Skip to content Skip to navigation

The Traveling Salesman Enables the Rapid Synthesis of Repetitive Polypeptides

A codon-scrambling algorithm
enables the PCR-based synthesis of repetitive proteins by finding the
least-repetitive synonymous gene sequence

The synthesis
of genes encoding highly repetitive polypeptides is one of the unsolved
problems in synthetic biology. While fast, scalable, high-throughput methods for
the synthesis of non-repetitive
are readily
available, these methods rely on piecing together oligonucleotides, or gene
fragments. For highly repetitive
these methods fail because the gene fragments are too similar to yield precise
results. However,
because synthetic biologists can get the same amino acid
from multiple DNA codons, they can avoid troublesome DNA repeats by swapping in
different codons that achieve the same effect. The challenge is finding the
least repetitive genetic code that still yields the desired polypeptide or
protein. In their publication in Nature Materials, Research
Triangle MRSEC professor Ashutosh Chilkoti
graduate fellow Nicholas Tang from
Duke University have removed this hurdle by developing a freely available
computer program based on the “traveling salesman” mathematics problem. Using
this program, they successfully synthesized 19 different repetitive proteins
using commercial biotechnology services. Synthetic biologists can now find the
least-repetitive genetic code to build the molecule they want to study. The
researchers say their program will allow those with limited resources or
expertise to easily explore synthetic biomaterials that were once available to
only a small fraction of the field. “This advance really democratizes the field
of synthetic biology and levels the playing field,” said Tang. “Before, you had
to have a lot of expertise and patience to work with repetitive sequences, but
now anyone can just order them online. We think this could really break open
the bottleneck that has held the field back and hopefully recruit more people
into the field.”

Computational Results for the Optimization of a Variety of Repetitive Proteins