![]() ![]() bipartisan biodefense study 27, ongoing U.S. The importance of additional biosurveillance capability has been articulated widely, for example by a major U.S. Such detection is also relevant for biosafety in the event of accidental release of engineered organisms. Although commercial DNA synthesis suppliers screen orders for similarity to select agents 23, 24, 25, 26, detection of synthetic genes within organismal genomes is particularly valuable for cases where conventional biosecurity control could be circumvented, such as when synthesis is done on a non-regulated machine. The ability to accurately identify synthetic genes enhances biosurveillance for organisms taking on non-native traits, which may be harmful or illicit. In the past, such engineering efforts could have been detected through the scars from gene editing, but such methods are becoming obsolete because of advances in scar-less molecular cloning 20, 21 and genome engineering techniques 22. We posit that codon-optimization offers a promising way to identify synthetic genes and the engineered organisms that contain them and thus provides the first way, to the best of our knowledge, to identify synthetic sequences from sequence alone. Though the subtle implications of codon choice for the rate and quality of protein production are still being understood 18, 19, such codon-optimization is so valuable for expression that commercial gene synthesis service providers typically offer this option by default. Recoding algorithms harness synonymous codons that more closely reflect the expression organism and preserve the natural protein sequence 17. In contrast with these restrictions on moving genes using traditional methods, gene synthesis can faithfully and rapidly recode natural sequences of large lengths 15, 16. Such constraints can limit what genetic engineers accomplish. These concerns only worsen as sequence length increases because the potential for problematic codons increases, as does the time required to manually convert these codons using PCR-based or restriction enzyme-based approaches. Although natural genes have the potential for direct transfer from one organism to another because of the universality of the genetic code, many such sequences would express poorly when moved into a new organism because of differences in codon usage, GC content, or the presence of expression-limiting regulatory elements 13, 14. Using DNA synthesis to transfer synthetic gene sequences from one organism to another may succeed where transferring natural gene sequences would fail. The growing field of synthetic biology also drives gene transfer because the genome sequences of non-model organisms present a treasure trove of potentially novel and orthogonal genes for testing in model organisms 11, 12. More recent biological research focused on mammalian models has featured considerable introduction of bacterial genes, notably the targeted genome editing tool CRISPR-Cas9 6, 7, 8 and tools for optogenetics 9, 10. Soon afterwards, biologists began sourcing genes encoding thermostable polymerases 4 from thermophilic bacteria and the well-known green fluorescent protein (GFP) 5 from the jellyfish as research tools. In the first industrial example of recombinant DNA technology, Eli Lilly and Genentech expressed a synthetic gene encoding human insulin in the model bacterium Escherichia coli for drug manufacturing 3. We provide empirical evidence that gene synthesis is leading biologists to sample more broadly across the diversity of life, and we provide a foundational tool for the biosurveillance community.īiologists and bioengineers often transfer genes across organisms to test genetic hypotheses or to endow their favorite model organisms with novel traits or functionality 1, 2. Phylogenetic analysis of distance between source and expression organisms reveals that researchers are using synthesis to source genes from more genetically-distant organisms, particularly for longer genes. We then classify ∼19,000 unique genes from the Addgene non-profit plasmid repository to investigate whether natural and synthetic genes have differential use in heterologous expression. This technique, grounded in codon theory and machine learning, can correctly classify genes with 97.7% accuracy on a novel data set. In this paper, we introduce a bioinformatics technique for determining whether a gene is natural or synthetic based solely on nucleotide sequence. Gene synthesis enables creation and modification of genetic sequences at an unprecedented pace, offering enormous potential for new biological functionality but also increasing the need for biosurveillance. ![]()
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |