1
|
Omachi Y, Saito N, Furusawa C. Rare-event sampling analysis uncovers the fitness landscape of the genetic code. PLoS Comput Biol 2023; 19:e1011034. [PMID: 37068098 PMCID: PMC10138212 DOI: 10.1371/journal.pcbi.1011034] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 04/27/2023] [Accepted: 03/16/2023] [Indexed: 04/18/2023] Open
Abstract
The genetic code refers to a rule that maps 64 codons to 20 amino acids. Nearly all organisms, with few exceptions, share the same genetic code, the standard genetic code (SGC). While it remains unclear why this universal code has arisen and been maintained during evolution, it may have been preserved under selection pressure. Theoretical studies comparing the SGC and numerically created hypothetical random genetic codes have suggested that the SGC has been subject to strong selection pressure for being robust against translation errors. However, these prior studies have searched for random genetic codes in only a small subspace of the possible code space due to limitations in computation time. Thus, how the genetic code has evolved, and the characteristics of the genetic code fitness landscape, remain unclear. By applying multicanonical Monte Carlo, an efficient rare-event sampling method, we efficiently sampled random codes from a much broader random ensemble of genetic codes than in previous studies, estimating that only one out of every 1020 random codes is more robust than the SGC. This estimate is significantly smaller than the previous estimate, one in a million. We also characterized the fitness landscape of the genetic code that has four major fitness peaks, one of which includes the SGC. Furthermore, genetic algorithm analysis revealed that evolution under such a multi-peaked fitness landscape could be strongly biased toward a narrow peak, in an evolutionary path-dependent manner.
Collapse
Affiliation(s)
- Yuji Omachi
- Graduate School of Sciences, The University of Tokyo, Hongo, Tokyo, Japan
| | - Nen Saito
- Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima City, Hiroshima, Japan
- Exploratory Research Center on Life and Living Systems, National Institutes of Natural Sciences, Okazaki, Aichi, Japan
- Universal Biology Institute, The University of Tokyo, Hongo, Tokyo, Japan
| | - Chikara Furusawa
- Graduate School of Sciences, The University of Tokyo, Hongo, Tokyo, Japan
- Universal Biology Institute, The University of Tokyo, Hongo, Tokyo, Japan
- Center for Biosystems Dynamics Research, RIKEN, Suita, Osaka, Japan
| |
Collapse
|
2
|
Wang X, Dong Q, Chen G, Zhang J, Liu Y, Cai Y. Frameshift and wild-type proteins are often highly similar because the genetic code and genomes were optimized for frameshift tolerance. BMC Genomics 2022; 23:416. [PMID: 35655139 PMCID: PMC9164415 DOI: 10.1186/s12864-022-08435-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 03/02/2022] [Indexed: 11/10/2022] Open
Abstract
Frameshift mutations have been considered of significant importance for the molecular evolution of proteins and their coding genes, while frameshift protein sequences encoded in the alternative reading frames of coding genes have been considered to be meaningless. However, functional frameshifts have been found widely existing. It was puzzling how a frameshift protein kept its structure and functionality while substantial changes occurred in its primary amino-acid sequence. This study shows that the similarities among frameshifts and wild types are higher than random similarities and are determined at different levels. Frameshift substitutions are more conservative than random substitutions in the standard genetic code (SGC). The frameshift substitutions score of SGC ranks in the top 2.0-3.5% of alternative genetic codes, showing that SGC is nearly optimal for frameshift tolerance. In many genes and certain genomes, frameshift-resistant codons and codon pairs appear more frequently than expected, suggesting that frameshift tolerance is achieved through not only the optimality of the genetic code but, more importantly, the further optimization of a specific gene or genome through the usages of codons/codon pairs, which sheds light on the role of frameshift mutations in molecular and genomic evolution.
Collapse
|
3
|
Abstract
Selection for resource conservation can shape the coding sequences of organisms living in nutrient-limited environments. Recently, it was proposed that selection for resource conservation, specifically for nitrogen and carbon content, has also shaped the structure of the standard genetic code, such that the missense mutations the code allows tend to cause small increases in the number of nitrogen and carbon atoms in amino acids. Moreover, it was proposed that this optimization is not confounded by known optimizations of the standard genetic code, such as for polar requirement or hydropathy. We challenge these claims. We show the proposed optimization for nitrogen conservation is highly sensitive to choice of null model and the proposed optimization for carbon conservation is confounded by the known conservative nature of the standard genetic code with respect to the molecular volume of amino acids. There is therefore little evidence the standard genetic code is optimized for resource conservation. We discuss our findings in the context of null models of the standard genetic code.
Collapse
Affiliation(s)
- Hana Rozhoňová
- Institute of Integrative Biology, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Quartier UNIL-Sorge, Lausanne, Switzerland
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Quartier UNIL-Sorge, Lausanne, Switzerland
| |
Collapse
|
4
|
Schmidt M, Kubyshkin V. How To Quantify a Genetic Firewall? A Polarity-Based Metric for Genetic Code Engineering. Chembiochem 2021; 22:1268-1284. [PMID: 33231343 PMCID: PMC8049029 DOI: 10.1002/cbic.202000758] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 11/20/2020] [Indexed: 12/14/2022]
Abstract
Genetic code engineering aims to produce organisms that translate genetic information in a different way from that prescribed by the standard genetic code. This endeavor could eventually lead to genetic isolation, where an organism that operates under a different genetic code will not be able to transfer functional genes with other living species, thereby standing behind a genetic firewall. It is not clear however, how distinct the code should be, or how to measure the distance. We have developed a metric (Δcode ) where we assigned polarity indices (clog D7 ) to amino acids to calculate the distances between pairs of genetic codes. We then calculated the distance between a set of 204 genetic codes, including the 24 known distinct natural codes, 11 extreme-distance codes created computationally, nine theoretical special purpose codes from literature and 160 codes in which canonical amino acids were replaced by noncanonical chemical analogues. The metric can be used for building strategies towards creating semantically alienated organisms, and testing the strength of genetic firewalls. This metric provides the basis for a map of the genetic codes that could guide future efforts towards novel biochemical worlds, biosafety and deep barcoding applications.
Collapse
Affiliation(s)
| | - Vladimir Kubyshkin
- Department of ChemistryUniversity of ManitobaDysart Road 144WinnipegR3T 2N2Canada
| |
Collapse
|
5
|
Yan Y, Maurer-Alcalá XX, Knight R, Kosakovsky Pond SL, Katz LA. Single-Cell Transcriptomics Reveal a Correlation between Genome Architecture and Gene Family Evolution in Ciliates. mBio 2019; 10:e02524-19. [PMID: 31874915 PMCID: PMC6935857 DOI: 10.1128/mbio.02524-19] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Accepted: 10/30/2019] [Indexed: 12/17/2022] Open
Abstract
Ciliates, a eukaryotic clade that is over 1 billion years old, are defined by division of genome function between transcriptionally inactive germline micronuclei and functional somatic macronuclei. To date, most analyses of gene family evolution have been limited to cultivable model lineages (e.g., Tetrahymena, Paramecium, Oxytricha, and Stylonychia). Here, we focus on the uncultivable Karyorelictea and its understudied sister class Heterotrichea, which represent two extremes in genome architecture. Somatic macronuclei within the Karyorelictea are described as nearly diploid, while the Heterotrichea have hyperpolyploid somatic genomes. Previous analyses indicate that genome architecture impacts ciliate gene family evolution as the most diverse and largest gene families are found in lineages with extensively processed somatic genomes (i.e., possessing thousands of gene-sized chromosomes). To further assess ciliate gene family evolution, we analyzed 43 single-cell transcriptomes from 33 ciliate species representing 10 classes. Focusing on conserved eukaryotic genes, we use estimates of transcript diversity as a proxy for the number of paralogs in gene families among four focal clades: Karyorelictea, Heterotrichea, extensive fragmenters (with gene-size somatic chromosomes), and non-extensive fragmenters (with more traditional somatic chromosomes), the latter two within the subphylum Intramacronucleata. Our results show that (i) the Karyorelictea have the lowest average transcript diversity, while Heterotrichea are highest among the four groups; (ii) proteins in Karyorelictea are under the highest functional constraints, and the patterns of selection in ciliates may reflect genome architecture; and (iii) stop codon reassignments vary among members of the Heterotrichea and Spirotrichea but are conserved in other classes.IMPORTANCE To further our understanding of genome evolution in eukaryotes, we assess the relationship between patterns of molecular evolution within gene families and variable genome structures found among ciliates. We combine single-cell transcriptomics with bioinformatic tools, focusing on understudied and uncultivable lineages selected from across the ciliate tree of life. Our analyses show that genome architecture correlates with patterns of protein evolution as lineages with more canonical somatic genomes, such as the class Karyorelictea, have more conserved patterns of molecular evolution compared to other classes. This study showcases the power of single-cell transcriptomics for investigating genome architecture and evolution in uncultivable microbial lineages and provides transcriptomic resources for further research on genome evolution.
Collapse
Affiliation(s)
- Ying Yan
- Smith College, Department of Biological Sciences, Northampton, Massachusetts, USA
| | - Xyrus X Maurer-Alcalá
- Smith College, Department of Biological Sciences, Northampton, Massachusetts, USA
- University of Massachusetts Amherst, Program in Organismic and Evolutionary Biology, Amherst, Massachusetts, USA
| | - Rob Knight
- University of California San Diego, Department of Pediatrics, San Diego, California, USA
- University of California San Diego, Department of Computer Science and Engineering, San Diego, California, USA
- University of California San Diego, Center for Microbiome Innovation, San Diego, California, USA
| | - Sergei L Kosakovsky Pond
- Temple University, Institute for Genomics and Evolutionary Medicine, Philadelphia, Pennsylvania, USA
| | - Laura A Katz
- Smith College, Department of Biological Sciences, Northampton, Massachusetts, USA
- University of Massachusetts Amherst, Program in Organismic and Evolutionary Biology, Amherst, Massachusetts, USA
| |
Collapse
|
6
|
Schmidt M. A metric space for semantic containment: Towards the implementation of genetic firewalls. Biosystems 2019; 185:104015. [PMID: 31408698 DOI: 10.1016/j.biosystems.2019.104015] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Revised: 08/06/2019] [Accepted: 08/08/2019] [Indexed: 12/13/2022]
Abstract
Analysing or engineering the genetic code has mainly been considered as an approach to reduce or increase the mutational robustness of the genetic code, i.e. the error tolerance in DNA mutations, or to enable the incorporation of non-canonical amino acids. The approach of "semantic containment", however, is less interested in altering the mutational tolerance of the standard code, but to create synthetic alternative genetic codes that limit or all together impede horizontal gene transfer between a natural and genomically recoded organisms (GRO). A major claim or conjecture of semantic containment is: "the farther, the safer", meaning, the less similarity there is between two codes, the less chance of a horizontal gene transfer, and the stronger the genetic firewall. So far, no metrics were available to measure and quantify the "genetic distance" between different genetic codes. Such a metric, however, is iis paramount to allow the experimental testing and evaluation of the validity of semantic biocontainment for the first time. Here, we introduce a metric space to measure exactly the distance (dissimilarity) between different genetic codes, in order to provide a framework to evaluate the relation between distance and strength of a genetic firewall. Results are presented that incorporate bespoken metrics when producing alternative genetic codes according to predefined goals, specifications and limitations. Finally, as an outlook, implications and challenges for genetic firewall(s) are discussed for dual- and multi-code systems.
Collapse
|
7
|
Genetic codes optimized as a traveling salesman problem. PLoS One 2019; 14:e0224552. [PMID: 31658301 PMCID: PMC6816573 DOI: 10.1371/journal.pone.0224552] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Accepted: 10/16/2019] [Indexed: 11/19/2022] Open
Abstract
The Standard Genetic Code (SGC) is robust to mutational errors such that frequently occurring mutations minimally alter the physio-chemistry of amino acids. The apparent correlation between the evolutionary distances among codons and the physio-chemical distances among their cognate amino acids suggests an early co-diversification between the codons and amino acids. Here we formulated the co-minimization of evolutionary distances between codons and physio-chemical distances between amino acids as a Traveling Salesman Problem (TSP) and solved it with a Hopfield neural network. In this unsupervised learning algorithm, macromolecules (e.g., tRNAs and aminoacyl-tRNA synthetases) associating codons with amino acids were considered biological analogs of Hopfield neurons associating "tour cities" with "tour positions". The Hopfield network efficiently yielded an abundance of genetic codes that were more error-minimizing than SGC and could thus be used to design artificial genetic codes. We further argue that as a self-optimization algorithm, the Hopfield neural network provides a model of origin of SGC and other adaptive molecular systems through evolutionary learning.
Collapse
|
8
|
Optimization of the standard genetic code in terms of two mutation types: Point mutations and frameshifts. Biosystems 2019; 181:44-50. [DOI: 10.1016/j.biosystems.2019.04.012] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 04/27/2019] [Indexed: 02/08/2023]
|
9
|
BłaŻej P, Wnetrzak M, Mackiewicz D, Mackiewicz P. The influence of different types of translational inaccuracies on the genetic code structure. BMC Bioinformatics 2019; 20:114. [PMID: 30841864 PMCID: PMC6404327 DOI: 10.1186/s12859-019-2661-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 01/29/2019] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The standard genetic code is a recipe for assigning unambiguously 21 labels, i.e. amino acids and stop translation signal, to 64 codons. However, at early stages of the translational machinery development, the codons did not have to be read unambiguously and the early genetic codes could have contained some ambiguous assignments of codons to amino acids. Therefore, the goal of this work was to obtain the genetic code structures which could have evolved assuming different types of inaccuracy of the translational machinery starting from unambiguous assignments of codons to amino acids. RESULTS We developed a theoretical model assuming that the level of uncertainty of codon assignments can gradually decrease during the simulations. Since it is postulated that the standard code has evolved to be robust against point mutations and mistranslations, we developed three simulation scenarios assuming that such errors can influence one, two or three codon positions. The simulated codes were selected using the evolutionary algorithm methodology to decrease coding ambiguity and increase their robustness against mistranslation. CONCLUSIONS The results indicate that the typical codon block structure of the genetic code could have evolved to decrease the ambiguity of amino acid to codon assignments and to increase the fidelity of reading the genetic information. However, the robustness to errors was not the decisive factor that influenced the genetic code evolution because it is possible to find theoretical codes that minimize the reading errors better than the standard genetic code.
Collapse
Affiliation(s)
- Paweł BłaŻej
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| | - Małgorzata Wnetrzak
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| | - Dorota Mackiewicz
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| | - Paweł Mackiewicz
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| |
Collapse
|
10
|
Many alternative and theoretical genetic codes are more robust to amino acid replacements than the standard genetic code. J Theor Biol 2019; 464:21-32. [DOI: 10.1016/j.jtbi.2018.12.030] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 12/17/2018] [Accepted: 12/19/2018] [Indexed: 02/07/2023]
|
11
|
Wnętrzak M, Błażej P, Mackiewicz D, Mackiewicz P. The optimality of the standard genetic code assessed by an eight-objective evolutionary algorithm. BMC Evol Biol 2018; 18:192. [PMID: 30545289 PMCID: PMC6293558 DOI: 10.1186/s12862-018-1304-0] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Accepted: 11/22/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The standard genetic code (SGC) is a unique set of rules which assign amino acids to codons. Similar amino acids tend to have similar codons indicating that the code evolved to minimize the costs of amino acid replacements in proteins, caused by mutations or translational errors. However, if such optimization in fact occurred, many different properties of amino acids must have been taken into account during the code evolution. Therefore, this problem can be reformulated as a multi-objective optimization task, in which the selection constraints are represented by measures based on various amino acid properties. RESULTS To study the optimality of the SGC we applied a multi-objective evolutionary algorithm and we used the representatives of eight clusters, which grouped over 500 indices describing various physicochemical properties of amino acids. Thanks to that we avoided an arbitrary choice of amino acid features as optimization criteria. As a consequence, we were able to conduct a more general study on the properties of the SGC than the ones presented so far in other papers on this topic. We considered two models of the genetic code, one preserving the characteristic codon blocks structure of the SGC and the other without this restriction. The results revealed that the SGC could be significantly improved in terms of error minimization, hereby it is not fully optimized. Its structure differs significantly from the structure of the codes optimized to minimize the costs of amino acid replacements. On the other hand, using newly defined quality measures that placed the SGC in the global space of theoretical genetic codes, we showed that the SGC is definitely closer to the codes that minimize the costs of amino acids replacements than those maximizing them. CONCLUSIONS The standard genetic code represents most likely only partially optimized systems, which emerged under the influence of many different factors. Our findings can be useful to researchers involved in modifying the genetic code of the living organisms and designing artificial ones.
Collapse
Affiliation(s)
- Małgorzata Wnętrzak
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, 50-383, Wrocław, Poland
| | - Paweł Błażej
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, 50-383, Wrocław, Poland
| | - Dorota Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, 50-383, Wrocław, Poland
| | - Paweł Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, 50-383, Wrocław, Poland.
| |
Collapse
|
12
|
Błażej P, Wnętrzak M, Mackiewicz D, Mackiewicz P. Optimization of the standard genetic code according to three codon positions using an evolutionary algorithm. PLoS One 2018; 13:e0201715. [PMID: 30092017 PMCID: PMC6084934 DOI: 10.1371/journal.pone.0201715] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 07/21/2018] [Indexed: 12/28/2022] Open
Abstract
Many biological systems are typically examined from the point of view of adaptation to certain conditions or requirements. One such system is the standard genetic code (SGC), which generally minimizes the cost of amino acid replacements resulting from mutations or mistranslations. However, no full consensus has been reached on the factors that caused the evolution of this feature. One of the hypotheses suggests that code optimality was directly selected as an advantage to preserve information about encoded proteins. An important feature that should be considered when studying the SGC is the different roles of the three codon positions. Therefore, we investigated the robustness of this code regarding the cost of amino acid replacements resulting from substitutions in these positions separately and the sum of these costs. We applied a modified evolutionary algorithm and included four models of the genetic code assuming various restrictions on its structure. The SGC was compared both with the codes that minimize the objective function and those that maximize it. This approach allowed us to place the SGC in the global space of possible codes, which is a more appropriate and unbiased comparison than that with randomly generated codes because they are characterized by relatively uniform amino acid assignments to codons. The SGC appeared to be well optimized at the global scale, but its individual positions were not fully optimized because there were codes that were optimized for only one codon position and simultaneously outperformed the SGC at the other positions. We also found that different code structures may lead to the same optimality and that random codes can show a tendency to minimize costs under some of the genetic code models. Our results suggest that the optimality of SGC could be a by-product of other processes.
Collapse
Affiliation(s)
- Paweł Błażej
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
| | - Małgorzata Wnętrzak
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
| | - Dorota Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
| | - Paweł Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
- * E-mail:
| |
Collapse
|
13
|
de Oliveira LL, Freitas AA, Tinós R. Multi-objective genetic algorithms in the study of the genetic code’s adaptability. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2017.10.022] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
14
|
Abstract
The standard genetic code is robust to mutations during transcription and translation. Point mutations are likely to be synonymous or to preserve the chemical properties of the original amino acid. Saturation mutagenesis experiments suggest that in some cases the best-performing mutant requires replacement of more than a single nucleotide within a codon. These replacements are essentially inaccessible to common error-based laboratory engineering techniques that alter a single nucleotide per mutation event, due to the extreme rarity of adjacent mutations. In this theoretical study, we suggest a radical reordering of the genetic code that maximizes the mutagenic potential of single nucleotide replacements. We explore several possible genetic codes that allow a greater degree of accessibility to the mutational landscape and may result in a hyperevolvable organism that could serve as an ideal platform for directed evolution experiments. We then conclude by evaluating the challenges of constructing such recoded organisms and their potential applications within the field of synthetic biology. The conservative nature of the genetic code prevents bioengineers from efficiently accessing the full mutational landscape of a gene via common error-prone methods. Here, we present two computational approaches to generate alternative genetic codes with increased accessibility. These new codes allow mutational transitions to a larger pool of amino acids and with a greater extent of chemical differences, based on a single nucleotide replacement within the codon, thus increasing evolvability both at the single-gene and at the genome levels. Given the widespread use of these techniques for strain and protein improvement, along with more fundamental evolutionary biology questions, the use of recoded organisms that maximize evolvability should significantly improve the efficiency of directed evolution, library generation, and fitness maximization.
Collapse
|
15
|
Santos J, Monteagudo Á. Inclusion of the fitness sharing technique in an evolutionary algorithm to analyze the fitness landscape of the genetic code adaptability. BMC Bioinformatics 2017; 18:195. [PMID: 28347270 PMCID: PMC5369190 DOI: 10.1186/s12859-017-1608-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2016] [Accepted: 03/16/2017] [Indexed: 11/26/2022] Open
Abstract
Background The canonical code, although prevailing in complex genomes, is not universal. It was shown the canonical genetic code superior robustness compared to random codes, but it is not clearly determined how it evolved towards its current form. The error minimization theory considers the minimization of point mutation adverse effect as the main selection factor in the evolution of the code. We have used simulated evolution in a computer to search for optimized codes, which helps to obtain information about the optimization level of the canonical code in its evolution. A genetic algorithm searches for efficient codes in a fitness landscape that corresponds with the adaptability of possible hypothetical genetic codes. The lower the effects of errors or mutations in the codon bases of a hypothetical code, the more efficient or optimal is that code. The inclusion of the fitness sharing technique in the evolutionary algorithm allows the extent to which the canonical genetic code is in an area corresponding to a deep local minimum to be easily determined, even in the high dimensional spaces considered. Results The analyses show that the canonical code is not in a deep local minimum and that the fitness landscape is not a multimodal fitness landscape with deep and separated peaks. Moreover, the canonical code is clearly far away from the areas of higher fitness in the landscape. Conclusions Given the non-presence of deep local minima in the landscape, although the code could evolve and different forces could shape its structure, the fitness landscape nature considered in the error minimization theory does not explain why the canonical code ended its evolution in a location which is not an area of a localized deep minimum of the huge fitness landscape.
Collapse
Affiliation(s)
- José Santos
- Department of Computer Science, University of A Coruña, Campus de Elviña s/n, A Coruña, 15071, Spain.
| | - Ángel Monteagudo
- Department of Computer Science, University of A Coruña, Campus de Elviña s/n, A Coruña, 15071, Spain
| |
Collapse
|
16
|
The role of crossover operator in evolutionary-based approach to the problem of genetic code optimization. Biosystems 2016; 150:61-72. [DOI: 10.1016/j.biosystems.2016.08.008] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2016] [Revised: 05/20/2016] [Accepted: 08/11/2016] [Indexed: 11/17/2022]
|
17
|
Gardini S, Cheli S, Baroni S, Di Lascio G, Mangiavacchi G, Micheletti N, Monaco CL, Savini L, Alocci D, Mangani S, Niccolai N. On Nature's Strategy for Assigning Genetic Code Multiplicity. PLoS One 2016; 11:e0148174. [PMID: 26849571 PMCID: PMC4746209 DOI: 10.1371/journal.pone.0148174] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2015] [Accepted: 01/13/2016] [Indexed: 11/26/2022] Open
Abstract
Genetic code redundancy would yield, on the average, the assignment of three codons for each of the natural amino acids. The fact that this number is observed only for incorporating Ile and to stop RNA translation still waits for an overall explanation. Through a Structural Bioinformatics approach, the wealth of information stored in the Protein Data Bank has been used here to look for unambiguous clues to decipher the rationale of standard genetic code (SGC) in assigning from one to six different codons for amino acid translation. Leu and Arg, both protected from translational errors by six codons, offer the clearest clue by appearing as the most abundant amino acids in protein-protein and protein-nucleic acid interfaces. Other SGC hidden messages have been sought by analyzing, in a protein structure framework, the roles of over- and under-protected amino acids.
Collapse
Affiliation(s)
- Simone Gardini
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Sara Cheli
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Silvia Baroni
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Gabriele Di Lascio
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Guido Mangiavacchi
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Nicholas Micheletti
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Carmen Luigia Monaco
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Lorenzo Savini
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Davide Alocci
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Stefano Mangani
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Neri Niccolai
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
- * E-mail:
| |
Collapse
|
18
|
Kumar B, Saini S. Analysis of the optimality of the standard genetic code. MOLECULAR BIOSYSTEMS 2016; 12:2642-51. [DOI: 10.1039/c6mb00262e] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Many theories have been proposed attempting to explain the origin of the genetic code. In this work, we compare performance of the standard genetic code against millions of randomly generated codes. On left, ability of genetic codes to encode additional information and their robustness to frameshift mutations.
Collapse
Affiliation(s)
- Balaji Kumar
- Department of Chemical Engineering
- Indian Institute of Technology Bombay
- Mumbai – 400 076
- India
| | - Supreet Saini
- Department of Chemical Engineering
- Indian Institute of Technology Bombay
- Mumbai – 400 076
- India
| |
Collapse
|
19
|
de Oliveira LL, de Oliveira PSL, Tinós R. A multiobjective approach to the genetic code adaptability problem. BMC Bioinformatics 2015; 16:52. [PMID: 25879480 PMCID: PMC4341243 DOI: 10.1186/s12859-015-0480-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Accepted: 01/27/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The organization of the canonical code has intrigued researches since it was first described. If we consider all codes mapping the 64 codes into 20 amino acids and one stop codon, there are more than 1.51×10(84) possible genetic codes. The main question related to the organization of the genetic code is why exactly the canonical code was selected among this huge number of possible genetic codes. Many researchers argue that the organization of the canonical code is a product of natural selection and that the code's robustness against mutations would support this hypothesis. In order to investigate the natural selection hypothesis, some researches employ optimization algorithms to identify regions of the genetic code space where best codes, according to a given evaluation function, can be found (engineering approach). The optimization process uses only one objective to evaluate the codes, generally based on the robustness for an amino acid property. Only one objective is also employed in the statistical approach for the comparison of the canonical code with random codes. We propose a multiobjective approach where two or more objectives are considered simultaneously to evaluate the genetic codes. RESULTS In order to test our hypothesis that the multiobjective approach is useful for the analysis of the genetic code adaptability, we implemented a multiobjective optimization algorithm where two objectives are simultaneously optimized. Using as objectives the robustness against mutation with the amino acids properties polar requirement (objective 1) and robustness with respect to hydropathy index or molecular volume (objective 2), we found solutions closer to the canonical genetic code in terms of robustness, when compared with the results using only one objective reported by other authors. CONCLUSIONS Using more objectives, more optimal solutions are obtained and, as a consequence, more information can be used to investigate the adaptability of the genetic code. The multiobjective approach is also more natural, because more than one objective was adapted during the evolutionary process of the canonical genetic code. Our results suggest that the evaluation function employed to compare genetic codes should consider simultaneously more than one objective, in contrast to what has been done in the literature.
Collapse
Affiliation(s)
| | | | - Renato Tinós
- Department of Computing and Mathematics, University of São Paulo, Ribeirão Preto, Brazil.
| |
Collapse
|