1
|
Di Giulio M. Theories of the origin of the genetic code: Strong corroboration for the coevolution theory. Biosystems 2024; 239:105217. [PMID: 38663520 DOI: 10.1016/j.biosystems.2024.105217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 04/16/2024] [Accepted: 04/18/2024] [Indexed: 04/29/2024]
Abstract
I analyzed all the theories and models of the origin of the genetic code, and over the years, I have considered the main suggestions that could explain this origin. The conclusion of this analysis is that the coevolution theory of the origin of the genetic code is the theory that best captures the majority of observations concerning the organization of the genetic code. In other words, the biosynthetic relationships between amino acids would have heavily influenced the origin of the organization of the genetic code, as supported by the coevolution theory. Instead, the presence in the genetic code of physicochemical properties of amino acids, which have also been linked to the physicochemical properties of anticodons or codons or bases by stereochemical and physicochemical theories, would simply be the result of natural selection. More explicitly, I maintain that these correlations between codons, anticodons or bases and amino acids are in fact the result not of a real correlation between amino acids and codons, for example, but are only the effect of the intervention of natural selection. Specifically, in the genetic code table we expect, for example, that the most similar codons - that is, those that differ by only one base - will have more similar physicochemical properties. Therefore, the 64 codons of the genetic code table ordered in a certain way would also represent an ordering of some of their physicochemical properties. Now, a study aimed at clarifying which physicochemical property of amino acids has influenced the allocation of amino acids in the genetic code has established that the partition energy of amino acids has played a role decisive in this. Indeed, under some conditions, the genetic code was found to be approximately 98% optimized on its columns. In this same work, it was shown that this was most likely the result of the action of natural selection. If natural selection had truly allocated the amino acids in the genetic code in such a way that similar amino acids also have similar codons - this, not through a mechanism of physicochemical interaction between, for example, codons and amino acids - then it might turn out that even different physicochemical properties of codons (or anticodons or bases) show some correlation with the physicochemical properties of amino acids, simply because the partition energy of amino acids is correlated with other physicochemical properties of amino acids. It is very likely that this would inevitably lead to a correlation between codons (or anticodons or bases) and amino acids. In other words, since the codons (anticodons or bases) are ordered in the genetic code, that is to say, some of their physicochemical properties should also be ordered by a similar order, and given that the amino acids would also appear to have been ordered in the genetic code by selection natural, then it should inevitably turn out that there is a correlation between, for example, the hydrophobicity of anticodons and that of amino acids. Instead, the intervention of natural selection in organizing the genetic code would appear to be highly compatible with the main mechanism of structuring the genetic code as supported by the coevolution theory. This would make the coevolution theory the only plausible explanation for the origin of the genetic code.
Collapse
Affiliation(s)
- Massimo Di Giulio
- The Ionian School, Early Evolution of Life Department, Genetic Code and tRNA Origin Laboratory, Via Roma 19, 67030, Alfedena, L'Aquila, Italy.
| |
Collapse
|
2
|
Katoh T, Suga H. A comprehensive analysis of translational misdecoding pattern and its implication on genetic code evolution. Nucleic Acids Res 2023; 51:10642-10652. [PMID: 37638759 PMCID: PMC10602915 DOI: 10.1093/nar/gkad707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 07/19/2023] [Accepted: 08/19/2023] [Indexed: 08/29/2023] Open
Abstract
The universal genetic code is comprised of 61 sense codons, which are assigned to 20 canonical amino acids. However, the evolutionary basis for the highly conserved mapping between amino acids and their codons remains incompletely understood. A possible selective pressure of evolution would be minimization of deleterious effects caused by misdecoding. Here we comprehensively analyzed the misdecoding pattern of 61 codons against 19 noncognate amino acids where an arbitrary amino acid was omitted, and revealed the following two rules. (i) If the second codon base is U or C, misdecoding is frequently induced by mismatches at the first and/or third base, where any mismatches are widely tolerated; whereas misdecoding with the second-base mismatch is promoted by only U-G or C-A pair formation. (ii) If the second codon base is A or G, misdecoding is promoted by only G-U or U-G pair formation at the first or second position. In addition, evaluation of functional/structural diversities of amino acids revealed that less diverse amino acid sets are assigned at codons that induce more frequent misdecoding, and vice versa, so as to minimize deleterious effects of misdecoding in the modern genetic code.
Collapse
Affiliation(s)
- Takayuki Katoh
- Department of Chemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Hiroaki Suga
- Department of Chemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| |
Collapse
|
3
|
Wills PR. Origins of Genetic Coding: Self-Guided Molecular Self-Organisation. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1281. [PMID: 37761580 PMCID: PMC10527755 DOI: 10.3390/e25091281] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 08/22/2023] [Accepted: 08/28/2023] [Indexed: 09/29/2023]
Abstract
The origin of genetic coding is characterised as an event of cosmic significance in which quantum mechanical causation was transcended by constructive computation. Computational causation entered the physico-chemical processes of the pre-biotic world by the incidental satisfaction of a condition of reflexivity between polymer sequence information and system elements able to facilitate their own production through translation of that information. This event, which has previously been modelled in the dynamics of Gene-Replication-Translation systems, is properly described as a process of self-guided self-organisation. The spontaneous emergence of a primordial genetic code between two-letter alphabets of nucleotide triplets and amino acids is easily possible, starting with random peptide synthesis that is RNA-sequence-dependent. The evident self-organising mechanism is the simultaneous quasi-species bifurcation of the populations of information-carrying genes and enzymes with aminoacyl-tRNA synthetase-like activities. This mechanism allowed the code to evolve very rapidly to the ~20 amino acid limit apparent for the reflexive differentiation of amino acid properties using protein catalysts. The self-organisation of semantics in this domain of physical chemistry conferred on emergent molecular biology exquisite computational control over the nanoscopic events needed for its self-construction.
Collapse
Affiliation(s)
- Peter R Wills
- Department of Physics, University of Auckland, Auckland PB 92019, New Zealand
| |
Collapse
|
4
|
Omachi Y, Saito N, Furusawa C. Rare-event sampling analysis uncovers the fitness landscape of the genetic code. PLoS Comput Biol 2023; 19:e1011034. [PMID: 37068098 PMCID: PMC10138212 DOI: 10.1371/journal.pcbi.1011034] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 04/27/2023] [Accepted: 03/16/2023] [Indexed: 04/18/2023] Open
Abstract
The genetic code refers to a rule that maps 64 codons to 20 amino acids. Nearly all organisms, with few exceptions, share the same genetic code, the standard genetic code (SGC). While it remains unclear why this universal code has arisen and been maintained during evolution, it may have been preserved under selection pressure. Theoretical studies comparing the SGC and numerically created hypothetical random genetic codes have suggested that the SGC has been subject to strong selection pressure for being robust against translation errors. However, these prior studies have searched for random genetic codes in only a small subspace of the possible code space due to limitations in computation time. Thus, how the genetic code has evolved, and the characteristics of the genetic code fitness landscape, remain unclear. By applying multicanonical Monte Carlo, an efficient rare-event sampling method, we efficiently sampled random codes from a much broader random ensemble of genetic codes than in previous studies, estimating that only one out of every 1020 random codes is more robust than the SGC. This estimate is significantly smaller than the previous estimate, one in a million. We also characterized the fitness landscape of the genetic code that has four major fitness peaks, one of which includes the SGC. Furthermore, genetic algorithm analysis revealed that evolution under such a multi-peaked fitness landscape could be strongly biased toward a narrow peak, in an evolutionary path-dependent manner.
Collapse
Affiliation(s)
- Yuji Omachi
- Graduate School of Sciences, The University of Tokyo, Hongo, Tokyo, Japan
| | - Nen Saito
- Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima City, Hiroshima, Japan
- Exploratory Research Center on Life and Living Systems, National Institutes of Natural Sciences, Okazaki, Aichi, Japan
- Universal Biology Institute, The University of Tokyo, Hongo, Tokyo, Japan
| | - Chikara Furusawa
- Graduate School of Sciences, The University of Tokyo, Hongo, Tokyo, Japan
- Universal Biology Institute, The University of Tokyo, Hongo, Tokyo, Japan
- Center for Biosystems Dynamics Research, RIKEN, Suita, Osaka, Japan
| |
Collapse
|
5
|
Fontecilla-Camps JC. Reflections on the Origin and Early Evolution of the Genetic Code. Chembiochem 2023; 24:e202300048. [PMID: 37052530 DOI: 10.1002/cbic.202300048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 03/01/2023] [Indexed: 04/14/2023]
Abstract
Examination of the genetic code (GeCo) reveals that amino acids coded by (A/U) codons display a large functional spectrum and bind RNA whereas, except for Arg, those coded by (G/C) codons do not. From a stereochemical viewpoint, the clear preference for (A/U)-rich codons to be located at the GeCo half blocks suggests they were specifically determined. Conversely, the overall lower affinity of cognate amino acids for their (G/C)-rich anticodons points to their late arrival to the GeCo. It is proposed that i) initially the code was composed of the eight (A/U) codons; ii) these codons were duplicated when G/C nucleotides were added to their wobble positions, and three new codons with G/C in their first position were incorporated; and iii) a combination of A/U and G/C nucleotides progressively generated the remaining codons.
Collapse
|
6
|
Di Giulio M. The error minimization of the genetic code would have been determined by natural selection and not by a neutral evolution. Biosystems 2023; 224:104838. [PMID: 36657560 DOI: 10.1016/j.biosystems.2023.104838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/10/2023] [Accepted: 01/15/2023] [Indexed: 01/18/2023]
Abstract
I discuss the mechanisms by which the error minimization observed in the genetic code would have been produced; that is, the ability of the genetic code to buffer, for example, the deleterious effects of translation errors. Here, I analyse whether the error minimization was produced by the intervention of natural selection or whether it is an emergent, that is, neutral property; in other words, whether it is a by-product of another mechanism that was structuring the genetic code. In particular, I criticize Massey's simulations (2008) - favouring the neutral hypothesis - which, containing elements of natural selection, would render his conclusions at least partly tautological. Furthermore, I criticize some of Koonin's (2017) interpretations regarding Massey's simulations. Finally, I criticize the opinion of Janzen et al. (2022) according to which their self-aminoacylating ribozyme system would have been capable of generating an error minimization of the genetic code as its emergent property. That is to say, I criticize, more generally, a neutral origin of error minimization. Indeed, any mechanism for structuring the genetic code would be capable of generating, in theory, such an emergent property. The problem is that to demonstrate this, it would be necessary to show that the level of optimization achieved by the genetic code would be that expected under the neutral hypothesis, the one that Janzen et al. (2022) instead they did not make. Therefore, their view is only a hypothesis and is very far from being corroborated by their results. Instead, in the literature there is a strong evidence that the level of optimization achieved by the genetic code is so high that it would imply, per se, an intervention of natural selection in the origin of error minimization of the genetic code. On the other hand, this level of optimization would be very far from what might have been produced by a neutral process.
Collapse
Affiliation(s)
- Massimo Di Giulio
- The Ionian School, Early Evolution of Life Department, Genetic Code and tRNA Origin Laboratory, Via Roma 19, 67030, Alfedena, L'Aquila, Italy.
| |
Collapse
|
7
|
Zhao F, Akanuma S. Ancestral Sequence Reconstruction of the Ribosomal Protein uS8 and Reduction of Amino Acid Usage to a Smaller Alphabet. J Mol Evol 2023; 91:10-23. [PMID: 36396786 DOI: 10.1007/s00239-022-10078-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 11/08/2022] [Indexed: 11/19/2022]
Abstract
Understanding the origin and early evolution of proteins is important for unveiling how the RNA world developed into an RNA-protein world. Because the composition of organic molecules in the Earth's primitive environment was plausibly not as diverse as today, the number of different amino acids used in early protein synthesis is likely to be substantially less than the current 20 proteinogenic residues. In this study, we have explored the thermal stability and RNA binding of ancestral variants of the ribosomal protein uS8 constructed from a reduced-alphabet of amino acids. First, we built a phylogenetic tree based on the amino acid sequences of uS8 from multiple extant organisms and used the tree to infer two plausible amino acid sequences corresponding to the last bacterial common ancestor of uS8. Both ancestral proteins were thermally stable and bound to an RNA fragment. By eliminating individual amino acid letters and monitoring thermal stability and RNA binding in the resulting proteins, we reduced the size of the amino acid set constituting one of the ancestral proteins, eventually finding that convergent sequences consisting of 15- or 14-amino acid alphabets still folded into stable structures that bound to the RNA fragment. Furthermore, a simplified variant reconstructed from a 13-amino-acid alphabet retained affinity for the RNA fragment, although it lost conformational stability. Collectively, RNA-binding activity may be achieved with a subset of the current 20 amino acids, raising the possibility of a simpler composition of RNA-binding proteins in the earliest stage of protein evolution.
Collapse
Affiliation(s)
- Fangzheng Zhao
- Faculty of Human Sciences, Waseda University, 2-579-15, Mikajima, Tokorozawa, Saitama, 359-1192, Japan
| | - Satoshi Akanuma
- Faculty of Human Sciences, Waseda University, 2-579-15, Mikajima, Tokorozawa, Saitama, 359-1192, Japan.
| |
Collapse
|
8
|
Yarus M. A crescendo of competent coding (c3) contains the Standard Genetic Code. RNA (NEW YORK, N.Y.) 2022; 28:1337-1347. [PMID: 35868841 PMCID: PMC9479743 DOI: 10.1261/rna.079275.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 07/18/2022] [Indexed: 06/15/2023]
Abstract
The Standard Genetic Code (SGC) can arise by fusion of partial codes evolved in different individuals, perhaps for differing prior tasks. Such code fragments can be unified into an SGC after later evolution of accurate third-position Crick wobble. Late wobble advent fills in the coding table, leaving only later development of translational initiation and termination to reach the SGC in separated domains of life. This code fusion mechanism is computationally implemented here. Late Crick wobble after C3 fusion (c3-lCw) is tested for its ability to evolve the SGC. Compared with previously studied isolated coding tables, or with increasing numbers of parallel, but nonfusing codes, c3-lCw reaches the SGC sooner, is successful in a smaller population, and presents accurate and complete codes more frequently. Notably, a long crescendo of SGC-like codes is exposed for selection of superior translation. c3-lCw also effectively suppresses varied disordered assignments, thus converging on a unified code. Such merged codes closely approach the SGC, making its selection plausible. For example: Under routine conditions, ≈1 of 22 c3-lCw environments evolves codes with ≥20 assignments and ≤3 differences from the SGC, notably including codes identical to the Standard Genetic Code.
Collapse
Affiliation(s)
- Michael Yarus
- Department of Molecular, Cellular and Developmental Biology, University of Colorado, Boulder, Colorado 80309-0347, USA
| |
Collapse
|
9
|
Martínez Giménez JA, Tabares Seisdedos R. A Cofactor-Based Mechanism for the Origin of the Genetic Code. ORIGINS LIFE EVOL B 2022; 52:149-163. [PMID: 36071304 DOI: 10.1007/s11084-022-09628-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 08/02/2022] [Indexed: 11/24/2022]
Abstract
The origin of the genetic code is probably the central problem of the studies on the origin of life. The key question to answer is the molecular mechanism that allows the association of the amino acids with their triplet codons. We proposed that the codon-anticodon duplex located in the acceptor stem of primitive tRNAs would facilitate the chemical reactions required to synthesize cognate amino acids from simple amino acids (glycine, valine, and aspartic acid) linked to the 3' acceptor end. In our view, various nucleotide-A-derived cofactors (with reactive chemical groups) may be attached to the codon-anticodon duplex, which allows group-transferring reactions from cofactors to simple amino acids, thereby producing the final amino acid. The nucleotide-A-derived cofactors could be incorporated into the RNA duplex (helix) by docking Adenosine (cofactor) into the minor groove via an interaction similar to the A-minor motif, forming a base triple between Adenosine and one complementary base pair of the duplex. Furthermore, we propose that this codon-anticodon duplex could initially catalyze a self-aminoacylation reaction with a simple amino acid. Therefore, the sequence of bases in the codon-anticodon duplex would determine the reactions that occurred during the formation of new amino acids for selective binding of nucleotide-A-derived cofactors.
Collapse
Affiliation(s)
| | - Rafael Tabares Seisdedos
- Departamento de Medicina, Facultad de Medicina de Valencia, (CIBERSAM; INCLIVA-UV), Universidad de Valencia, Av. Blasco Ibañez 17, 46010, Valencia, Spain.
| |
Collapse
|
10
|
Model of Genetic Code Structure Evolution under Various Types of Codon Reading. Int J Mol Sci 2022; 23:ijms23031690. [PMID: 35163612 PMCID: PMC8835785 DOI: 10.3390/ijms23031690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 01/23/2022] [Accepted: 01/25/2022] [Indexed: 11/28/2022] Open
Abstract
The standard genetic code (SGC) is a set of rules according to which 64 codons are assigned to 20 canonical amino acids and stop coding signal. As a consequence, the SGC is redundant because there is a greater number of codons than the number of encoded labels. This redundancy implies the existence of codons that encode the same genetic information. The size and organization of such synonymous codon blocks are important characteristics of the SGC structure whose evolution is still unclear. Therefore, we studied possible evolutionary mechanisms of the codon block structure. We conducted computer simulations assuming that coding systems at early stages of the SGC evolution were sets of ambiguous codon assignments with high entropy. We included three types of reading systems characterized by different inaccuracy and pattern of codon recognition. In contrast to the previous study, we allowed for evolution of the reading systems and their competition. The simulations performed under minimization of translational errors and reduction of coding ambiguity produced the coding system resistant to these errors. The reading system similar to that present in the SGC dominated the others very quickly. The survived system was also characterized by low entropy and possessed properties similar to that in the SGC. Our simulation show that the unambiguous SGC could emerged from a code with a lower level of ambiguity and the number of tRNAs increased during the evolution.
Collapse
|
11
|
Caldararo F, Di Giulio M. The genetic code is very close to a global optimum in a model of its origin taking into account both the partition energy of amino acids and their biosynthetic relationships. Biosystems 2022; 214:104613. [DOI: 10.1016/j.biosystems.2022.104613] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Revised: 01/16/2022] [Accepted: 01/17/2022] [Indexed: 01/23/2023]
|
12
|
Factors in Protobiomonomer Selection for the Origin of the Standard Genetic Code. Acta Biotheor 2021; 69:745-767. [PMID: 34283307 DOI: 10.1007/s10441-021-09420-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 07/01/2021] [Indexed: 10/20/2022]
Abstract
Natural selection of specific protobiomonomers during abiogenic development of the prototype genetic code is hindered by the diversity of structural, spatial, and rotational isomers that have identical elemental composition and molecular mass (M), but can vary significantly in their physicochemical characteristics, such as the melting temperature Tm, the Tm:M ratio, and the solubility in water, due to different positions of atoms in the molecule. These parameters differ between cis- and trans-isomers of dicarboxylic acids, spatial monosaccharide isomers, and structural isomers of α-, β-, and γ-amino acids. The stable planar heterocyclic molecules of the major nucleobases comprise four (C, H, N, O) or three (C, H, N) elements and contain a single -C=C bond and two nitrogen atoms in each heterocycle involved in C-N and C=N bonds. They exist as isomeric resonance hybrids of single and double bonds and as a mixture of tautomer forms due to the presence of -C=O and/or -NH2 side groups. They are thermostable, insoluble in water, and exhibit solid-state stability, which is of central importance for DNA molecules as carriers of genetic information. In M-Tm diagrams, proteinogenic amino acids and the corresponding codons are distributed fairly regularly relative to the distinct clusters of purine and pyrimidine bases, reflecting the correspondence between codons and amino acids that was established in different periods of genetic code development. The body of data on the evolution of the genetic code system indicates that the elemental composition and molecular structure of protobiomonomers, and their M, Tm, photostability, and aqueous solubility determined their selection in the emergence of the standard genetic code.
Collapse
|
13
|
Pawłowski PH. The Codon Usage in the Minimal Natural Cell. ORIGINS LIFE EVOL B 2021; 51:215-230. [PMID: 34694559 DOI: 10.1007/s11084-021-09616-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 08/24/2021] [Indexed: 10/20/2022]
Abstract
A statistical analysis of the variation in contents with the size of the current known smallest genomes, N. deltocephalinicola, C. ruddii, N. equitans, and M. genitalium, enabled the indication of a minimal set of codons capable of naturally building a modern-type free-living unicellular organism in an early stage of evolution. Using a linear regression model, the potential codon distribution in the minimal natural cell was predicted and compared to the composition of the smallest synthetic, JCVI-Syn3.0. The distribution of the molecular weight of potentially coded amino acids was also calculated. The main differences in the features of the minimal natural cell and H. Sapiens genome were analyzed. In this regard, the content percentage of respective amino acids and their polarization charge properties were reported and compared. The fractions of occurring nucleotides were calculated, too. Then, the estimated numbers of codons in a minimal natural cell were related to the expected numbers for random distribution. Shown increase, or decrease in the contents, relative to the calculated random filling was related to the evolutionary preferences, varying with the subsequent eras of the evolution of genetic code.
Collapse
Affiliation(s)
- Piotr H Pawłowski
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warszawa, Poland.
| |
Collapse
|
14
|
The Combinatorial Fusion Cascade to Generate the Standard Genetic Code. Life (Basel) 2021; 11:life11090975. [PMID: 34575125 PMCID: PMC8467831 DOI: 10.3390/life11090975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 09/14/2021] [Accepted: 09/14/2021] [Indexed: 11/17/2022] Open
Abstract
Combinatorial fusion cascade was proposed as a transition stage between prebiotic chemistry and early forms of life. The combinatorial fusion cascade consists of three stages: eight initial complimentary pairs of amino acids, four protocodes, and the standard genetic code. The initial complimentary pairs and the protocodes are divided into dominant and recessive entities. The transitions between these stages obey the same combinatorial fusion rules for all amino acids. The combinatorial fusion cascade mathematically describes the codon assignments in the standard genetic code. It explains the availability of amino acids with the even and odd numbers of codons, the appearance of stop codons, inclusion of novel canonical amino acids, exceptional high numbers of codons for amino acids arginine, leucine, and serine, and the temporal order of amino acid inclusion into the genetic code. The temporal order of amino acids within the cascade is congruent with the consensus temporal order previously derived from the similarities between the available hypotheses. The control over the combinatorial fusion cascades would open the road for a novel technology to develop artificial microorganisms.
Collapse
|
15
|
Pawlak K, Wnetrzak M, Mackiewicz D, Mackiewicz P, Błażej P. Models of genetic code structure evolution with variable number of coded labels. Biosystems 2021; 210:104528. [PMID: 34492316 DOI: 10.1016/j.biosystems.2021.104528] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 08/26/2021] [Accepted: 08/27/2021] [Indexed: 10/20/2022]
Abstract
It is assumed that at the early stage of cell evolution its translation machinery was characterized by high noise, i.e. ambiguous assignment of codons to amino acids in the genetic code, which initially encoded only few amino acids. Next, during its evolution new amino acids were added to this code. Taking into account this facts, we investigated theoretical models of genetic code's structure, which evolved from a set of ambiguous codons assignments into a coding system with a low level of uncertainty. We considered three types of translational inaccuracies assuming a different number of fixed codon positions. We applied a modified version of evolutionary algorithm for finding the genetic codes that the most effectively reduced the initial uncertainty in the assignment of codons to encoded labels, i.e. amino acids and a stop translation signal. We examined codes with the number of labels from four to 22. Our results indicated that the quality of genetic code structure is strongly dependent on the number of encoded labels as well as the type of translational mechanism. The more strict assignments of codon to the labels was preferred by the codes encoding more number of labels. The results showed that a smaller degeneracy of codes evolved from a more tolerant coding with the stepwise addition of coded amino acids to the genetic code. The distribution of codon groups in the standard genetic code corresponds well to the translation model assuming two fixed codon positions, whereas the six-codon groups can be relics form previous stages of evolution when the code characterized by a greater uncertainty.
Collapse
Affiliation(s)
- Konrad Pawlak
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | - Małgorzata Wnetrzak
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | - Dorota Mackiewicz
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | - Paweł Mackiewicz
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | - Paweł Błażej
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland.
| |
Collapse
|
16
|
Abstract
Minimally evolved codes are constructed here; these have randomly chosen standard genetic code (SGC) triplets, completed with completely random triplet assignments. Such "genetic codes" have not evolved, but retain SGC qualities. Retained qualities are basic, part of the underpinning of coding. For example, the sensitivity of coding to arbitrary assignments, which must be < ∼10%, is intrinsic. Such sensitivity comes from the elementary combinatorial properties of coding and constrains any SGC evolution hypothesis. Similarly, assignment of last-evolved functions is difficult because of late kinetic phenomena, likely common across codes. Census of minimally evolved code assignments shows that shape and size of wobble domains controls the code's fit into a coding table, strongly shifting accuracy of codon assignments. Access to the SGC therefore requires a plausible pathway to limited randomness, avoiding difficult completion while fitting a highly ordered, degenerate code into a preset three-dimensional space. Three-dimensional late Crick wobble in a genetic code assembled by lateral transfer between early partial codes satisfies these varied, simultaneous requirements. By allowing parallel evolution of SGC domains, this origin can yield shortened evolution to SGC-level order and allow the code to arise in smaller populations. It effectively yields full codes. Less obviously, it unifies previously studied chemical, biochemical, and wobble order in amino acid assignment, including a stereochemical minority of triplet-amino acid associations. Finally, fusion of intermediates into the final SGC is credible, mirroring broadly accepted later cellular evolution.
Collapse
|
17
|
Zolyan S. On the context-sensitive grammar of the genetic code. Biosystems 2021; 208:104497. [PMID: 34352327 DOI: 10.1016/j.biosystems.2021.104497] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 07/26/2021] [Accepted: 07/27/2021] [Indexed: 11/29/2022]
Abstract
We address the possibilities of the semiotic description of the genetic information as a dual and self-replicative system of correspondence between its biochemical substance and semiotic form of organization. Combining the principles of contextual dependence and arbitrariness of sign leads to the conclusion that the genetic code's primary elements (nucleotides) can be considered not as biochemical constants but as semiotic or, more precisely, grammatical variables. We suggest describing the genetic code as a language, consisting of 1) units of the alphabet; 2) a vocabulary that includes meaningful items and the rules of correspondence between units of different levels; 3) context-sensitive grammar - a system of rules for the formation of units based on abstract grammatical categories.
Collapse
Affiliation(s)
- Suren Zolyan
- Immanuel Kant Baltic Federal University, Kaliningrad, Russia; Institute of Scientific Information on Social Sciences of the Russian Academy of Sciences, Moscow, Russia; Institute of Philosophy, Sociology, and Law, National Academy of Sciences of the Republic of Armenia, Yerevan, Armenia.
| |
Collapse
|
18
|
Martínez-Giménez JA, Tabares-Seisdedos R. Possible Ancestral Functions of the Genetic and RNA Operational Precodes and the Origin of the Genetic System. ORIGINS LIFE EVOL B 2021; 51:167-183. [PMID: 34097191 DOI: 10.1007/s11084-021-09610-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 05/17/2021] [Indexed: 11/24/2022]
Abstract
The origin of genetic systems is the central problem in the study of the origin of life for which various explanatory hypotheses have been presented. One model suggests that both ancestral transfer ribonucleic acid (tRNA) molecules and primitive ribosomes were originally involved in RNA replication (Campbell 1991). According to this model the early tRNA molecules catalyzed their own self-loading with a trinucleotide complementary to their anticodon triplet, while the primordial ribosome (protoribosome) catalyzed the transfer of these terminal trinucleotides from one tRNA to another tRNA harboring the growing RNA polymer at the 3´-end.Here we present the notion that the anticodon-codon-like pairs presumably located in the acceptor stem of primordial tRNAs (Rodin et al. 1996) (thus being and remaining, after the code and translation origins, the major contributor to the RNA operational code (Schimmel et al. 1993)) might have originally been used for RNA replication rather than translation; these anticodon and acceptor stem triplets would have been involved in accurately loading the 3'-end of tRNAs with a trinucleotide complementary to their anticodon triplet, thus allowing the accurate repair of tRNAs for their use by the protoribosome during RNA replication.We propose that tRNAs could have catalyzed their own trinucleotide self-loading by forming catalytic tRNA dimers which would have had polymerase activity. Therefore, the loading mechanism and its evolution may have been a basic step in the emergence of new genetic mechanisms such as genetic translation. The evolutionary implications of this proposed loading mechanism are also discussed.
Collapse
Affiliation(s)
| | - Rafael Tabares-Seisdedos
- Departamento de Medicina, Facultad de Medicina de Valencia, Universidad de Valencia, Av. Blasco Ibañez 17, 46010, Valencia, Spain.
| |
Collapse
|
19
|
Abstract
Wobble coding is inevitable during evolution of the Standard Genetic Code (SGC). It ultimately splits half of NN U/C/A/G coding boxes with different assignments. Further, it contributes to pervasive SGC order by reinforcing close spacing for identical SGC assignments. But wobble cannot appear too soon, or it will inhibit encoding and more decisively, obstruct evolution of full coding tables. However, these prior results assumed Crick wobble, NN U/C and NN A/G, read by a single adaptor RNA. Superwobble translates NN U/C/A/G codons, using one adaptor RNA with an unmodified 5' anticodon U (appropriate to earliest coding) in modern mitochondria, plastids, and mycoplasma. Assuming the SGC was selected when evolving codes most resembled it, characteristics of the critical selection events can be calculated. For example, continuous superwobble infrequently evolves SGC-like coding tables. So, continuous superwobble is a very improbable origin hypothesis. In contrast, late-arising superwobble shares late Crick wobble's frequent resemblance to SGC order. Thus late superwobble is possible, but yields SGC-like assignments less frequently than late Crick wobble. Ancient coding ambiguity, most simply, arose from Crick wobble alone. This is consistent with SGC assignments to NAN codons.
Collapse
Affiliation(s)
- Michael Yarus
- Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, 80309-0347, USA.
| |
Collapse
|
20
|
Abstract
Abstract
The code is meaningless unless translated. (Monod 1971, 143)
We address issues of a description of the origin and evolution of the genetic code from the semiotics standpoint. Developing the concept of codepoiesis introduced by M. Barbieri, a new idea of semio-poiesis is proposed. Semio-poiesis, a recursive auto-referential processing of a semiotic system, becomes a form of organization of the bio-world when and while notions of meaning and aiming are introduced into it. The description of the genetic code as a semiotic system (grammar and vocabulary) allows us to apply the method of internal reconstruction to it: on the basis of heterogeneity and irregularity of the current state, to explicate possible previous states and various ways of forming coding and textualization mechanisms. The revealed patterns and irregularities are consistent with hypotheses about the origin and evolution of the genetic code.
Collapse
|
21
|
Nesterov-Mueller A, Popov R, Seligmann H. Combinatorial Fusion Rules to Describe Codon Assignment in the Standard Genetic Code. Life (Basel) 2020; 11:life11010004. [PMID: 33374866 PMCID: PMC7824455 DOI: 10.3390/life11010004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 12/15/2020] [Accepted: 12/21/2020] [Indexed: 11/16/2022] Open
Abstract
We propose combinatorial fusion rules that describe the codon assignment in the standard genetic code simply and uniformly for all canonical amino acids. These rules become obvious if the origin of the standard genetic code is considered as a result of a fusion of four protocodes: Two dominant AU and GC protocodes and two recessive AU and GC protocodes. The biochemical meaning of the fusion rules consists of retaining the complementarity between cognate codons of the small hydrophobic amino acids and large charged or polar amino acids within the protocodes. The proto tRNAs were assembled in form of two kissing hairpins with 9-base and 10-base loops in the case of dominant protocodes and two 9-base loops in the case of recessive protocodes. The fusion rules reveal the connection between the stop codons, the non-canonical amino acids, pyrrolysine and selenocysteine, and deviations in the translation of mitochondria. Using fusion rules, we predicted the existence of additional amino acids that are essential for the development of the standard genetic code. The validity of the proposed partition of the genetic code into dominant and recessive protocodes is considered referring to state-of-the-art hypotheses. The formation of two aminoacyl-tRNA synthetase classes is compatible with four-protocode partition.
Collapse
Affiliation(s)
- Alexander Nesterov-Mueller
- Institute of Microstructure Technology, Karlsruhe Institute of Technology (KIT), 76344 Eggenstein-Leopoldshafen, Germany; (R.P.); (H.S.)
- Correspondence:
| | - Roman Popov
- Institute of Microstructure Technology, Karlsruhe Institute of Technology (KIT), 76344 Eggenstein-Leopoldshafen, Germany; (R.P.); (H.S.)
| | - Hervé Seligmann
- Institute of Microstructure Technology, Karlsruhe Institute of Technology (KIT), 76344 Eggenstein-Leopoldshafen, Germany; (R.P.); (H.S.)
- The National Natural History Collections, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
- Laboratory AGEIS EA 7407, Team Tools for e-GnosisMedical & LabcomCNRS/UGA/OrangeLabs Telecoms4Health, Faculty of Medicine, Université Grenoble Alpes, F-38700 La Tronche, France
| |
Collapse
|
22
|
Kunnev D. Origin of Life: The Point of No Return. Life (Basel) 2020; 10:life10110269. [PMID: 33153087 PMCID: PMC7693465 DOI: 10.3390/life10110269] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 11/01/2020] [Accepted: 11/01/2020] [Indexed: 12/13/2022] Open
Abstract
Origin of life research is one of the greatest scientific frontiers of mankind. Many hypotheses have been proposed to explain how life began. Although different hypotheses emphasize different initial phenomena, all of them agree around one important concept: at some point, along with the chain of events toward life, Darwinian evolution emerged. There is no consensus, however, how this occurred. Frequently, the mechanism leading to Darwinian evolution is not addressed and it is assumed that this problem could be solved later, with experimental proof of the hypothesis. Here, the author first defines the minimum components required for Darwinian evolution and then from this standpoint, analyzes some of the hypotheses for the origin of life. Distinctive features of Darwinian evolution and life rooted in the interaction between information and its corresponding structure/function are then reviewed. Due to the obligatory dependency of the information and structure subject to Darwinian evolution, these components must be locked in their origin. One of the most distinctive characteristics of Darwinian evolution in comparison with all other processes is the establishment of a fundamentally new level of matter capable of evolving and adapting. Therefore, the initiation of Darwinian evolution is the "point of no return" after which life begins. In summary: a definition and a mechanism for Darwinian evolution are provided together with a critical analysis of some of the hypotheses for the origin of life.
Collapse
Affiliation(s)
- Dimiter Kunnev
- Department of Oral Biology, University at Buffalo, Buffalo, NY 14263, USA
| |
Collapse
|
23
|
Teixeira SC, Borges BC, Oliveira VQ, Carregosa LS, Bastos LA, Santos IA, Jardim ACG, Melo FF, Freitas LM, Rodrigues VM, Lopes DS. Insights into the antiviral activity of phospholipases A 2 (PLA 2s) from snake venoms. Int J Biol Macromol 2020; 164:616-625. [PMID: 32698062 PMCID: PMC7368918 DOI: 10.1016/j.ijbiomac.2020.07.178] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 07/08/2020] [Accepted: 07/14/2020] [Indexed: 12/18/2022]
Abstract
Viruses are associated with several human diseases that infect a large number of individuals, hence directly affecting global health and economy. Owing to the lack of efficient vaccines, antiviral therapy and emerging resistance strains, many viruses are considered as a potential threat to public health. Therefore, researches have been developed to identify new drug candidates for future treatments. Among them, antiviral research based on natural molecules is a promising approach. Phospholipases A2 (PLA2s) isolated from snake venom have shown significant antiviral activity against some viruses such as Dengue virus, Human Immunodeficiency virus, Hepatitis C virus and Yellow fever virus, and have emerged as an attractive alternative strategy for the development of novel antiviral therapy. Thus, this review provides an overview of remarkable findings involving PLA2s from snake venom that possess antiviral activity, and discusses the mechanisms of action mediated by PLA2s against different stages of virus replication cycle. Additionally, molecular docking simulations were performed by interacting between phospholipids from Dengue virus envelope and PLA2s from Bothrops asper snake venom. Studies on snake venom PLA2s highlight the potential use of these proteins for the development of broad-spectrum antiviral drugs.
Collapse
Affiliation(s)
- S C Teixeira
- Department of Immunology, Institute of Biomedical Science, Federal University of Uberlândia, Uberlândia, MG, Brazil
| | - B C Borges
- Department of Immunology, Institute of Biomedical Science, Federal University of Uberlândia, Uberlândia, MG, Brazil
| | - V Q Oliveira
- Multidisciplinary Institute of Health, Anísio Teixeira Campus, Federal University of Bahia, Vitória da Conquista, BA, Brazil
| | - L S Carregosa
- Multidisciplinary Institute of Health, Anísio Teixeira Campus, Federal University of Bahia, Vitória da Conquista, BA, Brazil
| | - L A Bastos
- Multidisciplinary Institute of Health, Anísio Teixeira Campus, Federal University of Bahia, Vitória da Conquista, BA, Brazil
| | - I A Santos
- Laboratory of Virology, Institute of Biomedical Science, Federal University of Uberlândia, Uberlândia, MG, Brazil
| | - A C G Jardim
- Laboratory of Virology, Institute of Biomedical Science, Federal University of Uberlândia, Uberlândia, MG, Brazil
| | - F F Melo
- Multidisciplinary Institute of Health, Anísio Teixeira Campus, Federal University of Bahia, Vitória da Conquista, BA, Brazil
| | - L M Freitas
- Multidisciplinary Institute of Health, Anísio Teixeira Campus, Federal University of Bahia, Vitória da Conquista, BA, Brazil
| | - V M Rodrigues
- Laboratory of Biochemistry and Animal Toxins, Institute of Biotechnology, Federal University of Uberlândia, Uberlândia, MG, Brazil.
| | - D S Lopes
- Multidisciplinary Institute of Health, Anísio Teixeira Campus, Federal University of Bahia, Vitória da Conquista, BA, Brazil; Institute of Health Sciences, Department of Bio-Function, Federal University of Bahia, Salvador, BA, Brazil.
| |
Collapse
|
24
|
Chu XY, Zhang HY. Cofactors as Molecular Fossils To Trace the Origin and Evolution of Proteins. Chembiochem 2020; 21:3161-3168. [PMID: 32515532 DOI: 10.1002/cbic.202000027] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Revised: 06/03/2020] [Indexed: 12/16/2022]
Abstract
Due to their early origin and extreme conservation, cofactors are valuable molecular fossils for tracing the origin and evolution of proteins. First, as the order of protein folds binding with cofactors roughly coincides with protein-fold chronology, cofactors are considered to have facilitated the origin of primitive proteins by selecting them from pools of random amino acid sequences. Second, in the subsequent evolution of proteins, cofactors still played an important role. More interestingly, as metallic cofactors evolved with geochemical variations, some geochemical events left imprints in the chronology of protein architecture; this provides further evidence supporting the coevolution of biochemistry and geochemistry. In this paper, we attempt to review the molecular fossils used in tracing the origin and evolution of proteins, with a special focus on cofactors.
Collapse
Affiliation(s)
- Xin-Yi Chu
- Hubei Key Laboratory of Agricultural Bioinformatics College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Hong-Yu Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| |
Collapse
|
25
|
Gospodinov A, Kunnev D. Universal Codons with Enrichment from GC to AU Nucleotide Composition Reveal a Chronological Assignment from Early to Late Along with LUCA Formation. Life (Basel) 2020; 10:life10060081. [PMID: 32516985 PMCID: PMC7345086 DOI: 10.3390/life10060081] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Revised: 05/30/2020] [Accepted: 06/03/2020] [Indexed: 12/14/2022] Open
Abstract
The emergence of a primitive genetic code should be considered the most essential event during the origin of life. Almost a complete set of codons (as we know them) should have been established relatively early during the evolution of the last universal common ancestor (LUCA) from which all known organisms descended. Many hypotheses have been proposed to explain the driving forces and chronology of the evolution of the genetic code; however, none is commonly accepted. In the current paper, we explore the features of the genetic code that, in our view, reflect the mechanism and the chronological order of the origin of the genetic code. Our hypothesis postulates that the primordial RNA was mostly GC-rich, and this bias was reflected in the order of amino acid codon assignment. If we arrange the codons and their corresponding amino acids from GC-rich to AU-rich, we find that: 1. The amino acids encoded by GC-rich codons (Ala, Gly, Arg, and Pro) are those that contribute the most to the interactions with RNA (if incorporated into short peptides). 2. This order correlates with the addition of novel functions necessary for the evolution from simple to longer folded peptides. 3. The overlay of aminoacyl-tRNA synthetases (aaRS) to the amino acid order produces a distinctive zonal distribution for class I and class II suggesting an interdependent origin. These correlations could be explained by the active role of the bridge peptide (BP), which we proposed earlier in the evolution of the genetic code.
Collapse
Affiliation(s)
- Anastas Gospodinov
- Roumen Tsanev Institute of Molecular Biology, Bulgarian Academy of Sciences, Acad. G. Bonchev Str. 21, Sofia 1113, Bulgaria;
| | - Dimiter Kunnev
- Department of Molecular & Cellular Biology, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
- Correspondence:
| |
Collapse
|
26
|
Kimura M, Akanuma S. Reconstruction and Characterization of Thermally Stable and Catalytically Active Proteins Comprising an Alphabet of ~ 13 Amino Acids. J Mol Evol 2020; 88:372-381. [PMID: 32201904 DOI: 10.1007/s00239-020-09938-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2019] [Accepted: 03/11/2020] [Indexed: 10/24/2022]
Abstract
While extant organisms synthesize proteins using approximately 20 kinds of genetically coded amino acids, the earliest protein synthesis system is likely to have been much simpler, utilizing a reduced set of amino acids. However, which types of building blocks were involved in primordial protein synthesis remains unclear. Herein, we reconstructed three convergent sequences of an ancestral nucleoside diphosphate kinase, each comprising a 10 amino acid "alphabet," and found that two of these variants folded into soluble and stable tertiary structures. Therefore, an alphabet consisting of 10 amino acids contains sufficient information for creating stable proteins. Furthermore, re-incorporation of a few more amino acid types into the active site of the 10 amino acid variants improved the catalytic activity, although the specific activity was not as high as that of extant proteins. Collectively, our results provide experimental support for the idea that robust protein scaffolds can be built with a subset of the current 20 amino acids that might have existed abundantly in the prebiotic environment, while the other amino acids, especially those with functional sidechains, evolved to contribute to efficient enzyme catalysis.
Collapse
Affiliation(s)
- Madoka Kimura
- Faculty of Human Sciences, Waseda University, 2-579-15 Mikajima, Tokorozawa, Saitama, 359-1192, Japan
| | - Satoshi Akanuma
- Faculty of Human Sciences, Waseda University, 2-579-15 Mikajima, Tokorozawa, Saitama, 359-1192, Japan.
| |
Collapse
|
27
|
Błażej P, Wnetrzak M, Mackiewicz D, Mackiewicz P. Basic principles of the genetic code extension. ROYAL SOCIETY OPEN SCIENCE 2020; 7:191384. [PMID: 32257313 PMCID: PMC7062095 DOI: 10.1098/rsos.191384] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Accepted: 01/09/2020] [Indexed: 05/08/2023]
Abstract
Compounds including non-canonical amino acids (ncAAs) or other artificially designed molecules can find a lot of applications in medicine, industry and biotechnology. They can be produced thanks to the modification or extension of the standard genetic code (SGC). Such peptides or proteins including the ncAAs can be constantly delivered in a stable way by organisms with the customized genetic code. Among several methods of engineering the code, using non-canonical base pairs is especially promising, because it enables generating many new codons, which can be used to encode any new amino acid. Since even one pair of new bases can extend the SGC up to 216 codons generated by a six-letter nucleotide alphabet, the extension of the SGC can be achieved in many ways. Here, we proposed a stepwise procedure of the SGC extension with one pair of non-canonical bases to minimize the consequences of point mutations. We reported relationships between codons in the framework of graph theory. All 216 codons were represented as nodes of the graph, whereas its edges were induced by all possible single nucleotide mutations occurring between codons. Therefore, every set of canonical and newly added codons induces a specific subgraph. We characterized the properties of the induced subgraphs generated by selected sets of codons. Thanks to that, we were able to describe a procedure for incremental addition of the set of meaningful codons up to the full coding system consisting of three pairs of bases. The procedure of gradual extension of the SGC makes the whole system robust to changing genetic information due to mutations and is compatible with the views assuming that codons and amino acids were added successively to the primordial SGC, which evolved minimizing harmful consequences of mutations or mistranslations of encoded proteins.
Collapse
Affiliation(s)
- Paweł Błażej
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | | | | | | |
Collapse
|
28
|
Adaptive Properties of the Genetically Encoded Amino Acid Alphabet Are Inherited from Its Subsets. Sci Rep 2019; 9:12468. [PMID: 31462646 PMCID: PMC6713743 DOI: 10.1038/s41598-019-47574-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Accepted: 07/08/2019] [Indexed: 01/11/2023] Open
Abstract
Life uses a common set of 20 coded amino acids (CAAs) to construct proteins. This set was likely canonicalized during early evolution; before this, smaller amino acid sets were gradually expanded as new synthetic, proofreading and coding mechanisms became biologically available. Many possible subsets of the modern CAAs or other presently uncoded amino acids could have comprised the earlier sets. We explore the hypothesis that the CAAs were selectively fixed due to their unique adaptive chemical properties, which facilitate folding, catalysis, and solubility of proteins, and gave adaptive value to organisms able to encode them. Specifically, we studied in silico hypothetical CAA sets of 3–19 amino acids comprised of 1913 structurally diverse α-amino acids, exploring the adaptive value of their combined physicochemical properties relative to those of the modern CAA set. We find that even hypothetical sets containing modern CAA members are especially adaptive; it is difficult to find sets even among a large choice of alternatives that cover the chemical property space more amply. These results suggest that each time a CAA was discovered and embedded during evolution, it provided an adaptive value unusual among many alternatives, and each selective step may have helped bootstrap the developing set to include still more CAAs.
Collapse
|
29
|
Abstract
The universal triple-nucleotide genetic code is often viewed as a given, randomly selected through evolution. However, as summarized in this article, many observations and deductions within structural and thermodynamic frameworks help to explain the forces that must have shaped the code during the early evolution of life on Earth.
Collapse
|
30
|
Di Giulio M. The key role of the elongation factors in the origin of the organization of the genetic code. Biosystems 2019; 181:20-26. [DOI: 10.1016/j.biosystems.2019.04.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 04/13/2019] [Accepted: 04/13/2019] [Indexed: 11/29/2022]
|
31
|
Optimization of the standard genetic code in terms of two mutation types: Point mutations and frameshifts. Biosystems 2019; 181:44-50. [DOI: 10.1016/j.biosystems.2019.04.012] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 04/27/2019] [Indexed: 02/08/2023]
|
32
|
Ikehara K. The Origin of tRNA Deduced from Pseudomonas aeruginosa 5' Anticodon-Stem Sequence : Anticodon-stem loop hypothesis. ORIGINS LIFE EVOL B 2019; 49:61-75. [PMID: 31077036 DOI: 10.1007/s11084-019-09573-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Accepted: 02/28/2019] [Indexed: 10/26/2022]
Abstract
The riddle of the origin of life is unsolved as yet. One of the best ways to solve the riddle would be to find a vestige of the first life from databases of DNA and/or protein of modern organisms. It would be, especially, important to know the origin of tRNA, because it mediates between genetic information and the amino acid sequence of a protein. Here I attempt to find a vestige of the origin and evolution of tRNA from base sequences of Pseudomonas aeruginosa tRNA gene. It was first perceived that 5' anticodon (AntiC) stem sequences of P. aeruginosa tRNA for translation of G-start codon (GNN) are intimately and mutually related. Then, mutual relations among all of the forty-two 5' AntiC stem sequences of P. aeruginosa tRNA were examined. These relationships imply that P. aeruginosa tRNA originated from four anticodon stem-loops (AntiC-SL) translating GNC codons to the corresponding four amino acids, Gly, Ala, Asp and Val (where N is G, C, A, or T). In contrast to the case of AntiC-stem sequence, a mutual relation map could not be drawn with D-, T- and acceptor-stem sequences of P. aeruginosa tRNA. Thus I conclude that the four AntiC-SLs were the first primeval tRNAs.
Collapse
Affiliation(s)
- Kenji Ikehara
- G&L Kyosei Institute, Koharu Bld. 202, Hokkeji 153-4, Nara, 630-8001, Japan.
- The International Institute for Advanced Studies of Japan, Kizugawadai 9-3, Kizugawa, Kyoto, 619-0225, Japan.
- Professor Emeritus of Nara Women's University, Nara, Japan.
| |
Collapse
|
33
|
BłaŻej P, Wnetrzak M, Mackiewicz D, Mackiewicz P. The influence of different types of translational inaccuracies on the genetic code structure. BMC Bioinformatics 2019; 20:114. [PMID: 30841864 PMCID: PMC6404327 DOI: 10.1186/s12859-019-2661-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 01/29/2019] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The standard genetic code is a recipe for assigning unambiguously 21 labels, i.e. amino acids and stop translation signal, to 64 codons. However, at early stages of the translational machinery development, the codons did not have to be read unambiguously and the early genetic codes could have contained some ambiguous assignments of codons to amino acids. Therefore, the goal of this work was to obtain the genetic code structures which could have evolved assuming different types of inaccuracy of the translational machinery starting from unambiguous assignments of codons to amino acids. RESULTS We developed a theoretical model assuming that the level of uncertainty of codon assignments can gradually decrease during the simulations. Since it is postulated that the standard code has evolved to be robust against point mutations and mistranslations, we developed three simulation scenarios assuming that such errors can influence one, two or three codon positions. The simulated codes were selected using the evolutionary algorithm methodology to decrease coding ambiguity and increase their robustness against mistranslation. CONCLUSIONS The results indicate that the typical codon block structure of the genetic code could have evolved to decrease the ambiguity of amino acid to codon assignments and to increase the fidelity of reading the genetic information. However, the robustness to errors was not the decisive factor that influenced the genetic code evolution because it is possible to find theoretical codes that minimize the reading errors better than the standard genetic code.
Collapse
Affiliation(s)
- Paweł BłaŻej
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| | - Małgorzata Wnetrzak
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| | - Dorota Mackiewicz
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| | - Paweł Mackiewicz
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| |
Collapse
|
34
|
Many alternative and theoretical genetic codes are more robust to amino acid replacements than the standard genetic code. J Theor Biol 2019; 464:21-32. [DOI: 10.1016/j.jtbi.2018.12.030] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 12/17/2018] [Accepted: 12/19/2018] [Indexed: 02/07/2023]
|
35
|
Rogers SO. Evolution of the genetic code based on conservative changes of codons, amino acids, and aminoacyl tRNA synthetases. J Theor Biol 2019; 466:1-10. [PMID: 30658052 DOI: 10.1016/j.jtbi.2019.01.022] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Revised: 01/10/2019] [Accepted: 01/14/2019] [Indexed: 11/30/2022]
Abstract
The genetic code, as arranged in the standard tabular form, displays a non-random structure relating to the characteristics of the amino acids. An alternative arrangement can be made by organizing the code according to aminoacyl-tRNA synthetases (aaRSs), codons, and reverse complement codons, which illuminates a coevolutionary process that led to the contemporary genetic code. As amino acids were added to the genetic code, they were recognized by aaRSs that interact with stereochemically similar amino acids. Single nucleotide changes in the codons and anticodons were favored over more extensive changes, such that there was a logical stepwise progression in the evolution of the genetic code. The model presented traces the evolution of the genetic code accounting for these steps. Amino acid frequencies in ancient proteins and the preponderance of GNN codons in mRNAs for ancient proteins indicate that the genetic code began with alanine, aspartate, glutamate, glycine, and valine, with alanine being in the highest proportions. In addition to being consistent in terms of conservative changes in codon nucleotides, the model also is consistent with respect to aaRS classes, aaRS attachment to the tRNA, amino acid stereochemistry, and to a large extent with amino acid physicochemistry, and biochemical pathways.
Collapse
Affiliation(s)
- Scott O Rogers
- Department of Biological Sciences, Bowling Green State University, Bowling Green, OH, United States.
| |
Collapse
|
36
|
Wnętrzak M, Błażej P, Mackiewicz D, Mackiewicz P. The optimality of the standard genetic code assessed by an eight-objective evolutionary algorithm. BMC Evol Biol 2018; 18:192. [PMID: 30545289 PMCID: PMC6293558 DOI: 10.1186/s12862-018-1304-0] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Accepted: 11/22/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The standard genetic code (SGC) is a unique set of rules which assign amino acids to codons. Similar amino acids tend to have similar codons indicating that the code evolved to minimize the costs of amino acid replacements in proteins, caused by mutations or translational errors. However, if such optimization in fact occurred, many different properties of amino acids must have been taken into account during the code evolution. Therefore, this problem can be reformulated as a multi-objective optimization task, in which the selection constraints are represented by measures based on various amino acid properties. RESULTS To study the optimality of the SGC we applied a multi-objective evolutionary algorithm and we used the representatives of eight clusters, which grouped over 500 indices describing various physicochemical properties of amino acids. Thanks to that we avoided an arbitrary choice of amino acid features as optimization criteria. As a consequence, we were able to conduct a more general study on the properties of the SGC than the ones presented so far in other papers on this topic. We considered two models of the genetic code, one preserving the characteristic codon blocks structure of the SGC and the other without this restriction. The results revealed that the SGC could be significantly improved in terms of error minimization, hereby it is not fully optimized. Its structure differs significantly from the structure of the codes optimized to minimize the costs of amino acid replacements. On the other hand, using newly defined quality measures that placed the SGC in the global space of theoretical genetic codes, we showed that the SGC is definitely closer to the codes that minimize the costs of amino acids replacements than those maximizing them. CONCLUSIONS The standard genetic code represents most likely only partially optimized systems, which emerged under the influence of many different factors. Our findings can be useful to researchers involved in modifying the genetic code of the living organisms and designing artificial ones.
Collapse
Affiliation(s)
- Małgorzata Wnętrzak
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, 50-383, Wrocław, Poland
| | - Paweł Błażej
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, 50-383, Wrocław, Poland
| | - Dorota Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, 50-383, Wrocław, Poland
| | - Paweł Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, 50-383, Wrocław, Poland.
| |
Collapse
|
37
|
Facchiano A, Di Giulio M. The genetic code is not an optimal code in a model taking into account both the biosynthetic relationships between amino acids and their physicochemical properties. J Theor Biol 2018; 459:45-51. [DOI: 10.1016/j.jtbi.2018.09.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 09/04/2018] [Accepted: 09/19/2018] [Indexed: 01/22/2023]
|
38
|
Di Giulio M. A Non-neutral Origin for Error Minimization in the Origin of the Genetic Code. J Mol Evol 2018; 86:593-597. [PMID: 30361751 DOI: 10.1007/s00239-018-9871-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 10/17/2018] [Indexed: 11/29/2022]
Abstract
Massey (J Mol Evol 67:510-516, 2008; J Theor Biol 408:237-242, 2016; Nat Comput. https://doi.org/10.1007/s11047-017-9669-3, 2018) claims that the error minimization of the genetic code is derived by means of a neutral process and was not due to the action of natural selection. Here, I argue that this neutralist hypothesis of the origin of error minimization is not based directly on any neutral process but it could be only indirectly. On the contrary, it has been natural selection that has acted during the origin of the genetic code determining the property that similar amino acids are coded by similar codons within the genetic code table.
Collapse
Affiliation(s)
- Massimo Di Giulio
- Early Evolution of Life Laboratory, Institute of Biosciences and Bioresources, CNR, Via P. Castellino, 111, 80131, Naples, Italy.
| |
Collapse
|
39
|
Kunnev D, Gospodinov A. Possible Emergence of Sequence Specific RNA Aminoacylation via Peptide Intermediary to Initiate Darwinian Evolution and Code Through Origin of Life. Life (Basel) 2018; 8:E44. [PMID: 30279401 PMCID: PMC6316189 DOI: 10.3390/life8040044] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Revised: 09/30/2018] [Accepted: 09/30/2018] [Indexed: 12/12/2022] Open
Abstract
One of the most intriguing questions in biological science is how life originated on Earth. A large number of hypotheses have been proposed to explain it, each putting an emphasis on different events leading to functional translation and self-sustained system. Here, we propose a set of interactions that could have taken place in the prebiotic environment. According to our hypothesis, hybridization-induced proximity of short aminoacylated RNAs led to the synthesis of peptides of random sequence. We postulate that among these emerged a type of peptide(s) capable of stimulating the interaction between specific RNAs and specific amino acids, which we call "bridge peptide" (BP). We conclude that translation should have emerged at the same time when the standard genetic code begun to evolve due to the stabilizing effect on RNA-peptide complexes with the help of BPs. Ribosomes, ribozymes, and the enzyme-directed RNA replication could co-evolve within the same period, as logical outcome of RNA-peptide world without the need of RNA only self-sustained step.
Collapse
Affiliation(s)
- Dimiter Kunnev
- Roswell Park Cancer Institute, Department of Molecular & Cellular Biology, Buffalo, NY 14263, USA.
| | - Anastas Gospodinov
- Roumen Tsanev Institute of Molecular Biology, Bulgarian Academy of Sciences, Acad. G. Bonchev Str. 21, Sofia 1113, Bulgaria.
| |
Collapse
|
40
|
Błażej P, Wnętrzak M, Mackiewicz D, Mackiewicz P. Optimization of the standard genetic code according to three codon positions using an evolutionary algorithm. PLoS One 2018; 13:e0201715. [PMID: 30092017 PMCID: PMC6084934 DOI: 10.1371/journal.pone.0201715] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 07/21/2018] [Indexed: 12/28/2022] Open
Abstract
Many biological systems are typically examined from the point of view of adaptation to certain conditions or requirements. One such system is the standard genetic code (SGC), which generally minimizes the cost of amino acid replacements resulting from mutations or mistranslations. However, no full consensus has been reached on the factors that caused the evolution of this feature. One of the hypotheses suggests that code optimality was directly selected as an advantage to preserve information about encoded proteins. An important feature that should be considered when studying the SGC is the different roles of the three codon positions. Therefore, we investigated the robustness of this code regarding the cost of amino acid replacements resulting from substitutions in these positions separately and the sum of these costs. We applied a modified evolutionary algorithm and included four models of the genetic code assuming various restrictions on its structure. The SGC was compared both with the codes that minimize the objective function and those that maximize it. This approach allowed us to place the SGC in the global space of possible codes, which is a more appropriate and unbiased comparison than that with randomly generated codes because they are characterized by relatively uniform amino acid assignments to codons. The SGC appeared to be well optimized at the global scale, but its individual positions were not fully optimized because there were codes that were optimized for only one codon position and simultaneously outperformed the SGC at the other positions. We also found that different code structures may lead to the same optimality and that random codes can show a tendency to minimize costs under some of the genetic code models. Our results suggest that the optimality of SGC could be a by-product of other processes.
Collapse
Affiliation(s)
- Paweł Błażej
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
| | - Małgorzata Wnętrzak
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
| | - Dorota Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
| | - Paweł Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
- * E-mail:
| |
Collapse
|
41
|
Di Giulio M. A discriminative test among the different theories proposed to explain the origin of the genetic code: The coevolution theory finds additional support. Biosystems 2018; 169-170:1-4. [DOI: 10.1016/j.biosystems.2018.05.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Revised: 04/26/2018] [Accepted: 05/07/2018] [Indexed: 11/29/2022]
|
42
|
Tripathi S, Deem MW. The Standard Genetic Code Facilitates Exploration of the Space of Functional Nucleotide Sequences. J Mol Evol 2018; 86:325-339. [PMID: 29959476 DOI: 10.1007/s00239-018-9852-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Accepted: 06/21/2018] [Indexed: 01/07/2023]
Abstract
The standard genetic code is well known to be optimized for minimizing the phenotypic effects of single-nucleotide substitutions, a property that was likely selected for during the emergence of a universal code. Given the fitness advantage afforded by high standing genetic diversity in a population in a dynamic environment, it is possible that selection to explore a large fraction of the space of functional proteins also occurred. To determine whether selection for such a property played a role during the emergence of the nearly universal standard genetic code, we investigated the number of functional variants of the Escherichia coli PhoQ protein explored at different time scales under translation using different genetic codes. We found that the standard genetic code is highly optimal for exploring a large fraction of the space of functional PhoQ variants at intermediate time scales as compared to random codes. Environmental changes, in response to which genetic diversity in a population provides a fitness advantage, are likely to have occurred at these intermediate time scales. Our results indicate that the ability of the standard code to explore a large fraction of the space of functional sequence variants arises from a balance between robustness and flexibility and is largely independent of the property of the standard code to minimize the phenotypic effects of mutations. We propose that selection to explore a large fraction of the functional sequence space while minimizing the phenotypic effects of mutations contributed toward the emergence of the standard code as the universal genetic code.
Collapse
Affiliation(s)
- Shubham Tripathi
- PhD Program in Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, 77005, USA
| | - Michael W Deem
- PhD Program in Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, 77005, USA.
- Center for Theoretical Biological Physics, Rice University, Houston, TX, 77005, USA.
- Department of Bioengineering, Rice University, Houston, TX, 77005, USA.
- Department of Physics and Astronomy, Rice University, Houston, TX, 77005, USA.
| |
Collapse
|
43
|
Frank A, Froese T. The Standard Genetic Code can Evolve from a Two-Letter GC Code Without Information Loss or Costly Reassignments. ORIGINS LIFE EVOL B 2018; 48:259-272. [PMID: 29959584 DOI: 10.1007/s11084-018-9559-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Accepted: 06/21/2018] [Indexed: 11/27/2022]
Abstract
It is widely agreed that the standard genetic code must have been preceded by a simpler code that encoded fewer amino acids. How this simpler code could have expanded into the standard genetic code is not well understood because most changes to the code are costly. Taking inspiration from the recently synthesized six-letter code, we propose a novel hypothesis: the initial genetic code consisted of only two letters, G and C, and then expanded the number of available codons via the introduction of an additional pair of letters, A and U. Various lines of evidence, including the relative prebiotic abundance of the earliest assigned amino acids, the balance of their hydrophobicity, and the higher GC content in genome coding regions, indicate that the original two nucleotides were indeed G and C. This process of code expansion probably started with the third base, continued with the second base, and ended up as the standard genetic code when the second pair of letters was introduced into the first base. The proposed process is consistent with the available empirical evidence, and it uniquely avoids the problem of costly code changes by positing instead that the code expanded its capacity via the creation of new codons with extra letters.
Collapse
Affiliation(s)
- Alejandro Frank
- Institute for Nuclear Sciences (ICN), National Autonomous University of Mexico (UNAM), Mexico City, Mexico
- Center for the Sciences of Complexity (C3), National Autonomous University of Mexico (UNAM), Mexico City, Mexico
- El Colegio Nacional, Mexico City, Mexico
| | - Tom Froese
- Center for the Sciences of Complexity (C3), National Autonomous University of Mexico (UNAM), Mexico City, Mexico.
- Institute for Applied Mathematics and Systems Research (IIMAS), National Autonomous University of Mexico (UNAM), Mexico City, Mexico.
| |
Collapse
|
44
|
Geyer R, Madany Mamlouk A. On the efficiency of the genetic code after frameshift mutations. PeerJ 2018; 6:e4825. [PMID: 29844977 PMCID: PMC5967371 DOI: 10.7717/peerj.4825] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Accepted: 05/02/2018] [Indexed: 01/05/2023] Open
Abstract
Statistical and biochemical studies of the standard genetic code (SGC) have found evidence that the impact of mistranslations is minimized in a way that erroneous codes are either synonymous or code for an amino acid with similar polarity as the originally coded amino acid. It could be quantified that the SGC is optimized to protect this specific chemical property as good as possible. In recent work, it has been speculated that the multilevel optimization of the genetic code stands in the wider context of overlapping codes. This work tries to follow the systematic approach on mistranslations and to extend those analyses to the general effect of frameshift mutations on the polarity conservation of amino acids. We generated one million random codes and compared their average polarity change over all triplets and the whole set of possible frameshift mutations. While the natural code—just as for the point mutations—appears to be competitively robust against frameshift mutations as well, we found that both optimizations appear to be independent of each other. For both, better codes can be found, but it becomes significantly more difficult to find candidates that optimize all of these features—just like the SGC does. We conclude that the SGC is not only very efficient in minimizing the consequences of mistranslations, but rather optimized in amino acid polarity conservation for all three effects of code alteration, namely translational errors, point and frameshift mutations. In other words, our result demonstrates that the SGC appears to be much more than just “one in a million”.
Collapse
Affiliation(s)
- Regine Geyer
- Institute for Neuro- and Bioinformatics, University of Lübeck, Lübeck, Germany
| | - Amir Madany Mamlouk
- Institute for Neuro- and Bioinformatics, University of Lübeck, Lübeck, Germany
| |
Collapse
|
45
|
Froese T, Campos JI, Fujishima K, Kiga D, Virgo N. Horizontal transfer of code fragments between protocells can explain the origins of the genetic code without vertical descent. Sci Rep 2018; 8:3532. [PMID: 29476089 PMCID: PMC5824800 DOI: 10.1038/s41598-018-21973-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 02/14/2018] [Indexed: 11/09/2022] Open
Abstract
Theories of the origin of the genetic code typically appeal to natural selection and/or mutation of hereditable traits to explain its regularities and error robustness, yet the present translation system presupposes high-fidelity replication. Woese's solution to this bootstrapping problem was to assume that code optimization had played a key role in reducing the effect of errors caused by the early translation system. He further conjectured that initially evolution was dominated by horizontal exchange of cellular components among loosely organized protocells ("progenotes"), rather than by vertical transmission of genes. Here we simulated such communal evolution based on horizontal transfer of code fragments, possibly involving pairs of tRNAs and their cognate aminoacyl tRNA synthetases or a precursor tRNA ribozyme capable of catalysing its own aminoacylation, by using an iterated learning model. This is the first model to confirm Woese's conjecture that regularity, optimality, and (near) universality could have emerged via horizontal interactions alone.
Collapse
Affiliation(s)
- Tom Froese
- Institute for Applied Mathematics and Systems Research (IIMAS), National Autonomous University of Mexico (UNAM), Mexico City, 04510, Mexico. .,Center for the Sciences of Complexity (C3), National Autonomous University of Mexico (UNAM), Mexico City, 04510, Mexico.
| | - Jorge I Campos
- Center for the Sciences of Complexity (C3), National Autonomous University of Mexico (UNAM), Mexico City, 04510, Mexico.,Faculty of Higher Education Aragon, National Autonomous University of Mexico (UNAM), Nezahualcoyotl City, State of Mexico, 57130, Mexico
| | - Kosuke Fujishima
- Earth-Life Science Institute, Tokyo Institute of Technology, Meguro-ku, Tokyo, 152-8550, Japan.,Institute for Advanced Biosciences, Keio University, Tsuruoka, 9970035, Japan
| | - Daisuke Kiga
- Faculty of Science and Engineering, School of Advanced Science and Engineering, Waseda University, Shinjuku, Tokyo, 169-8555, Japan
| | - Nathaniel Virgo
- Earth-Life Science Institute, Tokyo Institute of Technology, Meguro-ku, Tokyo, 152-8550, Japan
| |
Collapse
|
46
|
The evolution of the genetic code: Impasses and challenges. Biosystems 2018; 164:217-225. [DOI: 10.1016/j.biosystems.2017.10.006] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2017] [Revised: 10/06/2017] [Accepted: 10/09/2017] [Indexed: 01/17/2023]
|
47
|
Comprehensive reduction of amino acid set in a protein suggests the importance of prebiotic amino acids for stable proteins. Sci Rep 2018; 8:1227. [PMID: 29352156 PMCID: PMC5775292 DOI: 10.1038/s41598-018-19561-1] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 01/03/2018] [Indexed: 11/19/2022] Open
Abstract
Modern organisms commonly use the same set of 20 genetically coded amino acids for protein synthesis with very few exceptions. However, earlier protein synthesis was plausibly much simpler than modern one and utilized only a limited set of amino acids. Nevertheless, few experimental tests of this issue with arbitrarily chosen amino acid sets had been reported prior to this report. Herein we comprehensively and systematically reduced the size of the amino acid set constituting an ancestral nucleoside kinase that was reconstructed in our previous study. We eventually found that two convergent sequences, each comprised of a 13-amino acid alphabet, folded into soluble, stable and catalytically active structures, even though their stabilities and activities were not as high as those of the parent protein. Notably, many but not all of the reduced-set amino acids coincide with those plausibly abundant in primitive Earth. The inconsistent amino acids appeared to be important for catalytic activity but not for stability. Therefore, our findings suggest that the prebiotically abundant amino acids were used for creating stable protein structures and other amino acids with functional side chains were recruited to achieve efficient catalysis.
Collapse
|
48
|
Affiliation(s)
- Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Artem S. Novozhilov
- Department of Mathematics, North Dakota State University, Fargo, North Dakota 58108, USA
| |
Collapse
|
49
|
Di Giulio M. The aminoacyl-tRNA synthetases had only a marginal role in the origin of the organization of the genetic code: Evidence in favor of the coevolution theory. J Theor Biol 2017; 432:14-24. [DOI: 10.1016/j.jtbi.2017.08.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 08/01/2017] [Accepted: 08/03/2017] [Indexed: 10/19/2022]
|
50
|
Frozen Accident Pushing 50: Stereochemistry, Expansion, and Chance in the Evolution of the Genetic Code. Life (Basel) 2017; 7:life7020022. [PMID: 28545255 PMCID: PMC5492144 DOI: 10.3390/life7020022] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Revised: 05/19/2017] [Accepted: 05/20/2017] [Indexed: 12/31/2022] Open
Abstract
Nearly 50 years ago, Francis Crick propounded the frozen accident scenario for the evolution of the genetic code along with the hypothesis that the early translation system consisted primarily of RNA. Under the frozen accident perspective, the code is universal among modern life forms because any change in codon assignment would be highly deleterious. The frozen accident can be considered the default theory of code evolution because it does not imply any specific interactions between amino acids and the cognate codons or anticodons, or any particular properties of the code. The subsequent 49 years of code studies have elucidated notable features of the standard code, such as high robustness to errors, but failed to develop a compelling explanation for codon assignments. In particular, stereochemical affinity between amino acids and the cognate codons or anticodons does not seem to account for the origin and evolution of the code. Here, I expand Crick’s hypothesis on RNA-only translation system by presenting evidence that this early translation already attained high fidelity that allowed protein evolution. I outline an experimentally testable scenario for the evolution of the code that combines a distinct version of the stereochemical hypothesis, in which amino acids are recognized via unique sites in the tertiary structure of proto-tRNAs, rather than by anticodons, expansion of the code via proto-tRNA duplication, and the frozen accident.
Collapse
|