1
|
Di Giulio M. Theories of the origin of the genetic code: Strong corroboration for the coevolution theory. Biosystems 2024; 239:105217. [PMID: 38663520 DOI: 10.1016/j.biosystems.2024.105217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 04/16/2024] [Accepted: 04/18/2024] [Indexed: 04/29/2024]
Abstract
I analyzed all the theories and models of the origin of the genetic code, and over the years, I have considered the main suggestions that could explain this origin. The conclusion of this analysis is that the coevolution theory of the origin of the genetic code is the theory that best captures the majority of observations concerning the organization of the genetic code. In other words, the biosynthetic relationships between amino acids would have heavily influenced the origin of the organization of the genetic code, as supported by the coevolution theory. Instead, the presence in the genetic code of physicochemical properties of amino acids, which have also been linked to the physicochemical properties of anticodons or codons or bases by stereochemical and physicochemical theories, would simply be the result of natural selection. More explicitly, I maintain that these correlations between codons, anticodons or bases and amino acids are in fact the result not of a real correlation between amino acids and codons, for example, but are only the effect of the intervention of natural selection. Specifically, in the genetic code table we expect, for example, that the most similar codons - that is, those that differ by only one base - will have more similar physicochemical properties. Therefore, the 64 codons of the genetic code table ordered in a certain way would also represent an ordering of some of their physicochemical properties. Now, a study aimed at clarifying which physicochemical property of amino acids has influenced the allocation of amino acids in the genetic code has established that the partition energy of amino acids has played a role decisive in this. Indeed, under some conditions, the genetic code was found to be approximately 98% optimized on its columns. In this same work, it was shown that this was most likely the result of the action of natural selection. If natural selection had truly allocated the amino acids in the genetic code in such a way that similar amino acids also have similar codons - this, not through a mechanism of physicochemical interaction between, for example, codons and amino acids - then it might turn out that even different physicochemical properties of codons (or anticodons or bases) show some correlation with the physicochemical properties of amino acids, simply because the partition energy of amino acids is correlated with other physicochemical properties of amino acids. It is very likely that this would inevitably lead to a correlation between codons (or anticodons or bases) and amino acids. In other words, since the codons (anticodons or bases) are ordered in the genetic code, that is to say, some of their physicochemical properties should also be ordered by a similar order, and given that the amino acids would also appear to have been ordered in the genetic code by selection natural, then it should inevitably turn out that there is a correlation between, for example, the hydrophobicity of anticodons and that of amino acids. Instead, the intervention of natural selection in organizing the genetic code would appear to be highly compatible with the main mechanism of structuring the genetic code as supported by the coevolution theory. This would make the coevolution theory the only plausible explanation for the origin of the genetic code.
Collapse
Affiliation(s)
- Massimo Di Giulio
- The Ionian School, Early Evolution of Life Department, Genetic Code and tRNA Origin Laboratory, Via Roma 19, 67030, Alfedena, L'Aquila, Italy.
| |
Collapse
|
2
|
Seligmann H, Raoult D. Stem-Loop RNA Hairpins in Giant Viruses: Invading rRNA-Like Repeats and a Template Free RNA. Front Microbiol 2018; 9:101. [PMID: 29449833 PMCID: PMC5799277 DOI: 10.3389/fmicb.2018.00101] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Accepted: 01/16/2018] [Indexed: 12/31/2022] Open
Abstract
We examine the hypothesis that de novo template-free RNAs still form spontaneously, as they did at the origins of life, invade modern genomes, contribute new genetic material. Previously, analyses of RNA secondary structures suggested that some RNAs resembling ancestral (t)RNAs formed recently de novo, other parasitic sequences cluster with rRNAs. Here positive control analyses of additional RNA secondary structures confirm ancestral and de novo statuses of RNA grouped according to secondary structure. Viroids with branched stems resemble de novo RNAs, rod-shaped viroids resemble rRNA secondary structures, independently of GC contents. 5' UTR leading regions of West Nile and Dengue flavivirid viruses resemble de novo and rRNA structures, respectively. An RNA homologous with Megavirus, Dengue and West Nile genomes, copperhead snake microsatellites and levant cotton repeats, not templated by Mimivirus' genome, persists throughout Mimivirus' infection. Its secondary structure clusters with candidate de novo RNAs. The saltatory phyletic distribution and secondary structure of Mimivirus' peculiar RNA suggest occasional template-free polymerization of this sequence, rather than noncanonical transcriptions (swinger polymerization, posttranscriptional editing).
Collapse
Affiliation(s)
- Hervé Seligmann
- Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes, UMR MEPHI, Aix-Marseille Université, IRD, Assistance Publique-Hôpitaux de Marseille, Institut Hospitalo-Universitaire Méditerranée-Infection, Marseille, France
| | - Didier Raoult
- Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes, UMR MEPHI, Aix-Marseille Université, IRD, Assistance Publique-Hôpitaux de Marseille, Institut Hospitalo-Universitaire Méditerranée-Infection, Marseille, France
| |
Collapse
|
3
|
Di Giulio M. The aminoacyl-tRNA synthetases had only a marginal role in the origin of the organization of the genetic code: Evidence in favor of the coevolution theory. J Theor Biol 2017; 432:14-24. [DOI: 10.1016/j.jtbi.2017.08.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 08/01/2017] [Accepted: 08/03/2017] [Indexed: 10/19/2022]
|
4
|
Bhattacharyya S, Varshney U. Evolution of initiator tRNAs and selection of methionine as the initiating amino acid. RNA Biol 2016; 13:810-9. [PMID: 27322343 DOI: 10.1080/15476286.2016.1195943] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Transfer RNAs (tRNAs) have been important in shaping biomolecular evolution. Initiator tRNAs (tRNAi), a special class of tRNAs, carry methionine (or its derivative, formyl-methionine) to ribosomes to start an enormously energy consuming but a highly regulated process of protein synthesis. The processes of tRNAi evolution, and selection of methionine as the universal initiating amino acid remain an enigmatic problem. We constructed phylogenetic trees using the whole sequence, the acceptor-TψC arm ('minihelix'), and the anticodon-dihydrouridine arm regions of tRNAi from 158 species belonging to all 3 domains of life. All the trees distinctly assembled into 3 domains of life. Large trees, generated using data for all the tRNAs of a vast number of species, fail to reveal the major evolutionary events and identity of the probable elongator tRNA sequences that could be ancestor of tRNAi. Therefore, we constructed trees using the minihelix or the whole sequence of species specific tRNAs, and iterated our analysis on 50 eubacterial species. We identified tRNA(Pro), tRNA(Glu), or tRNA(Thr) (but surprisingly not elongator tRNA(Met)) as probable ancestors of tRNAi. We then determined the factors imposing selection of methionine as the initiating amino acid. Overall frequency of occurrence of methionine, whose metabolic cost of synthesis is the highest among all amino acids, remains almost unchanged across the 3 domains of life. Our correlation analysis shows that its high metabolic cost is independent of many physicochemical properties of the side chain. Our results indicate that selection of methionine, as the initiating amino acid was possibly a consequence of the evolution of one-carbon metabolism, which plays an important role in regulating translation initiation.
Collapse
Affiliation(s)
- Souvik Bhattacharyya
- a Department of Microbiology and Cell Biology , Indian Institute of Science , Bangalore , India
| | - Umesh Varshney
- a Department of Microbiology and Cell Biology , Indian Institute of Science , Bangalore , India.,b Jawaharlal Nehru Center for Advanced Scientific Research, Jakkur , Bangalore , India
| |
Collapse
|
5
|
Ancestral Reconstruction of a Pre-LUCA Aminoacyl-tRNA Synthetase Ancestor Supports the Late Addition of Trp to the Genetic Code. J Mol Evol 2015; 80:171-85. [PMID: 25791872 DOI: 10.1007/s00239-015-9672-1] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2015] [Accepted: 03/09/2015] [Indexed: 01/14/2023]
Abstract
The genetic code was likely complete in its current form by the time of the last universal common ancestor (LUCA). Several scenarios have been proposed for explaining the code's pre-LUCA emergence and expansion, and the relative order of the appearance of amino acids used in translation. One co-evolutionary model of genetic code expansion proposes that at least some amino acids were added to the code by the ancient divergence of aminoacyl-tRNA synthetase (aaRS) families. Of all the amino acids used within the genetic code, Trp is most frequently claimed as a relatively recent addition. We observe that, since TrpRS and TyrRS are paralogous protein families retaining significant sequence similarity, the inferred sequence composition of their ancestor can be used to evaluate this co-evolutionary model of genetic code expansion. We show that ancestral sequence reconstructions of the pre-LUCA paralog ancestor of TyrRS and TrpRS have several sites containing Tyr, yet a complete absence of sites containing Trp. This is consistent with the paralog ancestor being specific for the utilization of Tyr, with Trp being a subsequent addition to the genetic code facilitated by a process of aaRS divergence and neofunctionalization. Only after this divergence could Trp be specifically encoded and incorporated into proteins, including the TyrRS and TrpRS descendant lineages themselves. This early absence of Trp is observed under both homogeneous and non-homogeneous models of ancestral sequence reconstruction. Simulations support that this observed absence of Trp is unlikely to be due to chance or model bias. These results support that the final stages of genetic code evolution occurred well within the "protein world," and that the presence-absence of Trp within conserved sites of ancient protein domains is a likely measure of their relative antiquity, permitting the relative timing of extremely early events within protein evolution before LUCA.
Collapse
|
6
|
A mechanism for functional segregation of mitochondrial and cytosolic genetic codes. Proc Natl Acad Sci U S A 2009; 106:19420-5. [PMID: 19880741 DOI: 10.1073/pnas.0909937106] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The coexistence of multiple gene translation machineries is a feature of eukaryotic cells and a result of the endosymbiotic events that gave rise to mitochondria, plastids, and other organelles. The conditions required for the integration of these apparatuses within a single cell are not understood, but current evidence indicates that complete ablation of the mitochondrial protein synthesis apparatus and its substitution by its cytosolic equivalent is not possible. Why certain mitochondrial components and not others can be substituted by cytosolic equivalents is not known. In trypanosomatids this situation reaches a limit, because certain aminoacyl-tRNA synthetases are mitochondrial specific despite the fact that all tRNAs in these organisms are shared between cytosol and mitochondria. Here we report that a mitochondria-specific lysyl-tRNA synthetase in Trypanosoma has evolved a mechanism to block the activity of the enzyme during its synthesis and translocation. Only when the enzyme reaches the mitochondria is it activated through the cleavage of a C-terminal structural extension, preventing the possibility of the enzyme being active in the cytosol.
Collapse
|
7
|
Di Giulio M. An extension of the coevolution theory of the origin of the genetic code. Biol Direct 2008; 3:37. [PMID: 18775066 PMCID: PMC2538516 DOI: 10.1186/1745-6150-3-37] [Citation(s) in RCA: 112] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2008] [Accepted: 09/05/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The coevolution theory of the origin of the genetic code suggests that the genetic code is an imprint of the biosynthetic relationships between amino acids. However, this theory does not seem to attribute a role to the biosynthetic relationships between the earliest amino acids that evolved along the pathways of energetic metabolism. As a result, the coevolution theory is unable to clearly define the very earliest phases of genetic code origin. In order to remove this difficulty, I here suggest an extension of the coevolution theory that attributes a crucial role to the first amino acids that evolved along these biosynthetic pathways and to their biosynthetic relationships, even when defined by the non-amino acid molecules that are their precursors. RESULTS It is re-observed that the first amino acids to evolve along these biosynthetic pathways are predominantly those codified by codons of the type GNN, and this observation is found to be statistically significant. Furthermore, the close biosynthetic relationships between the sibling amino acids Ala-Ser, Ser-Gly, Asp-Glu, and Ala-Val are not random in the genetic code table and reinforce the hypothesis that the biosynthetic relationships between these six amino acids played a crucial role in defining the very earliest phases of genetic code origin. CONCLUSION All this leads to the hypothesis that there existed a code, GNS, reflecting the biosynthetic relationships between these six amino acids which, as it defines the very earliest phases of genetic code origin, removes the main difficulty of the coevolution theory. Furthermore, it is here discussed how this code might have naturally led to the code codifying only for the domains of the codons of precursor amino acids, as predicted by the coevolution theory. Finally, the hypothesis here suggested also removes other problems of the coevolution theory, such as the existence for certain pairs of amino acids with an unclear biosynthetic relationship between the precursor and product amino acids and the collocation of Ala between the amino acids Val and Leu belonging to the pyruvate biosynthetic family, which the coevolution theory considered as belonging to different biosyntheses. REVIEWERS This article was reviewed by Rob Knight, Paul Higgs (nominated by Laura Landweber), and Eugene Koonin.
Collapse
Affiliation(s)
- Massimo Di Giulio
- Laboratory for Molecular Evolution, Institute of Genetics and Biophysics Adriano Buzzati Traverso, CNR, Via P. Castellino, 111, 80131 Naples, Napoli, Italy.
| |
Collapse
|
8
|
Di Giulio M. The origin of the genetic code: theories and their relationships, a review. Biosystems 2004; 80:175-84. [PMID: 15823416 DOI: 10.1016/j.biosystems.2004.11.005] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2004] [Revised: 11/12/2004] [Accepted: 11/18/2004] [Indexed: 10/26/2022]
Abstract
A review of the main theories proposed to explain the origin of the genetic code is presented. I analyze arguments and data in favour of different theories proposed to explain the origin of the organization of the genetic code. It is possible to suggest a mechanism that makes compatible the different theories of the origin of the code, even if these are based on a historical or physicochemical determinism and thus appear incompatible by definition. Finally, I discuss the question of why a given number of synonymous codons was attributed to the amino acids in the genetic code.
Collapse
Affiliation(s)
- Massimo Di Giulio
- Institute of Genetics and Biophysics Adriano Buzzati-Traverso, CNR, Naples, Italy
| |
Collapse
|
9
|
|
10
|
Abstract
The coevolution theory of genetic code origin (Wong, J.T. 1975, Proc. Natl Acad. Sci. U.S.A.72, 1909-1912) is assumed here to be substantially correct. This theory is based on the strict parallelism of the biosynthetic relationships between amino acids and the organization of the genetic code and postulates that these relationships were mediated by tRNA-like molecules on which the biosynthetic transformations between precursor and product amino acids took place. These transformations underlay the mechanism that gave rise to genetic code organization. One of the pathways which represents these transformations found in current organisms, and which are thus probably molecular fossils, is the Met-tRNA(fMet)-->fMet-tRNA(fMet)pathway. This pathway is present only in the Bacteria domain. This along with other observations and arguments leads us to believe that this pathway is a clear violation of the universality of the genetic code. Furthermore, the presence of this pathway only in the Bacteria domain seems to imply that the translation apparatus was still rapidly evolving when this pathway was fixed. This, in turn, appears to imply that the last universal common ancestor was a progenote. Finally, the implications that the finding of this pathway has for the stereochemical theory of genetic code origin are discussed.
Collapse
Affiliation(s)
- M Di Giulio
- International Institute of Genetics and Biophysics, CNR, Via G. Marconi 10, 80125 Naples, Italy.
| |
Collapse
|
11
|
Di Giulio M. The beta-sheets of proteins, the biosynthetic relationships between amino acids, and the origin of the genetic code. ORIGINS LIFE EVOL B 1996; 26:589-609. [PMID: 9008882 DOI: 10.1007/bf01808222] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Two forces are generally hypothesised as being responsible for conditioning the origin of the organization of the genetic code: the physicochemical properties of amino acids and their biosynthetic relationships (relationships between precursor and product amino acids). If we assume that the biosynthetic relationships between amino acids were fundamental in defining the genetic code, then it is reasonable to expect that the distribution of physicochemical properties among the amino acids in precursor-product relationships cannot be random but must, rather, be affected by some selective constraints imposed by the structure of primitive proteins. Analysis shows that measurements representing the 'size' of amino acids, e.g. bulkiness, are specifically associated to the pairs of amino acids in precurso-product relationships. However, the size of amino acids cannot have been selected per se but, rather, because it reflects the beta-sheets of proteins which are, therefore, identified as the main adaptive theme promoting the origin of genetic code organization. Whereas there are no traces of the alpha-helix in the genetic code table. The above considerations make it necessary to re-examine the relationship linking the hydrophilicity of the dinucleoside monophosphates of anticodons and the polarity and bulkiness of amino acids. It can be concluded that this relationship seems to be meaningful only between the hydrophilicity of anticodons and the polarity of amino acids. The latter relationship is supposed to have been operative on hairpin structures, ancestors of the tRNA molecule. Moreover, it is on these very structures that the biosynthetic links between precursor and product amino acids might have been achieved, and the interaction between the hydrophilicity of anticodons and the polarity of amino acids might have had a role in the concession of codons (anticodons) from precursors to products.
Collapse
Affiliation(s)
- M Di Giulio
- International Institute of Genetics and Biophysics, CNR, Napoli, Italy
| |
Collapse
|
12
|
Di Giulio M. The phylogeny of tRNAs seems to confirm the predictions of the coevolution theory of the origin of the genetic code. ORIGINS LIFE EVOL B 1995; 25:549-64. [PMID: 7494635 DOI: 10.1007/bf01582024] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
An extensive analysis of the evolutionary relationships existing between transfer RNAs, performed using parsimony algorithms, is presented. After building up an estimate of the tRNA ancestral sequences, these sequences are then compared using certain methods. The results seem to suggest that the coevolution hypothesis (Wong, J.T., 1975, Proc. Natl. Acad. Sci. USA 72, 1909-1912) that sees the genetic code as a map of the biosynthetic relationships between amino acids is further supported by these results, as compared to the hypotheses that see the physicochemical properties of amino acids as the main adaptative theme that led to the structuring of the genetic code.
Collapse
Affiliation(s)
- M Di Giulio
- International Institute of Genetics and Biophysics, CNR, Napoli, Italy
| |
Collapse
|
13
|
Abstract
The evolutionary relationships between transfer RNA (tRNA) molecules are analyzed by parsimony algorithms. The position of the topologies expected on the basis of the hypotheses made to explain the origin of the genetic code, on the frequency distribution of all the possible tree topologies of the evolutionary relationships between tRNAs seems to lead to the following conclusion: The hypothesis (Wong, J. T., Proc. Natl. Acad. Sci. USA, 1975, 72: 1909-1912) that sees the genetic code as a map of the biosynthetic relationships between amino acids seems to occupy a statistically significant position on these frequency distributions, thus reflecting a significant part of the tRNA phylogeny.
Collapse
Affiliation(s)
- M Di Giulio
- International Institute of Genetics and Biophysics, CNR, Naples, Italy
| |
Collapse
|
14
|
Abstract
Sequence data and evolutionary arguments suggest that a similarity may exist between the C-terminal end of glutaminyl-tRNA synthetase (GlnRS) and the catalytic domain of glutamine amidotransferases (GATs). If true, this would seem to imply that the amidation reaction of the Glut-tRNA(Gln) complex was the evolutionary precursor of the direct tRNA(Gln) aminoacylation pathway. Since the C-terminal end of GlnRS does not now have an important functional role, it can be concluded that this sequence contains vestiges that lead us to believe that it represents a palimpsest. This sequence still conserves the remains of the evolutionary transition: amidation reaction-->aminoacylation reaction. This may be important in deciding which mechanism gave origin to the genetic code organization. These observations, together with results obtained by Gatti and Tzagoloff [J. Mol. Biol. (1991) 218:557-568], lead to the hypothesis that the class I aminoacyl-tRNA synthetases (ARSs) may be homologous to the GATs of the trpG subfamily, while the class II ARSs may be homologous to the GATs of the purF subfamily. Overall, this seems to point to the existence of an intimate evolutionary link between the proteins involved in the primitive metabolism and aminoacyl-tRNA synthetases.
Collapse
Affiliation(s)
- M Di Giulio
- International Institute of Genetics and Biophysics, CNR, Naples, Napoli, Italy
| |
Collapse
|