1
|
Di Giulio M. Theories of the origin of the genetic code: Strong corroboration for the coevolution theory. Biosystems 2024; 239:105217. [PMID: 38663520 DOI: 10.1016/j.biosystems.2024.105217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 04/16/2024] [Accepted: 04/18/2024] [Indexed: 04/29/2024]
Abstract
I analyzed all the theories and models of the origin of the genetic code, and over the years, I have considered the main suggestions that could explain this origin. The conclusion of this analysis is that the coevolution theory of the origin of the genetic code is the theory that best captures the majority of observations concerning the organization of the genetic code. In other words, the biosynthetic relationships between amino acids would have heavily influenced the origin of the organization of the genetic code, as supported by the coevolution theory. Instead, the presence in the genetic code of physicochemical properties of amino acids, which have also been linked to the physicochemical properties of anticodons or codons or bases by stereochemical and physicochemical theories, would simply be the result of natural selection. More explicitly, I maintain that these correlations between codons, anticodons or bases and amino acids are in fact the result not of a real correlation between amino acids and codons, for example, but are only the effect of the intervention of natural selection. Specifically, in the genetic code table we expect, for example, that the most similar codons - that is, those that differ by only one base - will have more similar physicochemical properties. Therefore, the 64 codons of the genetic code table ordered in a certain way would also represent an ordering of some of their physicochemical properties. Now, a study aimed at clarifying which physicochemical property of amino acids has influenced the allocation of amino acids in the genetic code has established that the partition energy of amino acids has played a role decisive in this. Indeed, under some conditions, the genetic code was found to be approximately 98% optimized on its columns. In this same work, it was shown that this was most likely the result of the action of natural selection. If natural selection had truly allocated the amino acids in the genetic code in such a way that similar amino acids also have similar codons - this, not through a mechanism of physicochemical interaction between, for example, codons and amino acids - then it might turn out that even different physicochemical properties of codons (or anticodons or bases) show some correlation with the physicochemical properties of amino acids, simply because the partition energy of amino acids is correlated with other physicochemical properties of amino acids. It is very likely that this would inevitably lead to a correlation between codons (or anticodons or bases) and amino acids. In other words, since the codons (anticodons or bases) are ordered in the genetic code, that is to say, some of their physicochemical properties should also be ordered by a similar order, and given that the amino acids would also appear to have been ordered in the genetic code by selection natural, then it should inevitably turn out that there is a correlation between, for example, the hydrophobicity of anticodons and that of amino acids. Instead, the intervention of natural selection in organizing the genetic code would appear to be highly compatible with the main mechanism of structuring the genetic code as supported by the coevolution theory. This would make the coevolution theory the only plausible explanation for the origin of the genetic code.
Collapse
Affiliation(s)
- Massimo Di Giulio
- The Ionian School, Early Evolution of Life Department, Genetic Code and tRNA Origin Laboratory, Via Roma 19, 67030, Alfedena, L'Aquila, Italy.
| |
Collapse
|
2
|
Di Giulio M. The time of appearance of the genetic code. Biosystems 2024; 237:105159. [PMID: 38373543 DOI: 10.1016/j.biosystems.2024.105159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 02/13/2024] [Accepted: 02/16/2024] [Indexed: 02/21/2024]
Abstract
I support the hypothesis that the origin of the genetic code occurred simultaneously with the evolution of cellularity. That is to say, I favour the hypothesis that the origin of the genetic code is a very, very late event in the history of life on Earth. I corroborate this hypothesis with observations favouring the progenote's stage for the Last Universal Common Ancestor (LUCA), for the ancestor of bacteria and that of archaea. Indeed, these progenotic stages would imply that - at that time - the origin of the genetic code was still ongoing simply because this origin would fall within the very definition of progenote. Therefore, if the evolution of cellularity had truly been coeval with the origin of the genetic code - at least in its terminal part - then this would favour theories such as the coevolution theory of the origin of the genetic code because this theory would postulate that this origin must have occurred in extremely complex protocellular conditions and not concerning stereochemical or physicochemical interactions having to do with other stages of the origin of life. In this sense, the coevolution theory would be corroborated while the stereochemical and physicochemical theories would be damaged. Therefore, the origin of the genetic code would be linked to the origin of the cell and not to the origin of life as sometimes asserted. Therefore, I will discuss the late hypothesis of the origin of the genetic code in the context of the theories proposed to explain this origin and more generally of its implications for the early evolution of life.
Collapse
Affiliation(s)
- Massimo Di Giulio
- The Ionian School, Early Evolution of Life Department, Genetic Code and tRNA Origin Laboratory, Via Roma 19, 67030, Alfedena, L'Aquila, Italy.
| |
Collapse
|
3
|
Štambuk N, Konjevoda P, Brčić-Kostić K, Baković J, Štambuk A. New algorithm for the analysis of nucleotide and amino acid evolutionary relationships based on Klein four-group. Biosystems 2023; 233:105030. [PMID: 37717902 DOI: 10.1016/j.biosystems.2023.105030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 09/10/2023] [Accepted: 09/10/2023] [Indexed: 09/19/2023]
Abstract
Phylogenetics is the study of ancestral relationships among biological species. Such sequence analyses are often represented as phylogenetic trees. The branching pattern of each tree and its topology reflect the evolutionary relatedness between analyzed sequences. We present a Klein four-group algorithm (K4A) for the evolutionary analysis of nucleotide and amino acid sequences. Klein four-group set of operators consists of: identity e (U), and three elements-a = transition (C), b = transversion (G) and c = transition-transversion or complementarity (A). We generated Klein four-group based distance matrices of: 1. Cayley table (CK4), 2. Table rows (K4R), 3. Table columns (K4C), and 4. Euclidean 2D distance (K4E). The performance of the matrices was tested on a dataset of RecA proteins in bacteria, eukaryotes (Rad51 homolog) and archaea (RadA homolog). RecA and its functional homologs are found in all species, and are essential for the repair and maintenance of DNA. Consequently, they represent a good model for the study of evolutionary relationship of protein and nucleotide sequences. The ancestral relationship between the sequences was correctly classified by all K4A matrices concerning general topology. All distance matrices exhibited small variations among species, and overall results of tree classification were in agreement with the general patterns obtained by standard BLOSUM and PAM substitution matrices. During the evolution of a code there is a phase of optimization of system rules, the ambiguity of a code is eliminated, and the system starts producing specific components. Klein four-group algorithm is consistent with the concept of ambiguity reduction. It also enables the use of different genetic code table variants optimized for particular transitions in evolution based on biological specificity.
Collapse
Affiliation(s)
- Nikola Štambuk
- Centre for Nuclear Magnetic Resonance, Ruđer Bošković Institute, Bijenička cesta 54, HR-10000, Zagreb, Croatia.
| | - Paško Konjevoda
- Laboratory for Epigenomics, Division of Molecular Medicine, Ruđer Bošković Institute, Bijenička cesta 54, HR-10000, Zagreb, Croatia.
| | - Krunoslav Brčić-Kostić
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Bijenička cesta 54, HR-10000, Zagreb, Croatia
| | - Josip Baković
- University Hospital Dubrava, Department of Surgery, Avenija Gojka Šuška 6, HR-10000, Zagreb, Croatia
| | - Albert Štambuk
- Faculty of Kinesiology, University of Zagreb, Horvaćanski zavoj 15, HR-10000 Zagreb, Croatia
| |
Collapse
|
4
|
Caldararo F, Di Giulio M. The genetic code is very close to a global optimum in a model of its origin taking into account both the partition energy of amino acids and their biosynthetic relationships. Biosystems 2022; 214:104613. [DOI: 10.1016/j.biosystems.2022.104613] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Revised: 01/16/2022] [Accepted: 01/17/2022] [Indexed: 01/23/2023]
|
5
|
Konjevoda P, Štambuk N. Relational model of the standard genetic code. Biosystems 2021; 210:104529. [PMID: 34464669 DOI: 10.1016/j.biosystems.2021.104529] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 08/26/2021] [Accepted: 08/27/2021] [Indexed: 11/28/2022]
Abstract
The genetic code is a set of rules that establishes mapping between triplets in messenger RNA and amino acids in proteins. The most common way to display these rules is the Standard Genetic Code (SGC) table. This paper takes an alternative approach, based on the relational data model by Edgar F. Codd (Commun. ACM, 13:377-387, 1970). The relational model (RM) proposes a distributed storage of data into a collection of tables (called relations), that can be connected by shared communality. Basic elements of the table are rows (called records or tuples), and columns (called fields or attributes). The SGC table, according to the relational data model, represents the so called unnormalized form of a table. Using normalization rules it is possible to subdivide the SGC table into four tables. The rows and columns of single tables are defined by the first and second base and individual tables by the third codon base. The result of this model is an approach to managing genetic code data, represented in terms of tuples and grouped into relations, with table structure and language consistent with first-order (predicate) logic. The RM explains that the final step in the development of the SGC was the adoption of coding function by the third base, which makes an informational/functional unit with the first base, despite the different physical location in a triplet. This enabled the synthesis of specific proteins without ambiguity, in accordance with the concept of ambiguity reduction and five phases of the general model on the origin of biological codes by Marcello Barbieri (BioSystems 181:11-19, 2019).
Collapse
Affiliation(s)
- Paško Konjevoda
- Laboratory for Epigenomics, Division of Molecular Medicine, Ruđer Bošković Institute, Bijenička cesta 54, HR-10000 Zagreb, Croatia.
| | - Nikola Štambuk
- Center for Nuclear Magnetic Resonance, Ruđer Bošković Institute, Bijenička cesta 54, HR-10000 Zagreb, Croatia.
| |
Collapse
|
6
|
Ying J, Ding R, Liu Y, Zhao Y. Prebiotic Chemistry in Aqueous Environment: A Review of Peptide Synthesis and Its Relationship with Genetic Code. CHINESE J CHEM 2021. [DOI: 10.1002/cjoc.202100120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Jianxi Ying
- Institute of Drug Discovery Technology Ningbo University, No.818 Fenghua Road, Ningbo Zhejiang 315211 China
- Qian Xuesen Collaborative Research Center of Astrochemistry and Space Life Sciences Ningbo University No.818 Fenghua Road, Ningbo Zhejiang 315211 China
| | - Ruiwen Ding
- Institute of Drug Discovery Technology Ningbo University, No.818 Fenghua Road, Ningbo Zhejiang 315211 China
- Qian Xuesen Collaborative Research Center of Astrochemistry and Space Life Sciences Ningbo University No.818 Fenghua Road, Ningbo Zhejiang 315211 China
| | - Yan Liu
- College of Chemistry and Chemical Engineering Xiamen University, No. 422, Siming South Road Xiamen Fujian 361005 China
| | - Yufen Zhao
- Institute of Drug Discovery Technology Ningbo University, No.818 Fenghua Road, Ningbo Zhejiang 315211 China
- Qian Xuesen Collaborative Research Center of Astrochemistry and Space Life Sciences Ningbo University No.818 Fenghua Road, Ningbo Zhejiang 315211 China
- College of Chemistry and Chemical Engineering Xiamen University, No. 422, Siming South Road Xiamen Fujian 361005 China
| |
Collapse
|
7
|
Takénaka A, Moras D. Correlation between equi-partition of aminoacyl-tRNA synthetases and amino-acid biosynthesis pathways. Nucleic Acids Res 2020; 48:3277-3285. [PMID: 31965182 PMCID: PMC7102985 DOI: 10.1093/nar/gkaa013] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Revised: 12/31/2019] [Accepted: 01/07/2020] [Indexed: 12/11/2022] Open
Abstract
The partition of aminoacyl-tRNA synthetases (aaRSs) into two classes of equal size and the correlated amino acid distribution is a puzzling still unexplained observation. We propose that the time scale of the amino-acid synthesis, assumed to be proportional to the number of reaction steps (NE) involved in the biosynthesis pathway, is one of the parameters that controlled the timescale of aaRSs appearance. Because all pathways are branched at fructose-6-phosphate on the metabolic pathway, this product is defined as the common origin for the NE comparison. For each amino-acid, the NE value, counted from the origin to the final product, provides a timescale for the pathways to be established. An archeological approach based on NE reveals that aaRSs of the two classes are generated in pair along this timescale. The results support the coevolution theory for the origin of the genetic code with an earlier appearance of class II aaRSs.
Collapse
Affiliation(s)
- Akio Takénaka
- Research Institute, Chiba Institute of Technology, 2-17-1 Tsudanuma, Narashino, Chiba 275-0016, Japan.,Faculty of Pharmacy, Shenyang Pharmaceutical University, Benxi, Liaoning 117004, China
| | - Dino Moras
- Department of Integrated Structural Biology, Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC) 1 rue Laurent Fries, Illkirch 67404, France.,Centre National de Recherche Scientifique (CNRS) UMR 7104, France.,Institut National de Santé et de Recherche Médicale (INSERM) U1258, France.,Université de Strasbourg, Illkirch, France
| |
Collapse
|
8
|
Barbieri M. Evolution of the genetic code: The ambiguity-reduction theory. Biosystems 2019; 185:104024. [DOI: 10.1016/j.biosystems.2019.104024] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Revised: 08/26/2019] [Accepted: 08/26/2019] [Indexed: 10/26/2022]
|
9
|
Di Giulio M. The key role of the elongation factors in the origin of the organization of the genetic code. Biosystems 2019; 181:20-26. [DOI: 10.1016/j.biosystems.2019.04.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 04/13/2019] [Accepted: 04/13/2019] [Indexed: 11/29/2022]
|
10
|
Optimization of the standard genetic code in terms of two mutation types: Point mutations and frameshifts. Biosystems 2019; 181:44-50. [DOI: 10.1016/j.biosystems.2019.04.012] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 04/27/2019] [Indexed: 02/08/2023]
|
11
|
Many alternative and theoretical genetic codes are more robust to amino acid replacements than the standard genetic code. J Theor Biol 2019; 464:21-32. [DOI: 10.1016/j.jtbi.2018.12.030] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 12/17/2018] [Accepted: 12/19/2018] [Indexed: 02/07/2023]
|
12
|
Di Giulio M. A Non-neutral Origin for Error Minimization in the Origin of the Genetic Code. J Mol Evol 2018; 86:593-597. [PMID: 30361751 DOI: 10.1007/s00239-018-9871-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 10/17/2018] [Indexed: 11/29/2022]
Abstract
Massey (J Mol Evol 67:510-516, 2008; J Theor Biol 408:237-242, 2016; Nat Comput. https://doi.org/10.1007/s11047-017-9669-3, 2018) claims that the error minimization of the genetic code is derived by means of a neutral process and was not due to the action of natural selection. Here, I argue that this neutralist hypothesis of the origin of error minimization is not based directly on any neutral process but it could be only indirectly. On the contrary, it has been natural selection that has acted during the origin of the genetic code determining the property that similar amino acids are coded by similar codons within the genetic code table.
Collapse
Affiliation(s)
- Massimo Di Giulio
- Early Evolution of Life Laboratory, Institute of Biosciences and Bioresources, CNR, Via P. Castellino, 111, 80131, Naples, Italy.
| |
Collapse
|
13
|
Błażej P, Wnętrzak M, Mackiewicz D, Mackiewicz P. Optimization of the standard genetic code according to three codon positions using an evolutionary algorithm. PLoS One 2018; 13:e0201715. [PMID: 30092017 PMCID: PMC6084934 DOI: 10.1371/journal.pone.0201715] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 07/21/2018] [Indexed: 12/28/2022] Open
Abstract
Many biological systems are typically examined from the point of view of adaptation to certain conditions or requirements. One such system is the standard genetic code (SGC), which generally minimizes the cost of amino acid replacements resulting from mutations or mistranslations. However, no full consensus has been reached on the factors that caused the evolution of this feature. One of the hypotheses suggests that code optimality was directly selected as an advantage to preserve information about encoded proteins. An important feature that should be considered when studying the SGC is the different roles of the three codon positions. Therefore, we investigated the robustness of this code regarding the cost of amino acid replacements resulting from substitutions in these positions separately and the sum of these costs. We applied a modified evolutionary algorithm and included four models of the genetic code assuming various restrictions on its structure. The SGC was compared both with the codes that minimize the objective function and those that maximize it. This approach allowed us to place the SGC in the global space of possible codes, which is a more appropriate and unbiased comparison than that with randomly generated codes because they are characterized by relatively uniform amino acid assignments to codons. The SGC appeared to be well optimized at the global scale, but its individual positions were not fully optimized because there were codes that were optimized for only one codon position and simultaneously outperformed the SGC at the other positions. We also found that different code structures may lead to the same optimality and that random codes can show a tendency to minimize costs under some of the genetic code models. Our results suggest that the optimality of SGC could be a by-product of other processes.
Collapse
Affiliation(s)
- Paweł Błażej
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
| | - Małgorzata Wnętrzak
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
| | - Dorota Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
| | - Paweł Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
- * E-mail:
| |
Collapse
|