1
|
Pawlak K, Błażej P, Mackiewicz D, Mackiewicz P. The Influence of the Selection at the Amino Acid Level on Synonymous Codon Usage from the Viewpoint of Alternative Genetic Codes. Int J Mol Sci 2023; 24:ijms24021185. [PMID: 36674703 PMCID: PMC9866869 DOI: 10.3390/ijms24021185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 12/19/2022] [Accepted: 12/30/2022] [Indexed: 01/11/2023] Open
Abstract
Synonymous codon usage can be influenced by mutations and/or selection, e.g., for speed of protein translation and correct folding. However, this codon bias can also be affected by a general selection at the amino acid level due to differences in the acceptance of the loss and generation of these codons. To assess the importance of this effect, we constructed a mutation-selection model model, in which we generated almost 90,000 stationary nucleotide distributions produced by mutational processes and applied a selection based on differences in physicochemical properties of amino acids. Under these conditions, we calculated the usage of fourfold degenerated (4FD) codons and compared it with the usage characteristic of the pure mutations. We considered both the standard genetic code (SGC) and alternative genetic codes (AGCs). The analyses showed that a majority of AGCs produced a greater 4FD codon bias than the SGC. The mutations producing more thymine or adenine than guanine and cytosine increased the differences in usage. On the other hand, the mutational pressures generating a lot of cytosine or guanine with a low content of adenine and thymine decreased this bias because the nucleotide content of most 4FD codons stayed in the compositional equilibrium with these pressures. The comparison of the theoretical results with those for real protein coding sequences showed that the influence of selection at the amino acid level on the synonymous codon usage cannot be neglected. The analyses indicate that the effect of amino acid selection cannot be disregarded and that it can interfere with other selection factors influencing codon usage, especially in AT-rich genomes, in which AGCs are usually used.
Collapse
|
2
|
Model of Genetic Code Structure Evolution under Various Types of Codon Reading. Int J Mol Sci 2022; 23:ijms23031690. [PMID: 35163612 PMCID: PMC8835785 DOI: 10.3390/ijms23031690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 01/23/2022] [Accepted: 01/25/2022] [Indexed: 11/28/2022] Open
Abstract
The standard genetic code (SGC) is a set of rules according to which 64 codons are assigned to 20 canonical amino acids and stop coding signal. As a consequence, the SGC is redundant because there is a greater number of codons than the number of encoded labels. This redundancy implies the existence of codons that encode the same genetic information. The size and organization of such synonymous codon blocks are important characteristics of the SGC structure whose evolution is still unclear. Therefore, we studied possible evolutionary mechanisms of the codon block structure. We conducted computer simulations assuming that coding systems at early stages of the SGC evolution were sets of ambiguous codon assignments with high entropy. We included three types of reading systems characterized by different inaccuracy and pattern of codon recognition. In contrast to the previous study, we allowed for evolution of the reading systems and their competition. The simulations performed under minimization of translational errors and reduction of coding ambiguity produced the coding system resistant to these errors. The reading system similar to that present in the SGC dominated the others very quickly. The survived system was also characterized by low entropy and possessed properties similar to that in the SGC. Our simulation show that the unambiguous SGC could emerged from a code with a lower level of ambiguity and the number of tRNAs increased during the evolution.
Collapse
|
3
|
Factors in Protobiomonomer Selection for the Origin of the Standard Genetic Code. Acta Biotheor 2021; 69:745-767. [PMID: 34283307 DOI: 10.1007/s10441-021-09420-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 07/01/2021] [Indexed: 10/20/2022]
Abstract
Natural selection of specific protobiomonomers during abiogenic development of the prototype genetic code is hindered by the diversity of structural, spatial, and rotational isomers that have identical elemental composition and molecular mass (M), but can vary significantly in their physicochemical characteristics, such as the melting temperature Tm, the Tm:M ratio, and the solubility in water, due to different positions of atoms in the molecule. These parameters differ between cis- and trans-isomers of dicarboxylic acids, spatial monosaccharide isomers, and structural isomers of α-, β-, and γ-amino acids. The stable planar heterocyclic molecules of the major nucleobases comprise four (C, H, N, O) or three (C, H, N) elements and contain a single -C=C bond and two nitrogen atoms in each heterocycle involved in C-N and C=N bonds. They exist as isomeric resonance hybrids of single and double bonds and as a mixture of tautomer forms due to the presence of -C=O and/or -NH2 side groups. They are thermostable, insoluble in water, and exhibit solid-state stability, which is of central importance for DNA molecules as carriers of genetic information. In M-Tm diagrams, proteinogenic amino acids and the corresponding codons are distributed fairly regularly relative to the distinct clusters of purine and pyrimidine bases, reflecting the correspondence between codons and amino acids that was established in different periods of genetic code development. The body of data on the evolution of the genetic code system indicates that the elemental composition and molecular structure of protobiomonomers, and their M, Tm, photostability, and aqueous solubility determined their selection in the emergence of the standard genetic code.
Collapse
|
4
|
Pawlak K, Wnetrzak M, Mackiewicz D, Mackiewicz P, Błażej P. Models of genetic code structure evolution with variable number of coded labels. Biosystems 2021; 210:104528. [PMID: 34492316 DOI: 10.1016/j.biosystems.2021.104528] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 08/26/2021] [Accepted: 08/27/2021] [Indexed: 10/20/2022]
Abstract
It is assumed that at the early stage of cell evolution its translation machinery was characterized by high noise, i.e. ambiguous assignment of codons to amino acids in the genetic code, which initially encoded only few amino acids. Next, during its evolution new amino acids were added to this code. Taking into account this facts, we investigated theoretical models of genetic code's structure, which evolved from a set of ambiguous codons assignments into a coding system with a low level of uncertainty. We considered three types of translational inaccuracies assuming a different number of fixed codon positions. We applied a modified version of evolutionary algorithm for finding the genetic codes that the most effectively reduced the initial uncertainty in the assignment of codons to encoded labels, i.e. amino acids and a stop translation signal. We examined codes with the number of labels from four to 22. Our results indicated that the quality of genetic code structure is strongly dependent on the number of encoded labels as well as the type of translational mechanism. The more strict assignments of codon to the labels was preferred by the codes encoding more number of labels. The results showed that a smaller degeneracy of codes evolved from a more tolerant coding with the stepwise addition of coded amino acids to the genetic code. The distribution of codon groups in the standard genetic code corresponds well to the translation model assuming two fixed codon positions, whereas the six-codon groups can be relics form previous stages of evolution when the code characterized by a greater uncertainty.
Collapse
Affiliation(s)
- Konrad Pawlak
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | - Małgorzata Wnetrzak
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | - Dorota Mackiewicz
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | - Paweł Mackiewicz
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | - Paweł Błażej
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland.
| |
Collapse
|
5
|
Nowak K, Błażej P, Wnetrzak M, Mackiewicz D, Mackiewicz P. Some theoretical aspects of reprogramming the standard genetic code. Genetics 2021; 218:6169163. [PMID: 33711098 PMCID: PMC8128387 DOI: 10.1093/genetics/iyab040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Accepted: 02/11/2021] [Indexed: 11/12/2022] Open
Abstract
Reprogramming of the standard genetic code to include non-canonical amino acids (ncAAs) opens new prospects for medicine, industry, and biotechnology. There are several methods of code engineering, which allow us for storing new genetic information in DNA sequences and producing proteins with new properties. Here, we provided a theoretical background for the optimal genetic code expansion, which may find application in the experimental design of the genetic code. We assumed that the expanded genetic code includes both canonical and non-canonical information stored in 64 classical codons. What is more, the new coding system is robust to point mutations and minimizes the possibility of reversion from the new to old information. In order to find such codes, we applied graph theory to analyze the properties of optimal codon sets. We presented the formal procedure in finding the optimal codes with various number of vacant codons that could be assigned to new amino acids. Finally, we discussed the optimal number of the newly incorporated ncAAs and also the optimal size of codon groups that can be assigned to ncAAs.
Collapse
Affiliation(s)
- Kuba Nowak
- Faculty of Mathematics and Computer Science, University of Wrocław, ul. F. Joliot-Curie 15, 50-383 Wrocław, Poland
| | - Paweł Błażej
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul F. Joliot-Curie 14a, 50-383 Wrocław, Poland
| | - Małgorzata Wnetrzak
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul F. Joliot-Curie 14a, 50-383 Wrocław, Poland
| | - Dorota Mackiewicz
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul F. Joliot-Curie 14a, 50-383 Wrocław, Poland
| | - Paweł Mackiewicz
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul F. Joliot-Curie 14a, 50-383 Wrocław, Poland
| |
Collapse
|
6
|
Demongeot J, Moreira A, Seligmann H. Negative CG dinucleotide bias: An explanation based on feedback loops between Arginine codon assignments and theoretical minimal RNA rings. Bioessays 2020; 43:e2000071. [PMID: 33319381 DOI: 10.1002/bies.202000071] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 11/23/2020] [Accepted: 11/26/2020] [Indexed: 01/05/2023]
Abstract
Theoretical minimal RNA rings are candidate primordial genes evolved for non-redundant coding of the genetic code's 22 coding signals (one codon per biogenic amino acid, a start and a stop codon) over the shortest possible length: 29520 22-nucleotide-long RNA rings solve this min-max constraint. Numerous RNA ring properties are reminiscent of natural genes. Here we present analyses showing that all RNA rings lack dinucleotide CG (a mutable, chemically instable dinucleotide coding for Arginine), bearing a resemblance to known CG-depleted genomes. CG in "incomplete" RNA rings (not coding for all coding signals, with only 3-12 nucleotides) gradually decreases towards CG absence in complete, 22-nucleotide-long RNA rings. Presumably, feedback loops during RNA ring growth during evolution (when amino acid assignment fixed the genetic code) assigned Arg to codons lacking CG (AGR) to avoid CG. Hence, as a chemical property of base pairs, CG mutability restructured the genetic code, thereby establishing itself as genetically encoded biological information.
Collapse
Affiliation(s)
- Jacques Demongeot
- Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical & Labcom CNRS/UGA/OrangeLabs Telecom4Health, Faculty of Medicine, Université Grenoble Alpes, La Tronche, France
| | - Andrés Moreira
- Departamento de Informática, Universidad Técnica Federico Santa María, Santiago, Chile
| | - Hervé Seligmann
- Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical & Labcom CNRS/UGA/OrangeLabs Telecom4Health, Faculty of Medicine, Université Grenoble Alpes, La Tronche, France.,The National Natural History Collections, The Hebrew University of Jerusalem, Jerusalem, Israel.,Institute of Microstructure Technology, Karlsruhe Institute of Technology (KIT), Eggenstein-Leopoldshafen, Germany
| |
Collapse
|
7
|
Demongeot J, Seligmann H. Codon assignment evolvability in theoretical minimal RNA rings. Gene 2020; 769:145208. [PMID: 33031892 DOI: 10.1016/j.gene.2020.145208] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2020] [Revised: 09/28/2020] [Accepted: 09/29/2020] [Indexed: 12/28/2022]
Abstract
Genetic code codon-amino acid assignments evolve for 15 (AAA, AGA, AGG, ATA, CGG, CTA, CTG. CTC, CTT, TAA, TAG, TCA, TCG, TGA and TTA (GNN codons notably absent)) among 64 codons (23.4%) across the 31 genetic codes (NCBI list completed with recently suggested green algal mitochondrial genetic codes). Their usage in 25 theoretical minimal RNA rings is examined. RNA rings are designed in silico to code once over the shortest length for all 22 coding signals (start and stop codons and each amino acid according to the standard genetic code). Though designed along coding constraints, RNA rings resemble ancestral tRNA loops, assigning to each RNA ring a putative anticodon, a cognate amino acid and an evolutionary genetic code integration rank for that cognate amino acid. Analyses here show 1. biases against/for evolvable codons in the two first vs last thirds of RNA ring coding sequences, 2. RNA rings with evolvable codons have recent cognates, and 3. evolvable codon and cytosine numbers in RNA ring compositions are positively correlated. Applying alternative genetic codes to RNA rings designed for nonredundant coding according to the standard genetic code reveals unsuspected properties of the standard genetic code and of RNA rings, notably on codon assignment evolvability and the special role of cytosine in relation to codon assignment evolvability and of the genetic code's coding structure.
Collapse
Affiliation(s)
- Jacques Demongeot
- Université Grenoble Alpes, Faculty of Medicine, Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical, F-38700 La Tronche, France
| | - Hervé Seligmann
- The National Natural History Collections, The Hebrew University of Jerusalem, 91404 Jerusalem, Israel.
| |
Collapse
|
8
|
Di Giulio M. LUCA as well as the ancestors of archaea, bacteria and eukaryotes were progenotes: Inference from the distribution and diversity of the reading mechanism of the AUA and AUG codons in the domains of life. Biosystems 2020; 198:104239. [PMID: 32919036 DOI: 10.1016/j.biosystems.2020.104239] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 09/01/2020] [Accepted: 09/01/2020] [Indexed: 11/25/2022]
Abstract
Here I use the rationale assuming that if of a certain trait that exerts its function in some aspect of the genetic code or, more generally, in protein synthesis, it is possible to identify the evolutionary stage of its origin then it would imply that this evolutionary moment would be characterized by a high translational noise because this trait would originate for the first time during that evolutionary stage. That is to say, if this trait had a non-marginal role in the realization of the genetic code, or in protein synthesis, then the origin of this trait would imply that, more generally, it was the genetic code itself that was still originating. But if the genetic code were still originating - at that precise evolutionary stage - then this would imply that there was a high translational noise which in turn would imply that it was in the presence of a protocell, i.e. a progenote that was by definition characterized by high translational noise. I apply this rationale to the mechanism of modification of the base 34 of the anticodon of an isoleucine tRNA that leads to the reading of AUA and AUG codons in archaea, bacteria and eukaryotes. The phylogenetic distribution of this mechanism in these phyletic lineages indicates that this mechanism originated only after the evolutionary stage of the last universal common ancestor (LUCA), namely, during the formation of cellular domains, i.e., at the stage of ancestors of these main phyletic lineages. Furthermore, given that this mechanism of modification of the base 34 of the anticodon of the isoleucine tRNA would result to emerge at a stage of the origin of the genetic code - despite in its terminal phases - then all this would imply that the ancestors of bacteria, archaea and eukaryotes were progenotes. If so, all the more so, the LUCA would also be a progenote since it preceded these ancestors temporally. A consequence of all this reasoning might be that since these three ancestors were of the progenotes that were different from each other, if at least one of them had evolved into at least two real and different cells - basically different from each other - then the number of cellular domains would not be three but it would be greater than three.
Collapse
Affiliation(s)
- Massimo Di Giulio
- The Ionian School, Genetic Code and tRNA Origin Laboratory, Via Roma 19, 67030, Alfedena (L'Aquila), Italy; Institute of Biosciences and Bioresources, National Research Council, Via P. Castellino, 111, 80131, Naples, Italy.
| |
Collapse
|
9
|
Seligmann H. First arrived, first served: competition between codons for codon-amino acid stereochemical interactions determined early genetic code assignments. Naturwissenschaften 2020; 107:20. [PMID: 32367155 DOI: 10.1007/s00114-020-01676-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Revised: 03/10/2020] [Accepted: 04/05/2020] [Indexed: 12/12/2022]
Abstract
Stereochemical nucleotide-amino acid interactions, in the form of noncovalent nucleotide-amino acid interactions, potentially produced the genetic code's codon-amino acid assignments. Empirical estimates of single nucleotide-amino acid affinities on surfaces and in solution are used to test whether trinucleotide-amino acid affinities determined genetic code assignments pending the principle "first arrived, first served": presumed early amino acids have greater codon-amino acid affinities than ulterior ones. Here, these single nucleotide affinities are used to approximate all 64 × 20 trinucleotide-amino acid affinities. Analyses show that (1) on surfaces, genetic code codon-amino acid assignments tend to match high affinities for the amino acids that integrated earliest the genetic code (according to Wong's metabolic coevolution hypothesis between nucleotides and amino acids) and (2) in solution, the same principle holds for the anticodon-amino acid assignments. Affinity analyses match best genetic code assignments when assuming that trinucleotides competed for amino acids, rather than amino acids for trinucleotides. Codon-amino acid affinities stick better to genetic code assignments than anticodon-amino acid affinities. Presumably, two independent coding systems, on surfaces and in solution, converged, and formed the current translation system. Proto-translation on surfaces by direct codon-amino acid interactions without tRNA-like adaptors coadapted with a system emerging in solution by proto-tRNA anticodon-amino acid interactions. These systems assigned identical or similar cognates to codons on surfaces and to anticodons in solution. Results indicate that a prebiotic metabolism predated genetic code self-organization.
Collapse
Affiliation(s)
- Hervé Seligmann
- The National Natural History Collections, The Hebrew University of Jerusalem, 91904, Jerusalem, Israel. .,Faculty of Medicine, Université Grenoble Alpes, Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical & Labcom CNRS/UGA/OrangeLabs Telecoms4Health, F-38700, La Tronche, France.
| |
Collapse
|
10
|
Błażej P, Wnetrzak M, Mackiewicz D, Mackiewicz P. Basic principles of the genetic code extension. ROYAL SOCIETY OPEN SCIENCE 2020; 7:191384. [PMID: 32257313 PMCID: PMC7062095 DOI: 10.1098/rsos.191384] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Accepted: 01/09/2020] [Indexed: 05/08/2023]
Abstract
Compounds including non-canonical amino acids (ncAAs) or other artificially designed molecules can find a lot of applications in medicine, industry and biotechnology. They can be produced thanks to the modification or extension of the standard genetic code (SGC). Such peptides or proteins including the ncAAs can be constantly delivered in a stable way by organisms with the customized genetic code. Among several methods of engineering the code, using non-canonical base pairs is especially promising, because it enables generating many new codons, which can be used to encode any new amino acid. Since even one pair of new bases can extend the SGC up to 216 codons generated by a six-letter nucleotide alphabet, the extension of the SGC can be achieved in many ways. Here, we proposed a stepwise procedure of the SGC extension with one pair of non-canonical bases to minimize the consequences of point mutations. We reported relationships between codons in the framework of graph theory. All 216 codons were represented as nodes of the graph, whereas its edges were induced by all possible single nucleotide mutations occurring between codons. Therefore, every set of canonical and newly added codons induces a specific subgraph. We characterized the properties of the induced subgraphs generated by selected sets of codons. Thanks to that, we were able to describe a procedure for incremental addition of the set of meaningful codons up to the full coding system consisting of three pairs of bases. The procedure of gradual extension of the SGC makes the whole system robust to changing genetic information due to mutations and is compatible with the views assuming that codons and amino acids were added successively to the primordial SGC, which evolved minimizing harmful consequences of mutations or mistranslations of encoded proteins.
Collapse
Affiliation(s)
- Paweł Błażej
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | | | | | | |
Collapse
|
11
|
Determining amino acid scores of the genetic code table: Complementarity, structure, function and evolution. Biosystems 2020; 187:104026. [DOI: 10.1016/j.biosystems.2019.104026] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 08/28/2019] [Indexed: 11/22/2022]
|
12
|
Optimization of the standard genetic code in terms of two mutation types: Point mutations and frameshifts. Biosystems 2019; 181:44-50. [DOI: 10.1016/j.biosystems.2019.04.012] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 04/27/2019] [Indexed: 02/08/2023]
|