1
|
Gardes J, Maldivi C, Boisset D, Aubourg T, Demongeot J. An Unsupervised Classifier for Whole-Genome Phylogenies, the Maxwell© Tool. Int J Mol Sci 2023; 24:16278. [PMID: 38003468 PMCID: PMC10671764 DOI: 10.3390/ijms242216278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 10/20/2023] [Accepted: 11/02/2023] [Indexed: 11/26/2023] Open
Abstract
The development of phylogenetic trees based on RNA or DNA sequences generally requires a precise and limited choice of important RNAs, e.g., messenger RNAs of essential proteins or ribosomal RNAs (like 16S), but rarely complete genomes, making it possible to explain evolution and speciation. In this article, we propose revisiting a classic phylogeny of archaea from only the information on the succession of nucleotides of their entire genome. For this purpose, we use a new tool, the unsupervised classifier Maxwell, whose principle lies in the Burrows-Wheeler compression transform, and we show its efficiency in clustering whole archaeal genomes.
Collapse
Affiliation(s)
- Joël Gardes
- Orange Labs, 38229 Meylan, France; (J.G.); (C.M.); (D.B.)
| | | | - Denis Boisset
- Orange Labs, 38229 Meylan, France; (J.G.); (C.M.); (D.B.)
| | - Timothée Aubourg
- Faculty of Medicine, Université Grenoble Alpes, AGEIS EA 7407 Tools for e-Gnosis Medical, 38700 La Tronche, France;
| | - Jacques Demongeot
- Faculty of Medicine, Université Grenoble Alpes, AGEIS EA 7407 Tools for e-Gnosis Medical, 38700 La Tronche, France;
| |
Collapse
|
2
|
Factors in Protobiomonomer Selection for the Origin of the Standard Genetic Code. Acta Biotheor 2021; 69:745-767. [PMID: 34283307 DOI: 10.1007/s10441-021-09420-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 07/01/2021] [Indexed: 10/20/2022]
Abstract
Natural selection of specific protobiomonomers during abiogenic development of the prototype genetic code is hindered by the diversity of structural, spatial, and rotational isomers that have identical elemental composition and molecular mass (M), but can vary significantly in their physicochemical characteristics, such as the melting temperature Tm, the Tm:M ratio, and the solubility in water, due to different positions of atoms in the molecule. These parameters differ between cis- and trans-isomers of dicarboxylic acids, spatial monosaccharide isomers, and structural isomers of α-, β-, and γ-amino acids. The stable planar heterocyclic molecules of the major nucleobases comprise four (C, H, N, O) or three (C, H, N) elements and contain a single -C=C bond and two nitrogen atoms in each heterocycle involved in C-N and C=N bonds. They exist as isomeric resonance hybrids of single and double bonds and as a mixture of tautomer forms due to the presence of -C=O and/or -NH2 side groups. They are thermostable, insoluble in water, and exhibit solid-state stability, which is of central importance for DNA molecules as carriers of genetic information. In M-Tm diagrams, proteinogenic amino acids and the corresponding codons are distributed fairly regularly relative to the distinct clusters of purine and pyrimidine bases, reflecting the correspondence between codons and amino acids that was established in different periods of genetic code development. The body of data on the evolution of the genetic code system indicates that the elemental composition and molecular structure of protobiomonomers, and their M, Tm, photostability, and aqueous solubility determined their selection in the emergence of the standard genetic code.
Collapse
|
3
|
Vallée Y, Youssef-Saliba S. Sulfur Amino Acids: From Prebiotic Chemistry to Biology and Vice Versa. SYNTHESIS-STUTTGART 2021. [DOI: 10.1055/a-1472-7914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
AbstractTwo sulfur-containing amino acids are included in the list of the 20 classical protein amino acids. A methionine residue is introduced at the start of the synthesis of all current proteins. Cysteine, thanks to its thiol function, plays an essential role in a very large number of catalytic sites. Here we present what is known about the prebiotic synthesis of these two amino acids and homocysteine, and we discuss their introduction into primitive peptides and more elaborate proteins.1 Introduction2 Sulfur Sources3 Prebiotic Synthesis of Cysteine4 Prebiotic Synthesis of Methionine5 Homocysteine and Its Thiolactone6 Methionine and Cystine in Proteins7 Prebiotic Scenarios Using Sulfur Amino Acids8 Introduction of Cys and Met in the Genetic Code9 Conclusion
Collapse
|
4
|
Shenhav L, Zeevi D. Resource conservation manifests in the genetic code. Science 2020; 370:683-687. [PMID: 33154134 DOI: 10.1126/science.aaz9642] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 05/04/2020] [Accepted: 09/11/2020] [Indexed: 12/31/2022]
Abstract
Nutrient limitation drives competition for resources across organisms. However, much is unknown about how selective pressures resulting from nutrient limitation shape microbial coding sequences. Here, we study this "resource-driven selection" by using metagenomic and single-cell data of marine microbes, alongside environmental measurements. We show that a significant portion of the selection exerted on microbes is explained by the environment and is associated with nitrogen availability. Notably, this resource conservation optimization is encoded in the structure of the standard genetic code, providing robustness against mutations that increase carbon and nitrogen incorporation into protein sequences. This robustness generalizes to codon choices from multiple taxa across all domains of life, including the human genome.
Collapse
Affiliation(s)
- Liat Shenhav
- Center for Studies in Physics and Biology, Rockefeller University, New York, NY, USA.,Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA
| | - David Zeevi
- Center for Studies in Physics and Biology, Rockefeller University, New York, NY, USA.
| |
Collapse
|
5
|
Energy mapping of the genetic code and genomic domains: implications for code evolution and molecular Darwinism. Q Rev Biophys 2020; 53:e11. [PMID: 33143792 DOI: 10.1017/s0033583520000098] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
When the iconic DNA genetic code is expressed in terms of energy differentials, one observes that information embedded in chemical sequences, including some biological outcomes, correlate with distinctive free energy profiles. Specifically, we find correlations between codon usage and codon free energy, suggestive of a thermodynamic selection for codon usage. We also find correlations between what are considered ancient amino acids and high codon free energy values. Such correlations may be reflective of the sequence-based genetic code fundamentally mapping as an energy code. In such a perspective, one can envision the genetic code as composed of interlocking thermodynamic cycles that allow codons to 'evolve' from each other through a series of sequential transitions and transversions, which are influenced by an energy landscape modulated by both thermodynamic and kinetic factors. As such, early evolution of the genetic code may have been driven, in part, by differential energetics, as opposed exclusively by the functionality of any gene product. In such a scenario, evolutionary pressures can, in part, derive from the optimization of biophysical properties (e.g. relative stabilities and relative rates), in addition to the classic perspective of being driven by a phenotypical adaptive advantage (natural selection). Such differential energy mapping of the genetic code, as well as larger genomic domains, may reflect an energetically resolved and evolved genomic landscape, consistent with a type of differential, energy-driven 'molecular Darwinism'. It should not be surprising that evolution of the code was influenced by differential energetics, as thermodynamics is the most general and universal branch of science that operates over all time and length scales.
Collapse
|
6
|
A search for the physical basis of the genetic code. Biosystems 2020; 195:104148. [DOI: 10.1016/j.biosystems.2020.104148] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 04/09/2020] [Accepted: 04/09/2020] [Indexed: 01/01/2023]
|
7
|
Rogers SO. Evolution of the genetic code based on conservative changes of codons, amino acids, and aminoacyl tRNA synthetases. J Theor Biol 2019; 466:1-10. [PMID: 30658052 DOI: 10.1016/j.jtbi.2019.01.022] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Revised: 01/10/2019] [Accepted: 01/14/2019] [Indexed: 11/30/2022]
Abstract
The genetic code, as arranged in the standard tabular form, displays a non-random structure relating to the characteristics of the amino acids. An alternative arrangement can be made by organizing the code according to aminoacyl-tRNA synthetases (aaRSs), codons, and reverse complement codons, which illuminates a coevolutionary process that led to the contemporary genetic code. As amino acids were added to the genetic code, they were recognized by aaRSs that interact with stereochemically similar amino acids. Single nucleotide changes in the codons and anticodons were favored over more extensive changes, such that there was a logical stepwise progression in the evolution of the genetic code. The model presented traces the evolution of the genetic code accounting for these steps. Amino acid frequencies in ancient proteins and the preponderance of GNN codons in mRNAs for ancient proteins indicate that the genetic code began with alanine, aspartate, glutamate, glycine, and valine, with alanine being in the highest proportions. In addition to being consistent in terms of conservative changes in codon nucleotides, the model also is consistent with respect to aaRS classes, aaRS attachment to the tRNA, amino acid stereochemistry, and to a large extent with amino acid physicochemistry, and biochemical pathways.
Collapse
Affiliation(s)
- Scott O Rogers
- Department of Biological Sciences, Bowling Green State University, Bowling Green, OH, United States.
| |
Collapse
|
8
|
Abstract
We advocate for a tRNA- rather than an mRNA-centric model for evolution of the genetic code. The mechanism for evolution of cloverleaf tRNA provides a root sequence for radiation of tRNAs and suggests a simplified understanding of code evolution. To analyze code sectoring, rooted tRNAomes were compared for several archaeal and one bacterial species. Rooting of tRNAome trees reveals conserved structures, indicating how the code was shaped during evolution and suggesting a model for evolution of a LUCA tRNAome tree. We propose the polyglycine hypothesis that the initial product of the genetic code may have been short chain polyglycine to stabilize protocells. In order to describe how anticodons were allotted in evolution, the sectoring-degeneracy hypothesis is proposed. Based on sectoring, a simple stepwise model is developed, in which the code sectors from a 1→4→8→∼16 letter code. At initial stages of code evolution, we posit strong positive selection for wobble base ambiguity, supporting convergence to 4-codon sectors and ∼16 letters. In a later stage, ∼5–6 letters, including stops, were added through innovating at the anticodon wobble position. In archaea and bacteria, tRNA wobble adenine is negatively selected, shrinking the maximum size of the primordial genetic code to 48 anticodons. Because 64 codons are recognized in mRNA, tRNA-mRNA coevolution requires tRNA wobble position ambiguity leading to degeneracy of the code.
Collapse
Affiliation(s)
- Daewoo Pak
- a Center for Statistical Training and Consulting , Michigan State University , E. Lansing , MI 48824 , USA
| | - Nan Du
- b Computer Science and Engineering , Michigan State University , E. Lansing , MI 48824
| | | | - Yanni Sun
- b Computer Science and Engineering , Michigan State University , E. Lansing , MI 48824
| | - Zachary F Burton
- d Department of Biochemistry and Molecular Biology , Michigan State University , E. Lansing , MI 48824-1319
| |
Collapse
|
9
|
The evolution of the genetic code: Impasses and challenges. Biosystems 2018; 164:217-225. [DOI: 10.1016/j.biosystems.2017.10.006] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2017] [Revised: 10/06/2017] [Accepted: 10/09/2017] [Indexed: 01/17/2023]
|
10
|
Triplet-Based Codon Organization Optimizes the Impact of Synonymous Mutation on Nucleic Acid Molecular Dynamics. J Mol Evol 2018; 86:91-102. [PMID: 29344693 PMCID: PMC5846835 DOI: 10.1007/s00239-018-9828-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Accepted: 01/06/2018] [Indexed: 11/22/2022]
Abstract
Since the elucidation of the genetic code almost 50 years ago, many nonrandom aspects of its codon organization remain only partly resolved. Here, we investigate the recent hypothesis of ‘dual-use’ codons which proposes that in addition to allowing adjustment of codon optimization to tRNA abundance, the degeneracy in the triplet-based genetic code also multiplexes information regarding DNA’s helical shape and protein-binding dynamics while avoiding interference with other protein-level characteristics determined by amino acid properties. How such structural optimization of the code within eukaryotic chromatin could have arisen from an RNA world is a mystery, but would imply some preadaptation in an RNA context. We analyzed synonymous (protein-silent) and nonsynonymous (protein-altering) mutational impacts on molecular dynamics in 13823 identically degenerate alternative codon reorganizations, defined by codon transitions in 7680 GPU-accelerated molecular dynamic simulations of implicitly and explicitly solvated double-stranded aRNA and bDNA structures. When compared to all possible alternative codon assignments, the standard genetic code minimized the impact of synonymous mutations on the random atomic fluctuations and correlations of carbon backbone vector trajectories while facilitating the specific movements that contribute to DNA polymer flexibility. This trend was notably stronger in the context of RNA supporting the idea that dual-use codon optimization and informational multiplexing in DNA resulted from the preadaptation of the RNA duplex to resist changes to thermostability. The nonrandom and divergent molecular dynamics of synonymous mutations also imply that the triplet-based code may have resulted from adaptive functional expansion enabling a primordial doublet code to multiplex gene regulatory information via the shape and charge of the minor groove.
Collapse
|
11
|
Affiliation(s)
- Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Artem S. Novozhilov
- Department of Mathematics, North Dakota State University, Fargo, North Dakota 58108, USA
| |
Collapse
|
12
|
Seligmann H, Warthi G. Genetic Code Optimization for Cotranslational Protein Folding: Codon Directional Asymmetry Correlates with Antiparallel Betasheets, tRNA Synthetase Classes. Comput Struct Biotechnol J 2017; 15:412-424. [PMID: 28924459 PMCID: PMC5591391 DOI: 10.1016/j.csbj.2017.08.001] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Revised: 07/20/2017] [Accepted: 08/05/2017] [Indexed: 12/14/2022] Open
Abstract
A new codon property, codon directional asymmetry in nucleotide content (CDA), reveals a biologically meaningful genetic code dimension: palindromic codons (first and last nucleotides identical, codon structure XZX) are symmetric (CDA = 0), codons with structures ZXX/XXZ are 5'/3' asymmetric (CDA = - 1/1; CDA = - 0.5/0.5 if Z and X are both purines or both pyrimidines, assigning negative/positive (-/+) signs is an arbitrary convention). Negative/positive CDAs associate with (a) Fujimoto's tetrahedral codon stereo-table; (b) tRNA synthetase class I/II (aminoacylate the 2'/3' hydroxyl group of the tRNA's last ribose, respectively); and (c) high/low antiparallel (not parallel) betasheet conformation parameters. Preliminary results suggest CDA-whole organism associations (body temperature, developmental stability, lifespan). Presumably, CDA impacts spatial kinetics of codon-anticodon interactions, affecting cotranslational protein folding. Some synonymous codons have opposite CDA sign (alanine, leucine, serine, and valine), putatively explaining how synonymous mutations sometimes affect protein function. Correlations between CDA and tRNA synthetase classes are weaker than between CDA and antiparallel betasheet conformation parameters. This effect is stronger for mitochondrial genetic codes, and potentially drives mitochondrial codon-amino acid reassignments. CDA reveals information ruling nucleotide-protein relations embedded in reversed (not reverse-complement) sequences (5'-ZXX-3'/5'-XXZ-3').
Collapse
Affiliation(s)
- Hervé Seligmann
- Aix-Marseille Univ, Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes, UM 63, CNRS UMR7278, IRD 198, INSERM U1095, Institut Hospitalo-Universitaire Méditerranée-Infection, Marseille, Postal code 13385, France
- Dept. Ecol Evol Behav, Alexander Silberman Inst Life Sci, The Hebrew University of Jerusalem, IL-91904 Jerusalem, Israel
| | - Ganesh Warthi
- Aix-Marseille Univ, Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes, UM 63, CNRS UMR7278, IRD 198, INSERM U1095, Institut Hospitalo-Universitaire Méditerranée-Infection, Marseille, Postal code 13385, France
| |
Collapse
|
13
|
Frozen Accident Pushing 50: Stereochemistry, Expansion, and Chance in the Evolution of the Genetic Code. Life (Basel) 2017; 7:life7020022. [PMID: 28545255 PMCID: PMC5492144 DOI: 10.3390/life7020022] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Revised: 05/19/2017] [Accepted: 05/20/2017] [Indexed: 12/31/2022] Open
Abstract
Nearly 50 years ago, Francis Crick propounded the frozen accident scenario for the evolution of the genetic code along with the hypothesis that the early translation system consisted primarily of RNA. Under the frozen accident perspective, the code is universal among modern life forms because any change in codon assignment would be highly deleterious. The frozen accident can be considered the default theory of code evolution because it does not imply any specific interactions between amino acids and the cognate codons or anticodons, or any particular properties of the code. The subsequent 49 years of code studies have elucidated notable features of the standard code, such as high robustness to errors, but failed to develop a compelling explanation for codon assignments. In particular, stereochemical affinity between amino acids and the cognate codons or anticodons does not seem to account for the origin and evolution of the code. Here, I expand Crick’s hypothesis on RNA-only translation system by presenting evidence that this early translation already attained high fidelity that allowed protein evolution. I outline an experimentally testable scenario for the evolution of the code that combines a distinct version of the stereochemical hypothesis, in which amino acids are recognized via unique sites in the tertiary structure of proto-tRNAs, rather than by anticodons, expansion of the code via proto-tRNA duplication, and the frozen accident.
Collapse
|
14
|
Aggarwal N, Bandhu AV, Sengupta S. Finite population analysis of the effect of horizontal gene transfer on the origin of an universal and optimal genetic code. Phys Biol 2016; 13:036007. [DOI: 10.1088/1478-3975/13/3/036007] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
15
|
Pathways of Genetic Code Evolution in Ancient and Modern Organisms. J Mol Evol 2015; 80:229-43. [DOI: 10.1007/s00239-015-9686-8] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 06/03/2015] [Indexed: 10/23/2022]
|
16
|
Guimarães RC. The Self-Referential Genetic Code is Biologic and Includes the Error Minimization Property. ORIGINS LIFE EVOL B 2015; 45:69-75. [PMID: 25773583 DOI: 10.1007/s11084-015-9417-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 11/24/2014] [Indexed: 11/26/2022]
Abstract
The distribution of the triplet to amino acid correspondences in the genetic code matrix contains blocks of similarity. There are (a) groups of similar triplets coding for the same amino acid, which is called code degeneracy, and (b) clusters of similar amino acids corresponding to similar triplets. Processes that led to this regionalization have been investigated through a variety of perspectives but no consensus has been reached and no model has been convincing enough to drive experimental tests. Most traditional has been the hypothesis that the code was derived from the standard evolutionary processes of testing variations in the correspondences through the fitness measure of reaching distributions in the matrix space in an optimal manner so that the effects of mutations on protein phenotypes would be minimized, that is, with reduction of the intensity or of the deviant quality of the functional alterations associated with variations. In contrast, the self-referential model for the formation of the code is based on an original regionalization of characters through the concerted superposition of the two components of the encodings: the four modules of dimers of tRNAs are occupied sequentially by sets of amino acids that are also sequentially devoted to fulfilling specific functions in the protein sites and motifs to which they preferentially belong. Therewith, part (b) of the error-minimizing property follows. Part (a) of the property, the code degeneracy, is derived from the synthetase character of developing specificities directed initially to the principal dinucleotides of the triplets, resulting in tetracodonic degeneracy. This was later partly modified during evolution according to the developments of codon usage and the introduction of new amino acids.
Collapse
Affiliation(s)
- Romeu Cardoso Guimarães
- Lab. Biodiversidade e Evolução Molecular, Dept. Biologia Geral, Inst. Ciências Biológicas, Univ. Federal de Minas Gerais, 31270.901, Belo Horizonte, MG, Brazil,
| |
Collapse
|
17
|
Extraordinarily adaptive properties of the genetically encoded amino acids. Sci Rep 2015; 5:9414. [PMID: 25802223 PMCID: PMC4371090 DOI: 10.1038/srep09414] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Accepted: 02/12/2015] [Indexed: 02/02/2023] Open
Abstract
Using novel advances in computational chemistry, we demonstrate that the set of 20 genetically encoded amino acids, used nearly universally to construct all coded terrestrial proteins, has been highly influenced by natural selection. We defined an adaptive set of amino acids as one whose members thoroughly cover relevant physico-chemical properties, or “chemistry space.” Using this metric, we compared the encoded amino acid alphabet to random sets of amino acids. These random sets were drawn from a computationally generated compound library containing 1913 alternative amino acids that lie within the molecular weight range of the encoded amino acids. Sets that cover chemistry space better than the genetically encoded alphabet are extremely rare and energetically costly. Further analysis of more adaptive sets reveals common features and anomalies, and we explore their implications for synthetic biology. We present these computations as evidence that the set of 20 amino acids found within the standard genetic code is the result of considerable natural selection. The amino acids used for constructing coded proteins may represent a largely global optimum, such that any aqueous biochemistry would use a very similar set.
Collapse
|
18
|
Babbitt GA, Alawad MA, Schulze KV, Hudson AO. Synonymous codon bias and functional constraint on GC3-related DNA backbone dynamics in the prokaryotic nucleoid. Nucleic Acids Res 2014; 42:10915-26. [PMID: 25200075 PMCID: PMC4176184 DOI: 10.1093/nar/gku811] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
While mRNA stability has been demonstrated to control rates of translation, generating both global and local synonymous codon biases in many unicellular organisms, this explanation cannot adequately explain why codon bias strongly tracks neighboring intergene GC content; suggesting that structural dynamics of DNA might also influence codon choice. Because minor groove width is highly governed by 3-base periodicity in GC, the existence of triplet-based codons might imply a functional role for the optimization of local DNA molecular dynamics via GC content at synonymous sites (≈GC3). We confirm a strong association between GC3-related intrinsic DNA flexibility and codon bias across 24 different prokaryotic multiple whole-genome alignments. We develop a novel test of natural selection targeting synonymous sites and demonstrate that GC3-related DNA backbone dynamics have been subject to moderate selective pressure, perhaps contributing to our observation that many genes possess extreme DNA backbone dynamics for their given protein space. This dual function of codons may impose universal functional constraints affecting the evolution of synonymous and non-synonymous sites. We propose that synonymous sites may have evolved as an 'accessory' during an early expansion of a primordial genetic code, allowing for multiplexed protein coding and structural dynamic information within the same molecular context.
Collapse
Affiliation(s)
- Gregory A Babbitt
- Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester NY, USA 14623
| | - Mohammed A Alawad
- B. Thomas Golisano College of Computing and Information Sciences, Rochester Institute of Technology, Rochester NY, USA 14623
| | - Katharina V Schulze
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX, USA 77030
| | - André O Hudson
- Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester NY, USA 14623
| |
Collapse
|
19
|
Salinas DG, Gallardo MO, Osorio MI. Probable relationship between partitions of the set of codons and the origin of the genetic code. Biosystems 2014; 117:77-81. [PMID: 24495914 DOI: 10.1016/j.biosystems.2014.01.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2013] [Revised: 12/26/2013] [Accepted: 01/24/2014] [Indexed: 11/16/2022]
Abstract
Here we study the distribution of randomly generated partitions of the set of amino acid-coding codons. Some results are an application from a previous work, about the Stirling numbers of the second kind and triplet codes, both to the cases of triplet codes having four stop codons, as in mammalian mitochondrial genetic code, and hypothetical doublet codes. Extending previous results, in this work it is found that the most probable number of blocks of synonymous codons, in a genetic code, is similar to the number of amino acids when there are four stop codons, as well as it could be for a primigenious doublet code. Also it is studied the integer partitions associated to patterns of synonymous codons and it is shown, for the canonical code, that the standard deviation inside an integer partition is one of the most probable. We think that, in some early epoch, the genetic code might have had a maximum of the disorder or entropy, independent of the assignment between codons and amino acids, reaching a state similar to "code freeze" proposed by Francis Crick. In later stages, maybe deterministic rules have reassigned codons to amino acids, forming the natural codes, such as the canonical code, but keeping the numerical features describing the set partitions and the integer partitions, like a "fossil numbers"; both kinds of partitions about the set of amino acid-coding codons.
Collapse
Affiliation(s)
- Dino G Salinas
- Centro de Investigación Biomédica, Facultad de Medicina, Universidad Diego Portales, Avda. Ejército 141, Santiago, Chile.
| | - Mauricio O Gallardo
- Centro de Investigación Biomédica, Facultad de Medicina, Universidad Diego Portales, Avda. Ejército 141, Santiago, Chile.
| | - Manuel I Osorio
- Centro de Investigación Biomédica, Facultad de Medicina, Universidad Diego Portales, Avda. Ejército 141, Santiago, Chile.
| |
Collapse
|
20
|
Bandhu AV, Aggarwal N, Sengupta S. Revisiting the physico-chemical hypothesis of code origin: an analysis based on code-sequence coevolution in a finite population. ORIGINS LIFE EVOL B 2013; 43:465-89. [PMID: 24500541 DOI: 10.1007/s11084-014-9353-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2013] [Accepted: 01/13/2014] [Indexed: 01/23/2023]
Abstract
The origin of the genetic code marked a major transition from a plausible RNA world to the world of DNA and proteins and is an important milestone in our understanding of the origin of life. We examine the efficacy of the physico-chemical hypothesis of code origin by carrying out simulations of code-sequence coevolution in finite populations in stages, leading first to the emergence of ten amino acid code(s) and subsequently to 14 amino acid code(s). We explore two different scenarios of primordial code evolution. In one scenario, competition occurs between populations of equilibrated code-sequence sets while in another scenario; new codes compete with existing codes as they are gradually introduced into the population with a finite probability. In either case, we find that natural selection between competing codes distinguished by differences in the degree of physico-chemical optimization is unable to explain the structure of the standard genetic code. The code whose structure is most consistent with the standard genetic code is often not among the codes that have a high fixation probability. However, we find that the composition of the code population affects the code fixation probability. A physico-chemically optimized code gets fixed with a significantly higher probability if it competes against a set of randomly generated codes. Our results suggest that physico-chemical optimization may not be the sole driving force in ensuring the emergence of the standard genetic code.
Collapse
Affiliation(s)
- Ashutosh Vishwa Bandhu
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India
| | | | | |
Collapse
|
21
|
Rosandić M, Paar V, Glunčić M. Fundamental role of start/stop regulators in whole DNA and new trinucleotide classification. Gene 2013; 531:184-90. [PMID: 24042127 DOI: 10.1016/j.gene.2013.09.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2013] [Revised: 08/31/2013] [Accepted: 09/05/2013] [Indexed: 10/26/2022]
Abstract
The origin and logic of genetic code are two of greatest mysteries of life sciences. Analyzing DNA sequences we showed that the start/stop trinucleotides have broader importance than just marking start and stop of exons in coding DNA. On this basis, here we introduced new classification of trinucleotides and showed that all A+T rich trinucleotides consisting of three different nucleotides arise from start-ATG, stop-TGA and stop-TAG using their complement, reverse complement and reverse transformations. Due to the same transformations during generations of crossing-over they can switch from one form to the other. By direct process the start-ATG and stop-TAG can irreversibly transform into stop-TAA. By transformation into A+T rich trinucleotides and 16/32 C+G rich they can lose the start/stop function and take the role of a sense codon in reversible way. The remaining 16 C+G trinucleotides cannot directly transform into start/stop trinucleotides and thus remain a firm skeleton for structuring the C+G rich DNA. We showed that start/stops strongly enrich the A+T rich noncoding DNA through frequently extended forms. From the evolutionary viewpoint the start/stops are chief creators of prevailing A+T rich noncoding DNA, and of more stable coding DNA. We propose that start/stops have basic role as "seeds" in trinucleotide evolution of noncoding and coding sequences and lead to asymmetry between A+T and C+G rich DNA. By dynamical transformations during evolution they enabled pronounced phylogenetic broadness, keeping the regulator function.
Collapse
Affiliation(s)
- Marija Rosandić
- Faculty of Science, University of Zagreb, Bijenička 32, 10000 Zagreb, Croatia
| | | | | |
Collapse
|
22
|
Povolotskaya IS, Kondrashov FA, Ledda A, Vlasov PK. Stop codons in bacteria are not selectively equivalent. Biol Direct 2012; 7:30. [PMID: 22974057 PMCID: PMC3549826 DOI: 10.1186/1745-6150-7-30] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2012] [Accepted: 08/22/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The evolution and genomic stop codon frequencies have not been rigorously studied with the exception of coding of non-canonical amino acids. Here we study the rate of evolution and frequency distribution of stop codons in bacterial genomes. RESULTS We show that in bacteria stop codons evolve slower than synonymous sites, suggesting the action of weak negative selection. However, the frequency of stop codons relative to genomic nucleotide content indicated that this selection regime is not straightforward. The frequency of TAA and TGA stop codons is GC-content dependent, with TAA decreasing and TGA increasing with GC-content, while TAG frequency is independent of GC-content. Applying a formal, analytical model to these data we found that the relationship between stop codon frequencies and nucleotide content cannot be explained by mutational biases or selection on nucleotide content. However, with weak nucleotide content-dependent selection on TAG, -0.5 < Nes < 1.5, the model fits all of the data and recapitulates the relationship between TAG and nucleotide content. For biologically plausible rates of mutations we show that, in bacteria, TAG stop codon is universally associated with lower fitness, with TAA being the optimal for G-content < 16% while for G-content > 16% TGA has a higher fitness than TAG. CONCLUSIONS Our data indicate that TAG codon is universally suboptimal in the bacterial lineage, such that TAA is likely to be the preferred stop codon for low GC content while the TGA is the preferred stop codon for high GC content. The optimization of stop codon usage may therefore be useful in genome engineering or gene expression optimization applications.
Collapse
Affiliation(s)
- Inna S Povolotskaya
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG) and UPF, 88 Dr, Aiguader, Barcelona 08003, Spain
| | | | | | | |
Collapse
|
23
|
de Crécy-Lagard V, Marck C, Grosjean H. Decoding in Candidatus Riesia pediculicola, close to a minimal tRNA modification set? TRENDS IN CELL & MOLECULAR BIOLOGY 2012; 7:11-34. [PMID: 23308034 PMCID: PMC3539174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
A comparative genomic analysis of the recently sequenced human body louse unicellular endosymbiont Candidatus Riesia pediculicola with a reduced genome (582 Kb), revealed that it is the only known organism that might have lost all post-transcriptional base and ribose modifications of the tRNA body, retaining only modifications of the anticodon-stem-loop essential for mRNA decoding. Such a minimal tRNA modification set was not observed in other insect symbionts or in parasitic unicellular bacteria, such as Mycoplasma genitalium (580 Kb), that have also evolved by considerably reducing their genomes. This could be an example of a minimal tRNA modification set required for life, a question that has been at the center of the field for many years, especially for understanding the emergence and evolution of the genetic code.
Collapse
Affiliation(s)
- Valérie de Crécy-Lagard
- Department of Microbiology and Cell Science, University of Florida, P.O. Box 110700, Gainesville, FL 32611-0700, USA
| | - Christian Marck
- Institut de Biologie et de Technologies de Saclay (iBiTec-S) Bât 144, CEA/Saclay, F-91191 Gif-sur-Yvette Cedex
| | - Henri Grosjean
- Centre de Génétique Moléculaire, UPR 3404, CNRS, Associée à l’Université Paris-Sud 11, FRC 3115, 91190 Gif-sur-Yvette, France
| |
Collapse
|
24
|
Mutuality in Discrete and Compositional Information: Perspectives for Synthetic Genetic Codes. Cognit Comput 2011. [DOI: 10.1007/s12559-011-9116-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
25
|
Vishnoi A, Sethupathy P, Simola D, Plotkin JB, Hannenhalli S. Genome-wide survey of natural selection on functional, structural, and network properties of polymorphic sites in Saccharomyces paradoxus. Mol Biol Evol 2011; 28:2615-27. [PMID: 21478372 DOI: 10.1093/molbev/msr085] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND To characterize the genetic basis of phenotypic evolution, numerous studies have identified individual genes that have likely evolved under natural selection. However, phenotypic changes may represent the cumulative effect of similar evolutionary forces acting on functionally related groups of genes. Phylogenetic analyses of divergent yeast species have identified functional groups of genes that have evolved at significantly different rates, suggestive of differential selection on the functional properties. However, due to environmental heterogeneity over long evolutionary timescales, selection operating within a single lineage may be dramatically different, and it is not detectable via interspecific comparisons alone. Moreover, interspecific studies typically quantify selection on protein-coding regions using the D(n)/D(s) ratio, which cannot be extended easily to study selection on noncoding regions or synonymous sites. The population genetic-based analysis of selection operating within a single lineage ameliorates these limitations. FINDINGS We investigated selection on several properties associated with genes, promoters, or polymorphic sites, by analyzing the derived allele frequency spectrum of single nucleotide polymorphisms (SNPs) in 28 strains of Saccharomyces paradoxus. We found evidence for significant differential selection between many functionally relevant categories of SNPs, underscoring the utility of function-centric approaches for discovering signatures of natural selection. When comparable, our findings are largely consistent with previous studies based on interspecific comparisons, with one notable exception: our study finds that mutations from an ancient amino acid to a relatively new amino acid are selectively disfavored, whereas interspecific comparisons have found selection against ancient amino acids. Several of our findings have not been addressed through prior interspecific studies: we find that synonymous mutations from preferred to unpreferred codons are selected against and that synonymous SNPs in the linker regions of proteins are relatively less constrained than those within protein domains. CONCLUSIONS We present the first global survey of selection acting on various functional properties in S. paradoxus. We found that selection pressures previously detected over long evolutionary timescales have also shaped the evolution of S. paradoxus. Importantly, we also make novel discoveries untenable via conventional interspecific analyses.
Collapse
|
26
|
Stability of the genetic code and optimal parameters of amino acids. J Theor Biol 2010; 269:57-63. [PMID: 20955716 DOI: 10.1016/j.jtbi.2010.10.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2010] [Revised: 09/20/2010] [Accepted: 10/12/2010] [Indexed: 11/24/2022]
Abstract
The standard genetic code is known to be much more efficient in minimizing adverse effects of misreading errors and one-point mutations in comparison with a random code having the same structure, i.e. the same number of codons coding for each particular amino acid. We study the inverse problem, how the code structure affects the optimal physico-chemical parameters of amino acids ensuring the highest stability of the genetic code. It is shown that the choice of two or more amino acids with given properties determines unambiguously all the others. In this sense the code structure determines strictly the optimal parameters of amino acids or the corresponding scales may be derived directly from the genetic code. In the code with the structure of the standard genetic code the resulting values for hydrophobicity obtained in the scheme "leave one out" and in the scheme with fixed maximum and minimum parameters correlate significantly with the natural scale. The comparison of the optimal and natural parameters allows assessing relative impact of physico-chemical and error-minimization factors during evolution of the genetic code. As the resulting optimal scale depends on the choice of amino acids with given parameters, the technique can also be applied to testing various scenarios of the code evolution with increasing number of codified amino acids. Our results indicate the co-evolution of the genetic code and physico-chemical properties of recruited amino acids.
Collapse
|
27
|
Fidelity in archaeal information processing. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2010; 2010. [PMID: 20871851 PMCID: PMC2943090 DOI: 10.1155/2010/960298] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/15/2010] [Accepted: 07/12/2010] [Indexed: 12/30/2022]
Abstract
A key element during the flow of genetic information in living systems is fidelity. The accuracy of DNA replication influences the genome size as well as the rate of genome evolution. The large amount of energy invested in gene expression implies that fidelity plays a major role in fitness. On the other hand, an increase in fidelity generally coincides with a decrease in velocity. Hence, an important determinant of the evolution of life has been the establishment of a delicate balance between fidelity and variability. This paper reviews the current knowledge on quality control in archaeal information processing. While the majority of these processes are homologous in Archaea, Bacteria, and Eukaryotes, examples are provided of nonorthologous factors and processes operating in the archaeal domain. In some instances, evidence for the existence of certain fidelity mechanisms has been provided, but the factors involved still remain to be identified.
Collapse
|