1
|
Wesp V, Theißen G, Schuster S. Statistical analysis of synonymous and stop codons in pseudo-random and real sequences as a function of GC content. Sci Rep 2023; 13:22996. [PMID: 38151539 PMCID: PMC10752896 DOI: 10.1038/s41598-023-49626-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 12/10/2023] [Indexed: 12/29/2023] Open
Abstract
Knowledge of the frequencies of synonymous triplets in protein-coding and non-coding DNA stretches can be used in gene finding. These frequencies depend on the GC content of the genome or parts of it. An example of interest is provided by stop codons. This is relevant for the definition of Open Reading Frames. A generic case is provided by pseudo-random sequences, especially when they code for complex proteins or when they are non-coding and not subject to selection pressure. Here, we calculate, for such sequences and for all 25 known genetic codes, the frequency of each amino acid and stop codon based on their set of codons and as a function of GC content. The amino acids can be classified into five groups according to the GC content where their expected frequency reaches its maximum. We determine the overall Shannon information based on groups of synonymous codons and show that it becomes maximum at a percent GC of 43.3% (for the standard code). This is in line with the observation that in most fungi, plants, and animals, this genomic parameter is in the range from 35 to 50%. By analysing natural sequences, we show that there is a clear bias for triplets corresponding to stop codons near the 5'- and 3'-splice sites in the introns of various clades.
Collapse
Affiliation(s)
- Valentin Wesp
- Department of Bioinformatics, Matthias Schleiden Institute, Friedrich Schiller University Jena, Ernst-Abbe-Platz 2, 07743, Jena, Germany
| | - Günter Theißen
- Department of Genetics, Matthias Schleiden Institute, Friedrich Schiller University Jena, Philosophenweg 12, 07743, Jena, Germany
| | - Stefan Schuster
- Department of Bioinformatics, Matthias Schleiden Institute, Friedrich Schiller University Jena, Ernst-Abbe-Platz 2, 07743, Jena, Germany.
| |
Collapse
|
2
|
Pawlak K, Błażej P, Mackiewicz D, Mackiewicz P. The Influence of the Selection at the Amino Acid Level on Synonymous Codon Usage from the Viewpoint of Alternative Genetic Codes. Int J Mol Sci 2023; 24:ijms24021185. [PMID: 36674703 PMCID: PMC9866869 DOI: 10.3390/ijms24021185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 12/19/2022] [Accepted: 12/30/2022] [Indexed: 01/11/2023] Open
Abstract
Synonymous codon usage can be influenced by mutations and/or selection, e.g., for speed of protein translation and correct folding. However, this codon bias can also be affected by a general selection at the amino acid level due to differences in the acceptance of the loss and generation of these codons. To assess the importance of this effect, we constructed a mutation-selection model model, in which we generated almost 90,000 stationary nucleotide distributions produced by mutational processes and applied a selection based on differences in physicochemical properties of amino acids. Under these conditions, we calculated the usage of fourfold degenerated (4FD) codons and compared it with the usage characteristic of the pure mutations. We considered both the standard genetic code (SGC) and alternative genetic codes (AGCs). The analyses showed that a majority of AGCs produced a greater 4FD codon bias than the SGC. The mutations producing more thymine or adenine than guanine and cytosine increased the differences in usage. On the other hand, the mutational pressures generating a lot of cytosine or guanine with a low content of adenine and thymine decreased this bias because the nucleotide content of most 4FD codons stayed in the compositional equilibrium with these pressures. The comparison of the theoretical results with those for real protein coding sequences showed that the influence of selection at the amino acid level on the synonymous codon usage cannot be neglected. The analyses indicate that the effect of amino acid selection cannot be disregarded and that it can interfere with other selection factors influencing codon usage, especially in AT-rich genomes, in which AGCs are usually used.
Collapse
|
3
|
Kachale A, Pavlíková Z, Nenarokova A, Roithová A, Durante IM, Miletínová P, Záhonová K, Nenarokov S, Votýpka J, Horáková E, Ross RL, Yurchenko V, Beznosková P, Paris Z, Valášek LS, Lukeš J. Short tRNA anticodon stem and mutant eRF1 allow stop codon reassignment. Nature 2023; 613:751-758. [PMID: 36631608 DOI: 10.1038/s41586-022-05584-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 11/18/2022] [Indexed: 01/13/2023]
Abstract
Cognate tRNAs deliver specific amino acids to translating ribosomes according to the standard genetic code, and three codons with no cognate tRNAs serve as stop codons. Some protists have reassigned all stop codons as sense codons, neglecting this fundamental principle1-4. Here we analyse the in-frame stop codons in 7,259 predicted protein-coding genes of a previously undescribed trypanosomatid, Blastocrithidia nonstop. We reveal that in this species in-frame stop codons are underrepresented in genes expressed at high levels and that UAA serves as the only termination codon. Whereas new tRNAsGlu fully cognate to UAG and UAA evolved to reassign these stop codons, the UGA reassignment followed a different path through shortening the anticodon stem of tRNATrpCCA from five to four base pairs (bp). The canonical 5-bp tRNATrp recognizes UGG as dictated by the genetic code, whereas its shortened 4-bp variant incorporates tryptophan also into in-frame UGA. Mimicking this evolutionary twist by engineering both variants from B. nonstop, Trypanosoma brucei and Saccharomyces cerevisiae and expressing them in the last two species, we recorded a significantly higher readthrough for all 4-bp variants. Furthermore, a gene encoding B. nonstop release factor 1 acquired a mutation that specifically restricts UGA recognition, robustly potentiating the UGA reassignment. Virtually the same strategy has been adopted by the ciliate Condylostoma magnum. Hence, we describe a previously unknown, universal mechanism that has been exploited in unrelated eukaryotes with reassigned stop codons.
Collapse
Affiliation(s)
- Ambar Kachale
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic.,Faculty of Sciences, University of South Bohemia, České Budějovice, Czech Republic
| | - Zuzana Pavlíková
- Institute of Microbiology, Czech Academy of Sciences, Prague, Czech Republic
| | - Anna Nenarokova
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic.,Faculty of Sciences, University of South Bohemia, České Budějovice, Czech Republic.,School of Biological Sciences, University of Bristol, Bristol, UK
| | - Adriana Roithová
- Institute of Microbiology, Czech Academy of Sciences, Prague, Czech Republic
| | - Ignacio M Durante
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
| | - Petra Miletínová
- Institute of Microbiology, Czech Academy of Sciences, Prague, Czech Republic
| | - Kristína Záhonová
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic.,Faculty of Science, Charles University, BIOCEV, Prague, Czech Republic.,Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| | - Serafim Nenarokov
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic.,Faculty of Sciences, University of South Bohemia, České Budějovice, Czech Republic
| | - Jan Votýpka
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic.,Faculty of Science, Charles University, BIOCEV, Prague, Czech Republic
| | - Eva Horáková
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic.,Institute of Microbiology, Czech Academy of Sciences, Třeboň, Czech Republic
| | | | - Vyacheslav Yurchenko
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| | - Petra Beznosková
- Institute of Microbiology, Czech Academy of Sciences, Prague, Czech Republic
| | - Zdeněk Paris
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic. .,Faculty of Sciences, University of South Bohemia, České Budějovice, Czech Republic.
| | | | - Julius Lukeš
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic. .,Faculty of Sciences, University of South Bohemia, České Budějovice, Czech Republic.
| |
Collapse
|
4
|
Matsuo E, Morita K, Nakayama T, Yazaki E, Sarai C, Takahashi K, Iwataki M, Inagaki Y. Comparative Plastid Genomics of Green-Colored Dinoflagellates Unveils Parallel Genome Compaction and RNA Editing. FRONTIERS IN PLANT SCIENCE 2022; 13:918543. [PMID: 35898209 PMCID: PMC9309888 DOI: 10.3389/fpls.2022.918543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 06/22/2022] [Indexed: 06/15/2023]
Abstract
Dinoflagellates possess plastids that are diverse in both pigmentation and evolutionary background. One of the plastid types found in dinoflagellates is pigmented with chlorophylls a and b (Chl a + b) and originated from the endosymbionts belonging to a small group of green algae, Pedinophyceae. The Chl a + b-containing plastids have been found in three distantly related dinoflagellates Lepidodinium spp., strain MGD, and strain TGD, and were proposed to be derived from separate partnerships between a dinoflagellate (host) and a pedinophycean green alga (endosymbiont). Prior to this study, a plastid genome sequence was only available for L. chlorophorum, which was reported to bear the features that were not found in that of the pedinophycean green alga Pedinomonas minor, a putative close relative of the endosymbiont that gave rise to the current Chl a + b-containing plastid. In this study, we sequenced the plastid genomes of strains MGD and TGD to compare with those of L. chlorophorum as well as pedinophycean green algae. The mapping of the RNA-seq reads on the corresponding plastid genome identified RNA editing on plastid gene transcripts in the three dinoflagellates. Further, the comparative plastid genomics revealed that the plastid genomes of the three dinoflagellates achieved several features, which are not found in or much less obvious than the pedinophycean plastid genomes determined to date, in parallel.
Collapse
Affiliation(s)
- Eriko Matsuo
- Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Japan
| | - Kounosuke Morita
- Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Japan
| | - Takuro Nakayama
- Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Japan
- Center for Computational Sciences, University of Tsukuba, Tsukuba, Japan
| | | | - Chihiro Sarai
- Graduate School of Science and Engineering, Yamagata University, Yamagata, Japan
| | - Kazuya Takahashi
- Asian Natural Environmental Science Center, The University of Tokyo, Tokyo, Japan
| | - Mitsunori Iwataki
- Asian Natural Environmental Science Center, The University of Tokyo, Tokyo, Japan
| | - Yuji Inagaki
- Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Japan
- Center for Computational Sciences, University of Tsukuba, Tsukuba, Japan
| |
Collapse
|
5
|
Christinaki AC, Kanellopoulos SG, Kortsinoglou AM, Andrikopoulos MΑ, Theelen B, Boekhout T, Kouvelis VN. Mitogenomics and mitochondrial gene phylogeny decipher the evolution of Saccharomycotina yeasts. Genome Biol Evol 2022; 14:6586520. [PMID: 35576568 PMCID: PMC9154068 DOI: 10.1093/gbe/evac073] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/12/2022] [Indexed: 11/17/2022] Open
Abstract
Saccharomycotina yeasts belong to diverse clades within the kingdom of fungi and are important to human everyday life. This work investigates the evolutionary relationships among these yeasts from a mitochondrial (mt) genomic perspective. A comparative study of 155 yeast mt genomes representing all major phylogenetic lineages of Saccharomycotina was performed, including genome size and content variability, intron and intergenic regions’ diversity, genetic code alterations, and syntenic variation. Findings from this study suggest that mt genome size diversity is the result of a ceaseless random process, mainly based on genetic recombination and intron mobility. Gene order analysis revealed conserved syntenic units and many occurring rearrangements, which can be correlated with major evolutionary events as shown by the phylogenetic analysis of the concatenated mt protein matrix. For the first time, molecular dating indicated a slower mt genome divergence rate in the early stages of yeast evolution, in contrast with a faster rate in the late evolutionary stages, compared to their nuclear time divergence. Genetic code reassignments of mt genomes are a perpetual process happening in many different parallel evolutionary steps throughout the evolution of Saccharomycotina. Overall, this work shows that phylogenetic studies based on the mt genome of yeasts highlight major evolutionary events.
Collapse
Affiliation(s)
- Anastasia C Christinaki
- National and Kapodistrian University of Athens, Faculty of Biology, Department of Genetics and Biotechnology, Athens, Greece
| | - Spyros G Kanellopoulos
- National and Kapodistrian University of Athens, Faculty of Biology, Department of Genetics and Biotechnology, Athens, Greece
| | - Alexandra M Kortsinoglou
- National and Kapodistrian University of Athens, Faculty of Biology, Department of Genetics and Biotechnology, Athens, Greece
| | - Marios Α Andrikopoulos
- National and Kapodistrian University of Athens, Faculty of Biology, Department of Genetics and Biotechnology, Athens, Greece
| | - Bart Theelen
- Westerdijk Fungal Biodiversity Institute, Utrecht, The Netherlands
| | - Teun Boekhout
- Westerdijk Fungal Biodiversity Institute, Utrecht, The Netherlands.,University of Amsterdam, Institute of Biodiversity and Ecosystem Dynamics (IBED), Amsterdam, The Netherlands
| | - Vassili N Kouvelis
- National and Kapodistrian University of Athens, Faculty of Biology, Department of Genetics and Biotechnology, Athens, Greece
| |
Collapse
|
6
|
Factors in Protobiomonomer Selection for the Origin of the Standard Genetic Code. Acta Biotheor 2021; 69:745-767. [PMID: 34283307 DOI: 10.1007/s10441-021-09420-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 07/01/2021] [Indexed: 10/20/2022]
Abstract
Natural selection of specific protobiomonomers during abiogenic development of the prototype genetic code is hindered by the diversity of structural, spatial, and rotational isomers that have identical elemental composition and molecular mass (M), but can vary significantly in their physicochemical characteristics, such as the melting temperature Tm, the Tm:M ratio, and the solubility in water, due to different positions of atoms in the molecule. These parameters differ between cis- and trans-isomers of dicarboxylic acids, spatial monosaccharide isomers, and structural isomers of α-, β-, and γ-amino acids. The stable planar heterocyclic molecules of the major nucleobases comprise four (C, H, N, O) or three (C, H, N) elements and contain a single -C=C bond and two nitrogen atoms in each heterocycle involved in C-N and C=N bonds. They exist as isomeric resonance hybrids of single and double bonds and as a mixture of tautomer forms due to the presence of -C=O and/or -NH2 side groups. They are thermostable, insoluble in water, and exhibit solid-state stability, which is of central importance for DNA molecules as carriers of genetic information. In M-Tm diagrams, proteinogenic amino acids and the corresponding codons are distributed fairly regularly relative to the distinct clusters of purine and pyrimidine bases, reflecting the correspondence between codons and amino acids that was established in different periods of genetic code development. The body of data on the evolution of the genetic code system indicates that the elemental composition and molecular structure of protobiomonomers, and their M, Tm, photostability, and aqueous solubility determined their selection in the emergence of the standard genetic code.
Collapse
|
7
|
Shulgina Y, Eddy SR. A computational screen for alternative genetic codes in over 250,000 genomes. eLife 2021; 10:71402. [PMID: 34751130 PMCID: PMC8629427 DOI: 10.7554/elife.71402] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 10/26/2021] [Indexed: 11/25/2022] Open
Abstract
The genetic code has been proposed to be a ‘frozen accident,’ but the discovery of alternative genetic codes over the past four decades has shown that it can evolve to some degree. Since most examples were found anecdotally, it is difficult to draw general conclusions about the evolutionary trajectories of codon reassignment and why some codons are affected more frequently. To fill in the diversity of genetic codes, we developed Codetta, a computational method to predict the amino acid decoding of each codon from nucleotide sequence data. We surveyed the genetic code usage of over 250,000 bacterial and archaeal genome sequences in GenBank and discovered five new reassignments of arginine codons (AGG, CGA, and CGG), representing the first sense codon changes in bacteria. In a clade of uncultivated Bacilli, the reassignment of AGG to become the dominant methionine codon likely evolved by a change in the amino acid charging of an arginine tRNA. The reassignments of CGA and/or CGG were found in genomes with low GC content, an evolutionary force that likely helped drive these codons to low frequency and enable their reassignment. All life forms rely on a ‘code’ to translate their genetic information into proteins. This code relies on limited permutations of three nucleotides – the building blocks that form DNA and other types of genetic information. Each ‘triplet’ of nucleotides – or codon – encodes a specific amino acid, the basic component of proteins. Reading the sequence of codons in the right order will let the cell know which amino acid to assemble next on a growing protein. For instance, the codon CGG – formed of the nucleotides guanine (G) and cytosine (C) – codes for the amino acid arginine. From bacteria to humans, most life forms rely on the same genetic code. Yet certain organisms have evolved to use slightly different codes, where one or several codons have an altered meaning. To better understand how alternative genetic codes have evolved, Shulgina and Eddy set out to find more organisms featuring these altered codons, creating a new software called Codetta that can analyze the genome of a microorganism and predict the genetic code it uses. Codetta was then used to sift through the genetic information of 250,000 microorganisms. This was made possible by the sequencing, in recent years, of the genomes of hundreds of thousands of bacteria and other microorganisms – including many never studied before. These analyses revealed five groups of bacteria with alternative genetic codes, all of which had changes in the codons that code for arginine. Amongst these, four had genomes with a low proportion of guanine and cytosine nucleotides. This may have made some guanine and cytosine-rich arginine codons very rare in these organisms and, therefore, easier to be reassigned to encode another amino acid. The work by Shulgina and Eddy demonstrates that Codetta is a new, useful tool that scientists can use to understand how genetic codes evolve. In addition, it can also help to ensure the accuracy of widely used protein databases, which assume which genetic code organisms use to predict protein sequences from their genomes.
Collapse
Affiliation(s)
| | - Sean R Eddy
- Molecular & Cellular Biology, Harvard University, Cambridge, United States
| |
Collapse
|
8
|
Demongeot J, Seligmann H. Codon assignment evolvability in theoretical minimal RNA rings. Gene 2020; 769:145208. [PMID: 33031892 DOI: 10.1016/j.gene.2020.145208] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2020] [Revised: 09/28/2020] [Accepted: 09/29/2020] [Indexed: 12/28/2022]
Abstract
Genetic code codon-amino acid assignments evolve for 15 (AAA, AGA, AGG, ATA, CGG, CTA, CTG. CTC, CTT, TAA, TAG, TCA, TCG, TGA and TTA (GNN codons notably absent)) among 64 codons (23.4%) across the 31 genetic codes (NCBI list completed with recently suggested green algal mitochondrial genetic codes). Their usage in 25 theoretical minimal RNA rings is examined. RNA rings are designed in silico to code once over the shortest length for all 22 coding signals (start and stop codons and each amino acid according to the standard genetic code). Though designed along coding constraints, RNA rings resemble ancestral tRNA loops, assigning to each RNA ring a putative anticodon, a cognate amino acid and an evolutionary genetic code integration rank for that cognate amino acid. Analyses here show 1. biases against/for evolvable codons in the two first vs last thirds of RNA ring coding sequences, 2. RNA rings with evolvable codons have recent cognates, and 3. evolvable codon and cytosine numbers in RNA ring compositions are positively correlated. Applying alternative genetic codes to RNA rings designed for nonredundant coding according to the standard genetic code reveals unsuspected properties of the standard genetic code and of RNA rings, notably on codon assignment evolvability and the special role of cytosine in relation to codon assignment evolvability and of the genetic code's coding structure.
Collapse
Affiliation(s)
- Jacques Demongeot
- Université Grenoble Alpes, Faculty of Medicine, Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical, F-38700 La Tronche, France
| | - Hervé Seligmann
- The National Natural History Collections, The Hebrew University of Jerusalem, 91404 Jerusalem, Israel.
| |
Collapse
|
9
|
BłaŻej P, Wnetrzak M, Mackiewicz D, Mackiewicz P. The influence of different types of translational inaccuracies on the genetic code structure. BMC Bioinformatics 2019; 20:114. [PMID: 30841864 PMCID: PMC6404327 DOI: 10.1186/s12859-019-2661-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 01/29/2019] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The standard genetic code is a recipe for assigning unambiguously 21 labels, i.e. amino acids and stop translation signal, to 64 codons. However, at early stages of the translational machinery development, the codons did not have to be read unambiguously and the early genetic codes could have contained some ambiguous assignments of codons to amino acids. Therefore, the goal of this work was to obtain the genetic code structures which could have evolved assuming different types of inaccuracy of the translational machinery starting from unambiguous assignments of codons to amino acids. RESULTS We developed a theoretical model assuming that the level of uncertainty of codon assignments can gradually decrease during the simulations. Since it is postulated that the standard code has evolved to be robust against point mutations and mistranslations, we developed three simulation scenarios assuming that such errors can influence one, two or three codon positions. The simulated codes were selected using the evolutionary algorithm methodology to decrease coding ambiguity and increase their robustness against mistranslation. CONCLUSIONS The results indicate that the typical codon block structure of the genetic code could have evolved to decrease the ambiguity of amino acid to codon assignments and to increase the fidelity of reading the genetic information. However, the robustness to errors was not the decisive factor that influenced the genetic code evolution because it is possible to find theoretical codes that minimize the reading errors better than the standard genetic code.
Collapse
Affiliation(s)
- Paweł BłaŻej
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| | - Małgorzata Wnetrzak
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| | - Dorota Mackiewicz
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| | - Paweł Mackiewicz
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| |
Collapse
|
10
|
Many alternative and theoretical genetic codes are more robust to amino acid replacements than the standard genetic code. J Theor Biol 2019; 464:21-32. [DOI: 10.1016/j.jtbi.2018.12.030] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 12/17/2018] [Accepted: 12/19/2018] [Indexed: 02/07/2023]
|
11
|
Noutahi E, Calderon V, Blanchette M, Lang FB, El-Mabrouk N. CoreTracker: accurate codon reassignment prediction, applied to mitochondrial genomes. Bioinformatics 2018; 33:3331-3339. [PMID: 28655158 DOI: 10.1093/bioinformatics/btx421] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Accepted: 06/23/2017] [Indexed: 11/13/2022] Open
Abstract
Motivation Codon reassignments have been reported across all domains of life. With the increasing number of sequenced genomes, the development of systematic approaches for genetic code detection is essential for accurate downstream analyses. Three automated prediction tools exist so far: FACIL, GenDecoder and Bagheera; the last two respectively restricted to metazoan mitochondrial genomes and CUG reassignments in yeast nuclear genomes. These tools can only analyze a single genome at a time and are often not followed by a validation procedure, resulting in a high rate of false positives. Results We present CoreTracker, a new algorithm for the inference of sense-to-sense codon reassignments. CoreTracker identifies potential codon reassignments in a set of related genomes, then uses statistical evaluations and a random forest classifier to predict those that are the most likely to be correct. Predicted reassignments are then validated through a phylogeny-aware step that evaluates the impact of the new genetic code on the protein alignment. Handling simultaneously a set of genomes in a phylogenetic framework, allows tracing back the evolution of each reassignment, which provides information on its underlying mechanism. Applied to metazoan and yeast genomes, CoreTracker significantly outperforms existing methods on both precision and sensitivity. Availability and implementation CoreTracker is written in Python and available at https://github.com/UdeM-LBIT/CoreTracker. Contact mabrouk@iro.umontreal.ca. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Emmanuel Noutahi
- Département d'Informatique et de Recherche Opérationnelle (DIRO), Université de Montréal, Montréal, QC CP 6128, Canada
| | - Virginie Calderon
- Département d'Informatique et de Recherche Opérationnelle (DIRO), Université de Montréal, Montréal, QC CP 6128, Canada
| | - Mathieu Blanchette
- School of Computer Science, McGill University, McConnell Engineering Bldg., Montréal, QC H3A 0E9, Canada
| | - Franz B Lang
- Département de Biochimie, Centre Robert Cedergren, Université de Montréal, Montréal, QC CP 6128, Canada
| | - Nadia El-Mabrouk
- Département d'Informatique et de Recherche Opérationnelle (DIRO), Université de Montréal, Montréal, QC CP 6128, Canada
| |
Collapse
|
12
|
Affiliation(s)
- Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Artem S. Novozhilov
- Department of Mathematics, North Dakota State University, Fargo, North Dakota 58108, USA
| |
Collapse
|
13
|
Nagao A, Ohara M, Miyauchi K, Yokobori SI, Yamagishi A, Watanabe K, Suzuki T. Hydroxylation of a conserved tRNA modification establishes non-universal genetic code in echinoderm mitochondria. Nat Struct Mol Biol 2017; 24:778-782. [PMID: 28783151 DOI: 10.1038/nsmb.3449] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Accepted: 07/11/2017] [Indexed: 12/13/2022]
Abstract
The genetic code is not frozen but still evolving, which can result in the acquisition of 'dialectal' codons that deviate from the universal genetic code. RNA modifications in the anticodon region of tRNAs play a critical role in establishing such non-universal genetic codes. In echinoderm mitochondria, the AAA codon specifies asparagine instead of lysine. By analyzing mitochondrial (mt-) tRNALys isolated from the sea urchin (Mesocentrotus nudus), we discovered a novel modified nucleoside, hydroxy-N6-threonylcarbamoyladenosine (ht6A), 3' adjacent to the anticodon (position 37). Biochemical analysis revealed that ht6A37 has the ability to prevent mt-tRNALys from misreading AAA as lysine, thereby indicating that hydroxylation of N6-threonylcarbamoyladenosine (t6A) contributes to the establishment of the non-universal genetic code in echinoderm mitochondria.
Collapse
Affiliation(s)
- Asuteka Nagao
- Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Tokyo, Japan
| | - Mitsuhiro Ohara
- Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Tokyo, Japan
| | - Kenjyo Miyauchi
- Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Tokyo, Japan
| | - Shin-Ichi Yokobori
- Department of Applied Life Sciences, School of Life Sciences, Tokyo University of Pharmacy and Life Sciences, Tokyo, Japan
| | - Akihiko Yamagishi
- Department of Applied Life Sciences, School of Life Sciences, Tokyo University of Pharmacy and Life Sciences, Tokyo, Japan
| | - Kimitsuna Watanabe
- Department of Applied Life Sciences, School of Life Sciences, Tokyo University of Pharmacy and Life Sciences, Tokyo, Japan
| | - Tsutomu Suzuki
- Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Tokyo, Japan
| |
Collapse
|
14
|
Frozen Accident Pushing 50: Stereochemistry, Expansion, and Chance in the Evolution of the Genetic Code. Life (Basel) 2017; 7:life7020022. [PMID: 28545255 PMCID: PMC5492144 DOI: 10.3390/life7020022] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Revised: 05/19/2017] [Accepted: 05/20/2017] [Indexed: 12/31/2022] Open
Abstract
Nearly 50 years ago, Francis Crick propounded the frozen accident scenario for the evolution of the genetic code along with the hypothesis that the early translation system consisted primarily of RNA. Under the frozen accident perspective, the code is universal among modern life forms because any change in codon assignment would be highly deleterious. The frozen accident can be considered the default theory of code evolution because it does not imply any specific interactions between amino acids and the cognate codons or anticodons, or any particular properties of the code. The subsequent 49 years of code studies have elucidated notable features of the standard code, such as high robustness to errors, but failed to develop a compelling explanation for codon assignments. In particular, stereochemical affinity between amino acids and the cognate codons or anticodons does not seem to account for the origin and evolution of the code. Here, I expand Crick’s hypothesis on RNA-only translation system by presenting evidence that this early translation already attained high fidelity that allowed protein evolution. I outline an experimentally testable scenario for the evolution of the code that combines a distinct version of the stereochemical hypothesis, in which amino acids are recognized via unique sites in the tertiary structure of proto-tRNAs, rather than by anticodons, expansion of the code via proto-tRNA duplication, and the frozen accident.
Collapse
|
15
|
Kollmar M, Mühlhausen S. Nuclear codon reassignments in the genomics era and mechanisms behind their evolution. Bioessays 2017; 39. [PMID: 28318058 DOI: 10.1002/bies.201600221] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The canonical genetic code ubiquitously translates nucleotide into peptide sequence with several alterations known in viruses, bacteria, mitochondria, plastids, and single-celled eukaryotes. A new hypothesis to explain genetic code changes, termed tRNA loss driven codon reassignment, has been proposed recently when the polyphyly of the yeast codon reassignment events has been uncovered. According to this hypothesis, the driving force for genetic code changes are tRNA or translation termination factor loss-of-function mutations or loss-of-gene events. The free codon can subsequently be captured by all tRNAs that have an appropriately mutated anticodon and are efficiently charged. Thus, codon capture most likely happens by near-cognate tRNAs and tRNAs whose anticodons are not part of the recognition sites of the respective aminoacyl-tRNA-synthetases. This hypothesis comprehensively explains the CTG codon translation as alanine in Pachysolen yeast together with the long known translation of the same codon as serine in Candida albicans and related species, and can also be applied to most other known reassignments.
Collapse
Affiliation(s)
- Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| | - Stefanie Mühlhausen
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK
| |
Collapse
|
16
|
Wei Y, Wang J, Xia X. Coevolution between Stop Codon Usage and Release Factors in Bacterial Species. Mol Biol Evol 2016; 33:2357-67. [PMID: 27297468 PMCID: PMC4989110 DOI: 10.1093/molbev/msw107] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Three stop codons in bacteria represent different translation termination signals, and their usage is expected to depend on their differences in translation termination efficiency, mutation bias, and relative abundance of release factors (RF1 decoding UAA and UAG, and RF2 decoding UAA and UGA). In 14 bacterial species (covering Proteobacteria, Firmicutes, Cyanobacteria, Actinobacteria and Spirochetes) with cellular RF1 and RF2 quantified, UAA is consistently over-represented in highly expressed genes (HEGs) relative to lowly expressed genes (LEGs), whereas UGA usage is the opposite even in species where RF2 is far more abundant than RF1. UGA usage relative to UAG increases significantly with PRF2 [=RF2/(RF1 + RF2)] as expected from adaptation between stop codons and their decoders. PRF2 is > 0.5 over a wide range of AT content (measured by PAT3 as the proportion of AT at third codon sites), but decreases rapidly toward zero at the high range of PAT3. This explains why bacterial lineages with high PAT3 often have UGA reassigned because of low RF2. There is no indication that UAG is a minor stop codon in bacteria as claimed in a recent publication. The claim is invalid because of the failure to apply the two key criteria in identifying a minor codon: (1) it is least preferred by HEGs (or most preferred by LEGs) and (2) it corresponds to the least abundant decoder. Our results suggest a more plausible explanation for why UAA usage increases, and UGA usage decreases, with PAT3, but UAG usage remains low over the entire PAT3 range.
Collapse
Affiliation(s)
- Yulong Wei
- Department of Biology, University of Ottawa, Ottawa, ON, Canada
| | - Juan Wang
- Department of Biology, University of Ottawa, Ottawa, ON, Canada
| | - Xuhua Xia
- Department of Biology, University of Ottawa, Ottawa, ON, Canada Ottawa Institute of Systems Biology, Ottawa, ON, Canada
| |
Collapse
|
17
|
Aggarwal N, Bandhu AV, Sengupta S. Finite population analysis of the effect of horizontal gene transfer on the origin of an universal and optimal genetic code. Phys Biol 2016; 13:036007. [DOI: 10.1088/1478-3975/13/3/036007] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
18
|
Brandão MM, Spoladore L, Faria LCB, Rocha ASL, Silva-Filho MC, Palazzo R. Ancient DNA sequence revealed by error-correcting codes. Sci Rep 2015; 5:12051. [PMID: 26159228 PMCID: PMC4498232 DOI: 10.1038/srep12051] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2014] [Accepted: 06/16/2015] [Indexed: 11/09/2022] Open
Abstract
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.
Collapse
Affiliation(s)
- Marcelo M Brandão
- 1] Centro de Biologia Molecular e Engenharia Genética, Universidade Estadual de Campinas, Campinas, SP, Brazil [2] Departamento de Genética, Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, 13400-918, Piracicaba, SP, Brazil
| | - Larissa Spoladore
- Departamento de Genética, Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, 13400-918, Piracicaba, SP, Brazil
| | - Luzinete C B Faria
- Departamento de Telemática, Faculdade de Engenharia Elétrica e de Computação, Universidade Estadual de Campinas, 13081-970, Campinas, SP, Brazil
| | - Andréa S L Rocha
- Departamento de Telemática, Faculdade de Engenharia Elétrica e de Computação, Universidade Estadual de Campinas, 13081-970, Campinas, SP, Brazil
| | - Marcio C Silva-Filho
- Departamento de Genética, Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, 13400-918, Piracicaba, SP, Brazil
| | - Reginaldo Palazzo
- Departamento de Telemática, Faculdade de Engenharia Elétrica e de Computação, Universidade Estadual de Campinas, 13081-970, Campinas, SP, Brazil
| |
Collapse
|
19
|
Pathways of Genetic Code Evolution in Ancient and Modern Organisms. J Mol Evol 2015; 80:229-43. [DOI: 10.1007/s00239-015-9686-8] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 06/03/2015] [Indexed: 10/23/2022]
|
20
|
Phylogeny of genetic codes and punctuation codes within genetic codes. Biosystems 2015; 129:36-43. [DOI: 10.1016/j.biosystems.2015.01.003] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2014] [Revised: 01/02/2015] [Accepted: 01/14/2015] [Indexed: 11/23/2022]
|
21
|
Bandhu AV, Aggarwal N, Sengupta S. Revisiting the physico-chemical hypothesis of code origin: an analysis based on code-sequence coevolution in a finite population. ORIGINS LIFE EVOL B 2013; 43:465-89. [PMID: 24500541 DOI: 10.1007/s11084-014-9353-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2013] [Accepted: 01/13/2014] [Indexed: 01/23/2023]
Abstract
The origin of the genetic code marked a major transition from a plausible RNA world to the world of DNA and proteins and is an important milestone in our understanding of the origin of life. We examine the efficacy of the physico-chemical hypothesis of code origin by carrying out simulations of code-sequence coevolution in finite populations in stages, leading first to the emergence of ten amino acid code(s) and subsequently to 14 amino acid code(s). We explore two different scenarios of primordial code evolution. In one scenario, competition occurs between populations of equilibrated code-sequence sets while in another scenario; new codes compete with existing codes as they are gradually introduced into the population with a finite probability. In either case, we find that natural selection between competing codes distinguished by differences in the degree of physico-chemical optimization is unable to explain the structure of the standard genetic code. The code whose structure is most consistent with the standard genetic code is often not among the codes that have a high fixation probability. However, we find that the composition of the code population affects the code fixation probability. A physico-chemically optimized code gets fixed with a significantly higher probability if it competes against a set of randomly generated codes. Our results suggest that physico-chemical optimization may not be the sole driving force in ensuring the emergence of the standard genetic code.
Collapse
Affiliation(s)
- Ashutosh Vishwa Bandhu
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India
| | | | | |
Collapse
|
22
|
Molecular epidemiology, phylogeny and evolution of Candida albicans. INFECTION GENETICS AND EVOLUTION 2013; 21:166-78. [PMID: 24269341 DOI: 10.1016/j.meegid.2013.11.008] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Revised: 10/31/2013] [Accepted: 11/01/2013] [Indexed: 11/21/2022]
Abstract
A small number of Candida species form part of the normal microbial flora of mucosal surfaces in humans and may give rise to opportunistic infections when host defences are impaired. Candida albicans is by far the most prevalent commensal and pathogenic Candida species. Several different molecular typing approaches including multilocus sequence typing, multilocus microsatellite typing and DNA fingerprinting using C. albicans-specific repetitive sequence-containing DNA probes have yielded a wealth of information regarding the epidemiology and population structure of this species. Such studies revealed that the C. albicans population structure consists of multiple major and minor clades, some of which exhibit geographical or phenotypic enrichment and that C. albicans reproduction is predominantly clonal. Despite this, losses of heterozygosity by recombination, the existence of a parasexual cycle, toleration of a wide range of aneuploidies and the recent description of viable haploid strains have all demonstrated the extensive plasticity of the C. albicans genome. Recombination and gross chromosomal rearrangements are more common under stressful environmental conditions, and have played a significant role in the evolution of this opportunistic pathogen. Surprisingly, Candida dubliniensis, the closest relative of C. albicans exhibits more karyotype variability than C. albicans, but is significantly less adaptable to unfavourable environments. This disparity most likely reflects the evolutionary processes that occurred during or soon after the divergence of both species from their common ancestor. Whilst C. dubliniensis underwent significant gene loss and pseudogenisation, C. albicans expanded gene families considered to be important in virulence. It is likely that technological developments in whole genome sequencing and data analysis in coming years will facilitate its routine use for population structure, epidemiological investigations, and phylogenetic analyses of Candida species. These are likely to reveal more minor C. albicans clades and to enhance our understanding of the population biology of this versatile organism.
Collapse
|
23
|
Chen X, Shen YY, Zhang YP. [Review of mtDNA in molecular evolution studies]. DONG WU XUE YAN JIU = ZOOLOGICAL RESEARCH 2013; 33:566-73. [PMID: 23266975 DOI: 10.3724/sp.j.1141.2012.06566] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Mitochondria are old organelles found in most eukaryotic cells. Due to its rapid mutation ratio, mitochondrial DNA (mtDNA) has been widely used as a DNA marker in molecular studies and has long been suggested to undergo neutral evolution or purifying selection. Mitochondria produces 95% of the adenosine triphosphate (ATP) needed for locomotion, and heat for thermoregulation. Recent studies had found that mitochondria play critical roles in energy metabolism, and proved that functional constraints acting on mitochondria, due to energy metabolism and/or thermoregulation, influence the evolution of mtDNA. This review summarizes mitochondrial genome composition, evolution, and its applications in molecular evolution studies (reconstruction of species phylogenesis, the relationship between biological energy metabolism and mtDNA evolution, and the mtDNA codon reassignment influences the adaptation in different creatures).
Collapse
Affiliation(s)
- Xing Chen
- Laboratory for Conservation and Utilization of Bio-resources, Yunnan University, Kunming, China
| | | | | |
Collapse
|
24
|
Morgens DW, Cavalcanti ARO. An alternative look at code evolution: using non-canonical codes to evaluate adaptive and historic models for the origin of the genetic code. J Mol Evol 2013; 76:71-80. [PMID: 23344715 DOI: 10.1007/s00239-013-9542-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2012] [Accepted: 01/15/2013] [Indexed: 10/27/2022]
Abstract
The canonical code has been shown many times to be highly robust against point mutations; that is, mutations that change a single nucleotide tend to result in similar amino acids more often than expected by chance. There are two major types of models for the origin of the code, which explain how this sophisticated structure evolved. Adaptive models state that the primitive code was specifically selected for error minimization, while historic models hypothesize that the robustness of the code is an artifact or by-product of the mechanism of code evolution. In this paper, we evaluated the levels of robustness in existing non-canonical codes as well as codes that differ in only one codon assignment from the standard code. We found that the level of robustness of many of these codes is comparable or better than that of the standard code. Although these results do not preclude an adaptive origin of the genetic code, they suggest that the code was not selected for minimizing the effects of point mutations.
Collapse
Affiliation(s)
- David W Morgens
- Department of Biology, Pomona College, 175 W 6th Street, Claremont, CA, USA
| | | |
Collapse
|
25
|
Abascal F, Posada D, Zardoya R. The evolution of the mitochondrial genetic code in arthropods revisited. ACTA ACUST UNITED AC 2012; 23:84-91. [PMID: 22397376 DOI: 10.3109/19401736.2011.653801] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
A variant of the invertebrate mitochondrial genetic code was previously identified in arthropods (Abascal et al. 2006a, PLoS Biol 4:e127) in which, instead of translating the AGG codon as serine, as in other invertebrates, some arthropods translate AGG as lysine. Here, we revisit the evolution of the genetic code in arthropods taking into account that (1) the number of arthropod mitochondrial genomes sequenced has triplicated since the original findings were published; (2) the phylogeny of arthropods has been recently resolved with confidence for many groups; and (3) sophisticated probabilistic methods can be applied to analyze the evolution of the genetic code in arthropod mitochondria. According to our analyses, evolutionary shifts in the genetic code have been more common than previously inferred, with many taxonomic groups displaying two alternative codes. Ancestral character-state reconstruction using probabilistic methods confirmed that the arthropod ancestor most likely translated AGG as lysine. Point mutations at tRNA-Lys and tRNA-Ser correlated with the meaning of the AGG codon. In addition, we identified three variables (GC content, number of AGG codons, and taxonomic information) that best explain the use of each of the two alternative genetic codes.
Collapse
Affiliation(s)
- Federico Abascal
- Departamento de Biodiversidad y Biología Evolutiva, Museo Nacional de Ciencias Naturales CSIC, José Gutiérrez Abascal 2, 28006 Madrid, Spain.
| | | | | |
Collapse
|
26
|
Cocquyt E, Gile GH, Leliaert F, Verbruggen H, Keeling PJ, De Clerck O. Complex phylogenetic distribution of a non-canonical genetic code in green algae. BMC Evol Biol 2010; 10:327. [PMID: 20977766 PMCID: PMC2984419 DOI: 10.1186/1471-2148-10-327] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2010] [Accepted: 10/26/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A non-canonical nuclear genetic code, in which TAG and TAA have been reassigned from stop codons to glutamine, has evolved independently in several eukaryotic lineages, including the ulvophycean green algal orders Dasycladales and Cladophorales. To study the phylogenetic distribution of the standard and non-canonical genetic codes, we generated sequence data of a representative set of ulvophycean green algae and used a robust green algal phylogeny to evaluate different evolutionary scenarios that may account for the origin of the non-canonical code. RESULTS This study demonstrates that the Dasycladales and Cladophorales share this alternative genetic code with the related order Trentepohliales and the genus Blastophysa, but not with the Bryopsidales, which is sister to the Dasycladales. This complex phylogenetic distribution whereby all but one representative of a single natural lineage possesses an identical deviant genetic code is unique. CONCLUSIONS We compare different evolutionary scenarios for the complex phylogenetic distribution of this non-canonical genetic code. A single transition to the non-canonical code followed by a reversal to the canonical code in the Bryopsidales is highly improbable due to the profound genetic changes that coincide with codon reassignment. Multiple independent gains of the non-canonical code, as hypothesized for ciliates, are also unlikely because the same deviant code has evolved in all lineages. Instead we favor a stepwise acquisition model, congruent with the ambiguous intermediate model, whereby the non-canonical code observed in these green algal orders has a single origin. We suggest that the final steps from an ambiguous intermediate situation to a non-canonical code have been completed in the Trentepohliales, Dasycladales, Cladophorales and Blastophysa but not in the Bryopsidales. We hypothesize that in the latter lineage an initial stage characterized by translational ambiguity was not followed by final reassignment of both stop codons to glutamine. Instead the standard code was retained by the disappearance of the ambiguously decoding tRNAs from the genome. We correlate the emergence of a non-canonical genetic code in the Ulvophyceae to their multinucleate nature.
Collapse
Affiliation(s)
- Ellen Cocquyt
- Phycology Research Group and Center for Molecular Phylogenetics and Evolution, Ghent University, Krijgslaan 281 S8, 9000 Ghent, Belgium.
| | | | | | | | | | | |
Collapse
|
27
|
Sammet SG, Bastolla U, Porto M. Comparison of translation loads for standard and alternative genetic codes. BMC Evol Biol 2010; 10:178. [PMID: 20546599 PMCID: PMC2909233 DOI: 10.1186/1471-2148-10-178] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2009] [Accepted: 06/14/2010] [Indexed: 11/25/2022] Open
Abstract
Background The (almost) universality of the genetic code is one of the most intriguing properties of cellular life. Nevertheless, several variants of the standard genetic code have been observed, which differ in one or several of 64 codon assignments and occur mainly in mitochondrial genomes and in nuclear genomes of some bacterial and eukaryotic parasites. These variants are usually considered to be the result of non-adaptive evolution. It has been shown that the standard genetic code is preferential to randomly assembled codes for its ability to reduce the effects of errors in protein translation. Results Using a genotype-to-phenotype mapping based on a quantitative model of protein folding, we compare the standard genetic code to seven of its naturally occurring variants with respect to the fitness loss associated to mistranslation and mutation. These fitness losses are computed through computer simulations of protein evolution with mutations that are either neutral or lethal, and different mutation biases, which influence the balance between unfolding and misfolding stability. We show that the alternative codes may produce significantly different mutation and translation loads, particularly for genomes evolving with a rather large mutation bias. Most of the alternative genetic codes are found to be disadvantageous to the standard code, in agreement with the view that the change of genetic code is a mutationally driven event. Nevertheless, one of the studied alternative genetic codes is predicted to be preferable to the standard code for a broad range of mutation biases. Conclusions Our results show that, with one exception, the standard genetic code is generally better able to reduce the translation load than the naturally occurring variants studied here. Besides this exception, some of the other alternative genetic codes are predicted to be better adapted for extreme mutation biases. Hence, the fixation of alternative genetic codes might be a neutral or nearly-neutral event in the majority of the cases, but adaptation cannot be excluded for some of the studied cases.
Collapse
Affiliation(s)
- Stefanie Gabriele Sammet
- Institut für Festkörperphysik, Technische Universität Darmstadt, Hochschulstr, 8, 64289 Darmstadt, Germany
| | | | | |
Collapse
|
28
|
A rationale for the symmetries by base substitutions of degeneracy in the genetic code. Biosystems 2010; 99:1-5. [DOI: 10.1016/j.biosystems.2009.07.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2009] [Revised: 07/15/2009] [Accepted: 07/28/2009] [Indexed: 11/18/2022]
|
29
|
Certain non-standard coding tables appear to be more robust to error than the standard genetic code. J Mol Evol 2009; 70:13-28. [PMID: 20012032 DOI: 10.1007/s00239-009-9303-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2009] [Accepted: 11/10/2009] [Indexed: 10/20/2022]
Abstract
Since the identification of the Standard Coding Table as a "universal" method to translate genetic information into amino acids, exceptions to this rule have been reported, and to date there are nearly 20 alternative genetic coding tables deployed by either nuclear genomes or organelles of organisms. Why are these codes still in use and why are new codon reassignments occurring? This present study aims to provide a new method to address these questions and to analyze whether these alternative codes present any advantages or disadvantages to the organisms or organelles in terms of robustness to error. We show that two of the alternative coding tables, The Ciliate, Dasycladacean and Hexamita Nuclear Code (CDH) and The Flatworm Mitochondrial Code (FMC), exhibit an advantage, while others such as The Yeast Mitochondrial Code (YMC) are at a significant disadvantage. We propose that the Standard Code is likely to have emerged as a "local minimum" and that the "coding landscape" is still being searched for a "global" minimum.
Collapse
|
30
|
Baranov PV, Venin M, Provan G. Codon size reduction as the origin of the triplet genetic code. PLoS One 2009; 4:e5708. [PMID: 19479032 PMCID: PMC2682656 DOI: 10.1371/journal.pone.0005708] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2009] [Accepted: 04/22/2009] [Indexed: 11/26/2022] Open
Abstract
The genetic code appears to be optimized in its robustness to missense errors and frameshift errors. In addition, the genetic code is near-optimal in terms of its ability to carry information in addition to the sequences of encoded proteins. As evolution has no foresight, optimality of the modern genetic code suggests that it evolved from less optimal code variants. The length of codons in the genetic code is also optimal, as three is the minimal nucleotide combination that can encode the twenty standard amino acids. The apparent impossibility of transitions between codon sizes in a discontinuous manner during evolution has resulted in an unbending view that the genetic code was always triplet. Yet, recent experimental evidence on quadruplet decoding, as well as the discovery of organisms with ambiguous and dual decoding, suggest that the possibility of the evolution of triplet decoding from living systems with non-triplet decoding merits reconsideration and further exploration. To explore this possibility we designed a mathematical model of the evolution of primitive digital coding systems which can decode nucleotide sequences into protein sequences. These coding systems can evolve their nucleotide sequences via genetic events of Darwinian evolution, such as point-mutations. The replication rates of such coding systems depend on the accuracy of the generated protein sequences. Computer simulations based on our model show that decoding systems with codons of length greater than three spontaneously evolve into predominantly triplet decoding systems. Our findings suggest a plausible scenario for the evolution of the triplet genetic code in a continuous manner. This scenario suggests an explanation of how protein synthesis could be accomplished by means of long RNA-RNA interactions prior to the emergence of the complex decoding machinery, such as the ribosome, that is required for stabilization and discrimination of otherwise weak triplet codon-anticodon interactions.
Collapse
Affiliation(s)
- Pavel V Baranov
- Biochemistry Department, University College Cork, Cork, Ireland.
| | | | | |
Collapse
|
31
|
Higgs PG. A four-column theory for the origin of the genetic code: tracing the evolutionary pathways that gave rise to an optimized code. Biol Direct 2009; 4:16. [PMID: 19393096 PMCID: PMC2689856 DOI: 10.1186/1745-6150-4-16] [Citation(s) in RCA: 104] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2009] [Accepted: 04/24/2009] [Indexed: 11/18/2022] Open
Abstract
Background The arrangement of the amino acids in the genetic code is such that neighbouring codons are assigned to amino acids with similar physical properties. Hence, the effects of translational error are minimized with respect to randomly reshuffled codes. Further inspection reveals that it is amino acids in the same column of the code (i.e. same second base) that are similar, whereas those in the same row show no particular similarity. We propose a 'four-column' theory for the origin of the code that explains how the action of selection during the build-up of the code leads to a final code that has the observed properties. Results The theory makes the following propositions. (i) The earliest amino acids in the code were those that are easiest to synthesize non-biologically, namely Gly, Ala, Asp, Glu and Val. (ii) These amino acids are assigned to codons with G at first position. Therefore the first code may have used only these codons. (iii) The code rapidly developed into a four-column code where all codons in the same column coded for the same amino acid: NUN = Val, NCN = Ala, NAN = Asp and/or Glu, and NGN = Gly. (iv) Later amino acids were added sequentially to the code by a process of subdivision of codon blocks in which a subset of the codons assigned to an early amino acid were reassigned to a later amino acid. (v) Later amino acids were added into positions formerly occupied by amino acids with similar properties because this can occur with minimal disruption to the proteins already encoded by the earlier code. As a result, the properties of the amino acids in the final code retain a four-column pattern that is a relic of the earliest stages of code evolution. Conclusion The driving force during this process is not the minimization of translational error, but positive selection for the increased diversity and functionality of the proteins that can be made with a larger amino acid alphabet. Nevertheless, the code that results is one in which translational error is minimized. We define a cost function with which we can compare the fitness of codes with varying numbers of amino acids, and a barrier function, which measures the change in cost immediately after addition of a new amino acid. We show that the barrier is positive if an amino acid is added into a column with dissimilar properties, but negative if an amino acid is added into a column with similar physical properties. Thus, natural selection favours the assignment of amino acids to the positions that they occupy in the final code. Reviewers This article was reviewed by David Ardell, Eugene Koonin and Stephen Freeland (nominated by Laurence Hurst)
Collapse
Affiliation(s)
- Paul G Higgs
- Department of Physics and Astronomy, McMaster University, Hamilton, Ontario L8S 4M1, Canada.
| |
Collapse
|
32
|
The cost of wobble translation in fungal mitochondrial genomes: integration of two traditional hypotheses. BMC Evol Biol 2008; 8:211. [PMID: 18638409 PMCID: PMC2488353 DOI: 10.1186/1471-2148-8-211] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2008] [Accepted: 07/19/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Fungal and animal mitochondrial genomes typically have one tRNA for each synonymous codon family. The codon-anticodon adaptation hypothesis predicts that the wobble nucleotide of a tRNA anticodon should evolve towards maximizing Watson-Crick base pairing with the most frequently used codon within each synonymous codon family, whereas the wobble versatility hypothesis argues that the nucleotide at the wobble site should be occupied by a nucleotide most versatile in wobble pairing, i.e., the tRNA wobble nucleotide should be G for NNY codon families, and U for NNR and NNN codon families (where Y stands for C or U, R for A or G and N for any nucleotide). RESULTS We here integrate these two traditional hypotheses on tRNA anticodons into a unified model based on an analysis of the wobble costs associated with different wobble base pairs. This novel approach allows the relative cost of wobble pairing to be qualitatively evaluated. A comprehensive study of 36 fungal genomes suggests very different costs between two kinds of U:G wobble pairs, i.e., (1) between a G at the wobble site of a tRNA anticodon and a U at the third codon position (designated MU3:G) and (2) between a U at the wobble site of a tRNA anticodon and a G at the third codon position (designated MG3:U). CONCLUSION In general, MU3:G is much smaller than MG3:U, suggesting no selection against U-ending codons in NNY codon families with a wobble G in the tRNA anticodon but strong selection against G-ending codons in NNR codon families with a wobble U at the tRNA anticodon. This finding resolves several puzzling observations in fungal genomics and corroborates previous studies showing that U3:G wobble is energetically more favorable than G3:U wobble.
Collapse
|
33
|
Shackelton LA, Holmes EC. The role of alternative genetic codes in viral evolution and emergence. J Theor Biol 2008; 254:128-34. [PMID: 18589455 DOI: 10.1016/j.jtbi.2008.05.024] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2008] [Revised: 03/17/2008] [Accepted: 05/20/2008] [Indexed: 10/22/2022]
Abstract
Although the 'universal' genetic code is widespread among life-forms, a number of diverse lineages have evolved unique codon reassignments. The proteomes of these organisms and organelles must, by necessity, use the same codon assignments. Likewise, for an exogenous genetic element, such as an infecting viral genome, to be accurately and completely expressed with the host's translation system, it must employ the same genetic code. This raises a number of intriguing questions regarding the origin and evolution of viruses. In particular, it is extremely unlikely that viruses of hosts utilizing the universal genetic code would emerge, via cross-species transmission, in hosts utilizing alternative codes, and vice versa. Consequently, more parsimonious scenarios for the origins of such viruses include the prolonged co-evolution of viruses with cellular life, or the escape of genetic material from host genomes. Further, we raise the possibility that emerging viruses provide the selection pressure favoring the use of alternative codes in potential hosts, such that the evolution of a variant genetic code acts as a unique and powerful antiviral strategy. As such, in the face of new emerging viruses, hosts with codon reassignments would have a significant selective advantage compared to hosts utilizing the universal code.
Collapse
Affiliation(s)
- Laura A Shackelton
- Center for Infectious Disease Dynamics, Department of Biology, Mueller Laboratory, The Pennsylvania State University, University Park, PA 16802, USA
| | | |
Collapse
|
34
|
A statistical analysis of the robustness of alternate genetic coding tables. Int J Mol Sci 2008; 9:679-697. [PMID: 19325778 PMCID: PMC2635705 DOI: 10.3390/ijms9050679] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2007] [Revised: 02/25/2008] [Accepted: 04/11/2008] [Indexed: 11/24/2022] Open
Abstract
The rules that specify how the information contained in DNA is translated into amino acid “language” during protein synthesis are called “the genetic code”, commonly called the “Standard” or “Universal” Genetic Code Table. As a matter of fact, this coding table is not at all “universal”: in addition to different genetic code tables used by different organisms, even within the same organism the nuclear and mitochondrial genes may be subject to two different coding tables. Results In an attempt to understand the advantages and disadvantages these coding tables may bring to an organism, we have decided to analyze various coding tables on genes subject to mutations, and have estimated how these genes “survive” over generations. We have used this as indicative of the “evolutionary” success of that particular coding table. We find that the “standard” genetic code is not actually the most robust of all coding tables, and interestingly, Flatworm Mitochondrial Code (FMC) appears to be the highest ranking coding table given our assumptions. Conclusions It is commonly hypothesized that the more robust a genetic code, the better suited it is for maintenance of the genome. Our study shows that, given the assumptions in our model, Standard Genetic Code is quite poor when compared to other alternate code tables in terms of robustness. This brings about the question of why Standard Code has been so widely accepted by a wider variety of organisms instead of FMC, which needs to be addressed for a thorough understanding of genetic code evolution.
Collapse
|
35
|
Carullo M, Xia X. An extensive study of mutation and selection on the wobble nucleotide in tRNA anticodons in fungal mitochondrial genomes. J Mol Evol 2008; 66:484-93. [PMID: 18401633 DOI: 10.1007/s00239-008-9102-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2007] [Revised: 03/05/2008] [Accepted: 03/19/2008] [Indexed: 10/22/2022]
Abstract
Two alternative hypotheses aim to predict the wobble nucleotide of tRNA anticodons in mitochondrion. The codon-anticodon adaptation hypothesis predicts that the wobble nucleotide of tRNA anticodon should evolve toward maximizing the Watson-Crick base pairing with the most frequently used codon within each synonymous codon family. In contrast, the wobble versatility hypothesis argues that the nucleotide at the wobble site should be occupied by a nucleotide most versatile in wobble pairing, i.e., the wobble site of the tRNA anticodon should be G for NNY codon families and U for NNR and NNN codon families (where Y stands for C or U, R for A or G, and N for any nucleotide). We examined codon usage and anticodon wobble sites in 36 fungal genomes to evaluate these two alternative hypotheses and identify exceptional cases that deserve new explanations. While the wobble versatility hypothesis is generally supported, there are interesting exceptions involving tRNA(Arg) translating the CGN codon family, tRNA(Trp) translating the UGR codon family, and tRNA(Met) translating the AUR codon family. Our results suggest that the potential to suppress stop codons, the historical inertia, and the conflict between translation initiation and elongation can all contribute to determining the wobble nucleotide of tRNA anticodons.
Collapse
Affiliation(s)
- Malisa Carullo
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada
| | | |
Collapse
|
36
|
Sengupta S, Yang X, Higgs PG. The mechanisms of codon reassignments in mitochondrial genetic codes. J Mol Evol 2007; 64:662-88. [PMID: 17541678 PMCID: PMC1894752 DOI: 10.1007/s00239-006-0284-7] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2006] [Accepted: 03/07/2007] [Indexed: 11/26/2022]
Abstract
Many cases of nonstandard genetic codes are known in mitochondrial genomes. We carry out analysis of phylogeny and codon usage of organisms for which the complete mitochondrial genome is available, and we determine the most likely mechanism for codon reassignment in each case. Reassignment events can be classified according to the gain-loss framework. The “gain” represents the appearance of a new tRNA for the reassigned codon or the change of an existing tRNA such that it gains the ability to pair with the codon. The “loss” represents the deletion of a tRNA or the change in a tRNA so that it no longer translates the codon. One possible mechanism is codon disappearance (CD), where the codon disappears from the genome prior to the gain and loss events. In the alternative mechanisms the codon does not disappear. In the unassigned codon mechanism, the loss occurs first, whereas in the ambiguous intermediate mechanism, the gain occurs first. Codon usage analysis gives clear evidence of cases where the codon disappeared at the point of the reassignment and also cases where it did not disappear. CD is the probable explanation for stop to sense reassignments and a small number of reassignments of sense codons. However, the majority of sense-to-sense reassignments cannot be explained by CD. In the latter cases, by analysis of the presence or absence of tRNAs in the genome and of the changes in tRNA sequences, it is sometimes possible to distinguish between the unassigned codon and the ambiguous intermediate mechanisms. We emphasize that not all reassignments follow the same scenario and that it is necessary to consider the details of each case carefully.
Collapse
Affiliation(s)
- Supratim Sengupta
- Department of Physics and Astronomy, McMaster University, Hamilton, Ontario L8S 4M1 Canada
- Department of Physics and Atmospheric Science, Dalhousie University, Halifax, Nova Scotia B3H 3J5 Canada
| | - Xiaoguang Yang
- Department of Physics and Astronomy, McMaster University, Hamilton, Ontario L8S 4M1 Canada
| | - Paul G. Higgs
- Department of Physics and Astronomy, McMaster University, Hamilton, Ontario L8S 4M1 Canada
| |
Collapse
|
37
|
Delarue M. An asymmetric underlying rule in the assignment of codons: possible clue to a quick early evolution of the genetic code via successive binary choices. RNA (NEW YORK, N.Y.) 2007; 13:161-9. [PMID: 17164478 PMCID: PMC1781368 DOI: 10.1261/rna.257607] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2006] [Accepted: 10/26/2006] [Indexed: 05/13/2023]
Abstract
Aminoacyl-tRNA synthetases (aaRSs) are responsible for creating the pool of correctly charged aminoacyl-tRNAs that are necessary for the translation of genetic information (mRNA) by the ribosome. Each aaRS belongs to either one of only two classes with two different mechanisms of aminoacylation, making use of either the 2'OH (Class I) or the 3'OH (Class II) of the terminal A76 of the tRNA and approaching the tRNA either from the minor groove (2'OH) or the major groove (3'OH). Here, an asymmetric pattern typical of differentiation is uncovered in the partition of the codon repertoire, as defined by the mechanism of aminoacylation of each corresponding tRNA. This pattern can be reproduced in a unique cascade of successive binary decisions that progressively reduces codon ambiguity. The deduced order of differentiation is manifestly driven by the reduction of translation errors. A simple rule can be defined, decoding each codon sequence in its binary class, thereby providing both the code and the key to decode it. Assuming that the partition into two mechanisms of tRNA aminoacylation is a relic that dates back to the invention of the genetic code in the RNA World, a model for the assignment of amino acids in the codon table can be derived. The model implies that the stop codon was always there, as the codon whose tRNA cannot be charged with any amino acid, and makes the prediction of an ultimate differentiation step, which is found to correspond to the codon assignment of the 22nd amino acid pyrrolysine in archaebacteria.
Collapse
Affiliation(s)
- Marc Delarue
- Unité de Dynamique Structurale des Macromolécules, URA 2185 du CNRS, Institut Pasteur, Paris, France.
| |
Collapse
|
38
|
Urbina D, Tang B, Higgs PG. The response of amino acid frequencies to directional mutation pressure in mitochondrial genome sequences is related to the physical properties of the amino acids and to the structure of the genetic code. J Mol Evol 2006; 62:340-61. [PMID: 16477524 DOI: 10.1007/s00239-005-0051-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2005] [Accepted: 10/01/2005] [Indexed: 11/29/2022]
Abstract
The frequencies of A, C, G, and T in mitochondrial DNA vary among species due to unequal rates of mutation between the bases. The frequencies of bases at fourfold degenerate sites respond directly to mutation pressure. At first and second positions, selection reduces the degree of frequency variation. Using a simple evolutionary model, we show that first position sites are less constrained by selection than second position sites and, therefore, that the frequencies of bases at first position are more responsive to mutation pressure than those at second position. We define a measure of distance between amino acids that is dependent on eight measured physical properties and a similarity measure that is the inverse of this distance. Columns 1, 2, 3, and 4 of the genetic code correspond to codons with U, C, A, and G in their second position, respectively. The similarity of amino acids in the four columns decreases systematically from column 1 to column 2 to column 3 to column 4. We then show that the responsiveness of first position bases to mutation pressure is dependent on the second position base and follows the same decreasing trend through the four columns. Again, this shows the correlation between physical properties and responsiveness. We determine a proximity measure for each amino acid, which is the average similarity between an amino acid and all others that are accessible via single point mutations in the mitochondrial genetic code structure. We also define a responsiveness for each amino acid, which measures how rapidly an amino acid frequency changes as a result of mutation pressure acting on the base frequencies. We show that there is a strong correlation between responsiveness and proximity, and that both these quantities are also correlated with the mutability of amino acids estimated from the mtREV substitution rate matrix. We also consider the variation of base frequencies between strands and between genes on a strand. These trends are consistent with the patterns expected from analysis of the variation among genomes.
Collapse
Affiliation(s)
- Daniel Urbina
- Department of Physics and Astronomy, McMaster University, Hamilton, Ontario, Canada
| | | | | |
Collapse
|