1
|
Giegé R, Eriani G. The tRNA identity landscape for aminoacylation and beyond. Nucleic Acids Res 2023; 51:1528-1570. [PMID: 36744444 PMCID: PMC9976931 DOI: 10.1093/nar/gkad007] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 12/21/2022] [Accepted: 01/03/2023] [Indexed: 02/07/2023] Open
Abstract
tRNAs are key partners in ribosome-dependent protein synthesis. This process is highly dependent on the fidelity of tRNA aminoacylation by aminoacyl-tRNA synthetases and relies primarily on sets of identities within tRNA molecules composed of determinants and antideterminants preventing mischarging by non-cognate synthetases. Such identity sets were discovered in the tRNAs of a few model organisms, and their properties were generalized as universal identity rules. Since then, the panel of identity elements governing the accuracy of tRNA aminoacylation has expanded considerably, but the increasing number of reported functional idiosyncrasies has led to some confusion. In parallel, the description of other processes involving tRNAs, often well beyond aminoacylation, has progressed considerably, greatly expanding their interactome and uncovering multiple novel identities on the same tRNA molecule. This review highlights key findings on the mechanistics and evolution of tRNA and tRNA-like identities. In addition, new methods and their results for searching sets of multiple identities on a single tRNA are discussed. Taken together, this knowledge shows that a comprehensive understanding of the functional role of individual and collective nucleotide identity sets in tRNA molecules is needed for medical, biotechnological and other applications.
Collapse
Affiliation(s)
- Richard Giegé
- Correspondence may also be addressed to Richard Giegé.
| | | |
Collapse
|
2
|
Lawrence TJ, Hadi-Nezhad F, Grosse I, Ardell DH. tSFM 1.0: tRNA Structure-Function Mapper. Bioinformatics 2021; 37:3654-3656. [PMID: 33904572 PMCID: PMC8545343 DOI: 10.1093/bioinformatics/btab247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 02/28/2021] [Accepted: 04/20/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Structure-conditioned information statistics have proven useful to predict and visualize tRNA Class-Informative Features (CIFs) and their evolutionary divergences. Although permutation p-values can quantify the significance of CIF divergences between two taxa, their naive Monte Carlo approximation is slow and inaccurate. The Peaks-over-Threshold approach of Knijnenburg et al. (2009) promises improvements to both speed and accuracy of permutation p-values, but has no publicly available API. AVAILABILITY AND IMPLEMENTATION We present tRNA Structure-Function Mapper (tSFM) v1.0, an open-source, multi-threaded application that efficiently computes, visualizes, and assesses significance of single- and paired-site CIFs and their evolutionary divergences for any RNA, protein, gene or genomic element sequence family. multiple estimators of permutation p-values for CIF evolutionary divergences are provided along with condidence intervals. tSFM is implemented in Python 3 with compiled C extensions and is freely available through GitHub (https://github.com/tlawrence3/tSFM) and PyPI. SUPPLEMENTARY INFORMATION Supplementary materials are available at Bioinformatics online.
Collapse
Affiliation(s)
- Travis J Lawrence
- Quantitative and Systems Biology Program, University of California, Merced, United States of America.,Biosciences Division, Oak Ridge National Lab, Oak Ridge, Tennessee, 37830, United States of America
| | - Fatemeh Hadi-Nezhad
- Quantitative and Systems Biology Program, University of California, Merced, United States of America
| | - Ivo Grosse
- Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle, Germany.,German Center of Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
| | - David H Ardell
- Quantitative and Systems Biology Program, University of California, Merced, United States of America.,Department of Molecular and Cell Biology, University of California, Merced, California 95343, United States of America
| |
Collapse
|
3
|
Phillips JB, Ardell DH. Structural and Genetic Determinants of Convergence in the Drosophila tRNA Structure-Function Map. J Mol Evol 2021; 89:103-116. [PMID: 33528599 PMCID: PMC7884595 DOI: 10.1007/s00239-021-09995-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 01/11/2021] [Indexed: 10/29/2022]
Abstract
The evolution of tRNA multigene families remains poorly understood, exhibiting unusual phenomena such as functional conversions of tRNA genes through anticodon shift substitutions. We improved FlyBase tRNA gene annotations from twelve Drosophila species, incorporating previously identified ortholog sets to compare substitution rates across tRNA bodies at single-site and base-pair resolution. All rapidly evolving sites fell within the same metal ion-binding pocket that lies at the interface of the two major stacked helical domains. We applied our tRNA Structure-Function Mapper (tSFM) method independently to each Drosophila species and one outgroup species Musca domestica and found that, although predicted tRNA structure-function maps are generally highly conserved in flies, one tRNA Class-Informative Feature (CIF) within the rapidly evolving ion-binding pocket-Cytosine 17 (C17), ancestrally informative for lysylation identity-independently gained asparaginylation identity and substituted in parallel across tRNAAsn paralogs at least once, possibly multiple times, during evolution of the genus. In D. melanogaster, most tRNALys and tRNAAsn genes are co-arrayed in one large heterologous gene cluster, suggesting that heterologous gene conversion as well as structural similarities of tRNA-binding interfaces in the closely related asparaginyl-tRNA synthetase (AsnRS) and lysyl-tRNA synthetase (LysRS) proteins may have played a role in these changes. A previously identified Asn-to-Lys anticodon shift substitution in D. ananassae may have arisen to compensate for the convergent and parallel gains of C17 in tRNAAsn paralogs in that lineage. Our results underscore the functional and evolutionary relevance of our tRNA structure-function map predictions and illuminate multiple genomic and structural factors contributing to rapid, parallel and compensatory evolution of tRNA multigene families.
Collapse
Affiliation(s)
- Julie Baker Phillips
- Quantitative and Systems Biology Program, University of California, Merced, CA, 95343, USA
- Department of Biology, Cumberland University, 1 Cumberland Square, Lebanon, TN, 37087, USA
| | - David H Ardell
- Quantitative and Systems Biology Program, University of California, Merced, CA, 95343, USA.
- Department of Molecular and Cell Biology, University of California, Merced, CA, 95343, USA.
| |
Collapse
|
4
|
Kelly P, Hadi-Nezhad F, Liu DY, Lawrence TJ, Linington RG, Ibba M, Ardell DH. Targeting tRNA-synthetase interactions towards novel therapeutic discovery against eukaryotic pathogens. PLoS Negl Trop Dis 2020; 14:e0007983. [PMID: 32106219 PMCID: PMC7046186 DOI: 10.1371/journal.pntd.0007983] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Accepted: 12/10/2019] [Indexed: 12/22/2022] Open
Abstract
The development of chemotherapies against eukaryotic pathogens is especially challenging because of both the evolutionary conservation of drug targets between host and parasite, and the evolution of strain-dependent drug resistance. There is a strong need for new nontoxic drugs with broad-spectrum activity against trypanosome parasites such as Leishmania and Trypanosoma. A relatively untested approach is to target macromolecular interactions in parasites rather than small molecular interactions, under the hypothesis that the features specifying macromolecular interactions diverge more rapidly through coevolution. We computed tRNA Class-Informative Features in humans and independently in eight distinct clades of trypanosomes, identifying parasite-specific informative features, including base pairs and base mis-pairs, that are broadly conserved over approximately 250 million years of trypanosome evolution. Validating these observations, we demonstrated biochemically that tRNA:aminoacyl-tRNA synthetase (aaRS) interactions are a promising target for anti-trypanosomal drug discovery. From a marine natural products extract library, we identified several fractions with inhibitory activity toward Leishmania major alanyl-tRNA synthetase (AlaRS) but no activity against the human homolog. These marine natural products extracts showed cross-reactivity towards Trypanosoma cruzi AlaRS indicating the broad-spectrum potential of our network predictions. We also identified Leishmania major threonyl-tRNA synthetase (ThrRS) inhibitors from the same library. We discuss why chemotherapies targeting multiple aaRSs should be less prone to the evolution of resistance than monotherapeutic or synergistic combination chemotherapies targeting only one aaRS. Trypanosome parasites pose a significant health risk worldwide. Conventional drug development strategies have proven challenging given the high conservation between humans and pathogens, with off-target toxicity being a common problem. Protein synthesis inhibitors have historically been an attractive target for antimicrobial discovery against bacteria, and more recently for eukaryotic pathogens. Here we propose that exploiting pathogen-specific tRNA-synthetase interactions offers the potential for highly targeted drug discovery. To this end, we improved tRNA gene annotations in trypanosome genomes, identified functionally informative trypanosome-specific tRNA features, and showed that these features are highly conserved over approximately 250 million years of trypanosome evolution. Highlighting the species-specific and broad-spectrum potential of our approach, we identified natural product inhibitors against the parasite translational machinery that have no effect on the homologous human enzyme.
Collapse
Affiliation(s)
- Paul Kelly
- The Ohio State University Molecular, Cellular and Developmental Biology Program, The Ohio State University, Columbus, Ohio, United States of America
- Center for RNA Biology, The Ohio State University, Ohio, United States of America
| | - Fatemeh Hadi-Nezhad
- Quantitative and Systems Biology Program, University of California, Merced, California, United States of America
| | - Dennis Y. Liu
- Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Travis J. Lawrence
- Quantitative and Systems Biology Program, University of California, Merced, California, United States of America
- Biosciences Division, Oak Ridge National Lab, Oak Ridge, Tennessee, United States of America
| | - Roger G. Linington
- Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Michael Ibba
- The Ohio State University Molecular, Cellular and Developmental Biology Program, The Ohio State University, Columbus, Ohio, United States of America
- Center for RNA Biology, The Ohio State University, Ohio, United States of America
- Department of Microbiology, The Ohio State University, Columbus, Ohio, United States of America
- * E-mail: (MI); (DHA)
| | - David H. Ardell
- Quantitative and Systems Biology Program, University of California, Merced, California, United States of America
- Department of Molecular & Cell Biology, University of California, Merced, California, United States of America
- * E-mail: (MI); (DHA)
| |
Collapse
|
5
|
Lawrence TJ, Amrine KCH, Swingley WD, Ardell DH. tRNA functional signatures classify plastids as late-branching cyanobacteria. BMC Evol Biol 2019; 19:224. [PMID: 31818253 PMCID: PMC6902448 DOI: 10.1186/s12862-019-1552-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Accepted: 11/29/2019] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Eukaryotes acquired the trait of oxygenic photosynthesis through endosymbiosis of the cyanobacterial progenitor of plastid organelles. Despite recent advances in the phylogenomics of Cyanobacteria, the phylogenetic root of plastids remains controversial. Although a single origin of plastids by endosymbiosis is broadly supported, recent phylogenomic studies are contradictory on whether plastids branch early or late within Cyanobacteria. One underlying cause may be poor fit of evolutionary models to complex phylogenomic data. RESULTS Using Posterior Predictive Analysis, we show that recently applied evolutionary models poorly fit three phylogenomic datasets curated from cyanobacteria and plastid genomes because of heterogeneities in both substitution processes across sites and of compositions across lineages. To circumvent these sources of bias, we developed CYANO-MLP, a machine learning algorithm that consistently and accurately phylogenetically classifies ("phyloclassifies") cyanobacterial genomes to their clade of origin based on bioinformatically predicted function-informative features in tRNA gene complements. Classification of cyanobacterial genomes with CYANO-MLP is accurate and robust to deletion of clades, unbalanced sampling, and compositional heterogeneity in input tRNA data. CYANO-MLP consistently classifies plastid genomes into a late-branching cyanobacterial sub-clade containing single-cell, starch-producing, nitrogen-fixing ecotypes, consistent with metabolic and gene transfer data. CONCLUSIONS Phylogenomic data of cyanobacteria and plastids exhibit both site-process heterogeneities and compositional heterogeneities across lineages. These aspects of the data require careful modeling to avoid bias in phylogenomic estimation. Furthermore, we show that amino acid recoding strategies may be insufficient to mitigate bias from compositional heterogeneities. However, the combination of our novel tRNA-specific strategy with machine learning in CYANO-MLP appears robust to these sources of bias with high accuracy in phyloclassification of cyanobacterial genomes. CYANO-MLP consistently classifies plastids as late-branching Cyanobacteria, consistent with independent evidence from signature-based approaches and some previous phylogenetic studies.
Collapse
Affiliation(s)
- Travis J Lawrence
- Biosciences Division, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN, 37831 USA
- Quantitative and Systems Biology Program, University of California, Merced, 5200 North Lake Rd., Merced, CA, 95343 USA
| | - Katherine CH Amrine
- Quantitative and Systems Biology Program, University of California, Merced, 5200 North Lake Rd., Merced, CA, 95343 USA
- Insight Data Science, 500 3rd St., San Francisco, CA, 94107 USA
| | - Wesley D Swingley
- Department of Biological Sciences, Northern Illinois University, 1425 Lincoln Hwy., DeKalb, IL, 60115 USA
| | - David H Ardell
- Quantitative and Systems Biology Program, University of California, Merced, 5200 North Lake Rd., Merced, CA, 95343 USA
- Molecular and Cell Biology, School of Natural Sciences, University of California, Merced, 5200 North Lake Rd., Merced, CA, 95343 USA
| |
Collapse
|
6
|
Collins-Hed AI, Ardell DH. Match fitness landscapes for macromolecular interaction networks: Selection for translational accuracy and rate can displace tRNA-binding interfaces of non-cognate aminoacyl-tRNA synthetases. Theor Popul Biol 2019; 129:68-80. [PMID: 31042487 DOI: 10.1016/j.tpb.2019.03.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 01/26/2019] [Accepted: 03/13/2019] [Indexed: 12/21/2022]
Abstract
Advances in structural biology of aminoacyl-tRNA synthetases (aaRSs) have revealed incredible diversity in how aaRSs bind their tRNA substrates. The causes of this diversity remain mysterious. We developed a new class of highly rugged fitness landscape models called match landscapes, through which genes encode the assortative interactions of their gene products through the complementarity and identifiability of their structural features. We used results from coding theory to prove bounds and equalities on fitness in match landscapes assuming additive interaction energies, macroscopic aminoacylation kinetics including proofreading, site-specific modifiers of interaction, and selection for translational accuracy in multiple, perfectly encoded site-types. Using genotypes based on extended Hamming codes we show that over a wide array of interface sizes and numbers of encoded cognate pairs, selection for translational accuracy alone is insufficient to displace the tRNA-binding interfaces of aaRSs. Yet, under combined selection for translational accuracy and rate, site-specific modifiers are selected to adaptively displace the tRNA-binding interfaces of non-cognate aaRS-tRNA pairs. We describe a remarkable correspondence between the lengths of perfect RNA (quaternary) codes and the modal sizes of small non-coding RNA families.
Collapse
Affiliation(s)
- Andrea I Collins-Hed
- Quantitative and Systems Biology Program, University of California, Merced, CA, 95306, United States
| | - David H Ardell
- Quantitative and Systems Biology Program, University of California, Merced, CA, 95306, United States; Molecular and Cell Biology Department, School of Natural Sciences, University of California, Merced, CA, 95306, United States.
| |
Collapse
|
7
|
Abstract
Inhibition of tRNA aminoacylation has proven to be an effective antimicrobial strategy, impeding an essential step of protein synthesis. Mupirocin, the well-known selective inhibitor of bacterial isoleucyl-tRNA synthetase, is one of three aminoacylation inhibitors now approved for human or animal use. However, design of novel aminoacylation inhibitors is complicated by the steadfast requirement to avoid off-target inhibition of protein synthesis in human cells. Here we review available data regarding known aminoacylation inhibitors as well as key amino-acid residues in aminoacyl-tRNA synthetases (aaRSs) and nucleotides in tRNA that determine the specificity and strength of the aaRS-tRNA interaction. Unlike most ligand-protein interactions, the aaRS-tRNA recognition interaction represents coevolution of both the tRNA and aaRS structures to conserve the specificity of aminoacylation. This property means that many determinants of tRNA recognition in pathogens have diverged from those of humans-a phenomenon that provides a valuable source of data for antimicrobial drug development.
Collapse
Affiliation(s)
- Joanne M Ho
- a Department of BioSciences , Rice University , Houston , TX , United States
| | | | - Dieter Söll
- c Departments of Molecular Biophysics & Biochemistry , Yale University , New Haven , CT , United States.,d Department of Chemistry , Yale University , New Haven , CT , United States
| | | |
Collapse
|
8
|
Abstract
Aminoacyl-tRNA synthetases (aaRSs) are modular enzymes globally conserved in the three kingdoms of life. All catalyze the same two-step reaction, i.e., the attachment of a proteinogenic amino acid on their cognate tRNAs, thereby mediating the correct expression of the genetic code. In addition, some aaRSs acquired other functions beyond this key role in translation. Genomics and X-ray crystallography have revealed great structural diversity in aaRSs (e.g., in oligomery and modularity, in ranking into two distinct groups each subdivided in 3 subgroups, by additional domains appended on the catalytic modules). AaRSs show huge structural plasticity related to function and limited idiosyncrasies that are kingdom or even species specific (e.g., the presence in many Bacteria of non discriminating aaRSs compensating for the absence of one or two specific aaRSs, notably AsnRS and/or GlnRS). Diversity, as well, occurs in the mechanisms of aaRS gene regulation that are not conserved in evolution, notably between distant groups such as Gram-positive and Gram-negative Bacteria. The review focuses on bacterial aaRSs (and their paralogs) and covers their structure, function, regulation, and evolution. Structure/function relationships are emphasized, notably the enzymology of tRNA aminoacylation and the editing mechanisms for correction of activation and charging errors. The huge amount of genomic and structural data that accumulated in last two decades is reviewed, showing how the field moved from essentially reductionist biology towards more global and integrated approaches. Likewise, the alternative functions of aaRSs and those of aaRS paralogs (e.g., during cell wall biogenesis and other metabolic processes in or outside protein synthesis) are reviewed. Since aaRS phylogenies present promiscuous bacterial, archaeal, and eukaryal features, similarities and differences in the properties of aaRSs from the three kingdoms of life are pinpointed throughout the review and distinctive characteristics of bacterium-like synthetases from organelles are outlined.
Collapse
Affiliation(s)
- Richard Giegé
- Architecture et Réactivité de l'ARN, Université de Strasbourg, CNRS, IBMC, 67084 Strasbourg, France
| | - Mathias Springer
- Université Paris Diderot, Sorbonne Cité, UPR9073 CNRS, IBPC, 75005 Paris, France
| |
Collapse
|
9
|
Roca AI. ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos. BMC Proc 2014; 8:S6. [PMID: 25237393 PMCID: PMC4155610 DOI: 10.1186/1753-6561-8-s2-s6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Background The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. Results The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. Conclusions The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org.
Collapse
Affiliation(s)
- Alberto I Roca
- ProfileGrid.org, P.O. Box 6414, Irvine, California 92616, USA
| |
Collapse
|
10
|
Amrine KCH, Swingley WD, Ardell DH. tRNA signatures reveal a polyphyletic origin of SAR11 strains among alphaproteobacteria. PLoS Comput Biol 2014; 10:e1003454. [PMID: 24586126 PMCID: PMC3937112 DOI: 10.1371/journal.pcbi.1003454] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Accepted: 12/10/2013] [Indexed: 12/18/2022] Open
Abstract
Molecular phylogenetics and phylogenomics are subject to noise from horizontal gene transfer (HGT) and bias from convergence in macromolecular compositions. Extensive variation in size, structure and base composition of alphaproteobacterial genomes has complicated their phylogenomics, sparking controversy over the origins and closest relatives of the SAR11 strains. SAR11 are highly abundant, cosmopolitan aquatic Alphaproteobacteria with streamlined, A+T-biased genomes. A dominant view holds that SAR11 are monophyletic and related to both Rickettsiales and the ancestor of mitochondria. Other studies dispute this, finding evidence of a polyphyletic origin of SAR11 with most strains distantly related to Rickettsiales. Although careful evolutionary modeling can reduce bias and noise in phylogenomic inference, entirely different approaches may be useful to extract robust phylogenetic signals from genomes. Here we develop simple phyloclassifiers from bioinformatically derived tRNA Class-Informative Features (CIFs), features predicted to target tRNAs for specific interactions within the tRNA interaction network. Our tRNA CIF-based model robustly and accurately classifies alphaproteobacterial genomes into one of seven undisputed monophyletic orders or families, despite great variability in tRNA gene complement sizes and base compositions. Our model robustly rejects monophyly of SAR11, classifying all but one strain as Rhizobiales with strong statistical support. Yet remarkably, conventional phylogenetic analysis of tRNAs classifies all SAR11 strains identically as Rickettsiales. We attribute this discrepancy to convergence of SAR11 and Rickettsiales tRNA base compositions. Thus, tRNA CIFs appear more robust to compositional convergence than tRNA sequences generally. Our results suggest that tRNA-CIF-based phyloclassification is robust to HGT of components of the tRNA interaction network, such as aminoacyl-tRNA synthetases. We explain why tRNAs are especially advantageous for prediction of traits governing macromolecular interactions from genomic data, and why such traits may be advantageous in the search for robust signals to address difficult problems in classification and phylogeny. If gene products work well in the networks of foreign cells, their genes may transfer horizontally between unrelated genomes. What factors dictate the ability to integrate into foreign networks? Different RNAs and proteins must interact specifically in order to function well as a system. For example, tRNA functions are determined by the interactions they have with other macromolecules. We have developed ways to predict, from genomic data alone, how tRNAs distinguish themselves to their specific interaction partners. Here, as proof of concept, we built a robust computational model from these bioinformatic predictions in seven lineages of Alphaproteobacteria. We validated our model by classifying hundreds of diverse alphaproteobacterial taxa and tested it on eight strains of SAR11, a phylogenetically controversial group that is highly abundant in the world's oceans. We found that different strains of SAR11 are more distantly related, both to each other and to mitochondria, than widely believed. We explain conflicting results about SAR11 as an artifact of bias created by the variability in base contents of alphaproteobacterial genomes. While this bias affects tRNAs too, our classifier appears unexpectedly robust to it. More broadly, our results suggest that traits governing macromolecular interactions may be more faithfully vertically inherited than the macromolecules themselves.
Collapse
Affiliation(s)
- Katherine C. H. Amrine
- Program in Quantitative and Systems Biology, University of California, Merced, Merced, California, United States of America
| | - Wesley D. Swingley
- Program in Quantitative and Systems Biology, University of California, Merced, Merced, California, United States of America
| | - David H. Ardell
- Program in Quantitative and Systems Biology, University of California, Merced, Merced, California, United States of America
- * E-mail:
| |
Collapse
|
11
|
Krishnakumar R, Prat L, Aerni HR, Ling J, Merryman C, Glass JI, Rinehart J, Söll D. Transfer RNA misidentification scrambles sense codon recoding. Chembiochem 2013; 14:1967-72. [PMID: 24000185 DOI: 10.1002/cbic.201300444] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2013] [Indexed: 12/22/2022]
Abstract
Sense codon recoding is the basis for genetic code expansion with more than two different noncanonical amino acids. It requires an unused (or rarely used) codon, and an orthogonal tRNA synthetase:tRNA pair with the complementary anticodon. The Mycoplasma capricolum genome contains just six CGG arginine codons, without a dedicated tRNA(Arg). We wanted to reassign this codon to pyrrolysine by providing M. capricolum with pyrrolysyl-tRNA synthetase, a synthetic tRNA with a CCG anticodon (tRNA(Pyl)(CCG)), and the genes for pyrrolysine biosynthesis. Here we show that tRNA(Pyl)(CCG) is efficiently recognized by the endogenous arginyl-tRNA synthetase, presumably at the anticodon. Mass spectrometry revealed that in the presence of tRNA(Pyl)(CCG), CGG codons are translated as arginine. This result is not unexpected as most tRNA synthetases use the anticodon as a recognition element. The data suggest that tRNA misidentification by endogenous aminoacyl-tRNA synthetases needs to be overcome for sense codon recoding.
Collapse
Affiliation(s)
- Radha Krishnakumar
- Synthetic Biology and Bioenergy, J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850 (USA)
| | | | | | | | | | | | | | | |
Collapse
|
12
|
Zhang Z, Yu J. Does the genetic code have a eukaryotic origin? GENOMICS PROTEOMICS & BIOINFORMATICS 2013; 11:41-55. [PMID: 23402863 PMCID: PMC4357656 DOI: 10.1016/j.gpb.2013.01.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2012] [Revised: 01/09/2013] [Accepted: 01/11/2013] [Indexed: 11/29/2022]
Abstract
In the RNA world, RNA is assumed to be the dominant macromolecule performing most, if not all, core “house-keeping” functions. The ribo-cell hypothesis suggests that the genetic code and the translation machinery may both be born of the RNA world, and the introduction of DNA to ribo-cells may take over the informational role of RNA gradually, such as a mature set of genetic code and mechanism enabling stable inheritance of sequence and its variation. In this context, we modeled the genetic code in two content variables—GC and purine contents—of protein-coding sequences and measured the purine content sensitivities for each codon when the sensitivity (% usage) is plotted as a function of GC content variation. The analysis leads to a new pattern—the symmetric pattern—where the sensitivity of purine content variation shows diagonally symmetry in the codon table more significantly in the two GC content invariable quarters in addition to the two existing patterns where the table is divided into either four GC content sensitivity quarters or two amino acid diversity halves. The most insensitive codon sets are GUN (valine) and CAN (CAR for asparagine and CAY for aspartic acid) and the most biased amino acid is valine (always over-estimated) followed by alanine (always under-estimated). The unique position of valine and its codons suggests its key roles in the final recruitment of the complete codon set of the canonical table. The distinct choice may only be attributable to sequence signatures or signals of splice sites for spliceosomal introns shared by all extant eukaryotes.
Collapse
Affiliation(s)
- Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | | |
Collapse
|
13
|
Abstract
Aminoacyl-tRNAsynthetases (aaRSs) are modular enzymesglobally conserved in the three kingdoms of life. All catalyze the same two-step reaction, i.e., the attachment of a proteinogenic amino acid on their cognate tRNAs, thereby mediating the correct expression of the genetic code. In addition, some aaRSs acquired other functions beyond this key role in translation.Genomics and X-ray crystallography have revealed great structural diversity in aaRSs (e.g.,in oligomery and modularity, in ranking into two distinct groups each subdivided in 3 subgroups, by additional domains appended on the catalytic modules). AaRSs show hugestructural plasticity related to function andlimited idiosyncrasies that are kingdom or even speciesspecific (e.g.,the presence in many Bacteria of non discriminating aaRSs compensating for the absence of one or two specific aaRSs, notably AsnRS and/or GlnRS).Diversity, as well, occurs in the mechanisms of aaRS gene regulation that are not conserved in evolution, notably betweendistant groups such as Gram-positive and Gram-negative Bacteria.Thereview focuses on bacterial aaRSs (and their paralogs) and covers their structure, function, regulation,and evolution. Structure/function relationships are emphasized, notably the enzymology of tRNA aminoacylation and the editing mechanisms for correction of activation and charging errors. The huge amount of genomic and structural data that accumulatedin last two decades is reviewed,showing how thefield moved from essentially reductionist biologytowards more global and integrated approaches. Likewise, the alternative functions of aaRSs and those of aaRSparalogs (e.g., during cellwall biogenesis and other metabolic processes in or outside protein synthesis) are reviewed. Since aaRS phylogenies present promiscuous bacterial, archaeal, and eukaryal features, similarities and differences in the properties of aaRSs from the three kingdoms of life are pinpointedthroughout the reviewand distinctive characteristics of bacterium-like synthetases from organelles are outlined.
Collapse
|
14
|
Szenes A, Pál G. Mapping hidden potential identity elements by computing the average discriminating power of individual tRNA positions. DNA Res 2012; 19:245-58. [PMID: 22378766 PMCID: PMC3372374 DOI: 10.1093/dnares/dss008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The recently published discrete mathematical method, extended consensus partition (ECP), identifies nucleotide types at each position that are strictly absent from a given sequence set, while occur in other sets. These are defined as discriminating elements (DEs). In this study using the ECP approach, we mapped potential hidden identity elements that discriminate the 20 different tRNA identities. We filtered the tDNA data set for the obligatory presence of well-established tRNA features, and then separately for each identity set, the presence of already experimentally identified strictly present identity elements. The analysis was performed on the three kingdoms of life. We determined the number of DE, e.g. the number of sets discriminated by the given position, for each tRNA position of each tRNA identity set. Then, from the positional DE numbers obtained from the 380 pairwise comparisons of the 20 identity sets, we calculated the average excluding value (AEV) for each tRNA position. The AEV provides a measure on the overall discriminating power of each position. Using a statistical analysis, we show that positional AEVs correlate with the number of already identified identity elements. Positions having high AEV but lacking published identity elements predict hitherto undiscovered tRNA identity elements.
Collapse
Affiliation(s)
- Aron Szenes
- Department of Biochemistry, Eötvös University, Budapest, Hungary
| | | |
Collapse
|
15
|
Alexander RW, Eargle J, Luthey-Schulten Z. Experimental and computational determination of tRNA dynamics. FEBS Lett 2009; 584:376-86. [PMID: 19932098 DOI: 10.1016/j.febslet.2009.11.061] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2009] [Revised: 11/14/2009] [Accepted: 11/16/2009] [Indexed: 10/20/2022]
Abstract
As the molecular representation of the genetic code, tRNA plays a central role in the translational machinery where it interacts with several proteins and other RNAs during the course of protein synthesis. These interactions exploit the dynamic flexibility of tRNA. In this minireview, we discuss the effects of modified bases, ions, and proteins on tRNA structure and dynamics and the challenges of observing its motions over the cycle of translation.
Collapse
Affiliation(s)
- Rebecca W Alexander
- Department of Chemistry, Wake Forest University, Winston-Salem, NC 27109-7486, United States.
| | | | | |
Collapse
|
16
|
Computational analysis of tRNA identity. FEBS Lett 2009; 584:325-33. [PMID: 19944694 DOI: 10.1016/j.febslet.2009.11.084] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2009] [Revised: 11/20/2009] [Accepted: 11/20/2009] [Indexed: 11/22/2022]
Abstract
I review recent developments in computational analysis of tRNA identity. I suggest that the tRNA-protein interaction network is hierarchically organized, and coevolutionarily flexible. Its functional specificity of recognition and discrimination persists despite generic structural constraints and perturbative evolutionary forces. This flexibility comes from its arbitrary nature as a self-recognizing shape code. A revisualization of predicted Proteobacterial tRNA identity highlights open research problems. tRNA identity elements and their coevolution with proteins must be mapped structurally over the Tree of Life. These traits can also resolve deep roots in the Tree. I show that histidylation identity elements phylogenetically reposition Pelagibacter ubique within alpha-Proteobacteria.
Collapse
|
17
|
Yang ZR. Predicting sulfotyrosine sites using the random forest algorithm with significantly improved prediction accuracy. BMC Bioinformatics 2009; 10:361. [PMID: 19874585 PMCID: PMC2777180 DOI: 10.1186/1471-2105-10-361] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2009] [Accepted: 10/29/2009] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Tyrosine sulfation is one of the most important posttranslational modifications. Due to its relevance to various disease developments, tyrosine sulfation has become the target for drug design. In order to facilitate efficient drug design, accurate prediction of sulfotyrosine sites is desirable. A predictor published seven years ago has been very successful with claimed prediction accuracy of 98%. However, it has a particularly low sensitivity when predicting sulfotyrosine sites in some newly sequenced proteins. RESULTS A new approach has been developed for predicting sulfotyrosine sites using the random forest algorithm after a careful evaluation of seven machine learning algorithms. Peptides are formed by consecutive residues symmetrically flanking tyrosine sites. They are then encoded using an amino acid hydrophobicity scale. This new approach has increased the sensitivity by 22%, the specificity by 3%, and the total prediction accuracy by 10% compared with the previous predictor using the same blind data. Meanwhile, both negative and positive predictive powers have been increased by 9%. In addition, the random forest model has an excellent feature for ranking the residues flanking tyrosine sites, hence providing more information for further investigating the tyrosine sulfation mechanism. A web tool has been implemented at http://ecsb.ex.ac.uk/sulfotyrosine for public use. CONCLUSION The random forest algorithm is able to deliver a better model compared with the Hidden Markov Model, the support vector machine, artificial neural networks, and others for predicting sulfotyrosine sites. The success shows that the random forest algorithm together with an amino acid hydrophobicity scale encoding can be a good candidate for peptide classification.
Collapse
Affiliation(s)
- Zheng Rong Yang
- School of Biosciences, University of Exeter, Exeter EX4 5DE, UK.
| |
Collapse
|
18
|
Giegé R. Toward a more complete view of tRNA biology. Nat Struct Mol Biol 2008; 15:1007-14. [PMID: 18836497 DOI: 10.1038/nsmb.1498] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2008] [Accepted: 09/09/2008] [Indexed: 12/11/2022]
Abstract
Transfer RNAs are ancient molecules present in all domains of life. In addition to translating the genetic code into protein and defining the second genetic code together with aminoacyl-tRNA synthetases, tRNAs act in many other cellular functions. Robust phenomenological observations on the role of tRNAs in translation, together with massive sequence and crystallographic data, have led to a deeper physicochemical understanding of tRNA architecture, dynamics and identity. In vitro studies complemented by cell biology data already indicate how tRNA behaves in cellular environments, in particular in higher Eukarya. From an opposite approach, reverse evolution considerations suggest how tRNAs emerged as simplified structures from the RNA world. This perspective discusses what basic questions remain unanswered, how these answers can be obtained and how a more rational understanding of the function and dysfunction of tRNA can have applications in medicine and biotechnology.
Collapse
Affiliation(s)
- Richard Giegé
- Département Machineries Traductionnelles, Institut de Biologie Moléculaire et Cellulaire du Centre National de la Recherche Scientifique & Université Louis Pasteur, Strasbourg, France.
| |
Collapse
|
19
|
Bailly M, Giannouli S, Blaise M, Stathopoulos C, Kern D, Becker HD. A single tRNA base pair mediates bacterial tRNA-dependent biosynthesis of asparagine. Nucleic Acids Res 2006; 34:6083-94. [PMID: 17074748 PMCID: PMC1635274 DOI: 10.1093/nar/gkl622] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In many prokaryotes and in organelles asparagine and glutamine are formed by a tRNA-dependent amidotransferase (AdT) that catalyzes amidation of aspartate and glutamate, respectively, mischarged on tRNAAsn and tRNAGln. These pathways supply the deficiency of the organism in asparaginyl- and glutaminyl-tRNA synthtetases and provide the translational machinery with Asn-tRNAAsn and Gln-tRNAGln. So far, nothing is known about the structural elements that confer to tRNA the role of a specific cofactor in the formation of the cognate amino acid. We show herein, using aspartylated tRNAAsn and tRNAAsp variants, that amidation of Asp acylating tRNAAsn is promoted by the base pair U1-A72 whereas the G1-C72 pair and presence of the supernumerary nucleotide U20A in the D-loop of tRNAAsp prevent amidation. We predict, based on comparison of tRNAGln and tRNAGlu sequence alignments from bacteria using the AdT-dependent pathway to form Gln-tRNAGln, that the same combination of nucleotides also rules specific tRNA-dependent formation of Gln. In contrast, we show that the tRNA-dependent conversion of Asp into Asn by archaeal AdT is mainly mediated by nucleotides G46 and U47 of the variable region. In the light of these results we propose that bacterial and archaeal AdTs use kingdom-specific signals to catalyze the tRNA-dependent formations of Asn and Gln.
Collapse
MESH Headings
- Adenine/chemistry
- Asparagine/biosynthesis
- Base Sequence
- Kinetics
- Neisseria meningitidis/enzymology
- Nitrogenous Group Transferases/chemistry
- Nitrogenous Group Transferases/metabolism
- RNA, Archaeal/chemistry
- RNA, Archaeal/metabolism
- RNA, Bacterial/chemistry
- RNA, Bacterial/metabolism
- RNA, Transfer/chemistry
- RNA, Transfer/metabolism
- RNA, Transfer, Asn/chemistry
- RNA, Transfer, Asn/metabolism
- RNA, Transfer, Asp/chemistry
- RNA, Transfer, Asp/metabolism
- RNA, Transfer, Gln/chemistry
- RNA, Transfer, Gln/metabolism
- RNA, Transfer, Glu/chemistry
- RNA, Transfer, Glu/metabolism
- Sequence Alignment
- Species Specificity
- Substrate Specificity
- Uridine/chemistry
Collapse
Affiliation(s)
| | - Stamatina Giannouli
- Department of Biochemistry and Biotechnology, University of Thessaly26 Ploutonos street, 41221 Larissa, Greece
| | | | - Constantinos Stathopoulos
- Department of Biochemistry and Biotechnology, University of Thessaly26 Ploutonos street, 41221 Larissa, Greece
- To whom correspondence should be addressed. Tel: +33 3 88 41 70 92; Fax: +33 3 88 60 22 18;
| | - Daniel Kern
- To whom correspondence should be addressed. Tel: +33 3 88 41 70 92; Fax: +33 3 88 60 22 18;
| | | |
Collapse
|
20
|
Ardell DH, Andersson SGE. TFAM detects co-evolution of tRNA identity rules with lateral transfer of histidyl-tRNA synthetase. Nucleic Acids Res 2006; 34:893-904. [PMID: 16473847 PMCID: PMC1363771 DOI: 10.1093/nar/gkj449] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
We present TFAM, an automated, statistical method to classify the identity of tRNAs. TFAM, currently optimized for bacteria, classifies initiator tRNAs and predicts the charging identity of both typical and atypical tRNAs such as suppressors with high confidence. We show statistical evidence for extensive variation in tRNA identity determinants among bacterial genomes due to variation in overall tDNA base content. With TFAM we have detected the first case of eukaryotic-like tRNA identity rules in bacteria. An α-proteobacterial clade encompassing Rhizobiales, Caulobacter crescentus and Silicibacter pomeroyi, unlike a sister clade containing the Rickettsiales, Zymomonas mobilis and Gluconobacter oxydans, uses the eukaryotic identity element A73 instead of the highly conserved prokaryotic element C73. We confirm divergence of bacterial histidylation rules by demonstrating perfect covariation of α-proteobacterial tRNAHis acceptor stems and residues in the motif IIb tRNA-binding pocket of their histidyl-tRNA synthetases (HisRS). Phylogenomic analysis supports lateral transfer of a eukaryotic-like HisRS into the α-proteobacteria followed by in situ adaptation of the bacterial tDNAHis and identity rule divergence. Our results demonstrate that TFAM is an effective tool for the bioinformatics, comparative genomics and evolutionary study of tRNA identity.
Collapse
MESH Headings
- Alphaproteobacteria/classification
- Alphaproteobacteria/enzymology
- Alphaproteobacteria/genetics
- DNA, Bacterial/classification
- Databases, Nucleic Acid
- Evolution, Molecular
- Gene Transfer, Horizontal
- Genetic Variation
- Genome, Bacterial
- Genomics
- Histidine-tRNA Ligase/classification
- Histidine-tRNA Ligase/genetics
- Models, Statistical
- Phylogeny
- RNA, Transfer/classification
- RNA, Transfer/genetics
- RNA, Transfer, His/chemistry
- RNA, Transfer, His/classification
- RNA, Transfer, His/genetics
- RNA, Transfer, Met/classification
Collapse
Affiliation(s)
- David H Ardell
- Department of Molecular Evolution, Evolutionary Biology Center Norbyvägen 18C Uppsala University SE-752 36 Uppsala Sweden.
| | | |
Collapse
|