1
|
Mandáková T, Krumpolcová A, Matyášek R, Volkov R, Lysak MA, Kovařík A. Uniparental silencing of 5S rRNA genes in plant allopolyploids - insights from Cardamine (Brassicaceae). THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2024. [PMID: 38838061 DOI: 10.1111/tpj.16850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 04/30/2024] [Accepted: 05/14/2024] [Indexed: 06/07/2024]
Abstract
While the phenomenon of uniparental silencing of 35S rDNA in interspecific hybrids and allopolyploids is well documented, there is a notable absence of information regarding whether such silencing extends to the 5S RNA component of ribosomes. To address this gap in knowledge, we analyzed the 5S and 35S rDNA expression in Cardamine (Brassicaceae) allopolyploids, namely C. × insueta (2n = 3x = 24, genome composition RRA), C. flexuosa (2n = 4x = 32, AAHH), and C. scutata (2n = 4x = 32, PPAA) which share a common diploid ancestor (AA). We employed high-throughput sequencing of transcriptomes and genomes and phylogenetic analyses of 5S rRNA variants. The genomic organization of rDNA was further scrutinized through clustering and fluorescence in situ hybridization. In the C. × insueta allotriploid, we observed uniparental dominant expression of 5S and 35S rDNA loci. In the C. flexuosa and C. scutata allotetraploids, the expression pattern differed, with the 35S rDNA being expressed from the A subgenome, whereas the 5S rDNA was expressed from the partner subgenome. Both C. flexuosa and C. scutata but not C. × insueta showed copy and locus number changes. We conclude that in stabilized allopolyploids, transcription of ribosomal RNA components occurs from different subgenomes. This phenomenon appears to result in the formation of chimeric ribosomes comprising rRNA molecules derived from distinct parental origins. We speculate that the interplay of epigenetic silencing and rDNA rearrangements introduces an additional layer of variation in multimolecule ribosomal complexes, potentially contributing to the evolutionary success of allopolyploids.
Collapse
Affiliation(s)
- Terezie Mandáková
- Central European Institute of Technology (CEITEC), Masaryk University, 625 00, Brno, Czech Republic
- Department of Experimental Biology, Faculty of Science, Masaryk University, 611 37, Brno, Czech Republic
| | - Alice Krumpolcová
- Department of Experimental Biology, Faculty of Science, Masaryk University, 611 37, Brno, Czech Republic
- Department of Molecular Epigenetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, 612 00, Brno, Czech Republic
| | - Roman Matyášek
- Department of Molecular Epigenetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, 612 00, Brno, Czech Republic
| | - Roman Volkov
- Department of Molecular Genetics and Biotechnology, Yuriy Fedkovych Chernivtsi National University, 58012, Chernivtsi, Ukraine
| | - Martin A Lysak
- Central European Institute of Technology (CEITEC), Masaryk University, 625 00, Brno, Czech Republic
- Faculty of Science, National Centre for Biomolecular Research, Masaryk University, 625 00, Brno, Czech Republic
| | - Ales Kovařík
- Department of Molecular Epigenetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, 612 00, Brno, Czech Republic
| |
Collapse
|
2
|
Ariza-Mateos A, Briones C, Perales C, Sobrino F, Domingo E, Gómez J. Archaeological approaches to RNA virus evolution. J Physiol 2024; 602:2469-2478. [PMID: 37818797 DOI: 10.1113/jp284416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 09/25/2023] [Indexed: 10/13/2023] Open
Abstract
Studies with RNA enzymes (ribozymes) and protein enzymes have identified certain structural elements that are present in some cellular mRNAs and viral RNAs. These elements do not share a primary structure and, thus, are not phylogenetically related. However, they have common (secondary/tertiary) structural folds that, according to some lines of evidence, may have an ancient and common origin. The term 'mRNA archaeology' has been coined to refer to the search for such structural/functional relics that may be informative of early evolutionary developments in the cellular and viral worlds and have lasted to the present day. Such identified RNA elements may have developed as biological signals with structural and functional relevance (as if they were buried objects with archaeological value), and coexist with the standard linear information of nucleic acid molecules that is translated into proteins. However, there is a key difference between the methods that extract information from either the primary structure of mRNA or the signals provided by secondary and tertiary structures. The former (sequence comparison and phylogenetic analysis) requires strict continuity of the material vehicle of information during evolution, whereas the archaeological method does not require such continuity. The tools of RNA archaeology (including the use of ribozymes and enzymes to investigate the reactivity of the RNA elements) establish links between the concepts of communication and language theories that have not been incorporated into knowledge of virology, as well as experimental studies on the search for functionally relevant RNA structures.
Collapse
Affiliation(s)
- Ascensión Ariza-Mateos
- Laboratory of RNA Archaeology, Instituto de Parasitología y Biomedicina 'López-Neyra' (CSIC), Granada, Spain
- Centro de Biología Molecular 'Severo Ochoa' (CSIC-UAM), Campus de Cantoblanco, Madrid, Spain
| | - Carlos Briones
- Department of Molecular Evolution, Centro de Astrobiología (CSIC-INTA), Madrid, Spain
| | - Celia Perales
- Centro de Biología Molecular 'Severo Ochoa' (CSIC-UAM), Campus de Cantoblanco, Madrid, Spain
- Department of Clinical Microbiology, IIS-Fundación Jiménez Díaz, UAM, Madrid, Spain
| | - Francisco Sobrino
- Centro de Biología Molecular 'Severo Ochoa' (CSIC-UAM), Campus de Cantoblanco, Madrid, Spain
| | - Esteban Domingo
- Centro de Biología Molecular 'Severo Ochoa' (CSIC-UAM), Campus de Cantoblanco, Madrid, Spain
| | - Jordi Gómez
- Laboratory of RNA Archaeology, Instituto de Parasitología y Biomedicina 'López-Neyra' (CSIC), Granada, Spain
| |
Collapse
|
3
|
Key J, Gispert S, Koepf G, Steinhoff-Wagner J, Reichlmeir M, Auburger G. Translation Fidelity and Respiration Deficits in CLPP-Deficient Tissues: Mechanistic Insights from Mitochondrial Complexome Profiling. Int J Mol Sci 2023; 24:17503. [PMID: 38139332 PMCID: PMC10743472 DOI: 10.3390/ijms242417503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 12/07/2023] [Accepted: 12/08/2023] [Indexed: 12/24/2023] Open
Abstract
The mitochondrial matrix peptidase CLPP is crucial during cell stress. Its loss causes Perrault syndrome type 3 (PRLTS3) with infertility, neurodegeneration, and a growth deficit. Its target proteins are disaggregated by CLPX, which also regulates heme biosynthesis via unfolding ALAS enzymes, providing access for pyridoxal-5'-phosphate (PLP). Despite efforts in diverse organisms with multiple techniques, CLPXP substrates remain controversial. Here, avoiding recombinant overexpression, we employed complexomics in mitochondria from three mouse tissues to identify endogenous targets. A CLPP absence caused the accumulation and dispersion of CLPX-VWA8 as AAA+ unfoldases, and of PLPBP. Similar changes and CLPX-VWA8 co-migration were evident for mitoribosomal central protuberance clusters, translation factors like GFM1-HARS2, the RNA granule components LRPPRC-SLIRP, and enzymes OAT-ALDH18A1. Mitochondrially translated proteins in testes showed reductions to <30% for MTCO1-3, the mis-assembly of the complex IV supercomplex, and accumulated metal-binding assembly factors COX15-SFXN4. Indeed, heavy metal levels were increased for iron, molybdenum, cobalt, and manganese. RT-qPCR showed compensatory downregulation only for Clpx mRNA; most accumulated proteins appeared transcriptionally upregulated. Immunoblots validated VWA8, MRPL38, MRPL18, GFM1, and OAT accumulation. Co-immunoprecipitation confirmed CLPX binding to MRPL38, GFM1, and OAT, so excess CLPX and PLP may affect their activity. Our data mechanistically elucidate the mitochondrial translation fidelity deficits which underlie progressive hearing impairment in PRLTS3.
Collapse
Affiliation(s)
- Jana Key
- Goethe University Frankfurt, University Hospital, Clinic of Neurology, Exp. Neurology, Heinrich Hoffmann Str. 7, 60590 Frankfurt am Main, Germany; (S.G.); (M.R.); (G.A.)
| | - Suzana Gispert
- Goethe University Frankfurt, University Hospital, Clinic of Neurology, Exp. Neurology, Heinrich Hoffmann Str. 7, 60590 Frankfurt am Main, Germany; (S.G.); (M.R.); (G.A.)
| | - Gabriele Koepf
- Goethe University Frankfurt, University Hospital, Clinic of Neurology, Exp. Neurology, Heinrich Hoffmann Str. 7, 60590 Frankfurt am Main, Germany; (S.G.); (M.R.); (G.A.)
| | - Julia Steinhoff-Wagner
- TUM School of Life Sciences, Animal Nutrition and Metabolism, Technical University of Munich, Liesel-Beckmann-Str. 2, 85354 Freising-Weihenstephan, Germany;
| | - Marina Reichlmeir
- Goethe University Frankfurt, University Hospital, Clinic of Neurology, Exp. Neurology, Heinrich Hoffmann Str. 7, 60590 Frankfurt am Main, Germany; (S.G.); (M.R.); (G.A.)
| | - Georg Auburger
- Goethe University Frankfurt, University Hospital, Clinic of Neurology, Exp. Neurology, Heinrich Hoffmann Str. 7, 60590 Frankfurt am Main, Germany; (S.G.); (M.R.); (G.A.)
| |
Collapse
|
4
|
Zhou S, Van Bortle K. The Pol III transcriptome: Basic features, recurrent patterns, and emerging roles in cancer. WILEY INTERDISCIPLINARY REVIEWS. RNA 2023; 14:e1782. [PMID: 36754845 PMCID: PMC10498592 DOI: 10.1002/wrna.1782] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 01/13/2023] [Accepted: 01/18/2023] [Indexed: 02/10/2023]
Abstract
The RNA polymerase III (Pol III) transcriptome is universally comprised of short, highly structured noncoding RNA (ncRNA). Through RNA-protein interactions, the Pol III transcriptome actuates functional activities ranging from nuclear gene regulation (7SK), splicing (U6, U6atac), and RNA maturation and stability (RMRP, RPPH1, Y RNA), to cytoplasmic protein targeting (7SL) and translation (tRNA, 5S rRNA). In higher eukaryotes, the Pol III transcriptome has expanded to include additional, recently evolved ncRNA species that effectively broaden the footprint of Pol III transcription to additional cellular activities. Newly evolved ncRNAs function as riboregulators of autophagy (vault), immune signaling cascades (nc886), and translation (Alu, BC200, snaR). Notably, upregulation of Pol III transcription is frequently observed in cancer, and multiple ncRNA species are linked to both cancer progression and poor survival outcomes among cancer patients. In this review, we outline the basic features and functions of the Pol III transcriptome, and the evidence for dysregulation and dysfunction for each ncRNA in cancer. When taken together, recurrent patterns emerge, ranging from shared functional motifs that include molecular scaffolding and protein sequestration, overlapping protein interactions, and immunostimulatory activities, to the biogenesis of analogous small RNA fragments and noncanonical miRNAs, augmenting the function of the Pol III transcriptome and further broadening its role in cancer. This article is categorized under: RNA in Disease and Development > RNA in Disease RNA Processing > Processing of Small RNAs RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications.
Collapse
Affiliation(s)
- Sihang Zhou
- Department of Cell and Developmental Biology, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Kevin Van Bortle
- Department of Cell and Developmental Biology, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
- Cancer Center at Illinois, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| |
Collapse
|
5
|
Tsurumaki M, Saito M, Tomita M, Kanai A. Features of smaller ribosomes in candidate phyla radiation (CPR) bacteria revealed with a molecular evolutionary analysis. RNA (NEW YORK, N.Y.) 2022; 28:1041-1057. [PMID: 35688647 PMCID: PMC9297842 DOI: 10.1261/rna.079103.122] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 06/06/2022] [Indexed: 06/01/2023]
Abstract
The candidate phyla radiation (CPR) is a large bacterial group consisting mainly of uncultured lineages. They have small cells and small genomes, and they often lack ribosomal proteins uL1, bL9, and/or uL30, which are basically ubiquitous in non-CPR bacteria. Here, we comprehensively analyzed the genomic information on CPR bacteria and identified their unique properties. The distribution of protein lengths in CPR bacteria peaks at around 100-150 amino acids, whereas the position of the peak varies in the range of 100-300 amino acids in free-living non-CPR bacteria, and at around 100-200 amino acids in most symbiotic non-CPR bacteria. These results show that the proteins of CPR bacteria are smaller, on average, than those of free-living non-CPR bacteria, like those of symbiotic non-CPR bacteria. We found that ribosomal proteins bL28, uL29, bL32, and bL33 have been lost in CPR bacteria in a taxonomic lineage-specific manner. Moreover, the sequences of approximately half of all ribosomal proteins of CPR differ, in part, from those of non-CPR bacteria, with missing regions or specifically added regions. We also found that several regions in the 16S, 23S, and 5S rRNAs of CPR bacteria are lacking, which presumably caused the total predicted lengths of the three rRNAs of CPR bacteria to be smaller than those of non-CPR bacteria. The regions missing in the CPR ribosomal proteins and rRNAs are located near the surface of the ribosome, and some are close to one another. These observations suggest that ribosomes are smaller in CPR bacteria than those in free-living non-CPR bacteria, with simplified surface structures.
Collapse
Affiliation(s)
- Megumi Tsurumaki
- Institute for Advanced Biosciences, Keio University, Tsuruoka 997-0017, Japan
- Systems Biology Program, Graduate School of Media and Governance, Keio University, Fujisawa 252-0882, Japan
| | - Motofumi Saito
- Institute for Advanced Biosciences, Keio University, Tsuruoka 997-0017, Japan
- Systems Biology Program, Graduate School of Media and Governance, Keio University, Fujisawa 252-0882, Japan
| | - Masaru Tomita
- Institute for Advanced Biosciences, Keio University, Tsuruoka 997-0017, Japan
- Systems Biology Program, Graduate School of Media and Governance, Keio University, Fujisawa 252-0882, Japan
- Faculty of Environment and Information Studies, Keio University, Fujisawa 252-0882, Japan
| | - Akio Kanai
- Institute for Advanced Biosciences, Keio University, Tsuruoka 997-0017, Japan
- Systems Biology Program, Graduate School of Media and Governance, Keio University, Fujisawa 252-0882, Japan
- Faculty of Environment and Information Studies, Keio University, Fujisawa 252-0882, Japan
| |
Collapse
|
6
|
Tynkevich YO, Shelyfist AY, Kozub LV, Hemleben V, Panchuk II, Volkov RA. 5S Ribosomal DNA of Genus Solanum: Molecular Organization, Evolution, and Taxonomy. FRONTIERS IN PLANT SCIENCE 2022; 13:852406. [PMID: 35498650 PMCID: PMC9043955 DOI: 10.3389/fpls.2022.852406] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 02/25/2022] [Indexed: 06/14/2023]
Abstract
The Solanum genus, being one of the largest among high plants, is distributed worldwide and comprises about 1,200 species. The genus includes numerous agronomically important species such as Solanum tuberosum (potato), Solanum lycopersicum (tomato), and Solanum melongena (eggplant) as well as medical and ornamental plants. The huge Solanum genus is a convenient model for research in the field of molecular evolution and structural and functional genomics. Clear knowledge of evolutionary relationships in the Solanum genus is required to increase the effectiveness of breeding programs, but the phylogeny of the genus is still not fully understood. The rapidly evolving intergenic spacer region (IGS) of 5S rDNA has been successfully used for inferring interspecific relationships in several groups of angiosperms. Here, combining cloning and sequencing with bioinformatic analysis of genomic data available in the SRA database, we evaluate the molecular organization and diversity of IGS for 184 accessions, representing 137 species of the Solanum genus. It was found that the main mechanisms of IGS molecular evolution was step-wise accumulation of single base substitution or short indels, and that long indels and multiple base substitutions, which arose repeatedly during evolution, were mostly not conserved and eliminated. The reason for this negative selection seems to be association between indels/multiple base substitutions and pseudogenization of 5S rDNA. Comparison of IGS sequences allowed us to reconstruct the phylogeny of the Solanum genus. The obtained dendrograms are mainly congruent with published data: same major and minor clades were found. However, relationships between these clades and position of some species (S. cochoae, S. clivorum, S. macrocarpon, and S. spirale) were different from those of previous results and require further clarification. Our results show that 5S IGS represents a convenient molecular marker for phylogenetic studies on the Solanum genus. In particular, the simultaneous presence of several structural variants of rDNA in the genome enables the detection of reticular evolution, especially in the largest and economically most important sect. Petota. The origin of several polyploid species should be reconsidered.
Collapse
Affiliation(s)
- Yurij O. Tynkevich
- Department of Molecular Genetics and Biotechnology, Yuriy Fedkovych Chernivtsi National University, Chernivtsi, Ukraine
| | - Antonina Y. Shelyfist
- Department of Molecular Genetics and Biotechnology, Yuriy Fedkovych Chernivtsi National University, Chernivtsi, Ukraine
| | - Liudmyla V. Kozub
- Department of Molecular Genetics and Biotechnology, Yuriy Fedkovych Chernivtsi National University, Chernivtsi, Ukraine
| | - Vera Hemleben
- Center of Plant Molecular Biology (ZMBP), Eberhard Karls University of Tübingen, Tübingen, Germany
| | - Irina I. Panchuk
- Department of Molecular Genetics and Biotechnology, Yuriy Fedkovych Chernivtsi National University, Chernivtsi, Ukraine
- Center of Plant Molecular Biology (ZMBP), Eberhard Karls University of Tübingen, Tübingen, Germany
| | - Roman A. Volkov
- Department of Molecular Genetics and Biotechnology, Yuriy Fedkovych Chernivtsi National University, Chernivtsi, Ukraine
| |
Collapse
|
7
|
Sun F, Caetano-Anollés G. Menzerath-Altmann's Law of Syntax in RNA Accretion History. Life (Basel) 2021; 11:489. [PMID: 34071925 PMCID: PMC8228408 DOI: 10.3390/life11060489] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 05/25/2021] [Accepted: 05/26/2021] [Indexed: 01/13/2023] Open
Abstract
RNA evolves by adding substructural parts to growing molecules. Molecular accretion history can be dissected with phylogenetic methods that exploit structural and functional evidence. Here, we explore the statistical behaviors of lengths of double-stranded and single-stranded segments of growing tRNA, 5S rRNA, RNase P RNA, and rRNA molecules. The reconstruction of character state changes along branches of phylogenetic trees of molecules and trees of substructures revealed strong pushes towards an economy of scale. In addition, statistically significant negative correlations and strong associations between the average lengths of helical double-stranded stems and their time of origin (age) were identified with the Pearson's correlation and Spearman's rho methods. The ages of substructures were derived directly from published rooted trees of substructures. A similar negative correlation was detected in unpaired segments of rRNA but not for the other molecules studied. These results suggest a principle of diminishing returns in RNA accretion history. We show this principle follows a tendency of substructural parts to decrease their size when molecular systems enlarge that follows the Menzerath-Altmann's law of language in full generality and without interference from the details of molecular growth.
Collapse
Affiliation(s)
- Fengjie Sun
- School of Science and Technology, Georgia Gwinnett College, Lawrenceville, GA 30043, USA;
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL 61801, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL 61801, USA
| |
Collapse
|
8
|
Stepanov VG, Fox GE. Expansion segments in bacterial and archaeal 5S ribosomal RNAs. RNA (NEW YORK, N.Y.) 2021; 27:133-150. [PMID: 33184227 PMCID: PMC7812874 DOI: 10.1261/rna.077123.120] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 11/09/2020] [Indexed: 05/10/2023]
Abstract
The large ribosomal RNAs of eukaryotes frequently contain expansion sequences that add to the size of the rRNAs but do not affect their overall structural layout and are compatible with major ribosomal function as an mRNA translation machine. The expansion of prokaryotic ribosomal RNAs is much less explored. In order to obtain more insight into the structural variability of these conserved molecules, we herein report the results of a comprehensive search for the expansion sequences in prokaryotic 5S rRNAs. Overall, 89 expanded 5S rRNAs of 15 structural types were identified in 15 archaeal and 36 bacterial genomes. Expansion segments ranging in length from 13 to 109 residues were found to be distributed among 17 insertion sites. The strains harboring the expanded 5S rRNAs belong to the bacterial orders Clostridiales, Halanaerobiales, Thermoanaerobacterales, and Alteromonadales as well as the archael order Halobacterales When several copies of a 5S rRNA gene are present in a genome, the expanded versions may coexist with normal 5S rRNA genes. The insertion sequences are typically capable of forming extended helices, which do not seemingly interfere with folding of the conserved core. The expanded 5S rRNAs have largely been overlooked in 5S rRNA databases.
Collapse
MESH Headings
- Alteromonadaceae/classification
- Alteromonadaceae/genetics
- Alteromonadaceae/metabolism
- Base Pairing
- Base Sequence
- Clostridiales/classification
- Clostridiales/genetics
- Clostridiales/metabolism
- Firmicutes/classification
- Firmicutes/genetics
- Firmicutes/metabolism
- Genome, Archaeal
- Genome, Bacterial
- Halobacteriales/classification
- Halobacteriales/genetics
- Halobacteriales/metabolism
- Nucleic Acid Conformation
- Phylogeny
- RNA, Archaeal/chemistry
- RNA, Archaeal/genetics
- RNA, Archaeal/metabolism
- RNA, Bacterial/chemistry
- RNA, Bacterial/genetics
- RNA, Bacterial/metabolism
- RNA, Ribosomal, 5S/chemistry
- RNA, Ribosomal, 5S/genetics
- RNA, Ribosomal, 5S/metabolism
- Thermoanaerobacterium/classification
- Thermoanaerobacterium/genetics
- Thermoanaerobacterium/metabolism
Collapse
Affiliation(s)
- Victor G Stepanov
- Department of Biology and Biochemistry, University of Houston, Houston, Texas 77204-5001, USA
| | - George E Fox
- Department of Biology and Biochemistry, University of Houston, Houston, Texas 77204-5001, USA
| |
Collapse
|
9
|
Long X, Xue H, Wong JTF. Descent of Bacteria and Eukarya From an Archaeal Root of Life. Evol Bioinform Online 2020; 16:1176934320908267. [PMID: 32636606 PMCID: PMC7313328 DOI: 10.1177/1176934320908267] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 01/30/2020] [Indexed: 02/05/2023] Open
Abstract
The 3 biological domains delineated based on small subunit ribosomal RNAs (SSU rRNAs) are confronted by uncertainties regarding the relationship between Archaea and Bacteria, and the origin of Eukarya. The similarities between the paralogous valyl-tRNA and isoleucyl-tRNA synthetases in 5398 species estimated by BLASTP, which decreased from Archaea to Bacteria and further to Eukarya, were consistent with vertical gene transmission from an archaeal root of life close to Methanopyrus kandleri through a Primitive Archaea Cluster to an Ancestral Bacteria Cluster, and to Eukarya. The predominant similarities of the ribosomal proteins (rProts) of eukaryotes toward archaeal rProts relative to bacterial rProts established that an archaeal parent rather than a bacterial parent underwent genome merger with bacteria to generate eukaryotes with mitochondria. Eukaryogenesis benefited from the predominantly archaeal accelerated gene adoption (AGA) phenotype pertaining to horizontally transferred genes from other prokaryotes and expedited genome evolution via both gene-content mutations and nucleotidyl mutations. Archaeons endowed with substantial AGA activity were accordingly favored as candidate archaeal parents. Based on the top similarity bitscores displayed by their proteomes toward the eukaryotic proteomes of Giardia and Trichomonas, and high AGA activity, the Aciduliprofundum archaea were identified as leading candidates of the archaeal parent. The Asgard archaeons and a number of bacterial species were among the foremost potential contributors of eukaryotic-like proteins to Eukarya.
Collapse
Affiliation(s)
- Xi Long
- Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Hong Xue
- Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China
| | - J Tze-Fei Wong
- Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China
| |
Collapse
|
10
|
Caetano-Anollés G, Aziz MF, Mughal F, Gräter F, Koç I, Caetano-Anollés K, Caetano-Anollés D. Emergence of Hierarchical Modularity in Evolving Networks Uncovered by Phylogenomic Analysis. Evol Bioinform Online 2019; 15:1176934319872980. [PMID: 31523127 PMCID: PMC6728656 DOI: 10.1177/1176934319872980] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Accepted: 08/08/2019] [Indexed: 01/15/2023] Open
Abstract
Networks describe how parts associate with each other to form integrated systems which often have modular and hierarchical structure. In biology, network growth involves two processes, one that unifies and the other that diversifies. Here, we propose a biphasic (bow-tie) theory of module emergence. In the first phase, parts are at first weakly linked and associate variously. As they diversify, they compete with each other and are often selected for performance. The emerging interactions constrain their structure and associations. This causes parts to self-organize into modules with tight linkage. In the second phase, variants of the modules diversify and become new parts for a new generative cycle of higher level organization. The paradigm predicts the rise of hierarchical modularity in evolving networks at different timescales and complexity levels. Remarkably, phylogenomic analyses uncover this emergence in the rewiring of metabolomic and transcriptome-informed metabolic networks, the nanosecond dynamics of proteins, and evolving networks of metabolism, elementary functionomes, and protein domain organization.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory,
Department of Crop Sciences, C.R. Woese Institute for Genomic Biology, and Illinois
Informatics Institute, University of Illinois, Urbana, IL, USA
| | - M Fayez Aziz
- Evolutionary Bioinformatics Laboratory,
Department of Crop Sciences, C.R. Woese Institute for Genomic Biology, and Illinois
Informatics Institute, University of Illinois, Urbana, IL, USA
| | - Fizza Mughal
- Evolutionary Bioinformatics Laboratory,
Department of Crop Sciences, C.R. Woese Institute for Genomic Biology, and Illinois
Informatics Institute, University of Illinois, Urbana, IL, USA
| | - Frauke Gräter
- Heidelberg Institute for Theoretical
Studies, Heidelberg, Germany
| | - Ibrahim Koç
- Department of Molecular Biology and
Genetics, Gebze Technical University, Gebze, Turkey
| | - Kelsey Caetano-Anollés
- Division of Biomedical Informatics,
College of Medicine, Seoul National University, Seoul, Republic of Korea
| | | |
Collapse
|
11
|
Caetano-Anollés G, Nasir A, Kim KM, Caetano-Anollés D. Rooting Phylogenies and the Tree of Life While Minimizing Ad Hoc and Auxiliary Assumptions. Evol Bioinform Online 2018; 14:1176934318805101. [PMID: 30364468 PMCID: PMC6196624 DOI: 10.1177/1176934318805101] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Accepted: 09/05/2018] [Indexed: 12/25/2022] Open
Abstract
Phylogenetic methods unearth evolutionary history when supported by three starting points of reason: (1) the continuity axiom begs the existence of a "model" of evolutionary change, (2) the singularity axiom defines the historical ground plan (phylogeny) in which biological entities (taxa) evolve, and (3) the memory axiom demands identification of biological attributes (characters) with historical information. Axiom consequences are interlinked, making the retrodiction enterprise an endeavor of reciprocal fulfillment. In particular, establishing direction of evolutionary change (character polarization) roots phylogenies and enables testing the existence of historical memory (homology). Unfortunately, rooting phylogenies, especially the "tree of life," generally follow narratives instead of integrating empirical and theoretical knowledge of retrodictive exploration. This stems mostly from a focus on molecular sequence analysis and uncertainties about rooting methods. Here, we review available rooting criteria, highlighting the need to minimize both ad hoc and auxiliary assumptions, especially argumentative ad hocness. We show that while the outgroup comparison method has been widely adopted, the generality criterion of nesting and additive phylogenetic change embodied in Weston rule offers the most powerful rooting approach. We also propose a change of focus, from phylogenies that describe the evolution of biological systems to those that describe the evolution of parts of those systems. This weakens violation of character independence, helps formalize the generality criterion of rooting, and provides new ways to study the problem of evolution.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Arshan Nasir
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- Department of Biosciences, COMSATS University Islamabad, Islamabad, Pakistan
| | - Kyung Mo Kim
- Division of Polar Life Sciences, Korea Polar Research Institute, Incheon, Republic of Korea
| | - Derek Caetano-Anollés
- Department of Evolutionary Genetics, Max-Planck-Institut für Evolutionsbiologie, Plön, Germany
| |
Collapse
|
12
|
Caetano-Anollés D, Caetano-Anollés K, Caetano-Anollés G. Evolution of macromolecular structure: a 'double tale' of biological accretion and diversification. Sci Prog 2018; 101:360-383. [PMID: 30296968 PMCID: PMC10365222 DOI: 10.3184/003685018x15379391431599] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
The evolution of structure in biology is driven by accretion and diversification. Accretion brings together disparate parts to form bigger wholes. Diversification provides opportunities for growth and innovation. Here, we review patterns and processes that are responsible for a 'double tale' of accretion and diversification at various levels of complexity, from proteins and nucleic acids to high-rise building structures in cities. Parts are at first weakly linked and associate variously. As they diversify, they compete with each other and are selected for performance. The emerging interactions constrain their structure and associations. This causes parts to self-organise into modules with tight linkage. In a second phase, variants of the modules evolve and become new parts for a new generative cycle of higher-level organisation. Evolutionary genomics and network biology support the 'double tale' of structural module creation and validate an evolutionary principle of maximum abundance that drives the gain and loss of modules.
Collapse
Affiliation(s)
- Derek Caetano-Anollés
- Department of Evolutionary Genetics of the Max-Planck Institute for Evolutionary Biology, Plön, Germany. Developmental Biology from the University of Illinois, Urbana-Champaign
| | - Kelsey Caetano-Anollés
- Division of Biomedical Informatics of Seoul National University College of Medicine, Republic of Korea. Animal Sciences from the University of Illinois, Urbana-Champaign
| | - Gustavo Caetano-Anollés
- Department of Crop Sciences and Affiliate of the C.R. Woese Institute for Genomic Biology at the University of Illinois, Urbana-Champaign. University of La Plata in Argentina
| |
Collapse
|
13
|
Staley JT, Caetano-Anollés G. Archaea-First and the Co-Evolutionary Diversification of Domains of Life. Bioessays 2018; 40:e1800036. [PMID: 29944192 DOI: 10.1002/bies.201800036] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Revised: 05/12/2018] [Indexed: 12/13/2022]
Abstract
The origins and evolution of the Archaea, Bacteria, and Eukarya remain controversial. Phylogenomic-wide studies of molecular features that are evolutionarily conserved, such as protein structural domains, suggest Archaea is the first domain of life to diversify from a stem line of descent. This line embodies the last universal common ancestor of cellular life. Here, we propose that ancestors of Euryarchaeota co-evolved with those of Bacteria prior to the diversification of Eukarya. This co-evolutionary scenario is supported by comparative genomic and phylogenomic analyses of the distributions of fold families of domains in the proteomes of free-living organisms, which show horizontal gene recruitments and informational process homologies. It also benefits from the molecular study of cell physiologies responsible for membrane phospholipids, methanogenesis, methane oxidation, cell division, gas vesicles, and the cell wall. Our theory however challenges popular cell fusion and two-domain of life scenarios derived from sequence analysis, demanding phylogenetic reconciliation. Also see the video abstract here: https://youtu.be/9yVWn_Q9faY.
Collapse
Affiliation(s)
- James T Staley
- Department of Microbiology and Astrobiology Program, University of Washington, Seattle, WA, 98195, USA
| | - Gustavo Caetano-Anollés
- Department of Crop Sciences, C. R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| |
Collapse
|
14
|
Coevolution Theory of the Genetic Code at Age Forty: Pathway to Translation and Synthetic Life. Life (Basel) 2016; 6:life6010012. [PMID: 26999216 PMCID: PMC4810243 DOI: 10.3390/life6010012] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Revised: 02/26/2016] [Accepted: 03/04/2016] [Indexed: 11/17/2022] Open
Abstract
The origins of the components of genetic coding are examined in the present study. Genetic information arose from replicator induction by metabolite in accordance with the metabolic expansion law. Messenger RNA and transfer RNA stemmed from a template for binding the aminoacyl-RNA synthetase ribozymes employed to synthesize peptide prosthetic groups on RNAs in the Peptidated RNA World. Coevolution of the genetic code with amino acid biosynthesis generated tRNA paralogs that identify a last universal common ancestor (LUCA) of extant life close to Methanopyrus, which in turn points to archaeal tRNA introns as the most primitive introns and the anticodon usage of Methanopyrus as an ancient mode of wobble. The prediction of the coevolution theory of the genetic code that the code should be a mutable code has led to the isolation of optional and mandatory synthetic life forms with altered protein alphabets.
Collapse
|
15
|
Castro SI, Hleap JS, Cárdenas H, Blouin C. Molecular organization of the 5S rDNA gene type II in elasmobranchs. RNA Biol 2015; 13:391-9. [PMID: 26488198 PMCID: PMC4841605 DOI: 10.1080/15476286.2015.1100796] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Accepted: 09/21/2015] [Indexed: 10/22/2022] Open
Abstract
The 5S rDNA gene is a non-coding RNA that can be found in 2 copies (type I and type II) in bony and cartilaginous fish. Previous studies have pointed out that type II gene is a paralog derived from type I. We analyzed the molecular organization of 5S rDNA type II in elasmobranchs. Although the structure of the 5S rDNA is supposed to be highly conserved, our results show that the secondary structure in this group possesses some variability and is different than the consensus secondary structure. One of these differences in Selachii is an internal loop at nucleotides 7 and 112. These mutations observed in the transcribed region suggest an independent origin of the gene among Batoids and Selachii. All promoters were highly conserved with the exception of BoxA, possibly due to its affinity to polymerase III. This latter enzyme recognizes a dT4 sequence as stop signal, however in Rajiformes this signal was doubled in length to dT8. This could be an adaptation toward a higher efficiency in the termination process. Our results suggest that there is no TATA box in elasmobranchs in the NTS region. We also provide some evidence suggesting that the complexity of the microsatellites present in the NTS region play an important role in the 5S rRNA gene since it is significantly correlated with the length of the NTS.
Collapse
Affiliation(s)
- Sergio I. Castro
- Grupo de Estudios en Genética Ecología Molecular y Fisiología Animal, Universidad del Valle, Cali, Colombia
- Fundación Colombiana para la Investigación y Conservación de Tiburones y Rayas, SQUALUS. Cali, Colombia
| | - Jose S. Hleap
- Grupo de Estudios en Genética Ecología Molecular y Fisiología Animal, Universidad del Valle, Cali, Colombia
- Fundación Colombiana para la Investigación y Conservación de Tiburones y Rayas, SQUALUS. Cali, Colombia
- Canadian Institute for Advanced Research, Program in Evolutionary Biology, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Canada
| | - Heiber Cárdenas
- Grupo de Estudios en Genética Ecología Molecular y Fisiología Animal, Universidad del Valle, Cali, Colombia
| | - Christian Blouin
- Canadian Institute for Advanced Research, Program in Evolutionary Biology, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Canada
- Department of Computer Science, Dalhousie University, Halifax, Canada
| |
Collapse
|
16
|
Caetano-Anollés G, Caetano-Anollés D. Computing the origin and evolution of the ribosome from its structure - Uncovering processes of macromolecular accretion benefiting synthetic biology. Comput Struct Biotechnol J 2015; 13:427-47. [PMID: 27096056 PMCID: PMC4823900 DOI: 10.1016/j.csbj.2015.07.003] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2015] [Revised: 07/16/2015] [Accepted: 07/19/2015] [Indexed: 12/11/2022] Open
Abstract
Accretion occurs pervasively in nature at widely different timeframes. The process also manifests in the evolution of macromolecules. Here we review recent computational and structural biology studies of evolutionary accretion that make use of the ideographic (historical, retrodictive) and nomothetic (universal, predictive) scientific frameworks. Computational studies uncover explicit timelines of accretion of structural parts in molecular repertoires and molecules. Phylogenetic trees of protein structural domains and proteomes and their molecular functions were built from a genomic census of millions of encoded proteins and associated terminal Gene Ontology terms. Trees reveal a ‘metabolic-first’ origin of proteins, the late development of translation, and a patchwork distribution of proteins in biological networks mediated by molecular recruitment. Similarly, the natural history of ancient RNA molecules inferred from trees of molecular substructures built from a census of molecular features shows patchwork-like accretion patterns. Ideographic analyses of ribosomal history uncover the early appearance of structures supporting mRNA decoding and tRNA translocation, the coevolution of ribosomal proteins and RNA, and a first evolutionary transition that brings ribosomal subunits together into a processive protein biosynthetic complex. Nomothetic structural biology studies of tertiary interactions and ancient insertions in rRNA complement these findings, once concentric layering assumptions are removed. Patterns of coaxial helical stacking reveal a frustrated dynamics of outward and inward ribosomal growth possibly mediated by structural grafting. The early rise of the ribosomal ‘turnstile’ suggests an evolutionary transition in natural biological computation. Results make explicit the need to understand processes of molecular growth and information transfer of macromolecules.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, 1101W. Peabody Drive, Urbana, IL 61801, USA; C.R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL 61801, USA
| | - Derek Caetano-Anollés
- C.R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL 61801, USA
| |
Collapse
|
17
|
Ancestral Insertions and Expansions of rRNA do not Support an Origin of the Ribosome in Its Peptidyl Transferase Center. J Mol Evol 2015; 80:162-5. [PMID: 25864085 PMCID: PMC4555209 DOI: 10.1007/s00239-015-9677-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2014] [Accepted: 03/29/2015] [Indexed: 01/06/2023]
Abstract
Phylogenetic reconstruction of ribosomal history suggests that the ribonucleoprotein complex originated in structures supporting RNA decoding and ribosomal mechanics. A recent study of accretion of ancestral expansion segments of rRNA, however, contends that the large subunit of the ribosome originated in its peptidyl transferase center (PTC). Here I re-analyze the rRNA insertion data that supports this claim. Analysis of a crucial three-way junction connecting the long-helical coaxial branch that supports the PTC to the L1 stalk and its translocation functions reveals an incorrect branch-to-trunk insertion assignment that is in conflict with the PTC-centered accretion model. Instead, the insertion supports the ancestral origin of translocation. Similarly, an insertion linking a terminal coaxial trunk that holds the L7–12 stalk and its GTPase center to a seven-way junction of the molecule again questions the early origin of the PTC. Unwarranted assumptions, dismissals of conflicting data, structural insertion ambiguities, and lack of phylogenetic information compromise the construction of an unequivocal insertion-based model of macromolecular accretion. Results prompt integration of phylogenetic and structure-based models to address RNA junction growth and evolutionary constraints acting on ribosomal structure.
Collapse
|
18
|
A phylogenomic census of molecular functions identifies modern thermophilic archaea as the most ancient form of cellular life. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2014; 2014:706468. [PMID: 25249790 PMCID: PMC4164138 DOI: 10.1155/2014/706468] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2013] [Revised: 11/20/2013] [Accepted: 01/17/2014] [Indexed: 12/30/2022]
Abstract
The origins of diversified life remain mysterious despite considerable efforts devoted to untangling the roots of the universal tree of life. Here we reconstructed phylogenies that described the evolution of molecular functions and the evolution of species directly from a genomic census of gene ontology (GO) definitions. We sampled 249 free-living genomes spanning organisms in the three superkingdoms of life, Archaea, Bacteria, and Eukarya, and used the abundance of GO terms as molecular characters to produce rooted phylogenetic trees. Results revealed an early thermophilic origin of Archaea that was followed by genome reduction events in microbial superkingdoms. Eukaryal genomes displayed extraordinary functional diversity and were enriched with hundreds of novel molecular activities not detected in the akaryotic microbial cells. Remarkably, the majority of these novel functions appeared quite late in evolution, synchronized with the diversification of the eukaryal superkingdom. The distribution of GO terms in superkingdoms confirms that Archaea appears to be the simplest and most ancient form of cellular life, while Eukarya is the most diverse and recent.
Collapse
|
19
|
Kim KM, Nasir A, Hwang K, Caetano-Anollés G. A tree of cellular life inferred from a genomic census of molecular functions. J Mol Evol 2014; 79:240-62. [PMID: 25128982 DOI: 10.1007/s00239-014-9637-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Accepted: 08/05/2014] [Indexed: 10/24/2022]
Abstract
Phylogenomics aims to describe evolutionary relatedness between organisms by analyzing genomic data. The common practice is to produce phylogenomic trees from molecular information in the sequence, order, and content of genes in genomes. These phylogenies describe the evolution of life and become valuable tools for taxonomy. The recent availability of structural and functional data for hundreds of genomes now offers the opportunity to study evolution using more deep, conserved, and reliable sets of molecular features. Here, we reconstruct trees of life from the functions of proteins. We start by inferring rooted phylogenomic trees and networks of organisms directly from Gene Ontology annotations. Phylogenies and networks yield novel insights into the emergence and evolution of cellular life. The ancestor of Archaea originated earlier than the ancestors of Bacteria and Eukarya and was thermophilic. In contrast, basal bacterial lineages were non-thermophilic. A close relationship between Plants and Metazoa was also identified that disagrees with the traditional Fungi-Metazoa grouping. While measures of evolutionary reticulation were minimum in Eukarya and maximum in Bacteria, the massive role of horizontal gene transfer in microbes did not materialize in phylogenomic networks. Phylogenies and networks also showed that the best reconstructions were recovered when problematic taxa (i.e., parasitic/symbiotic organisms) and horizontally transferred characters were excluded from analysis. Our results indicate that functionomic data represent a useful addition to the set of molecular characters used for tree reconstruction and that trees of cellular life carry in deep branches considerable predictive power to explain the evolution of living organisms.
Collapse
Affiliation(s)
- Kyung Mo Kim
- Microbial Resource Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, 305-806, Korea
| | | | | | | |
Collapse
|
20
|
Caetano-Anollés G, Nasir A, Zhou K, Caetano-Anollés D, Mittenthal JE, Sun FJ, Kim KM. Archaea: the first domain of diversified life. ARCHAEA (VANCOUVER, B.C.) 2014; 2014:590214. [PMID: 24987307 PMCID: PMC4060292 DOI: 10.1155/2014/590214] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Revised: 02/15/2014] [Accepted: 03/25/2014] [Indexed: 01/23/2023]
Abstract
The study of the origin of diversified life has been plagued by technical and conceptual difficulties, controversy, and apriorism. It is now popularly accepted that the universal tree of life is rooted in the akaryotes and that Archaea and Eukarya are sister groups to each other. However, evolutionary studies have overwhelmingly focused on nucleic acid and protein sequences, which partially fulfill only two of the three main steps of phylogenetic analysis, formulation of realistic evolutionary models, and optimization of tree reconstruction. In the absence of character polarization, that is, the ability to identify ancestral and derived character states, any statement about the rooting of the tree of life should be considered suspect. Here we show that macromolecular structure and a new phylogenetic framework of analysis that focuses on the parts of biological systems instead of the whole provide both deep and reliable phylogenetic signal and enable us to put forth hypotheses of origin. We review over a decade of phylogenomic studies, which mine information in a genomic census of millions of encoded proteins and RNAs. We show how the use of process models of molecular accumulation that comply with Weston's generality criterion supports a consistent phylogenomic scenario in which the origin of diversified life can be traced back to the early history of Archaea.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, Institute for Genomic Biology and Illinois Informatics Institute, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Arshan Nasir
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, Institute for Genomic Biology and Illinois Informatics Institute, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Kaiyue Zhou
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, Institute for Genomic Biology and Illinois Informatics Institute, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Derek Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, Institute for Genomic Biology and Illinois Informatics Institute, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Jay E. Mittenthal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, Institute for Genomic Biology and Illinois Informatics Institute, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Feng-Jie Sun
- School of Science and Technology, Georgia Gwinnett College, Lawrenceville, GA 30043, USA
| | - Kyung Mo Kim
- Microbial Resource Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon 305-806, Republic of Korea
| |
Collapse
|
21
|
Caetano-Anollés G, Sun FJ. The natural history of transfer RNA and its interactions with the ribosome. Front Genet 2014; 5:127. [PMID: 24847358 PMCID: PMC4023039 DOI: 10.3389/fgene.2014.00127] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2014] [Accepted: 04/22/2014] [Indexed: 12/20/2022] Open
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois Urbana-Champaign, IL, USA
| | - Feng-Jie Sun
- School of Science and Technology, Georgia Gwinnett College Lawrenceville, GA, USA
| |
Collapse
|
22
|
Comparative analysis of proteomes and functionomes provides insights into origins of cellular diversification. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2013; 2013:648746. [PMID: 24492748 PMCID: PMC3892558 DOI: 10.1155/2013/648746] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Revised: 11/22/2013] [Accepted: 11/25/2013] [Indexed: 11/22/2022]
Abstract
Reconstructing the evolutionary history of modern species is a difficult problem complicated by the conceptual and technical limitations of phylogenetic tree building methods. Here, we propose a comparative proteomic and functionomic inferential framework for genome evolution that allows resolving the tripartite division of cells and sketching their history. Evolutionary inferences were derived from the spread of conserved molecular features, such as molecular structures and functions, in the proteomes and functionomes of contemporary organisms. Patterns of use and reuse of these traits yielded significant insights into the origins of cellular diversification. Results uncovered an unprecedented strong evolutionary association between Bacteria and Eukarya while revealing marked evolutionary reductive tendencies in the archaeal genomic repertoires. The effects of nonvertical evolutionary processes (e.g., HGT, convergent evolution) were found to be limited while reductive evolution and molecular innovation appeared to be prevalent during the evolution of cells. Our study revealed a strong vertical trace in the history of proteins and associated molecular functions, which was reliably recovered using the comparative genomics approach. The trace supported the existence of a stem line of descent and the very early appearance of Archaea as a diversified superkingdom, but failed to uncover a hidden canonical pattern in which Bacteria was the first superkingdom to deploy superkingdom-specific structures and functions.
Collapse
|
23
|
Systematic application of DNA fiber-FISH technique in cotton. PLoS One 2013; 8:e75674. [PMID: 24086609 PMCID: PMC3785504 DOI: 10.1371/journal.pone.0075674] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2013] [Accepted: 08/09/2013] [Indexed: 01/16/2023] Open
Abstract
Fluorescence in situ hybridization on extended DNA (fiber-FISH) is a powerful tool in high-resolution physical mapping. To introduce this technique into cotton, we developed the technique and tested it by deliberately mapping of telomere and 5S rDNA. Results showed that telomere-length ranged from 0.80 kb to 37.86 kb in three species, G. hirsutum, G. herbaceum and G. arboreum. However, most of the telomeres (>91.0%) were below 10 kb. The length of 5S rDNA was revealed as 964 kb in G. herbaceum whereas, in G. arboreum, it was approximately three times longer (3.1 Mb). A fiber-FISH based immunofluorescence method was also described to assay the DNA methylation. Using this technique, we revealed that both telomere and 5S rDNA were methylated at different levels. In addition, we developed a BAC molecule-based fiber-FISH technique. Using this technique, we can precisely map BAC clones on each other and evaluated the size and location of overlapped regions. The development and application of fiber-FISH technique will facilitate high-resolution physical mapping and further directed sequencing projects for cotton.
Collapse
|
24
|
Caetano-Anollés G, Wang M, Caetano-Anollés D. Structural phylogenomics retrodicts the origin of the genetic code and uncovers the evolutionary impact of protein flexibility. PLoS One 2013; 8:e72225. [PMID: 23991065 PMCID: PMC3749098 DOI: 10.1371/journal.pone.0072225] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Accepted: 07/07/2013] [Indexed: 11/18/2022] Open
Abstract
The genetic code shapes the genetic repository. Its origin has puzzled molecular scientists for over half a century and remains a long-standing mystery. Here we show that the origin of the genetic code is tightly coupled to the history of aminoacyl-tRNA synthetase enzymes and their interactions with tRNA. A timeline of evolutionary appearance of protein domain families derived from a structural census in hundreds of genomes reveals the early emergence of the 'operational' RNA code and the late implementation of the standard genetic code. The emergence of codon specificities and amino acid charging involved tight coevolution of aminoacyl-tRNA synthetases and tRNA structures as well as episodes of structural recruitment. Remarkably, amino acid and dipeptide compositions of single-domain proteins appearing before the standard code suggest archaic synthetases with structures homologous to catalytic domains of tyrosyl-tRNA and seryl-tRNA synthetases were capable of peptide bond formation and aminoacylation. Results reveal that genetics arose through coevolutionary interactions between polypeptides and nucleic acid cofactors as an exacting mechanism that favored flexibility and folding of the emergent proteins. These enhancements of phenotypic robustness were likely internalized into the emerging genetic system with the early rise of modern protein structure.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, United States of America
- * E-mail:
| | - Minglei Wang
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, United States of America
| | - Derek Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, United States of America
| |
Collapse
|
25
|
Romance of the three domains: how cladistics transformed the classification of cellular organisms. Protein Cell 2013; 4:664-76. [PMID: 23873078 PMCID: PMC4875529 DOI: 10.1007/s13238-013-3050-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Accepted: 07/01/2013] [Indexed: 11/23/2022] Open
Abstract
Cladistics is a biological philosophy that uses genealogical relationship among species and an inferred sequence of divergence as the basis of classification. This review critically surveys the chronological development of biological classification from Aristotle through our postgenomic era with a central focus on cladistics. In 1957, Julian Huxley coined cladogenesis to denote splitting from subspeciation. In 1960, the English translation of Willi Hennig’s 1950 work, Systematic Phylogenetics, was published, which received strong opposition from pheneticists, such as numerical taxonomists Peter Sneath and Robert Sokal, and evolutionary taxonomist, Ernst Mayr, and sparked acrimonious debates in 1960–1980. In 1977–1990, Carl Woese pioneered in using small subunit rRNA gene sequences to delimitate the three domains of cellular life and established major prokaryotic phyla. Cladistics has since dominated taxonomy. Despite being compatible with modern microbiological observations, i.e. organisms with unusual phenotypes, restricted expression of characteristics and occasionally being uncultivable, increasing recognition of pervasiveness and abundance of horizontal gene transfer has challenged relevance and validity of cladistics. The mosaic nature of eukaryotic and prokaryotic genomes was also gradually discovered. In the mid-2000s, high-throughput and whole-genome sequencing became routine and complex geneologies of organisms have led to the proposal of a reticulated web of life. While genomics only indirectly leads to understanding of functional adaptations to ecological niches, computational modeling of entire organisms is underway and the gap between genomics and phenetics may soon be bridged. Controversies are not expected to settle as taxonomic classifications shall remain subjective to serve the human scientist, not the classified.
Collapse
|
26
|
Bukhari SA, Caetano-Anollés G. Origin and evolution of protein fold designs inferred from phylogenomic analysis of CATH domain structures in proteomes. PLoS Comput Biol 2013; 9:e1003009. [PMID: 23555236 PMCID: PMC3610613 DOI: 10.1371/journal.pcbi.1003009] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2012] [Accepted: 02/13/2013] [Indexed: 12/22/2022] Open
Abstract
The spatial arrangements of secondary structures in proteins, irrespective of their connectivity, depict the overall shape and organization of protein domains. These features have been used in the CATH and SCOP classifications to hierarchically partition fold space and define the architectural make up of proteins. Here we use phylogenomic methods and a census of CATH structures in hundreds of genomes to study the origin and diversification of protein architectures (A) and their associated topologies (T) and superfamilies (H). Phylogenies that describe the evolution of domain structures and proteomes were reconstructed from the structural census and used to generate timelines of domain discovery. Phylogenies of CATH domains at T and H levels of structural abstraction and associated chronologies revealed patterns of reductive evolution, the early rise of Archaea, three epochs in the evolution of the protein world, and patterns of structural sharing between superkingdoms. Phylogenies of proteomes confirmed the early appearance of Archaea. While these findings are in agreement with previous phylogenomic studies based on the SCOP classification, phylogenies unveiled sharing patterns between Archaea and Eukarya that are recent and can explain the canonical bacterial rooting typically recovered from sequence analysis. Phylogenies of CATH domains at A level uncovered general patterns of architectural origin and diversification. The tree of A structures showed that ancient structural designs such as the 3-layer (αβα) sandwich (3.40) or the orthogonal bundle (1.10) are comparatively simpler in their makeup and are involved in basic cellular functions. In contrast, modern structural designs such as prisms, propellers, 2-solenoid, super-roll, clam, trefoil and box are not widely distributed and were probably adopted to perform specialized functions. Our timelines therefore uncover a universal tendency towards protein structural complexity that is remarkable.
Collapse
Affiliation(s)
- Syed Abbas Bukhari
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, United States of America
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, United States of America
| |
Collapse
|
27
|
Caetano-Anollés K, Caetano-Anollés G. Structural phylogenomics reveals gradual evolutionary replacement of abiotic chemistries by protein enzymes in purine metabolism. PLoS One 2013; 8:e59300. [PMID: 23516625 PMCID: PMC3596326 DOI: 10.1371/journal.pone.0059300] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Accepted: 02/13/2013] [Indexed: 11/30/2022] Open
Abstract
The origin of metabolism has been linked to abiotic chemistries that existed in our planet at the beginning of life. While plausible chemical pathways have been proposed, including the synthesis of nucleobases, ribose and ribonucleotides, the cooption of these reactions by modern enzymes remains shrouded in mystery. Here we study the emergence of purine metabolism. The ages of protein domains derived from a census of fold family structure in hundreds of genomes were mapped onto enzymes in metabolic diagrams. We find that the origin of the nucleotide interconversion pathway benefited most parsimoniously from the prebiotic formation of adenine nucleosides. In turn, pathways of nucleotide biosynthesis, catabolism and salvage originated ∼300 million years later by concerted enzymatic recruitments and gradual replacement of abiotic chemistries. Remarkably, this process led to the emergence of the fully enzymatic biosynthetic pathway ∼3 billion years ago, concurrently with the appearance of a functional ribosome. The simultaneous appearance of purine biosynthesis and the ribosome probably fulfilled the expanding matter-energy and processing needs of genomic information.
Collapse
Affiliation(s)
- Kelsey Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- Chicago School of Professional Psychology, Chicago, Illinois, United States of America
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- * E-mail:
| |
Collapse
|
28
|
Korobeinikova AV, Garber MB, Gongadze GM. Ribosomal proteins: structure, function, and evolution. BIOCHEMISTRY (MOSCOW) 2012; 77:562-74. [PMID: 22817455 DOI: 10.1134/s0006297912060028] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The question concerning reasons for the variety of ribosomal proteins that arose for more than 40 years ago is still open. Ribosomes of modern organisms contain 50-80 individual proteins. Some are characteristic for all domains of life (universal ribosomal proteins), whereas others are specific for bacteria, archaea, or eucaryotes. Extensive information about ribosomal proteins has been obtained since that time. However, the role of the majority of ribosomal proteins in the formation and functioning of the ribosome is still not so clear. Based on recent data of experiments and bioinformatics, this review presents a comprehensive evaluation of structural conservatism of ribosomal proteins from evolutionarily distant organisms. Considering the current knowledge about features of the structural organization of the universal proteins and their intermolecular contacts, a possible role of individual proteins and their structural elements in the formation and functioning of ribosomes is discussed. The structural and functional conservatism of the majority of proteins of this group suggests that they should be present in the ribosome already in the early stages of its evolution.
Collapse
Affiliation(s)
- A V Korobeinikova
- Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
| | | | | |
Collapse
|
29
|
Abstract
5S rRNA is an integral component of the ribosome of all living organisms. It is known that the ribosome without 5S rRNA is functionally inactive. However, the question about the specific role of this RNA in functioning of the translation apparatus is still open. This review presents a brief history of the discovery of 5S rRNA and studies of its origin and localization in the ribosome. The previously expressed hypotheses about the role of this RNA in the functioning of the ribosome are discussed considering the unique location of 5S rRNA in the ribosome and its intermolecular contacts. Based on analysis of the current data on ribosome structure and its functional complexes, the role of 5S rRNA as an intermediary between ribosome functional domains is discussed.
Collapse
Affiliation(s)
- G M Gongadze
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region, Russia.
| |
Collapse
|
30
|
The Origin of the 5S Ribosomal RNA Molecule Could Have Been Caused by a Single Inverse Duplication: Strong Evidence from Its Sequences. J Mol Evol 2012; 74:170-86. [DOI: 10.1007/s00239-012-9497-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2011] [Accepted: 03/23/2012] [Indexed: 10/28/2022]
|
31
|
Kim KM, Caetano-Anollés G. The evolutionary history of protein fold families and proteomes confirms that the archaeal ancestor is more ancient than the ancestors of other superkingdoms. BMC Evol Biol 2012; 12:13. [PMID: 22284070 PMCID: PMC3306197 DOI: 10.1186/1471-2148-12-13] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2011] [Accepted: 01/27/2012] [Indexed: 11/23/2022] Open
Abstract
Background The entire evolutionary history of life can be studied using myriad sequences generated by genomic research. This includes the appearance of the first cells and of superkingdoms Archaea, Bacteria, and Eukarya. However, the use of molecular sequence information for deep phylogenetic analyses is limited by mutational saturation, differential evolutionary rates, lack of sequence site independence, and other biological and technical constraints. In contrast, protein structures are evolutionary modules that are highly conserved and diverse enough to enable deep historical exploration. Results Here we build phylogenies that describe the evolution of proteins and proteomes. These phylogenetic trees are derived from a genomic census of protein domains defined at the fold family (FF) level of structural classification. Phylogenomic trees of FF structures were reconstructed from genomic abundance levels of 2,397 FFs in 420 proteomes of free-living organisms. These trees defined timelines of domain appearance, with time spanning from the origin of proteins to the present. Timelines are divided into five different evolutionary phases according to patterns of sharing of FFs among superkingdoms: (1) a primordial protein world, (2) reductive evolution and the rise of Archaea, (3) the rise of Bacteria from the common ancestor of Bacteria and Eukarya and early development of the three superkingdoms, (4) the rise of Eukarya and widespread organismal diversification, and (5) eukaryal diversification. The relative ancestry of the FFs shows that reductive evolution by domain loss is dominant in the first three phases and is responsible for both the diversification of life from a universal cellular ancestor and the appearance of superkingdoms. On the other hand, domain gains are predominant in the last two phases and are responsible for organismal diversification, especially in Bacteria and Eukarya. Conclusions The evolution of functions that are associated with corresponding FFs along the timeline reveals that primordial metabolic domains evolved earlier than informational domains involved in translation and transcription, supporting the metabolism-first hypothesis rather than the RNA world scenario. In addition, phylogenomic trees of proteomes reconstructed from FFs appearing in each of the five phases of the protein world show that trees reconstructed from ancient domain structures were consistently rooted in archaeal lineages, supporting the proposal that the archaeal ancestor is more ancient than the ancestors of other superkingdoms.
Collapse
Affiliation(s)
- Kyung Mo Kim
- Evolutionary Bioinformatics Laboratory, Department of Crop Science, University of Illinois, Urbana, IL 61801, USA
| | | |
Collapse
|
32
|
Ciganda M, Williams N. Characterization of a novel association between two trypanosome-specific proteins and 5S rRNA. PLoS One 2012; 7:e30029. [PMID: 22253864 PMCID: PMC3257258 DOI: 10.1371/journal.pone.0030029] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2011] [Accepted: 12/12/2011] [Indexed: 11/20/2022] Open
Abstract
P34 and P37 are two previously identified RNA binding proteins in the flagellate protozoan Trypanosoma brucei. RNA interference studies have determined that the proteins are essential and are involved in ribosome biogenesis. Here, we show that these proteins interact in vitro with the 5S rRNA with nearly identical binding characteristics in the absence of other cellular factors. The T. brucei 5S rRNA has a complex secondary structure and presents four accessible loops (A to D) for interactions with RNA-binding proteins. In other eukaryotes, loop C is bound by the L5 ribosomal protein and loop A mainly by TFIIIA. The binding of P34 and P37 to T. brucei 5S rRNA involves the LoopA region of the RNA, but these proteins also protect the L5 binding site located on LoopC.
Collapse
Affiliation(s)
- Martin Ciganda
- Department of Microbiology and Immunology & Witebsky Center for Microbial Pathogenesis and Immunology, University at Buffalo, Buffalo, New York, United States of America
| | - Noreen Williams
- Department of Microbiology and Immunology & Witebsky Center for Microbial Pathogenesis and Immunology, University at Buffalo, Buffalo, New York, United States of America
- * E-mail:
| |
Collapse
|
33
|
The phylogenomic roots of modern biochemistry: origins of proteins, cofactors and protein biosynthesis. J Mol Evol 2012; 74:1-34. [PMID: 22210458 DOI: 10.1007/s00239-011-9480-1] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2011] [Accepted: 12/12/2011] [Indexed: 12/20/2022]
Abstract
The complexity of modern biochemistry developed gradually on early Earth as new molecules and structures populated the emerging cellular systems. Here, we generate a historical account of the gradual discovery of primordial proteins, cofactors, and molecular functions using phylogenomic information in the sequence of 420 genomes. We focus on structural and functional annotations of the 54 most ancient protein domains. We show how primordial functions are linked to folded structures and how their interaction with cofactors expanded the functional repertoire. We also reveal protocell membranes played a crucial role in early protein evolution and show translation started with RNA and thioester cofactor-mediated aminoacylation. Our findings allow elaboration of an evolutionary model of early biochemistry that is firmly grounded in phylogenomic information and biochemical, biophysical, and structural knowledge. The model describes how primordial α-helical bundles stabilized membranes, how these were decorated by layered arrangements of β-sheets and α-helices, and how these arrangements became globular. Ancient forms of aminoacyl-tRNA synthetase (aaRS) catalytic domains and ancient non-ribosomal protein synthetase (NRPS) modules gave rise to primordial protein synthesis and the ability to generate a code for specificity in their active sites. These structures diversified producing cofactor-binding molecular switches and barrel structures. Accretion of domains and molecules gave rise to modern aaRSs, NRPS, and ribosomal ensembles, first organized around novel emerging cofactors (tRNA and carrier proteins) and then more complex cofactor structures (rRNA). The model explains how the generation of protein structures acted as scaffold for nucleic acids and resulted in crystallization of modern translation.
Collapse
|
34
|
Kim KM, Caetano-Anollés G. The proteomic complexity and rise of the primordial ancestor of diversified life. BMC Evol Biol 2011; 11:140. [PMID: 21612591 PMCID: PMC3123224 DOI: 10.1186/1471-2148-11-140] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2011] [Accepted: 05/25/2011] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND The last universal common ancestor represents the primordial cellular organism from which diversified life was derived. This urancestor accumulated genetic information before the rise of organismal lineages and is considered to be either a simple 'progenote' organism with a rudimentary translational apparatus or a more complex 'cenancestor' with almost all essential biological processes. Recent comparative genomic studies support the latter model and propose that the urancestor was similar to modern organisms in terms of gene content. However, most of these studies were based on molecular sequences, which are fast evolving and of limited value for deep evolutionary explorations. RESULTS Here we engage in a phylogenomic study of protein domain structure in the proteomes of 420 free-living fully sequenced organisms. Domains were defined at the highly conserved fold superfamily (FSF) level of structural classification and an iterative phylogenomic approach was used to reconstruct max_set and min_set FSF repertoires as upper and lower bounds of the urancestral proteome. While the functional make up of the urancestral sets was complex, they represent only 5-11% of the 1,420 FSFs of extant proteomes and their make up and reuse was at least 5 and 3 times smaller than proteomes of free-living organisms, repectively. Trees of proteomes reconstructed directly from FSFs or from molecular functions, which included the max_set and min_set as articial taxa, showed that urancestors were always placed at their base and rooted the tree of life in Archaea. Finally, a molecular clock of FSFs suggests the min_set reflects urancestral genetic make up more reliably and confirms diversified life emerged about 2.9 billion years ago during the start of planet oxygenation. CONCLUSIONS The minimum urancestral FSF set reveals the urancestor had advanced metabolic capabilities, was especially rich in nucleotide metabolism enzymes, had pathways for the biosynthesis of membrane sn1,2 glycerol ester and ether lipids, and had crucial elements of translation, including a primordial ribosome with protein synthesis capabilities. It lacked however fundamental functions, including transcription, processes for extracellular communication, and enzymes for deoxyribonucleotide synthesis. Proteomic history reveals the urancestor is closer to a simple progenote organism but harbors a rather complex set of modern molecular functions.
Collapse
Affiliation(s)
- Kyung Mo Kim
- Evolutionary Bioinformatics Laboratory, Department of Crop Science, University of Illinois, Urbana, IL 61801, USA
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, 111 Gwahangno, Yuseong-gu, Daejeon 305-806, Korea
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Science, University of Illinois, Urbana, IL 61801, USA
| |
Collapse
|
35
|
Ciganda M, Williams N. Eukaryotic 5S rRNA biogenesis. WILEY INTERDISCIPLINARY REVIEWS-RNA 2011; 2:523-33. [PMID: 21957041 DOI: 10.1002/wrna.74] [Citation(s) in RCA: 103] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The ribosome is a large complex containing both protein and RNA which must be assembled in a precise manner to allow proper functioning in the critical role of protein synthesis. 5S rRNA is the smallest of the RNA components of the ribosome, and although it has been studied for decades, we still do not have a clear understanding of its function within the complex ribosome machine. It is the only RNA species that binds ribosomal proteins prior to its assembly into the ribosome. Its transport into the nucleolus requires this interaction. Here we present an overview of some of the key findings concerning the structure and function of 5S rRNA and how its association with specific proteins impacts its localization and function.
Collapse
Affiliation(s)
- Martin Ciganda
- Department of Microbiology and Immunology, University at Buffalo, Buffalo, NY, USA
| | | |
Collapse
|
36
|
Vizoso M, Vierna J, González-Tizón AM, Martínez-Lage A. The 5S rDNA Gene Family in Mollusks: Characterization of Transcriptional Regulatory Regions, Prediction of Secondary Structures, and Long-Term Evolution, with Special Attention to Mytilidae Mussels. J Hered 2011; 102:433-47. [DOI: 10.1093/jhered/esr046] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
|
37
|
Proteome evolution and the metabolic origins of translation and cellular life. J Mol Evol 2010; 72:14-33. [PMID: 21082171 DOI: 10.1007/s00239-010-9400-9] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2010] [Accepted: 10/25/2010] [Indexed: 12/27/2022]
Abstract
The origin of life has puzzled molecular scientists for over half a century. Yet fundamental questions remain unanswered, including which came first, the metabolic machinery or the encoding nucleic acids. In this study we take a protein-centric view and explore the ancestral origins of proteins. Protein domain structures in proteomes are highly conserved and embody molecular functions and interactions that are needed for cellular and organismal processes. Here we use domain structure to study the evolution of molecular function in the protein world. Timelines describing the age and function of protein domains at fold, fold superfamily, and fold family levels of structural complexity were derived from a structural phylogenomic census in hundreds of fully sequenced genomes. These timelines unfold congruent hourglass patterns in rates of appearance of domain structures and functions, functional diversity, and hierarchical complexity, and revealed a gradual build up of protein repertoires associated with metabolism, translation and DNA, in that order. The most ancient domain architectures were hydrolase enzymes and the first translation domains had catalytic functions for the aminoacylation and the molecular switch-driven transport of RNA. Remarkably, the most ancient domains had metabolic roles, did not interact with RNA, and preceded the gradual build-up of translation. In fact, the first translation domains had also a metabolic origin and were only later followed by specialized translation machinery. Our results explain how the generation of structure in the protein world and the concurrent crystallization of translation and diversified cellular life created further opportunities for proteomic diversification.
Collapse
|
38
|
Wang M, Jiang YY, Kim KM, Qu G, Ji HF, Mittenthal JE, Zhang HY, Caetano-Anollés G. A universal molecular clock of protein folds and its power in tracing the early history of aerobic metabolism and planet oxygenation. Mol Biol Evol 2010; 28:567-82. [PMID: 20805191 DOI: 10.1093/molbev/msq232] [Citation(s) in RCA: 97] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
The standard molecular clock describes a constant rate of molecular evolution and provides a powerful framework for evolutionary timescales. Here, we describe the existence and implications of a molecular clock of folds, a universal recurrence in the discovery of new structures in the world of proteins. Using a phylogenomic structural census in hundreds of proteomes, we build phylogenies and time lines of domains at fold and fold superfamily levels of structural complexity. These time lines correlate approximately linearly with geological timescales and were here used to date two crucial events in life history, planet oxygenation and organism diversification. We first dissected the structures and functions of enzymes in simulated metabolic networks. The placement of anaerobic and aerobic enzymes in the time line revealed that aerobic metabolism emerged about 2.9 billion years (giga-annum; Ga) ago and expanded during a period of about 400 My, reaching what is known as the Great Oxidation Event. During this period, enzymes recruited old and new folds for oxygen-mediated enzymatic activities. Remarkably, the first fold lost by a superkingdom disappeared in Archaea 2.6 Ga ago, within the span of oxygen rise, suggesting that oxygen also triggered diversification of life. The implications of a molecular clock of folds are many and important for the neutral theory of molecular evolution and for understanding the growth and diversity of the protein world. The clock also extends the standard concept that was specific to molecules and their timescales and turns it into a universal timescale-generating tool.
Collapse
Affiliation(s)
- Minglei Wang
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana-Champaign, USA
| | | | | | | | | | | | | | | |
Collapse
|
39
|
A Model of the Origin of the 5S Ribosomal RNA Molecule. J Mol Evol 2010; 71:1-2. [DOI: 10.1007/s00239-010-9358-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2009] [Accepted: 06/02/2010] [Indexed: 10/19/2022]
|
40
|
The Origin of Modern 5S rRNA: A Case of Relating Models of Structural History to Phylogenetic Data. J Mol Evol 2010; 71:3-5. [DOI: 10.1007/s00239-010-9359-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2009] [Accepted: 06/02/2010] [Indexed: 11/26/2022]
|
41
|
Kim KM, Caetano-Anollés G. Emergence and evolution of modern molecular functions inferred from phylogenomic analysis of ontological data. Mol Biol Evol 2010; 27:1710-33. [PMID: 20418223 DOI: 10.1093/molbev/msq106] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The biological processes that characterize the phenotypes of a living system are embodied in the function of molecules and hold the key to evolutionary history, delimiting natural selection and change. These processes and functions provide direct insight into the emergence, development, and organization of cellular life. However, detailed molecular functions make up a network-like hierarchy of relationships that tells little of evolutionary links between structure and function in biology. For example, Gene Ontology terms represent widely-used vocabularies of processes and functions with evolutionary relationships that are implicit but not defined. Here, we uncover patterns of global evolutionary history in ontological terms associated with the sequence of 38 genomes. These patterns unfold the metabolic origins of modern molecular functions and major biological transitions in evolution toward complex life. Phylogenies reveal the primordial appearance of hydrolases and transferases, with ATPase, GTPase, and helicase activities being the most ancient. This indicates that ancient catalysts were crucial for binding and transport, the emergence of nucleic acids and protein biopolymers, and the communication of primordial cells with the environment. Finally, the history of biological processes showed that cellular biopolymer metabolic processes preceded biopolymer biosynthesis and essential processes related to macromolecular formation, directly challenging the existence of an RNA world. Phylogenomic systematization of biological function takes the structure and function paradigm to a completely new level of abstraction, demonstrating a "metabolic first" origin of life. The approach uncovers patterns in the morphing of function that are unprecedented and necessary for systematic views in biology.
Collapse
Affiliation(s)
- Kyung Mo Kim
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, IL, USA
| | | |
Collapse
|
42
|
Sun FJ, Caetano-Anollés G. The ancient history of the structure of ribonuclease P and the early origins of Archaea. BMC Bioinformatics 2010; 11:153. [PMID: 20334683 PMCID: PMC2858038 DOI: 10.1186/1471-2105-11-153] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2010] [Accepted: 03/24/2010] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Ribonuclease P is an ancient endonuclease that cleaves precursor tRNA and generally consists of a catalytic RNA subunit (RPR) and one or more proteins (RPPs). It represents an important macromolecular complex and model system that is universally distributed in life. Its putative origins have inspired fundamental hypotheses, including the proposal of an ancient RNA world. RESULTS To study the evolution of this complex, we constructed rooted phylogenetic trees of RPR molecules and substructures and estimated RPP age using a cladistic method that embeds structure directly into phylogenetic analysis. The general approach was used previously to study the evolution of tRNA, SINE RNA and 5S rRNA, the origins of metabolism, and the evolution and complexity of the protein world, and revealed here remarkable evolutionary patterns. Trees of molecules uncovered the tripartite nature of life and the early origin of archaeal RPRs. Trees of substructures showed molecules originated in stem P12 and were accessorized with a catalytic P1-P4 core structure before the first substructure was lost in Archaea. This core currently interacts with RPPs and ancient segments of the tRNA molecule. Finally, a census of protein domain structure in hundreds of genomes established RPPs appeared after the rise of metabolic enzymes at the onset of the protein world. CONCLUSIONS The study provides a detailed account of the history and early diversification of a fundamental ribonucleoprotein and offers further evidence in support of the existence of a tripartite organismal world that originated by the segregation of archaeal lineages from an ancient community of primordial organisms.
Collapse
Affiliation(s)
- Feng-Jie Sun
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Laboratory of Molecular Epigenetics of the Ministry of Education, School of Life Sciences, Northeast Normal University, Changchun 130024, Jilin Province, PR China
- W.M. Keck Center for Comparative and Functional Genomics, Roy J. Carver Biotechnology Center, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| |
Collapse
|