1
|
Caetano-Anollés K, Aziz MF, Mughal F, Caetano-Anollés G. On Protein Loops, Prior Molecular States and Common Ancestors of Life. J Mol Evol 2024:10.1007/s00239-024-10167-y. [PMID: 38652291 DOI: 10.1007/s00239-024-10167-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 03/22/2024] [Indexed: 04/25/2024]
Abstract
The principle of continuity demands the existence of prior molecular states and common ancestors responsible for extant macromolecular structure. Here, we focus on the emergence and evolution of loop prototypes - the elemental architects of protein domain structure. Phylogenomic reconstruction spanning superkingdoms and viruses generated an evolutionary chronology of prototypes with six distinct evolutionary phases defining a most parsimonious evolutionary progression of cellular life. Each phase was marked by strategic prototype accumulation shaping the structures and functions of common ancestors. The last universal common ancestor (LUCA) of cells and viruses and the last universal cellular ancestor (LUCellA) defined stem lines that were structurally and functionally complex. The evolutionary saga highlighted transformative forces. LUCA lacked biosynthetic ribosomal machinery, while the pivotal LUCellA lacked essential DNA biosynthesis and modern transcription. Early proteins therefore relied on RNA for genetic information storage but appeared initially decoupled from it, hinting at transformative shifts of genetic processing. Urancestral loop types suggest advanced folding designs were present at an early evolutionary stage. An exploration of loop geometric properties revealed gradual replacement of prototypes with α-helix and β-strand bracing structures over time, paving the way for the dominance of other loop types. AlphFold2-generated atomic models of prototype accretion described patterns of fold emergence. Our findings favor a ‛processual' model of evolving stem lines aligned with Woese's vision of a communal world. This model prompts discussing the 'problem of ancestors' and the challenges that lie ahead for research in taxonomy, evolution and complexity.
Collapse
Affiliation(s)
- Kelsey Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
- Callout Biotech, Albuquerque, NM, 87112, USA
| | - M Fayez Aziz
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
| |
Collapse
|
2
|
Mughal F, Caetano-Anollés G. Evolution of Intrinsic Disorder in Protein Loops. Life (Basel) 2023; 13:2055. [PMID: 37895436 PMCID: PMC10608553 DOI: 10.3390/life13102055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 10/08/2023] [Accepted: 10/10/2023] [Indexed: 10/29/2023] Open
Abstract
Intrinsic disorder accounts for the flexibility of protein loops, molecular building blocks that are largely responsible for the processes and molecular functions of the living world. While loops likely represent early structural forms that served as intermediates in the emergence of protein structural domains, their origin and evolution remain poorly understood. Here, we conduct a phylogenomic survey of disorder in loop prototypes sourced from the ArchDB classification. Tracing prototypes associated with protein fold families along an evolutionary chronology revealed that ancient prototypes tended to be more disordered than their derived counterparts, with ordered prototypes developing later in evolution. This highlights the central evolutionary role of disorder and flexibility. While mean disorder increased with time, a minority of ordered prototypes exist that emerged early in evolutionary history, possibly driven by the need to preserve specific molecular functions. We also revealed the percolation of evolutionary constraints from higher to lower levels of organization. Percolation resulted in trade-offs between flexibility and rigidity that impacted prototype structure and geometry. Our findings provide a deep evolutionary view of the link between structure, disorder, flexibility, and function, as well as insights into the evolutionary role of intrinsic disorder in loops and their contribution to protein structure and function.
Collapse
Affiliation(s)
- Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL 61801, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL 61801, USA
- C.R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL 61801, USA
| |
Collapse
|
3
|
Aziz MF, Mughal F, Caetano-Anollés G. Tracing the birth of structural domains from loops during protein evolution. Sci Rep 2023; 13:14688. [PMID: 37673948 PMCID: PMC10482863 DOI: 10.1038/s41598-023-41556-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Accepted: 08/28/2023] [Indexed: 09/08/2023] Open
Abstract
The structures and functions of proteins are embedded into the loop scaffolds of structural domains. Their origin and evolution remain mysterious. Here, we use a novel graph-theoretical approach to describe how modular and non-modular loop prototypes combine to form folded structures in protein domain evolution. Phylogenomic data-driven chronologies reoriented a bipartite network of loops and domains (and its projections) into 'waterfalls' depicting an evolving 'elementary functionome' (EF). Two primordial waves of functional innovation involving founder 'p-loop' and 'winged-helix' domains were accompanied by an ongoing emergence and reuse of structural and functional novelty. Metabolic pathways expanded before translation functionalities. A dual hourglass recruitment pattern transferred scale-free properties from loop to domain components of the EF network in generative cycles of hierarchical modularity. Modeling the evolutionary emergence of the oldest P-loop and winged-helix domains with AlphFold2 uncovered rapid convergence towards folded structure, suggesting that a folding vocabulary exists in loops for protein fold repurposing and design.
Collapse
Affiliation(s)
- M Fayez Aziz
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA.
- C.R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL, 61801, USA.
| |
Collapse
|
4
|
Caetano-Anollés G, Claverie JM, Nasir A. A critical analysis of the current state of virus taxonomy. Front Microbiol 2023; 14:1240993. [PMID: 37601376 PMCID: PMC10435761 DOI: 10.3389/fmicb.2023.1240993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 07/20/2023] [Indexed: 08/22/2023] Open
Abstract
Taxonomical classification has preceded evolutionary understanding. For that reason, taxonomy has become a battleground fueled by knowledge gaps, technical limitations, and a priorism. Here we assess the current state of the challenging field, focusing on fallacies that are common in viral classification. We emphasize that viruses are crucial contributors to the genomic and functional makeup of holobionts, organismal communities that behave as units of biological organization. Consequently, viruses cannot be considered taxonomic units because they challenge crucial concepts of organismality and individuality. Instead, they should be considered processes that integrate virions and their hosts into life cycles. Viruses harbor phylogenetic signatures of genetic transfer that compromise monophyly and the validity of deep taxonomic ranks. A focus on building phylogenetic networks using alignment-free methodologies and molecular structure can help mitigate the impasse, at least in part. Finally, structural phylogenomic analysis challenges the polyphyletic scenario of multiple viral origins adopted by virus taxonomy, defeating a polyphyletic origin and supporting instead an ancient cellular origin of viruses. We therefore, prompt abandoning deep ranks and urgently reevaluating the validity of taxonomic units and principles of virus classification.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and C.R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Jean-Michel Claverie
- Structural and Genomic Information Laboratory (UMR7256), Mediterranean Institute of Microbiology (FR3479), IM2B, IOM, Aix Marseille University, CNRS, Marseille, France
| | | |
Collapse
|
5
|
Dyson HJ. Vital for Viruses: Intrinsically Disordered Proteins. J Mol Biol 2023; 435:167860. [PMID: 37330280 PMCID: PMC10656058 DOI: 10.1016/j.jmb.2022.167860] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 10/11/2022] [Accepted: 10/12/2022] [Indexed: 06/19/2023]
Abstract
Viruses infect all kingdoms of life; their genomes vary from DNA to RNA and in size from 2kB to 1 MB or more. Viruses frequently employ disordered proteins, that is, protein products of virus genes that do not themselves fold into independent three-dimensional structures, but rather, constitute a versatile molecular toolkit to accomplish a range of functions necessary for viral infection, assembly, and proliferation. Interestingly, disordered proteins have been discovered in almost all viruses so far studied, whether the viral genome consists of DNA or RNA, and whatever the configuration of the viral capsid or other outer covering. In this review, I present a wide-ranging set of stories illustrating the range of functions of IDPs in viruses. The field is rapidly expanding, and I have not tried to include everything. What is included is meant to be a survey of the variety of tasks that viruses accomplish using disordered proteins.
Collapse
Affiliation(s)
- H Jane Dyson
- Department of Integrative Structural and Computational Biology and Skaggs Institute of Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA.
| |
Collapse
|
6
|
On thresholds: signs, symbols and significance. JOURNAL OF DOCUMENTATION 2023. [DOI: 10.1108/jd-08-2022-0168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
PurposeThis paper reviews research developments in semiosis (sign activity) as theorized by Peirce, Eco and Sebeok, focusing specifically on the current study of “semiotic threshold zones,” which range from the origins of life through various nonhuman life forms to artificial life forms, including those symbolic thresholds most familiar to library and information science (LIS) researchers. The intent is to illustrate potential opportunities for LIS research beyond its present boundaries.Design/methodology/approachThe paper provides a framework that describes six semiotic threshold zones (presemiotic, protosemiotic, phytosemiotic, zoosemiotic, symbolic and polysemiotic) and notable work being done by researchers in each.FindingsWhile semiotic researchers are still defining the continuum of semiotic thresholds, this focus on thresholds can provide a unifying framework for significance as human and nonhuman interpretations of a wide variety of signs accompanied by a better understanding of their relationships becomes more urgent in a rapidly changing global environment.Originality/valueThough a variety of semiotic-related topics have appeared in the LIS literature, semiotic thresholds and their potential relationships to LIS research have not been previously discussed there. LIS has traditionally tasked itself with the recording, dissemination and preservation of knowledge, and in a world that faces unprecedented environmental and global challenges for all species, the importance of these thresholds may well be considered as part of our professional obligations in potentially documenting and archiving the critical differences in semiosis that extend beyond purely human knowledge.
Collapse
|
7
|
Khalifeh D, Neveu E, Fasshauer D. Megaviruses contain various genes encoding for eukaryotic vesicle trafficking factors. Traffic 2022; 23:414-425. [PMID: 35701729 PMCID: PMC9546365 DOI: 10.1111/tra.12860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 05/02/2022] [Accepted: 06/02/2022] [Indexed: 11/30/2022]
Abstract
Many intracellular pathogens, such as bacteria and large viruses, enter eukaryotic cells via phagocytosis, then replicate and proliferate inside the host. To avoid degradation in the phagosomes, they have developed strategies to modify vesicle trafficking. Although several strategies of bacteria have been characterized, it is not clear whether viruses also interfere with the vesicle trafficking of the host. Recently, we came across SNARE proteins encoded in the genomes of several bacteria of the order Legionellales. These pathogenic bacteria may use SNAREs to interfere with vesicle trafficking, since SNARE proteins are the core machinery for vesicle fusion during transport. They assemble into membrane-bridging SNARE complexes that bring membranes together. We now have also discovered SNARE proteins in the genomes of diverse giant viruses. Our biochemical experiments showed that these proteins are able to form SNARE complexes. We also found other key trafficking factors that work together with SNAREs such as NSF, SM, and Rab proteins encoded in the genomes of giant viruses, suggesting that viruses can make use of a large genetic repertoire of trafficking factors. Most giant viruses possess different collections, suggesting that these factors entered the viral genome multiple times. In the future, the molecular role of these factors during viral infection need to be studied.
Collapse
Affiliation(s)
- Dany Khalifeh
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Emilie Neveu
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Dirk Fasshauer
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| |
Collapse
|
8
|
Flores R, Navarro B, Serra P, Di Serio F. A Scenario for the Emergence of Protoviroids in the RNA World and for Their Further Evolution into Viroids and Viroid-Like RNAs by Modular Recombinations and Mutations. Virus Evol 2022; 8:veab107. [PMID: 35223083 PMCID: PMC8865084 DOI: 10.1093/ve/veab107] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 12/10/2021] [Accepted: 01/14/2022] [Indexed: 11/14/2022] Open
Abstract
Abstract
Viroids are tiny, circular and non-coding RNAs that are able to replicate and systemically infect plants. The smallest known pathogens, they have been proposed to represent survivors from the RNA world that likely preceded the cellular world currently dominating life on the earth. Although the small, circular and compact nature of viroid genomes, some of which are also endowed with catalytic activity mediated by hammerhead ribozymes, support this proposal, the lack of feasible evolutionary routes and the identification of hammerhead ribozymes in a large number of DNA genomes of organisms along the tree of life has led some to question such a proposal. Here, we reassess the origin and subsequent evolution of viroids by complementing phylogenetic reconstructions with molecular data, including the primary and higher-order structure of the genomic RNAs, their replication and recombination mechanisms and selected biological information. Features of some viroid-like RNAs found in plants, animal, and possibly fungi are also considered. The resulting evolutionary scenario supports the emergence of protoviroids in the RNA world, mainly as replicative modules, followed by further increase in genome complexity based on module/domain shuffling and combination, and mutation. Such a modular evolutionary scenario would have facilitated the inclusion in the protoviroid genomes of complex RNA structures (or coding sequences, as in the case of hepatitis ∂ virus and delta-like agents), likely needed for their adaptation from the RNA world to a life based on cells, thus generating the ancestors of current infectious viroids and viroid-like RNAs. Other non-infectious viroid-like RNAs, such as retroviroid-like RNA elements and retrozymes, could also be derived from protoviroids if their reverse transcription and integration into viral or eukaryotic DNA, respectively, are considered as a possible key step in their evolution. Comparison of evidence supporting a general and modular evolutionary model for viroids and viroid-like RNAs with that favoring alternative scenarios provides reasonable reasons to keep alive the hypothesis that these small RNA pathogens may be relics of a precellular world.
Collapse
Affiliation(s)
| | - Beatriz Navarro
- Istituto per la Protezione Sostenibile delle Piante, Consiglio Nazionale delle Ricerche, Via Amendola 122/D, Bari 70126, Italy
| | - Pedro Serra
- Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicas–Universidad Politécnica de Valencia, Ingeniero Fausto Elio s/n, Valencia 46022, Spain
| | | |
Collapse
|
9
|
Battaglia R, Alonzo R, Pennisi C, Caponnetto A, Ferrara C, Stella M, Barbagallo C, Barbagallo D, Ragusa M, Purrello M, Di Pietro C. MicroRNA-Mediated Regulation of the Virus Cycle and Pathogenesis in the SARS-CoV-2 Disease. Int J Mol Sci 2021; 22:ijms222413192. [PMID: 34947989 PMCID: PMC8715670 DOI: 10.3390/ijms222413192] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 12/03/2021] [Accepted: 12/04/2021] [Indexed: 12/24/2022] Open
Abstract
In the last few years, microRNA-mediated regulation has been shown to be important in viral infections. In fact, viral microRNAs can alter cell physiology and act on the immune system; moreover, cellular microRNAs can regulate the virus cycle, influencing positively or negatively viral replication. Accordingly, microRNAs can represent diagnostic and prognostic biomarkers of infectious processes and a promising approach for designing targeted therapies. In the past 18 months, the COVID-19 infection from SARS-CoV-2 has engaged many researchers in the search for diagnostic and prognostic markers and the development of therapies. Although some research suggests that the SARS-CoV-2 genome can produce microRNAs and that host microRNAs may be involved in the cellular response to the virus, to date, not enough evidence has been provided. In this paper, using a focused bioinformatic approach exploring the SARS-CoV-2 genome, we propose that SARS-CoV-2 is able to produce microRNAs sharing a strong sequence homology with the human ones and also that human microRNAs may target viral RNA regulating the virus life cycle inside human cells. Interestingly, all viral miRNA sequences and some human miRNA target sites are conserved in more recent SARS-CoV-2 variants of concern (VOCs). Even if experimental evidence will be needed, in silico analysis represents a valuable source of information useful to understand the sophisticated molecular mechanisms of disease and to sustain biomedical applications.
Collapse
|
10
|
Caetano-Anollés G, Aziz MF, Mughal F, Caetano-Anollés D. Tracing protein and proteome history with chronologies and networks: folding recapitulates evolution. Expert Rev Proteomics 2021; 18:863-880. [PMID: 34628994 DOI: 10.1080/14789450.2021.1992277] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
INTRODUCTION While the origin and evolution of proteins remain mysterious, advances in evolutionary genomics and systems biology are facilitating the historical exploration of the structure, function and organization of proteins and proteomes. Molecular chronologies are series of time events describing the history of biological systems and subsystems and the rise of biological innovations. Together with time-varying networks, these chronologies provide a window into the past. AREAS COVERED Here, we review molecular chronologies and networks built with modern methods of phylogeny reconstruction. We discuss how chronologies of structural domain families uncover the explosive emergence of metabolism, the late rise of translation, the co-evolution of ribosomal proteins and rRNA, and the late development of the ribosomal exit tunnel; events that coincided with a tendency to shorten folding time. Evolving networks described the early emergence of domains and a late 'big bang' of domain combinations. EXPERT OPINION Two processes, folding and recruitment appear central to the evolutionary progression. The former increases protein persistence. The later fosters diversity. Chronologically, protein evolution mirrors folding by combining supersecondary structures into domains, developing translation machinery to facilitate folding speed and stability, and enhancing structural complexity by establishing long-distance interactions in novel structural and architectural designs.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, USA.,C. R. Woese Institute for Genomic Biology, University of Illinois, Urbana, Illinois, USA
| | - M Fayez Aziz
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, USA
| | - Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, USA
| | - Derek Caetano-Anollés
- Data Science Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| |
Collapse
|
11
|
Nahalka J. Theoretical Analysis of S, M and N Structural Proteins by the Protein-RNA Recognition Code Leads to Genes/proteins that Are Relevant to the SARS-CoV-2 Life Cycle and Pathogenesis. Front Genet 2021; 12:763995. [PMID: 34659373 PMCID: PMC8511677 DOI: 10.3389/fgene.2021.763995] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 09/15/2021] [Indexed: 12/14/2022] Open
Abstract
In this conceptual review, based on the protein-RNA recognition code, some theoretical sequences were detected in the spike (S), membrane (M) and capsid (N) proteins that may post-transcriptionally regulate the host genes/proteins in immune homeostasis, pulmonary epithelial tissue homeostasis, and lipid homeostasis. According to the review of literature, the spectrum of identified genes/proteins shows that the virus promotes IL1α/β-IL1R1 signaling (type 1 immunity) and immunity defense against helminths and venoms (type 2 immunity). In the alteration of homeostasis in the pulmonary epithelial tissue, the virus blocks the function of cilia and the molecular programs that are involved in wound healing (EMT and MET). Additionally, the protein-RNA recognition method described here identifies compatible sequences in the S1A-domain for the post-transcriptional promotion of PIKFYVE, which is one of the critical factors for SARS-CoV-2 entry to the host cell, and for the post-transcriptional repression of xylulokinase XYLB. A decrease in XYLB product (Xu5P) in plasma was proposed as one of the potential metabolomics biomarkers of COVID-19. In summary, the protein-RNA recognition code leads to protein genes relevant to the SARS-CoV-2 life cycle and pathogenesis.
Collapse
Affiliation(s)
- Jozef Nahalka
- Institute of Chemistry, Centre for Glycomics, Slovak Academy of Sciences, Bratislava, Slovakia
- Institute of Chemistry, Centre of Excellence for White-green Biotechnology, Slovak Academy of Sciences, Nitra, Slovakia
| |
Collapse
|
12
|
Caetano-Anollés G. The Compressed Vocabulary of Microbial Life. Front Microbiol 2021; 12:655990. [PMID: 34305827 PMCID: PMC8292947 DOI: 10.3389/fmicb.2021.655990] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 04/27/2021] [Indexed: 12/22/2022] Open
Abstract
Communication is an undisputed central activity of life that requires an evolving molecular language. It conveys meaning through messages and vocabularies. Here, I explore the existence of a growing vocabulary in the molecules and molecular functions of the microbial world. There are clear correspondences between the lexicon, syntax, semantics, and pragmatics of language organization and the module, structure, function, and fitness paradigms of molecular biology. These correspondences are constrained by universal laws and engineering principles. Macromolecular structure, for example, follows quantitative linguistic patterns arising from statistical laws that are likely universal, including the Zipf's law, a special case of the scale-free distribution, the Heaps' law describing sublinear growth typical of economies of scales, and the Menzerath-Altmann's law, which imposes size-dependent patterns of decreasing returns. Trade-off solutions between principles of economy, flexibility, and robustness define a "triangle of persistence" describing the impact of the environment on a biological system. The pragmatic landscape of the triangle interfaces with the syntax and semantics of molecular languages, which together with comparative and evolutionary genomic data can explain global patterns of diversification of cellular life. The vocabularies of proteins (proteomes) and functions (functionomes) revealed a significant universal lexical core supporting a universal common ancestor, an ancestral evolutionary link between Bacteria and Eukarya, and distinct reductive evolutionary strategies of language compression in Archaea and Bacteria. A "causal" word cloud strategy inspired by the dependency grammar paradigm used in catenae unfolded the evolution of lexical units associated with Gene Ontology terms at different levels of ontological abstraction. While Archaea holds the smallest, oldest, and most homogeneous vocabulary of all superkingdoms, Bacteria heterogeneously apportions a more complex vocabulary, and Eukarya pushes functional innovation through mechanisms of flexibility and robustness.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, and C. R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL, United States
| |
Collapse
|
13
|
The Unique, the Known, and the Unknown of Spumaretrovirus Assembly. Viruses 2021; 13:v13010105. [PMID: 33451128 PMCID: PMC7828637 DOI: 10.3390/v13010105] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 01/08/2021] [Accepted: 01/10/2021] [Indexed: 12/22/2022] Open
Abstract
Within the family of Retroviridae, foamy viruses (FVs) are unique and unconventional with respect to many aspects in their molecular biology, including assembly and release of enveloped viral particles. Both components of the minimal assembly and release machinery, Gag and Env, display significant differences in their molecular structures and functions compared to the other retroviruses. This led to the placement of FVs into a separate subfamily, the Spumaretrovirinae. Here, we describe the molecular differences in FV Gag and Env, as well as Pol, which is translated as a separate protein and not in an orthoretroviral manner as a Gag-Pol fusion protein. This feature further complicates FV assembly since a specialized Pol encapsidation strategy via a tripartite Gag-genome–Pol complex is used. We try to relate the different features and specific interaction patterns of the FV Gag, Pol, and Env proteins in order to develop a comprehensive and dynamic picture of particle assembly and release, but also other features that are indirectly affected. Since FVs are at the root of the retrovirus tree, we aim at dissecting the unique/specialized features from those shared among the Spuma- and Orthoretrovirinae. Such analyses may shed light on the evolution and characteristics of virus envelopment since related viruses within the Ortervirales, for instance LTR retrotransposons, are characterized by different levels of envelopment, thus affecting the capacity for intercellular transmission.
Collapse
|
14
|
Nasir A, Romero-Severson E, Claverie JM. Investigating the Concept and Origin of Viruses. Trends Microbiol 2020; 28:959-967. [PMID: 33158732 PMCID: PMC7609044 DOI: 10.1016/j.tim.2020.08.003] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 08/25/2020] [Accepted: 08/27/2020] [Indexed: 12/21/2022]
Abstract
The ongoing COVID-19 pandemic has piqued public interest in the properties, evolution, and emergence of viruses. Here, we discuss how these basic questions have surprisingly remained disputed despite being increasingly within the reach of scientific analysis. We review recent data-driven efforts that shed light into the origin and evolution of viruses and explain factors that resist the widespread acceptance of new views and insights. We propose a new definition of viruses that is not restricted to the presence or absence of any genetic or physical feature, detail a scenario for how viruses likely originated from ancient cells, and explain technical and conceptual biases that limit our understanding of virus evolution. We note that the philosophical aspects of virus evolution also impact the way we might prepare for future outbreaks.
Collapse
Affiliation(s)
- Arshan Nasir
- Theoretical Biology and Biophysics (T-6), Los Alamos National Laboratory, Los Alamos, NM, USA.
| | - Ethan Romero-Severson
- Theoretical Biology and Biophysics (T-6), Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Jean-Michel Claverie
- Aix Marseille University, CNRS, IGS, Structural and Genomic Information Laboratory (UMR7256), Mediterranean Institute of Microbiology (FR3479), Marseille, France
| |
Collapse
|
15
|
Claverie JM. Fundamental Difficulties Prevent the Reconstruction of the Deep Phylogeny of Viruses. Viruses 2020; 12:E1130. [PMID: 33036160 PMCID: PMC7600955 DOI: 10.3390/v12101130] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/01/2020] [Accepted: 10/03/2020] [Indexed: 12/11/2022] Open
Abstract
The extension of virology beyond its traditional medical, veterinary, or agricultural applications, now called environmental virology, has shown that viruses are both the most numerous and diverse biological entities on Earth. In particular, virus isolations from unicellular eukaryotic hosts (heterotrophic and photosynthetic protozoans) revealed numerous viral types previously unexpected in terms of virion structure, gene content, or mode of replication. Complemented by large-scale metagenomic analyses, these discoveries have rekindled interest in the enigma of the origin of viruses, for which a description encompassing all their diversity remains not available. Several laboratories have repeatedly tackled the deep reconstruction of the evolutionary history of viruses, using various methods of molecular phylogeny applied to the few shared "core" genes detected in certain virus groups (e.g., the Nucleocytoviricota). Beyond the practical difficulties of establishing reliable homology relationships from extremely divergent sequences, I present here conceptual arguments highlighting several fundamental limitations plaguing the reconstruction of the deep evolutionary history of viruses, and even more the identification of their unique or multiple origin(s). These arguments also underline the risk of establishing premature high level viral taxonomic classifications. Those limitations are direct consequences of the random mechanisms governing the reductive/retrogressive evolution of all obligate intracellular parasites.
Collapse
Affiliation(s)
- Jean-Michel Claverie
- Structural & Genomic Information Laboratory (IGS, UMR 7256), Mediterranean Institute of Microbiology (FR3479), Aix-Marseille University and CNRS, 13288 Marseille, France
| |
Collapse
|