1
|
Calaça Serrão A, Dänekamp FT, Meggyesi Z, Braun D. Replication elongates short DNA, reduces sequence bias and develops trimer structure. Nucleic Acids Res 2024; 52:1290-1297. [PMID: 38096089 PMCID: PMC10853772 DOI: 10.1093/nar/gkad1190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 11/15/2023] [Accepted: 11/30/2023] [Indexed: 02/10/2024] Open
Abstract
The origin of molecular evolution required the replication of short oligonucleotides to form longer polymers. Prebiotically plausible oligonucleotide pools tend to contain more of some nucleobases than others. It has been unclear whether this initial bias persists and how it affects replication. To investigate this, we examined the evolution of 12-mer biased short DNA pools using an enzymatic model system. This allowed us to study the long timescales involved in evolution, since it is not yet possible with currently investigated prebiotic replication chemistries. Our analysis using next-generation sequencing from different time points revealed that the initial nucleotide bias of the pool disappeared in the elongated pool after isothermal replication. In contrast, the nucleotide composition at each position in the elongated sequences remained biased and varied with both position and initial bias. Furthermore, we observed the emergence of highly periodic dimer and trimer motifs in the rapidly elongated sequences. This shift in nucleotide composition and the emergence of structure through templated replication could help explain how biased prebiotic pools could undergo molecular evolution and lead to complex functional nucleic acids.
Collapse
Affiliation(s)
- Adriana Calaça Serrão
- Systems Biophysics, Physics Department, Center for NanoScience, Ludwig-Maximilians-Universität München, Amalienstraße 54, 80799 Munich, Germany
| | - Felix T Dänekamp
- Systems Biophysics, Physics Department, Center for NanoScience, Ludwig-Maximilians-Universität München, Amalienstraße 54, 80799 Munich, Germany
| | - Zsófia Meggyesi
- Systems Biophysics, Physics Department, Center for NanoScience, Ludwig-Maximilians-Universität München, Amalienstraße 54, 80799 Munich, Germany
| | - Dieter Braun
- Systems Biophysics, Physics Department, Center for NanoScience, Ludwig-Maximilians-Universität München, Amalienstraße 54, 80799 Munich, Germany
| |
Collapse
|
2
|
Random and Natural Non-Coding RNA Have Similar Structural Motif Patterns but Differ in Bulge, Loop, and Bond Counts. Life (Basel) 2023; 13:life13030708. [PMID: 36983865 PMCID: PMC10054693 DOI: 10.3390/life13030708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 02/15/2023] [Accepted: 02/27/2023] [Indexed: 03/08/2023] Open
Abstract
An important question in evolutionary biology is whether (and in what ways) genotype–phenotype (GP) map biases can influence evolutionary trajectories. Untangling the relative roles of natural selection and biases (and other factors) in shaping phenotypes can be difficult. Because the RNA secondary structure (SS) can be analyzed in detail mathematically and computationally, is biologically relevant, and a wealth of bioinformatic data are available, it offers a good model system for studying the role of bias. For quite short RNA (length L≤126), it has recently been shown that natural and random RNA types are structurally very similar, suggesting that bias strongly constrains evolutionary dynamics. Here, we extend these results with emphasis on much larger RNA with lengths up to 3000 nucleotides. By examining both abstract shapes and structural motif frequencies (i.e., the number of helices, bonds, bulges, junctions, and loops), we find that large natural and random structures are also very similar, especially when contrasted to typical structures sampled from the spaces of all possible RNA structures. Our motif frequency study yields another result, where the frequencies of different motifs can be used in machine learning algorithms to classify random and natural RNA with high accuracy, especially for longer RNA (e.g., ROC AUC 0.86 for L = 1000). The most important motifs for classification are the number of bulges, loops, and bonds. This finding may be useful in using SS to detect candidates for functional RNA within ‘junk’ DNA regions.
Collapse
|
3
|
Manrubia S. The simple emergence of complex molecular function. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2022; 380:20200422. [PMID: 35599566 DOI: 10.1098/rsta.2020.0422] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
At odds with a traditional view of molecular evolution that seeks a descent-with-modification relationship between functional sequences, new functions can emerge de novo with relative ease. At early times of molecular evolution, random polymers could have sufficed for the appearance of incipient chemical activity, while the cellular environment harbours a myriad of proto-functional molecules. The emergence of function is facilitated by several mechanisms intrinsic to molecular organization, such as redundant mapping of sequences into structures, phenotypic plasticity, modularity or cooperative associations between genomic sequences. It is the availability of niches in the molecular ecology that filters new potentially functional proposals. New phenotypes and subsequent levels of molecular complexity could be attained through combinatorial explorations of currently available molecular variants. Natural selection does the rest. This article is part of the theme issue 'Emergent phenomena in complex physical and socio-technical systems: from cells to societies'.
Collapse
Affiliation(s)
- Susanna Manrubia
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain
- Systems Biology Department, National Biotechnology Centre (CSIC), c/Darwin 3, 28049 Madrid, Spain
| |
Collapse
|
4
|
Martin NS, Ahnert SE. Fast free-energy-based neutral set size estimates for the RNA genotype-phenotype map. J R Soc Interface 2022; 19:20220072. [PMID: 35702868 PMCID: PMC9198509 DOI: 10.1098/rsif.2022.0072] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 05/23/2022] [Indexed: 12/30/2022] Open
Abstract
The genotype-phenotype (GP) map of RNA secondary structure links each RNA sequence to its corresponding secondary structure. Previous research has shown that the large-scale structural properties of GP maps, such as the size of neutral sets in genotype space, can influence evolutionary outcomes. In order to use neutral set sizes, efficient and accurate computational methods are needed to compute them. Here, we propose a new method, which is based on free energy estimates and is much faster than existing sample-based methods. Moreover, this approach can give insight into the reasons behind neutral set size variations, for example, why structures with fewer stacks tend to have larger neutral set sizes. In addition, we generalize neutral set size calculations from the previously studied many-to-one framework, where each sequence folds into a single energetically preferred structure, to a fuller many-to-many framework, where several low-energy structures are included. We find that structures with high neutral sets in one framework also tend to have large neutral sets in the other framework for a range of parameters and thus the choice of GP map does not fundamentally affect which structures have the largest neutral set sizes.
Collapse
Affiliation(s)
- Nora S. Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, UK
- Sainsbury Laboratory, University of Cambridge, Bateman Street, Cambridge CB2 1LR, UK
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Parks Road, Oxford OX1 3PU, UK
| | - Sebastian E. Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK
- The Alan Turing Institute, British Library, Euston Road, London NW1 2DB, UK
| |
Collapse
|
5
|
Dingle K, Ghaddar F, Šulc P, Louis AA. Phenotype Bias Determines How Natural RNA Structures Occupy the Morphospace of All Possible Shapes. Mol Biol Evol 2022; 39:msab280. [PMID: 34542628 PMCID: PMC8763027 DOI: 10.1093/molbev/msab280] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Morphospaces-representations of phenotypic characteristics-are often populated unevenly, leaving large parts unoccupied. Such patterns are typically ascribed to contingency, or else to natural selection disfavoring certain parts of the morphospace. The extent to which developmental bias, the tendency of certain phenotypes to preferentially appear as potential variation, also explains these patterns is hotly debated. Here we demonstrate quantitatively that developmental bias is the primary explanation for the occupation of the morphospace of RNA secondary structure (SS) shapes. Upon random mutations, some RNA SS shapes (the frequent ones) are much more likely to appear than others. By using the RNAshapes method to define coarse-grained SS classes, we can directly compare the frequencies that noncoding RNA SS shapes appear in the RNAcentral database to frequencies obtained upon a random sampling of sequences. We show that: 1) only the most frequent structures appear in nature; the vast majority of possible structures in the morphospace have not yet been explored; 2) remarkably small numbers of random sequences are needed to produce all the RNA SS shapes found in nature so far; and 3) perhaps most surprisingly, the natural frequencies are accurately predicted, over several orders of magnitude in variation, by the likelihood that structures appear upon a uniform random sampling of sequences. The ultimate cause of these patterns is not natural selection, but rather a strong phenotype bias in the RNA genotype-phenotype map, a type of developmental bias or "findability constraint," which limits evolutionary dynamics to a hugely reduced subset of structures that are easy to "find."
Collapse
Affiliation(s)
- Kamaludin Dingle
- Centre for Applied Mathematics and Bioinformatics, Department of Mathematics and Natural Sciences, Gulf University for Science and Technology, Hawally, Kuwait
| | - Fatme Ghaddar
- Centre for Applied Mathematics and Bioinformatics, Department of Mathematics and Natural Sciences, Gulf University for Science and Technology, Hawally, Kuwait
| | - Petr Šulc
- School of Molecular Sciences and Center for Molecular Design and Biomimetics at the Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
6
|
Martin NS, Ahnert SE. Insertions and deletions in the RNA sequence-structure map. J R Soc Interface 2021; 18:20210380. [PMID: 34610259 PMCID: PMC8492174 DOI: 10.1098/rsif.2021.0380] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Accepted: 09/13/2021] [Indexed: 12/21/2022] Open
Abstract
Genotype-phenotype maps link genetic changes to their fitness effect and are thus an essential component of evolutionary models. The map between RNA sequences and their secondary structures is a key example and has applications in functional RNA evolution. For this map, the structural effect of substitutions is well understood, but models usually assume a constant sequence length and do not consider insertions or deletions. Here, we expand the sequence-structure map to include single nucleotide insertions and deletions by using the RNAshapes concept. To quantify the structural effect of insertions and deletions, we generalize existing definitions for robustness and non-neutral mutation probabilities. We find striking similarities between substitutions, deletions and insertions: robustness to substitutions is correlated with robustness to insertions and, for most structures, to deletions. In addition, frequent structural changes after substitutions also tend to be common for insertions and deletions. This is consistent with the connection between energetically suboptimal folds and possible structural transitions. The similarities observed hold both for genotypic and phenotypic robustness and mutation probabilities, i.e. for individual sequences and for averages over sequences with the same structure. Our results could have implications for the rate of neutral and non-neutral evolution.
Collapse
Affiliation(s)
- Nora S. Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, UK
- Sainsbury Laboratory, University of Cambridge, Bateman Street, Cambridge CB2 1LR, UK
| | - Sebastian E. Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK
- The Alan Turing Institute, British Library, Euston Road, London NW1 2DB, UK
| |
Collapse
|
7
|
Manrubia S, Cuesta JA, Aguirre J, Ahnert SE, Altenberg L, Cano AV, Catalán P, Diaz-Uriarte R, Elena SF, García-Martín JA, Hogeweg P, Khatri BS, Krug J, Louis AA, Martin NS, Payne JL, Tarnowski MJ, Weiß M. From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics. Phys Life Rev 2021; 38:55-106. [PMID: 34088608 DOI: 10.1016/j.plrev.2021.03.004] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 03/01/2021] [Indexed: 12/21/2022]
Abstract
Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced in the space of genotypes by sequences mapped to molecular structures, has revealed important facts that deeply affect the dynamical description of evolutionary processes. Empirical evidence supporting the fundamental relevance of features such as phenotypic bias is mounting as well, while the synthesis of conceptual and experimental progress leads to questioning current assumptions on the nature of evolutionary dynamics-cancer progression models or synthetic biology approaches being notable examples. This work delves with a critical and constructive attitude into our current knowledge of how genotypes map onto molecular phenotypes and organismal functions, and discusses theoretical and empirical avenues to broaden and improve this comprehension. As a final goal, this community should aim at deriving an updated picture of evolutionary processes soundly relying on the structural properties of genotype spaces, as revealed by modern techniques of molecular and functional analysis.
Collapse
Affiliation(s)
- Susanna Manrubia
- Department of Systems Biology, Centro Nacional de Biotecnología (CSIC), Madrid, Spain; Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain; Instituto de Biocomputación y Física de Sistemas Complejos (BiFi), Universidad de Zaragoza, Spain; UC3M-Santander Big Data Institute (IBiDat), Getafe, Madrid, Spain
| | - Jacobo Aguirre
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Centro de Astrobiología, CSIC-INTA, ctra. de Ajalvir km 4, 28850 Torrejón de Ardoz, Madrid, Spain
| | - Sebastian E Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK; The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK
| | | | - Alejandro V Cano
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pablo Catalán
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain
| | - Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain; Instituto de Investigaciones Biomédicas "Alberto Sols" (UAM-CSIC), Madrid, Spain
| | - Santiago F Elena
- Instituto de Biología Integrativa de Sistemas, I(2)SysBio (CSIC-UV), València, Spain; The Santa Fe Institute, Santa Fe, NM, USA
| | | | - Paulien Hogeweg
- Theoretical Biology and Bioinformatics Group, Utrecht University, the Netherlands
| | - Bhavin S Khatri
- The Francis Crick Institute, London, UK; Department of Life Sciences, Imperial College London, London, UK
| | - Joachim Krug
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, UK
| | - Nora S Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Marcel Weiß
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| |
Collapse
|
8
|
Oliver CG, Reinharz V, Waldispühl J. On the emergence of structural complexity in RNA replicators. RNA (NEW YORK, N.Y.) 2019; 25:1579-1591. [PMID: 31467146 PMCID: PMC6859851 DOI: 10.1261/rna.070391.119] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 08/19/2019] [Indexed: 06/10/2023]
Abstract
The RNA world hypothesis relies on the ability of ribonucleic acids to spontaneously acquire complex structures capable of supporting essential biological functions. Multiple sophisticated evolutionary models have been proposed for their emergence, but they often assume specific conditions. In this work, we explore a simple and parsimonious scenario describing the emergence of complex molecular structures at the early stages of life. We show that at specific GC content regimes, an undirected replication model is sufficient to explain the apparition of multibranched RNA secondary structures-a structural signature of many essential ribozymes. We ran a large-scale computational study to map energetically stable structures on complete mutational networks of 50-nt-long RNA sequences. Our results reveal that the sequence landscape with stable structures is enriched with multibranched structures at a length scale coinciding with the appearance of complex structures in RNA databases. A random replication mechanism preserving a 50% GC content may suffice to explain a natural enrichment of stable complex structures in populations of functional RNAs. In contrast, an evolutionary mechanism eliciting the most stable folds at each generation appears to help reaching multibranched structures at highest GC content.
Collapse
Affiliation(s)
- Carlos G Oliver
- School of Computer Science, McGill University, Montreal, QC H3A 2B3, Canada
| | - Vladimir Reinharz
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 34126, South Korea
| | - Jérôme Waldispühl
- School of Computer Science, McGill University, Montreal, QC H3A 2B3, Canada
| |
Collapse
|
9
|
Grabow WW, Andrews GE. On the nature and origin of biological information: The curious case of RNA. Biosystems 2019; 185:104031. [PMID: 31525398 DOI: 10.1016/j.biosystems.2019.104031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2019] [Revised: 09/11/2019] [Accepted: 09/12/2019] [Indexed: 11/18/2022]
Abstract
Biological information is most commonly thought of in terms of biology's Central Dogma where DNA is viewed as a linearized code used to synthesize proteins. Using DNA's chemical cousin, RNA, as a case study we consider how biological information operates outside the linear arrangement of its polymeric subunits. Much like individual pieces of a jigsaw puzzle, particular structures enable biomolecules to undergo precise molecular interactions with one another based on their respective shapes. By exploring the relationship between sequence and structure in RNA we argue that biological information finds its ultimate functional fulfillment in the three-dimensional structural arrangement of its atoms. We show how recurrent structural RNA motifs-operating at the tertiary level of a molecule-provide robust building blocks for the formation of new structural configurations and thereby convey the information required for emergent biological functions. We posit that these same RNA structures, guided by their respective thermodynamic stabilities, experience selective pressure to maintain particular three-dimensional architectures over and above pressures to maintain a particular sequence of nucleotides. Ultimately, this framework for understanding the nature of biological information provides a useful paradigm for understanding its origins and how biological information can result from chaotic prebiotic conditions.
Collapse
Affiliation(s)
- Wade W Grabow
- Department of Chemistry and Biochemistry, Seattle Pacific University, Seattle, WA, 918119-1997, USA.
| | - Grace E Andrews
- Department of Chemistry and Biochemistry, Seattle Pacific University, Seattle, WA, 918119-1997, USA
| |
Collapse
|
10
|
Catalán P, Elena SF, Cuesta JA, Manrubia S. Parsimonious Scenario for the Emergence of Viroid-Like Replicons De Novo. Viruses 2019; 11:v11050425. [PMID: 31075860 PMCID: PMC6563258 DOI: 10.3390/v11050425] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 04/30/2019] [Accepted: 05/02/2019] [Indexed: 01/12/2023] Open
Abstract
Viroids are small, non-coding, circular RNA molecules that infect plants. Different hypotheses for their evolutionary origin have been put forward, such as an early emergence in a precellular RNA World or several de novo independent evolutionary origins in plants. Here, we discuss the plausibility of de novo emergence of viroid-like replicons by giving theoretical support to the likelihood of different steps along a parsimonious evolutionary pathway. While Avsunviroidae-like structures are relatively easy to obtain through evolution of a population of random RNA sequences of fixed length, rod-like structures typical of Pospiviroidae are difficult to fix. Using different quantitative approaches, we evaluated the likelihood that RNA sequences fold into a rod-like structure and bear specific sequence motifs facilitating interactions with other molecules, e.g., RNA polymerases, RNases, and ligases. By means of numerical simulations, we show that circular RNA replicons analogous to Pospiviroidae emerge if evolution is seeded with minimal circular RNAs that grow through the gradual addition of nucleotides. Further, these rod-like replicons often maintain their structure if independent functional modules are acquired that impose selective constraints. The evolutionary scenario we propose here is consistent with the structural and biochemical properties of viroids described to date.
Collapse
Affiliation(s)
- Pablo Catalán
- Biosciences, College of Life and Environmental Sciences, University of Exeter, Exeter EX4 4QD, UK.
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
| | - Santiago F Elena
- Instituto de Biología Integrativa de Sistemas (I2SysBio), CSIC-Universitat de València, Paterna, 46980 València, Spain.
- The Santa Fe Institute, Santa Fe, NM 87501, USA.
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
- Departamento de Matemáticas, Universidad Carlos III de Madrid, 28911 Leganés, Spain.
- Instituto de Biocomputación y Física de Sistemas Complejos (BiFi), Universidad de Zaragoza, 50018 Zaragoza, Spain.
- Institute of Financial Big Data (IFiBiD), Universidad Carlos III de Madrid⁻Banco de Santander, 28903 Getafe, Spain.
| | - Susanna Manrubia
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
- National Biotechnology Centre (CSIC), 28049 Madrid, Spain.
| |
Collapse
|
11
|
Mutschler H, Taylor AI, Porebski BT, Lightowlers A, Houlihan G, Abramov M, Herdewijn P, Holliger P. Random-sequence genetic oligomer pools display an innate potential for ligation and recombination. eLife 2018; 7:43022. [PMID: 30461419 PMCID: PMC6289569 DOI: 10.7554/elife.43022] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2018] [Accepted: 11/16/2018] [Indexed: 02/06/2023] Open
Abstract
Recombination, the exchange of information between different genetic polymer strands, is of fundamental importance in biology for genome maintenance and genetic diversification and is mediated by dedicated recombinase enzymes. Here, we describe an innate capacity for non-enzymatic recombination (and ligation) in random-sequence genetic oligomer pools. Specifically, we examine random and semi-random eicosamer (N20) pools of RNA, DNA and the unnatural genetic polymers ANA (arabino-), HNA (hexitol-) and AtNA (altritol-nucleic acids). While DNA, ANA and HNA pools proved inert, RNA (and to a lesser extent AtNA) pools displayed diverse modes of spontaneous intermolecular recombination, connecting recombination mechanistically to the vicinal ring cis-diol configuration shared by RNA and AtNA. Thus, the chemical constitution that renders both susceptible to hydrolysis emerges as the fundamental determinant of an innate capacity for recombination, which is shown to promote a concomitant increase in compositional, informational and structural pool complexity and hence evolutionary potential.
Collapse
Affiliation(s)
| | | | | | | | | | - Mikhail Abramov
- REGA Institute, Katholieke Universiteit Leuven, Leuven, Belgium
| | - Piet Herdewijn
- REGA Institute, Katholieke Universiteit Leuven, Leuven, Belgium
| | | |
Collapse
|
12
|
García-Martín JA, Catalán P, Manrubia S, Cuesta JA. Statistical theory of phenotype abundance distributions: A test through exact enumeration of genotype spaces. ACTA ACUST UNITED AC 2018. [DOI: 10.1209/0295-5075/123/28001] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
13
|
Manrubia S, Cuesta JA. Distribution of genotype network sizes in sequence-to-structure genotype-phenotype maps. J R Soc Interface 2017; 14:rsif.2016.0976. [PMID: 28424303 DOI: 10.1098/rsif.2016.0976] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Accepted: 03/22/2017] [Indexed: 01/10/2023] Open
Abstract
An essential quantity to ensure evolvability of populations is the navigability of the genotype space. Navigability, understood as the ease with which alternative phenotypes are reached, relies on the existence of sufficiently large and mutually attainable genotype networks. The size of genotype networks (e.g. the number of RNA sequences folding into a particular secondary structure or the number of DNA sequences coding for the same protein structure) is astronomically large in all functional molecules investigated: an exhaustive experimental or computational study of all RNA folds or all protein structures becomes impossible even for moderately long sequences. Here, we analytically derive the distribution of genotype network sizes for a hierarchy of models which successively incorporate features of increasingly realistic sequence-to-structure genotype-phenotype maps. The main feature of these models relies on the characterization of each phenotype through a prototypical sequence whose sites admit a variable fraction of letters of the alphabet. Our models interpolate between two limit distributions: a power-law distribution, when the ordering of sites in the prototypical sequence is strongly constrained, and a lognormal distribution, as suggested for RNA, when different orderings of the same set of sites yield different phenotypes. Our main result is the qualitative and quantitative identification of those features of sequence-to-structure maps that lead to different distributions of genotype network sizes.
Collapse
Affiliation(s)
- Susanna Manrubia
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain .,Departamento de Biología de Sistemas, Centro Nacional de Biotecnología (CSIC), Madrid, Spain
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.,Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Madrid, Spain.,Instituto de Biocomputación y Física de Sistemas Complejos (BIFI), Universidad de Zaragoza, Zaragoza, Spain.,UC3M-BS Institute of Financial Big Data (IFiBiD), Universidad Carlos III de Madrid, Getafe, Madrid, Spain
| |
Collapse
|
14
|
Staroseletz Y, Nechaev S, Bichenkova E, Bryce RA, Watson C, Vlassov V, Zenkova M. Non-enzymatic recombination of RNA: Ligation in loops. Biochim Biophys Acta Gen Subj 2017; 1862:705-725. [PMID: 29097301 DOI: 10.1016/j.bbagen.2017.10.019] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Revised: 10/10/2017] [Accepted: 10/26/2017] [Indexed: 12/23/2022]
Abstract
BACKGROUND While the RNA world hypothesis is widely accepted, it is still far from complete: the existence of self-replicating ribozyme, consisting of potentially hundreds of nucleotides, is a core assumption for the majority of RNA world models. The appearance of such long RNA molecules under prebiotic conditions is not self-evident. Recombination seems to be a plausible way of creating RNA diversity, resulting in the appearance of functional RNAs, capable of self-replicating. METHODS We report here on the study of recombination process modelled with two 96 nts RNA fragments. Detection of recombination products was performed with RT-PCR followed by TA-cloning and Sanger sequencing. RESULTS A wide range of recombinant products was detected. We found that (i) the most efficient ligation was observed for RNA species forming bulges or internal loops, with ligation partners located within the loop; (ii) a strong preference was observed for formation of a few types of major products with a large variety of minor products; (iii) ligation could occur with participation of either 2',3'-cyclophosphate or 5'-ppp; (iv) the presence of key reaction components, i.e. 5'ppp-RNAs, enabled the formation of additional types of product; (v) molecular dynamics simulations of one of the most abundant products suggests that the ligation results in a preferable formation of 2'-5'- rather than 3'-5'-linkages. CONCLUSIONS The study demonstrates regularities of new RNA molecules formation with non-enzymatic recombination process. GENERAL SIGNIFICANCE Our findings provide new data supporting the RNA World hypothesis and show the way of new RNA sequences emergence under prebiotic conditions.
Collapse
Affiliation(s)
- Yaroslav Staroseletz
- Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of Russian Academy of Sciences, 8 Lavrentiev Avenue, Novosibirsk 630090, Russia
| | - Sergey Nechaev
- Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of Russian Academy of Sciences, 8 Lavrentiev Avenue, Novosibirsk 630090, Russia
| | - Elena Bichenkova
- Division of Pharmacy and Optometry, School of Health Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Oxford Road, Manchester, M13 9PT, UK
| | - Richard A Bryce
- Division of Pharmacy and Optometry, School of Health Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Oxford Road, Manchester, M13 9PT, UK
| | - Catherine Watson
- Division of Pharmacy and Optometry, School of Health Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Oxford Road, Manchester, M13 9PT, UK
| | - Valentin Vlassov
- Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of Russian Academy of Sciences, 8 Lavrentiev Avenue, Novosibirsk 630090, Russia
| | - Marina Zenkova
- Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of Russian Academy of Sciences, 8 Lavrentiev Avenue, Novosibirsk 630090, Russia.
| |
Collapse
|
15
|
Catalán P, Arias CF, Cuesta JA, Manrubia S. Adaptive multiscapes: an up-to-date metaphor to visualize molecular adaptation. Biol Direct 2017; 12:7. [PMID: 28245845 PMCID: PMC5331743 DOI: 10.1186/s13062-017-0178-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 02/11/2017] [Indexed: 01/08/2023] Open
Abstract
Background Wright’s metaphor of the fitness landscape has shaped and conditioned our view of the adaptation of populations for almost a century. Since its inception, and including criticism raised by Wright himself, the concept has been surrounded by controversy. Among others, the debate stems from the intrinsic difficulty to capture important features of the space of genotypes, such as its high dimensionality or the existence of abundant ridges, in a visually appealing two-dimensional picture. Two additional currently widespread observations come to further constrain the applicability of the original metaphor: the very skewed distribution of phenotype sizes (which may actively prevent, due to entropic effects, the achievement of fitness maxima), and functional promiscuity (i.e. the existence of secondary functions which entail partial adaptation to environments never encountered before by the population). Results Here we revise some of the shortcomings of the fitness landscape metaphor and propose a new “scape” formed by interconnected layers, each layer containing the phenotypes viable in a given environment. Different phenotypes within a layer are accessible through mutations with selective value, while neutral mutations cause displacements of populations within a phenotype. A different environment is represented as a separated layer, where phenotypes may have new fitness values, other phenotypes may be viable, and the same genotype may yield a different phenotype, representing genotypic promiscuity. This scenario explicitly includes the many-to-many structure of the genotype-to-phenotype map. A number of empirical observations regarding the adaptation of populations in the light of adaptive multiscapes are reviewed. Conclusions Several shortcomings of Wright’s visualization of fitness landscapes can be overcome through adaptive multiscapes. Relevant aspects of population adaptation, such as neutral drift, functional promiscuity or environment-dependent fitness, as well as entropic trapping and the concomitant impossibility to reach fitness peaks are visualized at once. Adaptive multiscapes should aid in the qualitative understanding of the multiple pathways involved in evolutionary dynamics. Reviewers This article was reviewed by Eugene Koonin and Ricard Solé.
Collapse
Affiliation(s)
- Pablo Catalán
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.,Departamento de Matemáticas, Universidad Carlos III de Madrid, Madrid, Spain
| | - Clemente F Arias
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain
| | - Jose A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.,Departamento de Matemáticas, Universidad Carlos III de Madrid, Madrid, Spain.,Institute for Biocomputation and Physics of Complex Systems, Zaragoza, Spain.,UC3M-BS Institute of Financial Big Data (IFiBiD), Madrid, Spain
| | - Susanna Manrubia
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain. .,National Biotechnology Centre (CSIC), c/ Darwin 3, Madrid, 28049, Spain.
| |
Collapse
|
16
|
Greenbury SF, Ahnert SE. The organization of biological sequences into constrained and unconstrained parts determines fundamental properties of genotype-phenotype maps. J R Soc Interface 2015; 12:20150724. [PMID: 26609063 PMCID: PMC4707848 DOI: 10.1098/rsif.2015.0724] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Accepted: 10/30/2015] [Indexed: 11/12/2022] Open
Abstract
Biological information is stored in DNA, RNA and protein sequences, which can be understood as genotypes that are translated into phenotypes. The properties of genotype-phenotype (GP) maps have been studied in great detail for RNA secondary structure. These include a highly biased distribution of genotypes per phenotype, negative correlation of genotypic robustness and evolvability, positive correlation of phenotypic robustness and evolvability, shape-space covering, and a roughly logarithmic scaling of phenotypic robustness with phenotypic frequency. More recently similar properties have been discovered in other GP maps, suggesting that they may be fundamental to biological GP maps, in general, rather than specific to the RNA secondary structure map. Here we propose that the above properties arise from the fundamental organization of biological information into 'constrained' and 'unconstrained' sequences, in the broadest possible sense. As 'constrained' we describe sequences that affect the phenotype more immediately, and are therefore more sensitive to mutations, such as, e.g. protein-coding DNA or the stems in RNA secondary structure. 'Unconstrained' sequences, on the other hand, can mutate more freely without affecting the phenotype, such as, e.g. intronic or intergenic DNA or the loops in RNA secondary structure. To test our hypothesis we consider a highly simplified GP map that has genotypes with 'coding' and 'non-coding' parts. We term this the Fibonacci GP map, as it is equivalent to the Fibonacci code in information theory. Despite its simplicity the Fibonacci GP map exhibits all the above properties of much more complex and biologically realistic GP maps. These properties are therefore likely to be fundamental to many biological GP maps.
Collapse
Affiliation(s)
- S F Greenbury
- Theory of Condensed Matter, Cavendish Laboratory, University of Cambridge, Cambridge CB3 0HE, UK
| | - S E Ahnert
- Theory of Condensed Matter, Cavendish Laboratory, University of Cambridge, Cambridge CB3 0HE, UK
| |
Collapse
|
17
|
Manrubia S, Cuesta JA. Evolution on neutral networks accelerates the ticking rate of the molecular clock. J R Soc Interface 2015; 12:20141010. [PMID: 25392402 DOI: 10.1098/rsif.2014.1010] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Large sets of genotypes give rise to the same phenotype, because phenotypic expression is highly redundant. Accordingly, a population can accept mutations without altering its phenotype, as long as the genotype mutates into another one on the same set. By linking every pair of genotypes that are mutually accessible through mutation, genotypes organize themselves into neutral networks (NNs). These networks are known to be heterogeneous and assortative, and these properties affect the evolutionary dynamics of the population. By studying the dynamics of populations on NNs with arbitrary topology, we analyse the effect of assortativity, of NN (phenotype) fitness and of network size. We find that the probability that the population leaves the network is smaller the longer the time spent on it. This progressive 'phenotypic entrapment' entails a systematic increase in the overdispersion of the process with time and an acceleration in the fixation rate of neutral mutations. We also quantify the variation of these effects with the size of the phenotype and with its fitness relative to that of neighbouring alternatives.
Collapse
Affiliation(s)
- Susanna Manrubia
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain Systems Biology Programme, National Centre for Biotechnology (CSIC), c/ Darwin 3, 28049 Madrid, Spain
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain Department of Mathematics, Universidad Carlos III de Madrid, 28911 Leganés, Madrid, Spain Instituto de Biocomputación y Física de Sistemas Complejos (BIFI), Universidad de Zaragoza, 50009 Zaragoza, Spain
| |
Collapse
|
18
|
Becchetti A. Empirically founded genotype-phenotype maps from mammalian cyclic nucleotide-gated ion channels. J Theor Biol 2014; 363:205-15. [PMID: 25172772 DOI: 10.1016/j.jtbi.2014.08.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2014] [Revised: 07/22/2014] [Accepted: 08/20/2014] [Indexed: 10/24/2022]
Abstract
A major barrier between evolutionary and functional biology is the difficulty of determining appropriate genotype-phenotype-fitness maps, particularly in metazoans. Concrete perspectives towards unifying these approaches are offered by studies on the physiological systems that depend on ion channel dynamics. I focus on the cyclic nucleotide-gated (CNG) channels implicated in the photoreceptor's response to light. From an evolutionary standpoint, sensory systems offers interpretative advantages, as the relation between the sensory response and environment is relatively straightforward. For CNG and other ion channels, extensive data are available about the physiological consequences of scanning mutagenesis on sensitive protein domains, such as the conduction pore. Mutant ion channels can be easily studied in living cells, so that the relation between genotypes and phenotypes is less speculative than usual. By relying on relatively simple theoretical frameworks, I used these data to relate the sequence space with phenotypes at increasing hierarchical levels. These empirical genotype-phenotype and phenotype-phenotype landscapes became smoother at higher integration levels, especially in heterozygous condition. The epistatic interaction between sites was analyzed from double mutant constructs. Magnitude epistasis was common. Moreover, evidence of reciprocal sign epistasis and the presence of permissive mutations were also observed, which suggest how adaptive regions can be connected across maladaptive valleys. The approach I describe suggests a way to better relate the evolutionary dynamics with the underlying physiology.
Collapse
Affiliation(s)
- Andrea Becchetti
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, 20126 Milano, Italy.
| |
Collapse
|
19
|
Partha R, Raman K. Revisiting robustness and evolvability: evolution in weighted genotype spaces. PLoS One 2014; 9:e112792. [PMID: 25390641 PMCID: PMC4229248 DOI: 10.1371/journal.pone.0112792] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Accepted: 10/16/2014] [Indexed: 12/30/2022] Open
Abstract
Robustness and evolvability are highly intertwined properties of biological systems. The relationship between these properties determines how biological systems are able to withstand mutations and show variation in response to them. Computational studies have explored the relationship between these two properties using neutral networks of RNA sequences (genotype) and their secondary structures (phenotype) as a model system. However, these studies have assumed every mutation to a sequence to be equally likely; the differences in the likelihood of the occurrence of various mutations, and the consequence of probabilistic nature of the mutations in such a system have previously been ignored. Associating probabilities to mutations essentially results in the weighting of genotype space. We here perform a comparative analysis of weighted and unweighted neutral networks of RNA sequences, and subsequently explore the relationship between robustness and evolvability. We show that assuming an equal likelihood for all mutations (as in an unweighted network), underestimates robustness and overestimates evolvability of a system. In spite of discarding this assumption, we observe that a negative correlation between sequence (genotype) robustness and sequence evolvability persists, and also that structure (phenotype) robustness promotes structure evolvability, as observed in earlier studies using unweighted networks. We also study the effects of base composition bias on robustness and evolvability. Particularly, we explore the association between robustness and evolvability in a sequence space that is AU-rich – sequences with an AU content of 80% or higher, compared to a normal (unbiased) sequence space. We find that evolvability of both sequences and structures in an AU-rich space is lesser compared to the normal space, and robustness higher. We also observe that AU-rich populations evolving on neutral networks of phenotypes, can access less phenotypic variation compared to normal populations evolving on neutral networks.
Collapse
Affiliation(s)
- Raghavendran Partha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
| | - Karthik Raman
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
- * E-mail:
| |
Collapse
|
20
|
Greenbury SF, Johnston IG, Louis AA, Ahnert SE. A tractable genotype-phenotype map modelling the self-assembly of protein quaternary structure. J R Soc Interface 2014; 11:20140249. [PMID: 24718456 PMCID: PMC4006268 DOI: 10.1098/rsif.2014.0249] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The mapping between biological genotypes and phenotypes is central to the study of biological evolution. Here, we introduce a rich, intuitive and biologically realistic genotype–phenotype (GP) map that serves as a model of self-assembling biological structures, such as protein complexes, and remains computationally and analytically tractable. Our GP map arises naturally from the self-assembly of polyomino structures on a two-dimensional lattice and exhibits a number of properties: redundancy (genotypes vastly outnumber phenotypes), phenotype bias (genotypic redundancy varies greatly between phenotypes), genotype component disconnectivity (phenotypes consist of disconnected mutational networks) and shape space covering (most phenotypes can be reached in a small number of mutations). We also show that the mutational robustness of phenotypes scales very roughly logarithmically with phenotype redundancy and is positively correlated with phenotypic evolvability. Although our GP map describes the assembly of disconnected objects, it shares many properties with other popular GP maps for connected units, such as models for RNA secondary structure or the hydrophobic-polar (HP) lattice model for protein tertiary structure. The remarkable fact that these important properties similarly emerge from such different models suggests the possibility that universal features underlie a much wider class of biologically realistic GP maps.
Collapse
Affiliation(s)
- Sam F Greenbury
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, , Cambridge, UK
| | | | | | | |
Collapse
|
21
|
Ruiz-Mirazo K, Briones C, de la Escosura A. Prebiotic Systems Chemistry: New Perspectives for the Origins of Life. Chem Rev 2013; 114:285-366. [DOI: 10.1021/cr2004844] [Citation(s) in RCA: 563] [Impact Index Per Article: 51.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Kepa Ruiz-Mirazo
- Biophysics
Unit (CSIC-UPV/EHU), Leioa, and Department of Logic and Philosophy
of Science, University of the Basque Country, Avenida de Tolosa 70, 20080 Donostia−San Sebastián, Spain
| | - Carlos Briones
- Department
of Molecular Evolution, Centro de Astrobiología (CSIC−INTA, associated to the NASA Astrobiology Institute), Carretera de Ajalvir, Km 4, 28850 Torrejón de Ardoz, Madrid, Spain
| | - Andrés de la Escosura
- Organic
Chemistry Department, Universidad Autónoma de Madrid, Cantoblanco, 28049 Madrid, Spain
| |
Collapse
|
22
|
Takeuchi N, Hogeweg P. Reply to the commentaries on “Evolutionary dynamics of RNA-like replicator systems: A bioinformatic approach to the origin of life”. Phys Life Rev 2012. [DOI: 10.1016/j.plrev.2012.07.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
23
|
A quantitative quasispecies theory-based model of virus escape mutation under immune selection. Proc Natl Acad Sci U S A 2012; 109:12980-5. [PMID: 22826258 DOI: 10.1073/pnas.1117201109] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Viral infections involve a complex interplay of the immune response and escape mutation of the virus quasispecies inside a single host. Although fundamental aspects of such a balance of mutation and selection pressure have been established by the quasispecies theory decades ago, its implications have largely remained qualitative. Here, we present a quantitative approach to model the virus evolution under cytotoxic T-lymphocyte immune response. The virus quasispecies dynamics are explicitly represented by mutations in the combined sequence space of a set of epitopes within the viral genome. We stochastically simulated the growth of a viral population originating from a single wild-type founder virus and its recognition and clearance by the immune response, as well as the expansion of its genetic diversity. Applied to the immune escape of a simian immunodeficiency virus epitope, model predictions were quantitatively comparable to the experimental data. Within the model parameter space, we found two qualitatively different regimes of infectious disease pathogenesis, each representing alternative fates of the immune response: It can clear the infection in finite time or eventually be overwhelmed by viral growth and escape mutation. The latter regime exhibits the characteristic disease progression pattern of human immunodeficiency virus, while the former is bounded by maximum mutation rates that can be suppressed by the immune response. Our results demonstrate that, by explicitly representing epitope mutations and thus providing a genotype-phenotype map, the quasispecies theory can form the basis of a detailed sequence-specific model of real-world viral pathogens evolving under immune selection.
Collapse
|
24
|
Manrubia SC. Neutral networks and chemical function in RNA: Comment on "Evolutionary dynamics of RNA-like replicator systems: A bioinformatic approach to the origin of life". Phys Life Rev 2012; 9:277-8; discussion 279-84. [PMID: 22698460 DOI: 10.1016/j.plrev.2012.06.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2012] [Accepted: 06/06/2012] [Indexed: 12/24/2022]
Affiliation(s)
- Susanna C Manrubia
- Centro de Astrobiología (INTA-CSIC), Ctra. de Ajalvir km. 4, 28850 Torrejón de Ardoz, Madrid, Spain.
| |
Collapse
|
25
|
Derr J, Manapat ML, Rajamani S, Leu K, Xulvi-Brunet R, Joseph I, Nowak MA, Chen IA. Prebiotically plausible mechanisms increase compositional diversity of nucleic acid sequences. Nucleic Acids Res 2012; 40:4711-22. [PMID: 22319215 PMCID: PMC3378899 DOI: 10.1093/nar/gks065] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
During the origin of life, the biological information of nucleic acid polymers must have increased to encode functional molecules (the RNA world). Ribozymes tend to be compositionally unbiased, as is the vast majority of possible sequence space. However, ribonucleotides vary greatly in synthetic yield, reactivity and degradation rate, and their non-enzymatic polymerization results in compositionally biased sequences. While natural selection could lead to complex sequences, molecules with some activity are required to begin this process. Was the emergence of compositionally diverse sequences a matter of chance, or could prebiotically plausible reactions counter chemical biases to increase the probability of finding a ribozyme? Our in silico simulations using a two-letter alphabet show that template-directed ligation and high concatenation rates counter compositional bias and shift the pool toward longer sequences, permitting greater exploration of sequence space and stable folding. We verified experimentally that unbiased DNA sequences are more efficient templates for ligation, thus increasing the compositional diversity of the pool. Our work suggests that prebiotically plausible chemical mechanisms of nucleic acid polymerization and ligation could predispose toward a diverse pool of longer, potentially structured molecules. Such mechanisms could have set the stage for the appearance of functional activity very early in the emergence of life.
Collapse
Affiliation(s)
- Julien Derr
- FAS Center for Systems Biology, Harvard University, Cambridge, MA 02138, USA
| | | | | | | | | | | | | | | |
Collapse
|
26
|
Topological structure of the space of phenotypes: the case of RNA neutral networks. PLoS One 2011; 6:e26324. [PMID: 22028856 PMCID: PMC3196570 DOI: 10.1371/journal.pone.0026324] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2011] [Accepted: 09/23/2011] [Indexed: 11/19/2022] Open
Abstract
The evolution and adaptation of molecular populations is constrained by the diversity accessible through mutational processes. RNA is a paradigmatic example of biopolymer where genotype (sequence) and phenotype (approximated by the secondary structure fold) are identified in a single molecule. The extreme redundancy of the genotype-phenotype map leads to large ensembles of RNA sequences that fold into the same secondary structure and can be connected through single-point mutations. These ensembles define neutral networks of phenotypes in sequence space. Here we analyze the topological properties of neutral networks formed by 12-nucleotides RNA sequences, obtained through the exhaustive folding of sequence space. A total of 4(12) sequences fragments into 645 subnetworks that correspond to 57 different secondary structures. The topological analysis reveals that each subnetwork is far from being random: it has a degree distribution with a well-defined average and a small dispersion, a high clustering coefficient, and an average shortest path between nodes close to its minimum possible value, i.e. the Hamming distance between sequences. RNA neutral networks are assortative due to the correlation in the composition of neighboring sequences, a feature that together with the symmetries inherent to the folding process explains the existence of communities. Several topological relationships can be analytically derived attending to structural restrictions and generic properties of the folding process. The average degree of these phenotypic networks grows logarithmically with their size, such that abundant phenotypes have the additional advantage of being more robust to mutations. This property prevents fragmentation of neutral networks and thus enhances the navigability of sequence space. In summary, RNA neutral networks show unique topological properties, unknown to other networks previously described.
Collapse
|
27
|
Stich M, Manrubia SC. Motif frequency and evolutionary search times in RNA populations. J Theor Biol 2011; 280:117-26. [PMID: 21419782 DOI: 10.1016/j.jtbi.2011.03.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2010] [Revised: 01/26/2011] [Accepted: 03/10/2011] [Indexed: 02/07/2023]
Abstract
RNA molecules, through their dual identity as sequence and structure, are an appropriate experimental and theoretical model to study the genotype-phenotype map and evolutionary processes taking place in simple replicator populations. In this computational study, we relate properties of the sequence-structure map, in particular the abundance of a given secondary structure in a random pool, with the number of replicative events that an initially random population of sequences needs to find that structure through mutation and selection. For common structures, this search process turns out to be much faster than for rare structures. Furthermore, search and fixation processes are more efficient in a wider range of mutation rates for common structures, thus indicating that evolvability of RNA populations is not simply determined by abundance. We also find significant differences in the search and fixation processes for structures of same abundance, and relate them with the number of base pairs forming the structure. Moreover, the influence of the nucleotide content of the RNA sequences on the search process is studied. Our results advance in the understanding of the distribution and attainability of RNA secondary structures. They hint at the fact that, beyond sequence length and sequence-to-function redundancy, the mutation rate that permits localization and fixation of a given phenotype strongly depends on its relative abundance and global, in general non-uniform, distribution in sequence space.
Collapse
Affiliation(s)
- Michael Stich
- Centro de Astrobiología (CSIC-INTA), Ctra de Ajalvir km 4, 28850 Torrejón de Ardoz, Madrid, Spain.
| | | |
Collapse
|
28
|
Ma W, Yu C, Zhang W, Hu J. A simple template-dependent ligase ribozyme as the RNA replicase emerging first in the RNA world. ASTROBIOLOGY 2010; 10:437-447. [PMID: 20528198 DOI: 10.1089/ast.2009.0385] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
The "RNA world" hypothesis has offered a framework for both experimental and theoretical work in the field of the origin of life. An important concern about the hypothesis is how the RNA world could originate. It has long been speculated that a template-dependent RNA synthetase ribozyme, which catalyzed its own replication (thus, an "RNA replicase"), should have emerged first. However, experimental searches for such a replicase have so far been unsuccessful. This is primarily because of the large sequence length of candidate ribozymes, which mainly work in a polymerase-like way. Here, we propose that the replicase that emerged first would be a simple template-dependent ligase ribozyme, which loosely binds to template RNA and has a relatively low efficiency of catalyzing the formation of phosphodiester bonds between adjacently aligned nucleotides or oligonucleotides. We conducted a computer simulation to support this proposal and considered the factors that might affect the emergence of the ribozyme based on the parameter analysis in the simulation. We conclude that (1) a template-dependent ligase may be more likely than a template-dependent polymerase as an early replicase in the emergence of RNA-based replication; (2) such a ligase ribozyme could emerge and be stable against parasites under a broad range of parameters in our model; (3) the conditions shown to favor the initial appearance of a template-dependent ligase ribozyme do not favor its spread.
Collapse
Affiliation(s)
- Wentao Ma
- College of Life Sciences, Wuhan University, Wuhan, People's Republic of China.
| | | | | | | |
Collapse
|
29
|
Pigliucci M. Genotype-phenotype mapping and the end of the 'genes as blueprint' metaphor. Philos Trans R Soc Lond B Biol Sci 2010; 365:557-66. [PMID: 20083632 DOI: 10.1098/rstb.2009.0241] [Citation(s) in RCA: 179] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In a now classic paper published in 1991, Alberch introduced the concept of genotype-phenotype (G-->P) mapping to provide a framework for a more sophisticated discussion of the integration between genetics and developmental biology that was then available. The advent of evo-devo first and of the genomic era later would seem to have superseded talk of transitions in phenotypic space and the like, central to Alberch's approach. On the contrary, this paper shows that recent empirical and theoretical advances have only sharpened the need for a different conceptual treatment of how phenotypes are produced. Old-fashioned metaphors like genetic blueprint and genetic programme are not only woefully inadequate but positively misleading about the nature of G-->P, and are being replaced by an algorithmic approach emerging from the study of a variety of actual G-->P maps. These include RNA folding, protein function and the study of evolvable software. Some generalities are emerging from these disparate fields of analysis, and I suggest that the concept of 'developmental encoding' (as opposed to the classical one of genetic encoding) provides a promising computational-theoretical underpinning to coherently integrate ideas on evolvability, modularity and robustness and foster a fruitful framing of the G-->P mapping problem.
Collapse
Affiliation(s)
- Massimo Pigliucci
- Department of Philosophy, City University of New York-Lehman, NY, USA.
| |
Collapse
|
30
|
Stich M, Lázaro E, Manrubia SC. Phenotypic effect of mutations in evolving populations of RNA molecules. BMC Evol Biol 2010; 10:46. [PMID: 20163698 PMCID: PMC2841169 DOI: 10.1186/1471-2148-10-46] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2009] [Accepted: 02/17/2010] [Indexed: 11/17/2022] Open
Abstract
Background The secondary structure of folded RNA sequences is a good model to map phenotype onto genotype, as represented by the RNA sequence. Computational studies of the evolution of ensembles of RNA molecules towards target secondary structures yield valuable clues to the mechanisms behind adaptation of complex populations. The relationship between the space of sequences and structures, the organization of RNA ensembles at mutation-selection equilibrium, the time of adaptation as a function of the population parameters, the presence of collective effects in quasispecies, or the optimal mutation rates to promote adaptation all are issues that can be explored within this framework. Results We investigate the effect of microscopic mutations on the phenotype of RNA molecules during their in silico evolution and adaptation. We calculate the distribution of the effects of mutations on fitness, the relative fractions of beneficial and deleterious mutations and the corresponding selection coefficients for populations evolving under different mutation rates. Three different situations are explored: the mutation-selection equilibrium (optimized population) in three different fitness landscapes, the dynamics during adaptation towards a goal structure (adapting population), and the behavior under periodic population bottlenecks (perturbed population). Conclusions The ratio between the number of beneficial and deleterious mutations experienced by a population of RNA sequences increases with the value of the mutation rate μ at which evolution proceeds. In contrast, the selective value of mutations remains almost constant, independent of μ, indicating that adaptation occurs through an increase in the amount of beneficial mutations, with little variations in the average effect they have on fitness. Statistical analyses of the distribution of fitness effects reveal that small effects, either beneficial or deleterious, are well described by a Pareto distribution. These results are robust under changes in the fitness landscape, remarkably when, in addition to selecting a target secondary structure, specific subsequences or low-energy folds are required. A population perturbed by bottlenecks behaves similarly to an adapting population, struggling to return to the optimized state. Whether it can survive in the long run or whether it goes extinct depends critically on the length of the time interval between bottlenecks.
Collapse
|
31
|
Abstract
In vitro selection of RNA aptamers that bind to a specific ligand usually begins with a random pool of RNA sequences. We propose a computational approach for designing a starting pool of RNA sequences for the selection of RNA aptamers for specific analyte binding. Our approach consists of three steps: (i) selection of RNA sequences based on their secondary structure, (ii) generating a library of three-dimensional (3D) structures of RNA molecules and (iii) high-throughput virtual screening of this library to select aptamers with binding affinity to a desired small molecule. We developed a set of criteria that allows one to select a sequence with potential binding affinity from a pool of random sequences and developed a protocol for RNA 3D structure prediction. As verification, we tested the performance of in silico selection on a set of six known aptamer–ligand complexes. The structures of the native sequences for the ligands in the testing set were among the top 5% of the selected structures. The proposed approach reduces the RNA sequences search space by four to five orders of magnitude—significantly accelerating the experimental screening and selection of high-affinity aptamers.
Collapse
Affiliation(s)
- Yaroslav Chushak
- Biotechnology HPC Software Applications Institute, Telemedicine and Advanced Technology Research Center, US Army Medical Research and Materiel Command, Fort Detrick, MD 21702, USA.
| | | |
Collapse
|
32
|
Briones C, Stich M, Manrubia SC. The dawn of the RNA World: toward functional complexity through ligation of random RNA oligomers. RNA (NEW YORK, N.Y.) 2009; 15:743-9. [PMID: 19318464 PMCID: PMC2673073 DOI: 10.1261/rna.1488609] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2008] [Accepted: 01/31/2009] [Indexed: 05/23/2023]
Abstract
A main unsolved problem in the RNA World scenario for the origin of life is how a template-dependent RNA polymerase ribozyme emerged from short RNA oligomers obtained by random polymerization on mineral surfaces. A number of computational studies have shown that the structural repertoire yielded by that process is dominated by topologically simple structures, notably hairpin-like ones. A fraction of these could display RNA ligase activity and catalyze the assembly of larger, eventually functional RNA molecules retaining their previous modular structure: molecular complexity increases but template replication is absent. This allows us to build up a stepwise model of ligation-based, modular evolution that could pave the way to the emergence of a ribozyme with RNA replicase activity, step at which information-driven Darwinian evolution would be triggered.
Collapse
Affiliation(s)
- Carlos Briones
- Centro de Astrobiología (CSIC-INTA), 28850 Torrejón de Ardoz, Madrid, Spain.
| | | | | |
Collapse
|