1
|
Lipsh-Sokolik R, Fleishman SJ. Addressing epistasis in the design of protein function. Proc Natl Acad Sci U S A 2024; 121:e2314999121. [PMID: 39133844 PMCID: PMC11348311 DOI: 10.1073/pnas.2314999121] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024] Open
Abstract
Mutations in protein active sites can dramatically improve function. The active site, however, is densely packed and extremely sensitive to mutations. Therefore, some mutations may only be tolerated in combination with others in a phenomenon known as epistasis. Epistasis reduces the likelihood of obtaining improved functional variants and dramatically slows natural and lab evolutionary processes. Research has shed light on the molecular origins of epistasis and its role in shaping evolutionary trajectories and outcomes. In addition, sequence- and AI-based strategies that infer epistatic relationships from mutational patterns in natural or experimental evolution data have been used to design functional protein variants. In recent years, combinations of such approaches and atomistic design calculations have successfully predicted highly functional combinatorial mutations in active sites. These were used to design thousands of functional active-site variants, demonstrating that, while our understanding of epistasis remains incomplete, some of the determinants that are critical for accurate design are now sufficiently understood. We conclude that the space of active-site variants that has been explored by evolution may be expanded dramatically to enhance natural activities or discover new ones. Furthermore, design opens the way to systematically exploring sequence and structure space and mutational impacts on function, deepening our understanding and control over protein activity.
Collapse
Affiliation(s)
- Rosalie Lipsh-Sokolik
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Sarel J Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
2
|
Metzger BPH, Park Y, Starr TN, Thornton JW. Epistasis facilitates functional evolution in an ancient transcription factor. eLife 2024; 12:RP88737. [PMID: 38767330 PMCID: PMC11105156 DOI: 10.7554/elife.88737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
Abstract
A protein's genetic architecture - the set of causal rules by which its sequence produces its functions - also determines its possible evolutionary trajectories. Prior research has proposed that the genetic architecture of proteins is very complex, with pervasive epistatic interactions that constrain evolution and make function difficult to predict from sequence. Most of this work has analyzed only the direct paths between two proteins of interest - excluding the vast majority of possible genotypes and evolutionary trajectories - and has considered only a single protein function, leaving unaddressed the genetic architecture of functional specificity and its impact on the evolution of new functions. Here, we develop a new method based on ordinal logistic regression to directly characterize the global genetic determinants of multiple protein functions from 20-state combinatorial deep mutational scanning (DMS) experiments. We use it to dissect the genetic architecture and evolution of a transcription factor's specificity for DNA, using data from a combinatorial DMS of an ancient steroid hormone receptor's capacity to activate transcription from two biologically relevant DNA elements. We show that the genetic architecture of DNA recognition consists of a dense set of main and pairwise effects that involve virtually every possible amino acid state in the protein-DNA interface, but higher-order epistasis plays only a tiny role. Pairwise interactions enlarge the set of functional sequences and are the primary determinants of specificity for different DNA elements. They also massively expand the number of opportunities for single-residue mutations to switch specificity from one DNA target to another. By bringing variants with different functions close together in sequence space, pairwise epistasis therefore facilitates rather than constrains the evolution of new functions.
Collapse
Affiliation(s)
- Brian PH Metzger
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
| | - Yeonwoo Park
- Program in Genetics, Genomics, and Systems Biology, University of ChicagoChicagoUnited States
| | - Tyler N Starr
- Department of Biochemistry and Molecular Biophysics, University of ChicagoChicagoUnited States
| | - Joseph W Thornton
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
- Department of Human Genetics, University of ChicagoChicagoUnited States
| |
Collapse
|
3
|
Rozhoňová H, Martí-Gómez C, McCandlish DM, Payne JL. Robust genetic codes enhance protein evolvability. PLoS Biol 2024; 22:e3002594. [PMID: 38754362 PMCID: PMC11098591 DOI: 10.1371/journal.pbio.3002594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 03/19/2024] [Indexed: 05/18/2024] Open
Abstract
The standard genetic code defines the rules of translation for nearly every life form on Earth. It also determines the amino acid changes accessible via single-nucleotide mutations, thus influencing protein evolvability-the ability of mutation to bring forth adaptive variation in protein function. One of the most striking features of the standard genetic code is its robustness to mutation, yet it remains an open question whether such robustness facilitates or frustrates protein evolvability. To answer this question, we use data from massively parallel sequence-to-function assays to construct and analyze 6 empirical adaptive landscapes under hundreds of thousands of rewired genetic codes, including those of codon compression schemes relevant to protein engineering and synthetic biology. We find that robust genetic codes tend to enhance protein evolvability by rendering smooth adaptive landscapes with few peaks, which are readily accessible from throughout sequence space. However, the standard genetic code is rarely exceptional in this regard, because many alternative codes render smoother landscapes than the standard code. By constructing low-dimensional visualizations of these landscapes, which each comprise more than 16 million mRNA sequences, we show that such alternative codes radically alter the topological features of the network of high-fitness genotypes. Whereas the genetic codes that optimize evolvability depend to some extent on the detailed relationship between amino acid sequence and protein function, we also uncover general design principles for engineering nonstandard genetic codes for enhanced and diminished evolvability, which may facilitate directed protein evolution experiments and the bio-containment of synthetic organisms, respectively.
Collapse
Affiliation(s)
- Hana Rozhoňová
- Institute of Integrative Biology, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Carlos Martí-Gómez
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - David M. McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Joshua L. Payne
- Institute of Integrative Biology, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
4
|
Sudakow I, Reinitz J, Vakulenko SA, Grigoriev D. Evolution of biological cooperation: an algorithmic approach. Sci Rep 2024; 14:1468. [PMID: 38233462 PMCID: PMC10794236 DOI: 10.1038/s41598-024-52028-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 01/12/2024] [Indexed: 01/19/2024] Open
Abstract
This manuscript presents an algorithmic approach to cooperation in biological systems, drawing on fundamental ideas from statistical mechanics and probability theory. Fisher's geometric model of adaptation suggests that the evolution of organisms well adapted to multiple constraints comes at a significant complexity cost. By utilizing combinatorial models of fitness, we demonstrate that the probability of adapting to all constraints decreases exponentially with the number of constraints, thereby generalizing Fisher's result. Our main focus is understanding how cooperation can overcome this adaptivity barrier. Through these combinatorial models, we demonstrate that when an organism needs to adapt to a multitude of environmental variables, division of labor emerges as the only viable evolutionary strategy.
Collapse
Affiliation(s)
- Ivan Sudakow
- School of Mathematics and Statistics, The Open University, Milton Keynes, MK7 6AA, UK.
| | - John Reinitz
- Departments of Statistics, Ecology and Evolution, Molecular Genetics and Cell Biology, University of Chicago, Chicago, 10587, IL, USA
| | - Sergey A Vakulenko
- Institute for Problems in Mechanical Engineering, Russian Academy of Sciences, Saint Petersburg, 199178, Russia
- Saint Petersburg Electrotechnical University, Saint Petersburg, 197022, Russia
| | - Dima Grigoriev
- CNRS, Mathématiques, Université de Lille, Villeneuve d'Ascq, Lille, 59655, France
| |
Collapse
|
5
|
Papkou A, Garcia-Pastor L, Escudero JA, Wagner A. A rugged yet easily navigable fitness landscape. Science 2023; 382:eadh3860. [PMID: 37995212 DOI: 10.1126/science.adh3860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 09/29/2023] [Indexed: 11/25/2023]
Abstract
Fitness landscape theory predicts that rugged landscapes with multiple peaks impair Darwinian evolution, but experimental evidence is limited. In this study, we used genome editing to map the fitness of >260,000 genotypes of the key metabolic enzyme dihydrofolate reductase in the presence of the antibiotic trimethoprim, which targets this enzyme. The resulting landscape is highly rugged and harbors 514 fitness peaks. However, its highest peaks are accessible to evolving populations via abundant fitness-increasing paths. Different peaks share large basins of attraction that render the outcome of adaptive evolution highly contingent on chance events. Our work shows that ruggedness need not be an obstacle to Darwinian evolution but can reduce its predictability. If true in general, the complexity of optimization problems on realistic landscapes may require reappraisal.
Collapse
Affiliation(s)
- Andrei Papkou
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
| | - Lucia Garcia-Pastor
- Departamento de Sanidad Animal and VISAVET Health Surveillance Centre, Universidad Complutense de Madrid, Madrid, Spain
| | - José Antonio Escudero
- Departamento de Sanidad Animal and VISAVET Health Surveillance Centre, Universidad Complutense de Madrid, Madrid, Spain
| | - Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- The Santa Fe Institute, Santa Fe, NM, USA
| |
Collapse
|
6
|
Barker-Clarke R, Weaver DT, Scott JG. Graph 'texture' features as novel metrics that can summarize complex biological graphs. Phys Med Biol 2023; 68:174001. [PMID: 37385267 PMCID: PMC10598684 DOI: 10.1088/1361-6560/ace305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 06/08/2023] [Accepted: 06/29/2023] [Indexed: 07/01/2023]
Abstract
Objective.Image texture features, such as those derived by Haralicket al, are a powerful metric for image classification and are used across fields including cancer research. Our aim is to demonstrate how analogous texture features can be derived for graphs and networks. We also aim to illustrate how these new metrics summarize graphs, may aid comparative graph studies, may help classify biological graphs, and might assist in detecting dysregulation in cancer.Approach.We generate the first analogies of image texture for graphs and networks. Co-occurrence matrices for graphs are generated by summing over all pairs of neighboring nodes in the graph. We generate metrics for fitness landscapes, gene co-expression and regulatory networks, and protein interaction networks. To assess metric sensitivity we varied discretization parameters and noise. To examine these metrics in the cancer context we compare metrics for both simulated and publicly available experimental gene expression and build random forest classifiers for cancer cell lineage.Main results.Our novel graph 'texture' features are shown to be informative of graph structure and node label distributions. The metrics are sensitive to discretization parameters and noise in node labels. We demonstrate that graph texture features vary across different biological graph topologies and node labelings. We show how our texture metrics can be used to classify cell line expression by lineage, demonstrating classifiers with 82% and 89% accuracy.Significance.New metrics provide opportunities for better comparative analyzes and new models for classification. Our texture features are novel second-order graph features for networks or graphs with ordered node labels. In the complex cancer informatics setting, evolutionary analyses and drug response prediction are two examples where new network science approaches like this may prove fruitful.
Collapse
Affiliation(s)
- R Barker-Clarke
- Department of Translational Hematology and Oncology Research, Lerner Research Institute, Cleveland, OH 44195, United States of America
| | - D T Weaver
- Department of Translational Hematology and Oncology Research, Lerner Research Institute, Cleveland, OH 44195, United States of America
- School of Medicine, Case Western Reserve University, Cleveland, OH 44195, United States of America
| | - J G Scott
- Department of Translational Hematology and Oncology Research, Lerner Research Institute, Cleveland, OH 44195, United States of America
- School of Medicine, Case Western Reserve University, Cleveland, OH 44195, United States of America
| |
Collapse
|
7
|
Servajean R, Bitbol AF. Impact of population size on early adaptation in rugged fitness landscapes. Philos Trans R Soc Lond B Biol Sci 2023; 378:20220045. [PMID: 37004726 PMCID: PMC10067268 DOI: 10.1098/rstb.2022.0045] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 01/12/2023] [Indexed: 04/04/2023] Open
Abstract
Owing to stochastic fluctuations arising from finite population size, known as genetic drift, the ability of a population to explore a rugged fitness landscape depends on its size. In the weak mutation regime, while the mean steady-state fitness increases with population size, we find that the height of the first fitness peak encountered when starting from a random genotype displays various behaviours versus population size, even among small and simple rugged landscapes. We show that the accessibility of the different fitness peaks is key to determining whether this height overall increases or decreases with population size. Furthermore, there is often a finite population size that maximizes the height of the first fitness peak encountered when starting from a random genotype. This holds across various classes of model rugged landscapes with sparse peaks, and in some experimental and experimentally inspired ones. Thus, early adaptation in rugged fitness landscapes can be more efficient and predictable for relatively small population sizes than in the large-size limit. This article is part of the theme issue 'Interdisciplinary approaches to predicting evolutionary biology'.
Collapse
Affiliation(s)
- Richard Servajean
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Anne-Florence Bitbol
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
8
|
Venkataram S, Kryazhimskiy S. Evolutionary repeatability of emergent properties of ecological communities. Philos Trans R Soc Lond B Biol Sci 2023; 378:20220047. [PMID: 37004728 PMCID: PMC10067272 DOI: 10.1098/rstb.2022.0047] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 12/07/2022] [Indexed: 04/04/2023] Open
Abstract
Most species belong to ecological communities where their interactions give rise to emergent community-level properties, such as diversity and productivity. Understanding and predicting how these properties change over time has been a major goal in ecology, with important practical implications for sustainability and human health. Less attention has been paid to the fact that community-level properties can also change because member species evolve. Yet, our ability to predict long-term eco-evolutionary dynamics hinges on how repeatably community-level properties change as a result of species evolution. Here, we review studies of evolution of both natural and experimental communities and make the case that community-level properties at least sometimes evolve repeatably. We discuss challenges faced in investigations of evolutionary repeatability. In particular, only a handful of studies enable us to quantify repeatability. We argue that quantifying repeatability at the community level is critical for approaching what we see as three major open questions in the field: (i) Is the observed degree of repeatability surprising? (ii) How is evolutionary repeatability at the community level related to repeatability at the level of traits of member species? (iii) What factors affect repeatability? We outline some theoretical and empirical approaches to addressing these questions. Advances in these directions will not only enrich our basic understanding of evolution and ecology but will also help us predict eco-evolutionary dynamics. This article is part of the theme issue 'Interdisciplinary approaches to predicting evolutionary biology'.
Collapse
Affiliation(s)
- Sandeep Venkataram
- Department of Ecology, Behavior and Evolution, UC San Diego, La Jolla, CA 92093, USA
| | - Sergey Kryazhimskiy
- Department of Ecology, Behavior and Evolution, UC San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
9
|
Novelty Search Promotes Antigenic Diversity in Microbial Pathogens. Pathogens 2023; 12:pathogens12030388. [PMID: 36986310 PMCID: PMC10053453 DOI: 10.3390/pathogens12030388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 02/12/2023] [Accepted: 02/21/2023] [Indexed: 03/05/2023] Open
Abstract
Driven by host–pathogen coevolution, cell surface antigens are often the fastest evolving parts of a microbial pathogen. The persistent evolutionary impetus for novel antigen variants suggests the utility of novelty-seeking algorithms in predicting antigen diversification in microbial pathogens. In contrast to traditional genetic algorithms maximizing variant fitness, novelty-seeking algorithms optimize variant novelty. Here, we designed and implemented three evolutionary algorithms (fitness-seeking, novelty-seeking, and hybrid) and evaluated their performances in 10 simulated and 2 empirically derived antigen fitness landscapes. The hybrid walks combining fitness- and novelty-seeking strategies overcame the limitations of each algorithm alone, and consistently reached global fitness peaks. Thus, hybrid walks provide a model for microbial pathogens escaping host immunity without compromising variant fitness. Biological processes facilitating novelty-seeking evolution in natural pathogen populations include hypermutability, recombination, wide dispersal, and immune-compromised hosts. The high efficiency of the hybrid algorithm improves the evolutionary predictability of novel antigen variants. We propose the design of escape-proof vaccines based on high-fitness variants covering a majority of the basins of attraction on the fitness landscape representing all potential variants of a microbial antigen.
Collapse
|
10
|
Schmiegelt B, Krug J. Accessibility percolation on Cartesian power graphs. J Math Biol 2023; 86:46. [PMID: 36790641 PMCID: PMC9931871 DOI: 10.1007/s00285-023-01882-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 01/12/2023] [Accepted: 01/31/2023] [Indexed: 02/16/2023]
Abstract
A fitness landscape is a mapping from a space of discrete genotypes to the real numbers. A path in a fitness landscape is a sequence of genotypes connected by single mutational steps. Such a path is said to be accessible if the fitness values of the genotypes encountered along the path increase monotonically. We study accessible paths on random fitness landscapes of the House-of-Cards type, on which fitness values are independent, identically and continuously distributed random variables. The genotype space is taken to be a Cartesian power graph [Formula: see text], where [Formula: see text] is the number of genetic loci and the allele graph [Formula: see text] encodes the possible allelic states and mutational transitions on one locus. The probability of existence of accessible paths between two genotypes at a distance linear in [Formula: see text] displays a transition from 0 to a positive value at a threshold [Formula: see text] for the fitness difference between the initial and final genotype. We derive a lower bound on [Formula: see text] for general [Formula: see text] and show that this bound is tight for a large class of allele graphs. Our results generalize previous results for accessibility percolation on the biallelic hypercube, and compare favorably to published numerical results for multiallelic Hamming graphs.
Collapse
Affiliation(s)
| | - Joachim Krug
- Institute for Biological Physics, University of Cologne, Köln, Germany
| |
Collapse
|
11
|
Cirne D, Campos PRA. Rate of environmental variation impacts the predictability in evolution. Phys Rev E 2022; 106:064408. [PMID: 36671169 DOI: 10.1103/physreve.106.064408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 12/09/2022] [Indexed: 12/24/2022]
Abstract
In the two last decades, we have improved our understanding of the adaptive evolution of natural populations under constant and stable environments. For instance, experimental methods from evolutionary biology have allowed us to explore the structure of fitness landscapes and survey how the landscape properties can constrain the adaptation process. However, understanding how environmental changes can affect adaptation remains challenging. Very little progress has been made with respect to time-varying fitness landscapes. Using the adaptive-walk approximation, we survey the evolutionary process of populations under a scenario of environmental variation. In particular, we investigate how the rate of environmental variation influences the predictability in evolution. We observe that the rate of environmental variation not only changes the duration of adaptive walks towards fitness peaks of the fitness landscape, but also affects the degree of repeatability of both outcomes and evolutionary paths. In general, slower environmental variation increases the predictability in evolution. The accessibility of endpoints is greatly influenced by the ecological dynamics. The dependence of these quantities on the genome size and number of traits is also addressed. To our knowledge, this contribution is the first to use the predictive approach to quantify and understand the impact of the speed of environmental variation on the degree of parallelism of the evolutionary process.
Collapse
Affiliation(s)
- Diego Cirne
- Departamento de Física, Universidade Federal de Pernambuco, 50740-560 Recife-PE, Brazil
| | - Paulo R A Campos
- Departamento de Física, Universidade Federal de Pernambuco, 50740-560 Recife-PE, Brazil
| |
Collapse
|
12
|
Dingle K, Novev JK, Ahnert SE, Louis AA. Predicting phenotype transition probabilities via conditional algorithmic probability approximations. J R Soc Interface 2022; 19:20220694. [PMID: 36514888 PMCID: PMC9748496 DOI: 10.1098/rsif.2022.0694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 11/18/2022] [Indexed: 12/15/2022] Open
Abstract
Unravelling the structure of genotype-phenotype (GP) maps is an important problem in biology. Recently, arguments inspired by algorithmic information theory (AIT) and Kolmogorov complexity have been invoked to uncover simplicity bias in GP maps, an exponentially decaying upper bound in phenotype probability with the increasing phenotype descriptional complexity. This means that phenotypes with many genotypes assigned via the GP map must be simple, while complex phenotypes must have few genotypes assigned. Here, we use similar arguments to bound the probability P(x → y) that phenotype x, upon random genetic mutation, transitions to phenotype y. The bound is [Formula: see text], where [Formula: see text] is the estimated conditional complexity of y given x, quantifying how much extra information is required to make y given access to x. This upper bound is related to the conditional form of algorithmic probability from AIT. We demonstrate the practical applicability of our derived bound by predicting phenotype transition probabilities (and other related quantities) in simulations of RNA and protein secondary structures. Our work contributes to a general mathematical understanding of GP maps and may facilitate the prediction of transition probabilities directly from examining phenotype themselves, without utilizing detailed knowledge of the GP map.
Collapse
Affiliation(s)
- Kamaludin Dingle
- Department of Chemical Engineering and Biotechnology, Cambridge University, Cambridge CB2 1TN, UK
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, USA
- Department of Mathematics and Natural Sciences, Centre for Applied Mathematics and Bioinformatics (CAMB), Gulf University for Science and Technology, 32093, Kuwait
| | - Javor K. Novev
- Department of Chemical Engineering and Biotechnology, Cambridge University, Cambridge CB2 1TN, UK
| | - Sebastian E. Ahnert
- Department of Chemical Engineering and Biotechnology, Cambridge University, Cambridge CB2 1TN, UK
| | - Ard A. Louis
- Department of Physics, Rudolf Peierls Centre for Theoretical Physics, Oxford University, Oxford OX1 2JD, UK
| |
Collapse
|
13
|
Conflicting effects of recombination on the evolvability and robustness in neutrally evolving populations. PLoS Comput Biol 2022; 18:e1010710. [DOI: 10.1371/journal.pcbi.1010710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 12/05/2022] [Accepted: 11/04/2022] [Indexed: 11/22/2022] Open
Abstract
Understanding the benefits and costs of recombination under different scenarios of evolutionary adaptation remains an open problem for theoretical and experimental research. In this study, we focus on finite populations evolving on neutral networks comprising viable and unfit genotypes. We provide a comprehensive overview of the effects of recombination by jointly considering different measures of evolvability and mutational robustness over a broad parameter range, such that many evolutionary regimes are covered. We find that several of these measures vary non-monotonically with the rates of mutation and recombination. Moreover, the presence of unfit genotypes that introduce inhomogeneities in the network of viable states qualitatively alters the effects of recombination. We conclude that conflicting trends induced by recombination can be explained by an emerging trade-off between evolvability on the one hand, and mutational robustness on the other. Finally, we discuss how different implementations of the recombination scheme in theoretical models can affect the observed dependence on recombination rate through a coupling between recombination and genetic drift.
Collapse
|
14
|
Getting higher on rugged landscapes: Inversion mutations open access to fitter adaptive peaks in NK fitness landscapes. PLoS Comput Biol 2022; 18:e1010647. [PMID: 36315581 PMCID: PMC9648849 DOI: 10.1371/journal.pcbi.1010647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 11/10/2022] [Accepted: 10/09/2022] [Indexed: 11/12/2022] Open
Abstract
Molecular evolution is often conceptualised as adaptive walks on rugged fitness landscapes, driven by mutations and constrained by incremental fitness selection. It is well known that epistasis shapes the ruggedness of the landscape’s surface, outlining their topography (with high-fitness peaks separated by valleys of lower fitness genotypes). However, within the strong selection weak mutation (SSWM) limit, once an adaptive walk reaches a local peak, natural selection restricts passage through downstream paths and hampers any possibility of reaching higher fitness values. Here, in addition to the widely used point mutations, we introduce a minimal model of sequence inversions to simulate adaptive walks. We use the well known NK model to instantiate rugged landscapes. We show that adaptive walks can reach higher fitness values through inversion mutations, which, compared to point mutations, allows the evolutionary process to escape local fitness peaks. To elucidate the effects of this chromosomal rearrangement, we use a graph-theoretical representation of accessible mutants and show how new evolutionary paths are uncovered. The present model suggests a simple mechanistic rationale to analyse escapes from local fitness peaks in molecular evolution driven by (intragenic) structural inversions and reveals some consequences of the limits of point mutations for simulations of molecular evolution. Ninety years ago, Wright translated Darwin’s core idea of survival of the fittest into rugged landscapes—a highly influential metaphor—with peaks representing high values of fitness separated by valleys of lower fitness. In this picture, once a population has reached a local peak, the adaptive dynamics may stall as further adaptation requires crossing a valley. At the DNA level, adaptation is often modelled as a space of genotypes that is explored through point mutations. Therefore, once a local peak is reached, any genotype fitter than that of the peak will be away from the neighbourhood of genotypes accessible through point mutations. Here we present a simple computational model for inversion mutations, one of the most frequent structural variations, and show that adaptive processes in rugged landscapes can escape from local peaks through intragenic inversion mutations. This new escape mechanism reveals the innovative role of inversions at the DNA level and provides a step towards more realistic models of adaptive dynamics, beyond the dominance of point mutations in theories of molecular evolution.
Collapse
|
15
|
The structure of genotype-phenotype maps makes fitness landscapes navigable. Nat Ecol Evol 2022; 6:1742-1752. [PMID: 36175543 DOI: 10.1038/s41559-022-01867-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 08/01/2022] [Indexed: 11/09/2022]
Abstract
Fitness landscapes are often described in terms of 'peaks' and 'valleys', indicating an intuitive low-dimensional landscape of the kind encountered in everyday experience. The space of genotypes, however, is extremely high dimensional, which results in counter-intuitive structural properties of genotype-phenotype maps. Here we show that these properties, such as the presence of pervasive neutral networks, make fitness landscapes navigable. For three biologically realistic genotype-phenotype map models-RNA secondary structure, protein tertiary structure and protein complexes-we find that, even under random fitness assignment, fitness maxima can be reached from almost any other phenotype without passing through fitness valleys. This in turn indicates that true fitness valleys are very rare. By considering evolutionary simulations between pairs of real examples of functional RNA sequences, we show that accessible paths are also likely to be used under evolutionary dynamics. Our findings have broad implications for the prediction of natural evolutionary outcomes and for directed evolution.
Collapse
|
16
|
Life finds a way. Nat Ecol Evol 2022; 6:1599-1600. [PMID: 36175542 DOI: 10.1038/s41559-022-01877-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
17
|
Srivastava M, Payne JL. On the incongruence of genotype-phenotype and fitness landscapes. PLoS Comput Biol 2022; 18:e1010524. [PMID: 36121840 PMCID: PMC9521842 DOI: 10.1371/journal.pcbi.1010524] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 09/29/2022] [Accepted: 08/30/2022] [Indexed: 11/22/2022] Open
Abstract
The mapping from genotype to phenotype to fitness typically involves multiple nonlinearities that can transform the effects of mutations. For example, mutations may contribute additively to a phenotype, but their effects on fitness may combine non-additively because selection favors a low or intermediate value of that phenotype. This can cause incongruence between the topographical properties of a fitness landscape and its underlying genotype-phenotype landscape. Yet, genotype-phenotype landscapes are often used as a proxy for fitness landscapes to study the dynamics and predictability of evolution. Here, we use theoretical models and empirical data on transcription factor-DNA interactions to systematically study the incongruence of genotype-phenotype and fitness landscapes when selection favors a low or intermediate phenotypic value. Using the theoretical models, we prove a number of fundamental results. For example, selection for low or intermediate phenotypic values does not change simple sign epistasis into reciprocal sign epistasis, implying that genotype-phenotype landscapes with only simple sign epistasis motifs will always give rise to single-peaked fitness landscapes under such selection. More broadly, we show that such selection tends to create fitness landscapes that are more rugged than the underlying genotype-phenotype landscape, but this increased ruggedness typically does not frustrate adaptive evolution because the local adaptive peaks in the fitness landscape tend to be nearly as tall as the global peak. Many of these results carry forward to the empirical genotype-phenotype landscapes, which may help to explain why low- and intermediate-affinity transcription factor-DNA interactions are so prevalent in eukaryotic gene regulation.
Collapse
Affiliation(s)
- Malvika Srivastava
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Joshua L. Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
18
|
Patton AH, Richards EJ, Gould KJ, Buie LK, Martin CH. Hybridization alters the shape of the genotypic fitness landscape, increasing access to novel fitness peaks during adaptive radiation. eLife 2022; 11:e72905. [PMID: 35616528 PMCID: PMC9135402 DOI: 10.7554/elife.72905] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Accepted: 04/14/2022] [Indexed: 12/30/2022] Open
Abstract
Estimating the complex relationship between fitness and genotype or phenotype (i.e. the adaptive landscape) is one of the central goals of evolutionary biology. However, adaptive walks connecting genotypes to organismal fitness, speciation, and novel ecological niches are still poorly understood and processes for surmounting fitness valleys remain controversial. One outstanding system for addressing these connections is a recent adaptive radiation of ecologically and morphologically novel pupfishes (a generalist, molluscivore, and scale-eater) endemic to San Salvador Island, Bahamas. We leveraged whole-genome sequencing of 139 hybrids from two independent field fitness experiments to identify the genomic basis of fitness, estimate genotypic fitness networks, and measure the accessibility of adaptive walks on the fitness landscape. We identified 132 single nucleotide polymorphisms (SNPs) that were significantly associated with fitness in field enclosures. Six out of the 13 regions most strongly associated with fitness contained differentially expressed genes and fixed SNPs between trophic specialists; one gene (mettl21e) was also misexpressed in lab-reared hybrids, suggesting a potential intrinsic genetic incompatibility. We then constructed genotypic fitness networks from adaptive alleles and show that scale-eating specialists are the most isolated of the three species on these networks. Intriguingly, introgressed and de novo variants reduced fitness landscape ruggedness as compared to standing variation, increasing the accessibility of genotypic fitness paths from generalist to specialists. Our results suggest that adaptive introgression and de novo mutations alter the shape of the fitness landscape, providing key connections in adaptive walks circumventing fitness valleys and triggering the evolution of novelty during adaptive radiation.
Collapse
Affiliation(s)
- Austin H Patton
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
- Museum of Vertebrate Zoology, University of California, BerkeleyBerkeleyUnited States
| | - Emilie J Richards
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
- Museum of Vertebrate Zoology, University of California, BerkeleyBerkeleyUnited States
| | - Katelyn J Gould
- Department of Biology, University of North CarolinaChapel HillUnited States
| | - Logan K Buie
- Department of Biology, University of North CarolinaChapel HillUnited States
| | - Christopher H Martin
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
- Museum of Vertebrate Zoology, University of California, BerkeleyBerkeleyUnited States
| |
Collapse
|
19
|
Yang CH, Scarpino SV. A Family of Fitness Landscapes Modeled through Gene Regulatory Networks. ENTROPY (BASEL, SWITZERLAND) 2022; 24:622. [PMID: 35626507 PMCID: PMC9141513 DOI: 10.3390/e24050622] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 04/11/2022] [Accepted: 04/26/2022] [Indexed: 02/01/2023]
Abstract
Fitness landscapes are a powerful metaphor for understanding the evolution of biological systems. These landscapes describe how genotypes are connected to each other through mutation and related through fitness. Empirical studies of fitness landscapes have increasingly revealed conserved topographical features across diverse taxa, e.g., the accessibility of genotypes and "ruggedness". As a result, theoretical studies are needed to investigate how evolution proceeds on fitness landscapes with such conserved features. Here, we develop and study a model of evolution on fitness landscapes using the lens of Gene Regulatory Networks (GRNs), where the regulatory products are computed from multiple genes and collectively treated as phenotypes. With the assumption that regulation is a binary process, we prove the existence of empirically observed, topographical features such as accessibility and connectivity. We further show that these results hold across arbitrary fitness functions and that a trade-off between accessibility and ruggedness need not exist. Then, using graph theory and a coarse-graining approach, we deduce a mesoscopic structure underlying GRN fitness landscapes where the information necessary to predict a population's evolutionary trajectory is retained with minimal complexity. Using this coarse-graining, we develop a bottom-up algorithm to construct such mesoscopic backbones, which does not require computing the genotype network and is therefore far more efficient than brute-force approaches. Altogether, this work provides mathematical results of high-dimensional fitness landscapes and a path toward connecting theory to empirical studies.
Collapse
Affiliation(s)
- Chia-Hung Yang
- Network Science Institute, Northeastern University, Boston, MA 02115, USA
| | - Samuel V. Scarpino
- Network Science Institute, Northeastern University, Boston, MA 02115, USA
- Physics Department, Northeastern University, Boston, MA 02115, USA
- Roux Institute, Northeastern University, Boston, MA 02115, USA
- Institute for Experiential AI, Northeastern University, Boston, MA 02115, USA
- Santa Fe Institute, Santa Fe, NM 87501, USA
- Vermont Complex Systems Center, University of Vermont, Burlington, VT 05405, USA
| |
Collapse
|
20
|
Abstract
How do mutational biases influence the process of adaptation? A common assumption is that selection alone determines the course of adaptation from abundant preexisting variation. Yet, theoretical work shows broad conditions under which the mutation rate to a given type of variant strongly influences its probability of contributing to adaptation. Here we introduce a statistical approach to analyzing how mutation shapes protein sequence adaptation. Using large datasets from three different species, we show that the mutation spectrum has a proportional influence on the types of changes fixed in adaptation. We also show via computer simulations that a variety of factors can influence how closely the spectrum of adaptive substitutions reflects the spectrum of variants introduced by mutation. Evolutionary adaptation often occurs by the fixation of beneficial mutations. This mode of adaptation can be characterized quantitatively by a spectrum of adaptive substitutions, i.e., a distribution for types of changes fixed in adaptation. Recent work establishes that the changes involved in adaptation reflect common types of mutations, raising the question of how strongly the mutation spectrum shapes the spectrum of adaptive substitutions. We address this question with a codon-based model for the spectrum of adaptive amino acid substitutions, applied to three large datasets covering thousands of amino acid changes identified in natural and experimental adaptation in Saccharomyces cerevisiae, Escherichia coli, and Mycobacterium tuberculosis. Using species-specific mutation spectra based on prior knowledge, we find that the mutation spectrum has a proportional influence on the spectrum of adaptive substitutions in all three species. Indeed, we find that by inferring the mutation rates that best explain the spectrum of adaptive substitutions, we can accurately recover the species-specific mutation spectra. However, we also find that the predictive power of the model differs substantially between the three species. To better understand these differences, we use population simulations to explore the factors that influence how closely the spectrum of adaptive substitutions mirrors the mutation spectrum. The results show that the influence of the mutation spectrum decreases with increasing mutational supply (Nμ) and that predictive power is strongly affected by the number and diversity of beneficial mutations.
Collapse
|
21
|
Phaneuf PV, Zielinski DC, Yurkovich JT, Johnsen J, Szubin R, Yang L, Kim SH, Schulz S, Wu M, Dalldorf C, Ozdemir E, Lennen RM, Palsson BO, Feist AM. Escherichia coli Data-Driven Strain Design Using Aggregated Adaptive Laboratory Evolution Mutational Data. ACS Synth Biol 2021; 10:3379-3395. [PMID: 34762392 PMCID: PMC8870144 DOI: 10.1021/acssynbio.1c00337] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
![]()
Microbes are being
engineered for an increasingly large and diverse
set of applications. However, the designing of microbial genomes remains
challenging due to the general complexity of biological systems. Adaptive
Laboratory Evolution (ALE) leverages nature’s problem-solving
processes to generate optimized genotypes currently inaccessible to
rational methods. The large amount of public ALE data now represents
a new opportunity for data-driven strain design. This study describes
how novel strain designs, or genome sequences not yet observed in
ALE experiments or published designs, can be extracted from aggregated
ALE data and demonstrates this by designing, building, and testing
three novel Escherichia coli strains with fitnesses
comparable to ALE mutants. These designs were achieved through a meta-analysis
of aggregated ALE mutations data (63 Escherichia coli K-12 MG1655 based ALE experiments, described by 93 unique environmental
conditions, 357 independent evolutions, and 13 957 observed
mutations), which additionally revealed global ALE mutation trends
that inform on ALE-derived strain design principles. Such informative
trends anticipate ALE-derived strain designs as largely gene-centric,
as opposed to noncoding, and composed of a relatively small number
of beneficial variants (approximately 6). These results demonstrate
how strain design efforts can be enhanced by the meta-analysis of
aggregated ALE data.
Collapse
Affiliation(s)
- Patrick V. Phaneuf
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, California 92093, United States
| | - Daniel C. Zielinski
- Department of Bioengineering, University of California, San Diego, La Jolla, California 92093, United States
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - James T. Yurkovich
- Department of Bioengineering, University of California, San Diego, La Jolla, California 92093, United States
| | - Josefin Johnsen
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Richard Szubin
- Department of Bioengineering, University of California, San Diego, La Jolla, California 92093, United States
| | - Lei Yang
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Se Hyeuk Kim
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Sebastian Schulz
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Muyao Wu
- Department of Bioengineering, University of California, San Diego, La Jolla, California 92093, United States
| | - Christopher Dalldorf
- Department of Bioengineering, University of California, San Diego, La Jolla, California 92093, United States
| | - Emre Ozdemir
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Rebecca M. Lennen
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Bernhard O. Palsson
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, California 92093, United States
- Department of Bioengineering, University of California, San Diego, La Jolla, California 92093, United States
- Department of Pediatrics, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, United States
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Adam M. Feist
- Department of Bioengineering, University of California, San Diego, La Jolla, California 92093, United States
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| |
Collapse
|
22
|
Song S, Zhang J. Unbiased inference of the fitness landscape ruggedness from imprecise fitness estimates. Evolution 2021; 75:2658-2671. [PMID: 34554581 DOI: 10.1111/evo.14363] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 09/14/2021] [Indexed: 01/17/2023]
Abstract
Fitness landscapes map genotypes to their corresponding fitness under given environments and allow explaining and predicting evolutionary trajectories. Of particular interest is the landscape ruggedness or the unevenness of the landscape, because it impacts many aspects of evolution such as the likelihood that a population is trapped in a local fitness peak. Although the ruggedness has been inferred from a number of empirically mapped fitness landscapes, it is unclear to what extent this inference is affected by fitness estimation error, which is inevitable in the experimental determination of fitness landscapes. Here, we address this question by simulating fitness landscapes under various theoretical models, with or without fitness estimation error. We find that all eight examined measures of landscape ruggedness are overestimated due to imprecise fitness quantification, but different measures are affected to different degrees. We devise a method to use replicate fitness measures to correct this bias and show that our method performs well under realistic conditions. We conclude that previously reported fitness landscape ruggedness is likely upward biased owing to the negligence of fitness estimation error and advise that future fitness landscape mapping should include at least three biological replicates to permit an unbiased inference of the ruggedness.
Collapse
Affiliation(s)
- Siliang Song
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, 48109
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, 48109
| |
Collapse
|
23
|
Bendixsen DP, Pollock TB, Peri G, Hayden EJ. Experimental Resurrection of Ancestral Mammalian CPEB3 Ribozymes Reveals Deep Functional Conservation. Mol Biol Evol 2021; 38:2843-2853. [PMID: 33720319 PMCID: PMC8233481 DOI: 10.1093/molbev/msab074] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Self-cleaving ribozymes are genetic elements found in all domains of life, but their evolution remains poorly understood. A ribozyme located in the second intron of the cytoplasmic polyadenylation binding protein 3 gene (CPEB3) shows high sequence conservation in mammals, but little is known about the functional conservation of self-cleaving ribozyme activity across the mammalian tree of life or during the course of mammalian evolution. Here, we use a phylogenetic approach to design a mutational library and a deep sequencing assay to evaluate the in vitro self-cleavage activity of numerous extant and resurrected CPEB3 ribozymes that span over 100 My of mammalian evolution. We found that the predicted sequence at the divergence of placentals and marsupials is highly active, and this activity has been conserved in most lineages. A reduction in ribozyme activity appears to have occurred multiple different times throughout the mammalian tree of life. The in vitro activity data allow an evaluation of the predicted mutational pathways leading to extant ribozyme as well as the mutational landscape surrounding these ribozymes. The results demonstrate that in addition to sequence conservation, the self-cleavage activity of the CPEB3 ribozyme has persisted over millions of years of mammalian evolution.
Collapse
Affiliation(s)
- Devin P. Bendixsen
- Biomolecular Sciences Graduate Programs, Boise State University, Boise, ID, USA
| | - Tanner B. Pollock
- Department of Biological Science, Boise State University, Boise, ID, USA
| | - Gianluca Peri
- Biomolecular Sciences Graduate Programs, Boise State University, Boise, ID, USA
| | - Eric J. Hayden
- Biomolecular Sciences Graduate Programs, Boise State University, Boise, ID, USA
- Department of Biological Science, Boise State University, Boise, ID, USA
| |
Collapse
|
24
|
Technow F, Podlich D, Cooper M. Back to the future: Implications of genetic complexity for the structure of hybrid breeding programs. G3 (BETHESDA, MD.) 2021; 11:6265599. [PMID: 33950172 PMCID: PMC8495936 DOI: 10.1093/g3journal/jkab153] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 04/28/2021] [Indexed: 11/14/2022]
Abstract
Commercial hybrid breeding operations can be described as decentralized networks of smaller, more or less isolated breeding programs. There is further a tendency for the disproportionate use of successful inbred lines for generating the next generation of recombinants, which has led to a series of significant bottlenecks, particularly in the history of the North American and European maize germplasm. Both the decentralization and the disproportionate contribution of inbred lines reduce effective population size and constrain the accessible genetic space. Under these conditions, long-term response to selection is not expected to be optimal under the classical infinitesimal model of quantitative genetics. In this study, we therefore aim to propose a rationale for the success of large breeding operations in the context of genetic complexity arising from the structure and properties of interactive genetic networks. For this, we use simulations based on the NK model of genetic architecture. We indeed found that constraining genetic space through program decentralization and disproportionate contribution of parental inbred lines, is required to expose additive genetic variation and thus facilitate heritable genetic gains under high levels of genetic complexity. These results introduce new insights into why the historically grown structure of hybrid breeding programs was successful in improving the yield potential of hybrid crops over the last century. We also hope that a renewed appreciation for “why things worked” in the past can guide the adoption of novel technologies and the design of future breeding strategies for navigating biological complexity.
Collapse
Affiliation(s)
- Frank Technow
- Plant Breeding, Corteva Agriscience, Tavistock, ON, N0B 2R0, Canada
| | - Dean Podlich
- Systems and Innovation for Breeding and Seed Products, Corteva Agriscience, Johnston, IA, 50131, USA
| | - Mark Cooper
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, QLD, 4067, Australia
| |
Collapse
|
25
|
Smerlak M. Quasi-species evolution maximizes genotypic reproductive value (not fitness or flatness). J Theor Biol 2021; 522:110699. [PMID: 33794289 DOI: 10.1016/j.jtbi.2021.110699] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 02/23/2021] [Accepted: 03/23/2021] [Indexed: 10/21/2022]
Abstract
Growing efforts to measure fitness landscapes in molecular and microbial systems are motivated by a longstanding goal to predict future evolutionary trajectories. Sometimes under-appreciated, however, is that the fitness landscape and its topography do not by themselves determine the direction of evolution: under sufficiently high mutation rates, populations can climb the closest fitness peak (survival of the fittest), settle in lower regions with higher mutational robustness (survival of the flattest), or even fail to adapt altogether (error catastrophes). I show that another measure of reproductive success, Fisher's reproductive value, resolves the trade-off between fitness and robustness in the quasi-species regime of evolution: to forecast the motion of a population in genotype space, one should look for peaks in the (mutation-rate dependent) landscape of genotypic reproductive values-whether or not these peaks correspond to local fitness maxima or flat fitness plateaus. This new landscape picture turns quasi-species dynamics into an instance of non-equilibrium dynamics, in the physical sense of Markovian processes, potential landscapes, entropy production, etc.
Collapse
Affiliation(s)
- Matteo Smerlak
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
| |
Collapse
|
26
|
Balance between promiscuity and specificity in phage λ host range. ISME JOURNAL 2021; 15:2195-2205. [PMID: 33589767 DOI: 10.1038/s41396-021-00912-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 01/18/2021] [Accepted: 01/25/2021] [Indexed: 01/21/2023]
Abstract
As hosts acquire resistance to viruses, viruses must overcome that resistance to re-establish infectivity, or go extinct. Despite the significant hurdles associated with adapting to a resistant host, viruses are evolutionarily successful and maintain stable coevolutionary relationships with their hosts. To investigate the factors underlying how pathogens adapt to their hosts, we performed a deep mutational scan of the region of the λ tail fiber tip protein that mediates contact with the receptor on λ's host, Escherichia coli. Phages harboring amino acid substitutions were subjected to selection for infectivity on wild type E. coli, revealing a highly restrictive fitness landscape, in which most substitutions completely abrogate function. A subset of positions that are tolerant of mutation in this assay, but diverse over evolutionary time, are associated with host range expansion. Imposing selection for phage infectivity on three λ-resistant hosts, each harboring a different missense mutation in the λ receptor, reveals hundreds of adaptive variants in λ. We distinguish λ variants that confer promiscuity, a general ability to overcome host resistance, from those that drive host-specific infectivity. Both processes may be important in driving adaptation to a novel host.
Collapse
|
27
|
Genotype networks of 80 quantitative Arabidopsis thaliana phenotypes reveal phenotypic evolvability despite pervasive epistasis. PLoS Comput Biol 2020; 16:e1008082. [PMID: 32790763 PMCID: PMC7447023 DOI: 10.1371/journal.pcbi.1008082] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Revised: 08/25/2020] [Accepted: 06/22/2020] [Indexed: 12/23/2022] Open
Abstract
We study the genotype-phenotype maps of 80 quantitative phenotypes in the model plant Arabidopsis thaliana, by representing the genotypes affecting each phenotype as a genotype network. In such a network, each vertex or node corresponds to an individual's genotype at all those genomic loci that affect a given phenotype. Two vertices are connected by an edge if the associated genotypes differ in exactly one nucleotide. The 80 genotype networks we analyze are based on data from genome-wide association studies of 199 A. thaliana accessions. They form connected graphs whose topography differs substantially among phenotypes. We focus our analysis on the incidence of epistasis (non-additive interactions among mutations) because a high incidence of epistasis can reduce the accessibility of evolutionary paths towards high or low phenotypic values. We find epistatic interactions in 67 phenotypes, and in 51 phenotypes every pairwise mutant interaction is epistatic. Moreover, we find phenotype-specific differences in the fraction of accessible mutational paths to maximum phenotypic values. However, even though epistasis affects the accessibility of maximum phenotypic values, the relationships between genotypic and phenotypic change of our analyzed phenotypes are sufficiently smooth that some evolutionary paths remain accessible for most phenotypes, even where epistasis is pervasive. The genotype network representation we use can complement existing approaches to understand the genetic architecture of polygenic traits in many different organisms.
Collapse
|
28
|
Das SG, Direito SOL, Waclaw B, Allen RJ, Krug J. Predictable properties of fitness landscapes induced by adaptational tradeoffs. eLife 2020; 9:e55155. [PMID: 32423531 PMCID: PMC7297540 DOI: 10.7554/elife.55155] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 05/05/2020] [Indexed: 02/06/2023] Open
Abstract
Fitness effects of mutations depend on environmental parameters. For example, mutations that increase fitness of bacteria at high antibiotic concentration often decrease fitness in the absence of antibiotic, exemplifying a tradeoff between adaptation to environmental extremes. We develop a mathematical model for fitness landscapes generated by such tradeoffs, based on experiments that determine the antibiotic dose-response curves of Escherichia coli strains, and previous observations on antibiotic resistance mutations. Our model generates a succession of landscapes with predictable properties as antibiotic concentration is varied. The landscape is nearly smooth at low and high concentrations, but the tradeoff induces a high ruggedness at intermediate antibiotic concentrations. Despite this high ruggedness, however, all the fitness maxima in the landscapes are evolutionarily accessible from the wild type. This implies that selection for antibiotic resistance in multiple mutational steps is relatively facile despite the complexity of the underlying landscape.
Collapse
Affiliation(s)
- Suman G Das
- Institute for Biological Physics, University of CologneCologneGermany
| | - Susana OL Direito
- School of Physics and Astronomy, University of EdinburghEdinburghUnited Kingdom
| | - Bartlomiej Waclaw
- School of Physics and Astronomy, University of EdinburghEdinburghUnited Kingdom
| | - Rosalind J Allen
- School of Physics and Astronomy, University of EdinburghEdinburghUnited Kingdom
| | - Joachim Krug
- Institute for Biological Physics, University of CologneCologneGermany
| |
Collapse
|
29
|
Reia SM, Campos PRA. Analysis of statistical correlations between properties of adaptive walks in fitness landscapes. ROYAL SOCIETY OPEN SCIENCE 2020; 7:192118. [PMID: 32218986 PMCID: PMC7029893 DOI: 10.1098/rsos.192118] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 01/13/2020] [Indexed: 06/10/2023]
Abstract
The fitness landscape metaphor has been central in our way of thinking about adaptation. In this scenario, adaptive walks are idealized dynamics that mimic the uphill movement of an evolving population towards a fitness peak of the landscape. Recent works in experimental evolution have demonstrated that the constraints imposed by epistasis are responsible for reducing the number of accessible mutational pathways towards fitness peaks. Here, we exhaustively analyse the statistical properties of adaptive walks for two empirical fitness landscapes and theoretical NK landscapes. Some general conclusions can be drawn from our simulation study. Regardless of the dynamics, we observe that the shortest paths are more regularly used. Although the accessibility of a given fitness peak is reasonably correlated to the number of monotonic pathways towards it, the two quantities are not exactly proportional. A negative correlation between predictability and mean path divergence is established, and so the decrease of the number of effective mutational pathways ensures the convergence of the attraction basin of fitness peaks. On the other hand, other features are not conserved among fitness landscapes, such as the relationship between accessibility and predictability.
Collapse
Affiliation(s)
- Sandro M. Reia
- Instituto de Física de São Carlos, Universidade de São Paulo, Caixa Postal 369, 13560-970 São Carlos, São Paulo, Brazil
| | - Paulo R. A. Campos
- Evolutionary Dynamics Lab, Physics Department, Federal University of Pernambuco, Recife, Brazil
| |
Collapse
|
30
|
The effect of spatiotemporal antibiotic inhomogeneities on the evolution of resistance. J Theor Biol 2019; 486:110077. [PMID: 31715181 DOI: 10.1016/j.jtbi.2019.110077] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 10/31/2019] [Accepted: 11/09/2019] [Indexed: 12/30/2022]
Abstract
Combating the evolution of widespread antibiotic resistance is one of the most pressing challenges facing modern medicine. Recent research has demonstrated that the evolution of pathogens with high levels of resistance can be accelerated by spatial and temporal inhomogeneities in antibiotic concentration, which frequently arise in patients and the environment. Strategies to predict and counteract the effects of such inhomogeneities will be critical in the fight against resistance. In this paper we develop a mechanistic framework for modelling the adaptive evolution of resistance in the presence of spatiotemporal antibiotic concentrations, which treats the adaptive process as an interaction between two mutually orthogonal forces; the first returns cells to their wild-type state in the absence of antibiotic selection, and the second selects for increased coping ability in the presence of an antibiotic. We apply our model to investigate laboratory adaptation experiments, and then extend it to consider the case in which multiple strategies for resistance undergo competitive evolution.
Collapse
|
31
|
Nonoyama T, Chiba S. Phenotypic determinism and contingency in the evolution of hypothetical tree-like organisms. PLoS One 2019; 14:e0211671. [PMID: 31671104 PMCID: PMC6822745 DOI: 10.1371/journal.pone.0211671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2019] [Accepted: 10/01/2019] [Indexed: 11/19/2022] Open
Abstract
Whether evolutionary history is mostly contingent or deterministic has been given much focus in the field of evolutionary biology. Studies addressing this issue have been conducted theoretically, based on models, and experimentally, based on microcosms. It has been argued that the shape of the adaptive landscape and mutation rate are major determinants of replicated phenotypic evolution. In the present study, to incorporate the effects of phenotypic plasticity, we constructed a model using tree-like organisms. In this model, the basic rules used to develop trees are genetically determined, but tree shape (described by the number and aspect ratio of the branches) is determined by both genetic components and plasticity. The results of the simulation show that the tree shapes become more deterministic under higher mutation rates. However, the tree shape became most contingent and diverse at the lower mutation rate. In this situation, the variances of the genetically determinant characters were low, but the variance of the tree shape is rather high, suggesting that phenotypic plasticity results in this contingency and diversity of tree shape. The present findings suggest that plasticity cannot be ignored as a factor that increases contingency and diversity of evolutionary outcomes.
Collapse
Affiliation(s)
- Tomonobu Nonoyama
- Graduate School of Life Sciences, Tohoku University, Katahira, Aoba-ku, Sendai, Japan
- * E-mail:
| | - Satoshi Chiba
- Graduate School of Life Sciences, Tohoku University, Katahira, Aoba-ku, Sendai, Japan
- Center for Northeast Asian Studies, Tohoku University, Kawauchi, Aoba-ku, Sendai, Japan
| |
Collapse
|
32
|
Abstract
We consider evolution of a large population, where fitness of each organism is defined by many phenotypical traits. These traits result from expression of many genes. Under some assumptions on fitness we prove that such model organisms are capable, to some extent, to recognize the fitness landscape. That fitness landscape learning sharply reduces the number of mutations needed for adaptation. Moreover, this learning increases phenotype robustness with respect to mutations, i.e., canalizes the phenotype. We show that learning and canalization work only when evolution is gradual. Organisms can be adapted to many constraints associated with a hard environment, if that environment becomes harder step by step. Our results explain why evolution can involve genetic changes of a relatively large effect and why the total number of changes are surprisingly small.
Collapse
Affiliation(s)
- John Reinitz
- Departments of Statistics, Ecology and Evolution, Molecular Genetics and Cell Biology, University of Chicago, Chicago, IL, USA
| | - Sergey Vakulenko
- Saint Petersburg National Research University of Information Technologies, Mechanics and Optics, Saint Petersburg, Russian Federation
| | - Dmitri Grigoriev
- CNRS, Mathématiques, Université de Lille, Villeneuve d'Ascq, France
| | - Andreas Weber
- Department of Computer Science, University of Bonn, Bonn, Germany
| |
Collapse
|
33
|
Klug A, Park SC, Krug J. Recombination and mutational robustness in neutral fitness landscapes. PLoS Comput Biol 2019; 15:e1006884. [PMID: 31415555 PMCID: PMC6711544 DOI: 10.1371/journal.pcbi.1006884] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 08/27/2019] [Accepted: 07/09/2019] [Indexed: 11/19/2022] Open
Abstract
Mutational robustness quantifies the effect of random mutations on fitness. When mutational robustness is high, most mutations do not change fitness or have only a minor effect on it. From the point of view of fitness landscapes, robust genotypes form neutral networks of almost equal fitness. Using deterministic population models it has been shown that selection favors genotypes inside such networks, which results in increased mutational robustness. Here we demonstrate that this effect is massively enhanced by recombination. Our results are based on a detailed analysis of mesa-shaped fitness landscapes, where we derive precise expressions for the dependence of the robustness on the landscape parameters for recombining and non-recombining populations. In addition, we carry out numerical simulations on different types of random holey landscapes as well as on an empirical fitness landscape. We show that the mutational robustness of a genotype generally correlates with its recombination weight, a new measure that quantifies the likelihood for the genotype to arise from recombination. We argue that the favorable effect of recombination on mutational robustness is a highly universal feature that may have played an important role in the emergence and maintenance of mechanisms of genetic exchange.
Collapse
Affiliation(s)
- Alexander Klug
- Institute for Biological Physics, University of Cologne, Cologne, Germany
| | - Su-Chan Park
- Department of Physics, The Catholic University of Korea, Bucheon, Republic of Korea
| | - Joachim Krug
- Institute for Biological Physics, University of Cologne, Cologne, Germany
| |
Collapse
|
34
|
Diaz-Uriarte R, Vasallo C. Every which way? On predicting tumor evolution using cancer progression models. PLoS Comput Biol 2019; 15:e1007246. [PMID: 31374072 PMCID: PMC6693785 DOI: 10.1371/journal.pcbi.1007246] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Revised: 08/14/2019] [Accepted: 07/05/2019] [Indexed: 11/18/2022] Open
Abstract
Successful prediction of the likely paths of tumor progression is valuable for diagnostic, prognostic, and treatment purposes. Cancer progression models (CPMs) use cross-sectional samples to identify restrictions in the order of accumulation of driver mutations and thus CPMs encode the paths of tumor progression. Here we analyze the performance of four CPMs to examine whether they can be used to predict the true distribution of paths of tumor progression and to estimate evolutionary unpredictability. Employing simulations we show that if fitness landscapes are single peaked (have a single fitness maximum) there is good agreement between true and predicted distributions of paths of tumor progression when sample sizes are large, but performance is poor with the currently common much smaller sample sizes. Under multi-peaked fitness landscapes (i.e., those with multiple fitness maxima), performance is poor and improves only slightly with sample size. In all cases, detection regime (when tumors are sampled) is a key determinant of performance. Estimates of evolutionary unpredictability from the best performing CPM, among the four examined, tend to overestimate the true unpredictability and the bias is affected by detection regime; CPMs could be useful for estimating upper bounds to the true evolutionary unpredictability. Analysis of twenty-two cancer data sets shows low evolutionary unpredictability for several of the data sets. But most of the predictions of paths of tumor progression are very unreliable, and unreliability increases with the number of features analyzed. Our results indicate that CPMs could be valuable tools for predicting cancer progression but that, currently, obtaining useful predictions of paths of tumor progression from CPMs is dubious, and emphasize the need for methodological work that can account for the probably multi-peaked fitness landscapes in cancer.
Collapse
Affiliation(s)
- Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain
- Instituto de Investigaciones Biomédicas “Alberto Sols” (UAM-CSIC), Madrid, Spain
| | - Claudia Vasallo
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain
- Instituto de Investigaciones Biomédicas “Alberto Sols” (UAM-CSIC), Madrid, Spain
| |
Collapse
|
35
|
Accessibility percolation on random rooted labeled trees. J Appl Probab 2019. [DOI: 10.1017/jpr.2019.29] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
AbstractThe accessibility percolation model is investigated on random rooted labeled trees. More precisely, the number of accessible leaves (i.e. increasing paths) Zn and the number of accessible vertices Cn in a random rooted labeled tree of size n are jointly considered in this work. As n → ∞, we prove that (Zn, Cn) converges in distribution to a random vector whose probability generating function is given in an explicit form. In particular, we obtain that the asymptotic distributions of Zn + 1 and Cn are geometric distributions with parameters e/(1 + e) and 1/e, respectively. Much of our analysis is performed in the context of local weak convergence of random rooted labeled trees.
Collapse
|
36
|
Tendler A, Zimmer A, Mayo A, Alon U. Noise-precision tradeoff in predicting combinations of mutations and drugs. PLoS Comput Biol 2019; 15:e1006956. [PMID: 31116755 PMCID: PMC6548401 DOI: 10.1371/journal.pcbi.1006956] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2018] [Revised: 06/04/2019] [Accepted: 03/18/2019] [Indexed: 02/06/2023] Open
Abstract
Many biological problems involve the response to multiple perturbations. Examples include response to combinations of many drugs, and the effects of combinations of many mutations. Such problems have an exponentially large space of combinations, which makes it infeasible to cover the entire space experimentally. To overcome this problem, several formulae that predict the effect of drug combinations or fitness landscape values have been proposed. These formulae use the effects of single perturbations and pairs of perturbations to predict triplets and higher order combinations. Interestingly, different formulae perform best on different datasets. Here we use Pareto optimality theory to quantitatively explain why no formula is optimal for all datasets, due to an inherent bias-variance (noise-precision) tradeoff. We calculate the Pareto front of log-linear formulae and find that the optimal formula depends on properties of the dataset: the typical interaction strength and the experimental noise. This study provides an approach to choose a suitable prediction formula for a given dataset, in order to best overcome the combinatorial explosion problem. Sometimes a combination of drugs works much better than each drug alone. Finding such drug cocktails is a pressing challenge in order to combat drug resistance and to improve drug effects. However, it is impossible to test all combinations of multiple drug experimentally. Therefore, researchers are looking for computational rather than experimental approaches to overcome this problem. One approach is to measure the effect of few drugs and plug it into a formula that predicts the effect of many drugs together. Existing prediction formulae typically perform best on the dataset that they were developed on, but less well on other datasets. Here we explain this observation and give a guide for the choice of an optimal prediction formula for a given dataset. The optimal formula depends on two main properties of the dataset: 1) The interaction strength between the drugs and 2) The experimental noise in the data. This study may help researchers discover effective combinations of multiple drugs and multiple perturbations in general.
Collapse
Affiliation(s)
- Avichai Tendler
- Dept. Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Anat Zimmer
- Dept. Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Avi Mayo
- Dept. Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Uri Alon
- Dept. Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
- * E-mail:
| |
Collapse
|
37
|
Abstract
We consider evolution of a large population, where fitness of each organism is defined by many phenotypical traits. These traits result from expression of many genes. Under some assumptions on fitness we prove that such model organisms are capable, to some extent, to recognize the fitness landscape. That fitness landscape learning sharply reduces the number of mutations needed for adaptation. Moreover, this learning increases phenotype robustness with respect to mutations, i.e., canalizes the phenotype. We show that learning and canalization work only when evolution is gradual. Organisms can be adapted to many constraints associated with a hard environment, if that environment becomes harder step by step. Our results explain why evolution can involve genetic changes of a relatively large effect and why the total number of changes are surprisingly small.
Collapse
Affiliation(s)
- John Reinitz
- Departments of Statistics, Ecology and Evolution, Molecular Genetics and Cell Biology, University of Chicago, Chicago, IL, USA
| | - Sergey Vakulenko
- Saint Petersburg National Research University of Information Technologies, Mechanics and Optics, Saint Petersburg, Russian Federation
| | - Dmitri Grigoriev
- CNRS, Mathématiques, Université de Lille, Villeneuve d'Ascq, France
| | - Andreas Weber
- Department of Computer Science, University of Bonn, Bonn, Germany
| |
Collapse
|
38
|
Computational Complexity as an Ultimate Constraint on Evolution. Genetics 2019; 212:245-265. [PMID: 30833289 DOI: 10.1534/genetics.119.302000] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Accepted: 02/22/2019] [Indexed: 01/28/2023] Open
Abstract
Experiments show that evolutionary fitness landscapes can have a rich combinatorial structure due to epistasis. For some landscapes, this structure can produce a computational constraint that prevents evolution from finding local fitness optima-thus overturning the traditional assumption that local fitness peaks can always be reached quickly if no other evolutionary forces challenge natural selection. Here, I introduce a distinction between easy landscapes of traditional theory where local fitness peaks can be found in a moderate number of steps, and hard landscapes where finding local optima requires an infeasible amount of time. Hard examples exist even among landscapes with no reciprocal sign epistasis; on these semismooth fitness landscapes, strong selection weak mutation dynamics cannot find the unique peak in polynomial time. More generally, on hard rugged fitness landscapes that include reciprocal sign epistasis, no evolutionary dynamics-even ones that do not follow adaptive paths-can find a local fitness optimum quickly. Moreover, on hard landscapes, the fitness advantage of nearby mutants cannot drop off exponentially fast but has to follow a power-law that long-term evolution experiments have associated with unbounded growth in fitness. Thus, the constraint of computational complexity enables open-ended evolution on finite landscapes. Knowing this constraint allows us to use the tools of theoretical computer science and combinatorial optimization to characterize the fitness landscapes that we expect to see in nature. I present candidates for hard landscapes at scales from single genes, to microbes, to complex organisms with costly learning (Baldwin effect) or maintained cooperation (Hankshaw effect). Just how ubiquitous hard landscapes (and the corresponding ultimate constraint on evolution) are in nature becomes an open empirical question.
Collapse
|
39
|
Blanco C, Janzen E, Pressman A, Saha R, Chen IA. Molecular Fitness Landscapes from High-Coverage Sequence Profiling. Annu Rev Biophys 2019; 48:1-18. [PMID: 30601678 DOI: 10.1146/annurev-biophys-052118-115333] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The function of fitness (or molecular activity) in the space of all possible sequences is known as the fitness landscape. Evolution is a random walk on the fitness landscape, with a bias toward climbing hills. Mapping the topography of real fitness landscapes is fundamental to understanding evolution, but previous efforts were hampered by the difficulty of obtaining large, quantitative data sets. The accessibility of high-throughput sequencing (HTS) has transformed this study, enabling large-scale enumeration of fitness for many mutants and even complete sequence spaces in some cases. We review the progress of high-throughput studies in mapping molecular fitness landscapes, both in vitro and in vivo, as well as opportunities for future research. Such studies are rapidly growing in number. HTS is expected to have a profound effect on the understanding of real molecular fitness landscapes.
Collapse
Affiliation(s)
- Celia Blanco
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, California 93106, USA; , , , ,
| | - Evan Janzen
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, California 93106, USA; , , , , .,Biomolecular Science and Engineering Program, University of California, Santa Barbara, California 93106, USA
| | - Abe Pressman
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, California 93106, USA; , , , , .,Department of Chemical Engineering, University of California, Santa Barbara, California 93106, USA
| | - Ranajay Saha
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, California 93106, USA; , , , ,
| | - Irene A Chen
- Biomolecular Science and Engineering Program, University of California, Santa Barbara, California 93106, USA
| |
Collapse
|
40
|
Fragata I, Blanckaert A, Dias Louro MA, Liberles DA, Bank C. Evolution in the light of fitness landscape theory. Trends Ecol Evol 2019; 34:69-82. [DOI: 10.1016/j.tree.2018.10.009] [Citation(s) in RCA: 84] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 10/16/2018] [Accepted: 10/17/2018] [Indexed: 01/28/2023]
|
41
|
Diaz-Uriarte R. Cancer progression models and fitness landscapes: a many-to-many relationship. Bioinformatics 2018; 34:836-844. [PMID: 29048486 PMCID: PMC6031050 DOI: 10.1093/bioinformatics/btx663] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2017] [Accepted: 10/17/2017] [Indexed: 11/13/2022] Open
Abstract
Motivation The identification of constraints, due to gene interactions, in the order of accumulation of mutations during cancer progression can allow us to single out therapeutic targets. Cancer progression models (CPMs) use genotype frequency data from cross-sectional samples to identify these constraints, and return Directed Acyclic Graphs (DAGs) of restrictions where arrows indicate dependencies or constraints. On the other hand, fitness landscapes, which map genotypes to fitness, contain all possible paths of tumor progression. Thus, we expect a correspondence between DAGs from CPMs and the fitness landscapes where evolution happened. But many fitness landscapes-e.g. those with reciprocal sign epistasis-cannot be represented by CPMs. Results Using simulated data under 500 fitness landscapes, I show that CPMs' performance (prediction of genotypes that can exist) degrades with reciprocal sign epistasis. There is large variability in the DAGs inferred from each landscape, which is also affected by mutation rate, detection regime and fitness landscape features, in ways that depend on CPM method. Using three cancer datasets, I show that these problems strongly affect the analysis of empirical data: fitness landscapes that are widely different from each other produce data similar to the empirically observed ones and lead to DAGs that infer very different restrictions. Because reciprocal sign epistasis can be common in cancer, these results question the use and interpretation of CPMs. Availability and implementation Code available from Supplementary Material. Contact ramon.diaz@iib.uam.es. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Instituto de Investigaciones Biomédicas "Alberto Sols" (UAM-CSIC), Madrid 28029, Spain
| |
Collapse
|
42
|
Kundu M, Mukherjee S, Biswas S. Record-breaking statistics near second-order phase transitions. Phys Rev E 2018; 98:022103. [PMID: 30253517 DOI: 10.1103/physreve.98.022103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2018] [Indexed: 06/08/2023]
Abstract
When a quantity reaches a value higher (or lower) than its value at any time before, it is said to have made a record. We numerically study the statistical properties of records in the time series of order parameters in different models near their critical points. Specifically, we choose the transversely driven Edwards-Wilkinson model for interface depinning in (1+1) dimensions and the Ising model in two dimensions, as paradigmatic and simple examples of nonequilibrium and equilibrium critical behaviors, respectively. The total number of record-breaking events in the time series of the order parameters of the models show maxima when the system is near criticality. The number of record-breaking events and associated quantities, such as the distribution of the waiting time between successive record events, show power-law scaling near the critical point. The exponent values are specific to the universality classes of the respective models. Such behaviors near criticality can be used as a precursor to imminent criticality, i.e., abrupt and catastrophic changes in the system. Due to the extreme nature of the records, its measurements are relatively free of detection errors and thus provide a clear signal regarding the state of the system in which they are measured.
Collapse
Affiliation(s)
- Mily Kundu
- Condensed Matter Physics Division, Saha Institute of Nuclear Physics, 1/AF Bidhannagar, Kolkata 700064, India
| | - Sudip Mukherjee
- Condensed Matter Physics Division, Saha Institute of Nuclear Physics, 1/AF Bidhannagar, Kolkata 700064, India
- Barasat Government College, Barasat, Kolkata 700124, India
| | - Soumyajyoti Biswas
- Max Planck Institute for Dynamics and Self-Organization, Am Fassberg 17, 37077 Göttingen, Germany
| |
Collapse
|
43
|
Evolutionary constraints in fitness landscapes. Heredity (Edinb) 2018; 121:466-481. [PMID: 29993041 DOI: 10.1038/s41437-018-0110-1] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Revised: 06/01/2018] [Accepted: 06/03/2018] [Indexed: 12/29/2022] Open
Abstract
In the last years, several genotypic fitness landscapes-combinations of a small number of mutations-have been experimentally resolved. To learn about the general properties of "real" fitness landscapes, it is key to characterize these experimental landscapes via simple measures of their structure, related to evolutionary features. Some of the most relevant measures are based on the selectively acessible paths and their properties. In this paper, we present some measures of evolutionary constraints based on (i) the similarity between accessible paths and (ii) the abundance and characteristics of "chains" of obligatory mutations, that are paths going through genotypes with a single fitter neighbor. These measures have a clear evolutionary interpretation. Furthermore, we show that chains are only weakly correlated to classical measures of epistasis. In fact, some of these measures of constraint are non-monotonic in the amount of epistatic interactions, but have instead a maximum for intermediate values. Finally, we show how these measures shed light on evolutionary constraints and predictability in experimentally resolved landscapes.
Collapse
|
44
|
Abstract
Stochastic phenotype switching has been suggested to play a beneficial role in microbial populations by leading to the division of labour among cells, or ensuring that at least some of the population survives an unexpected change in environmental conditions. Here we use a computational model to investigate an alternative possible function of stochastic phenotype switching: as a way to adapt more quickly even in a static environment. We show that when a genetic mutation causes a population to become less fit, switching to an alternative phenotype with higher fitness (growth rate) may give the population enough time to develop compensatory mutations that increase the fitness again. The possibility of switching phenotypes can reduce the time to adaptation by orders of magnitude if the “fitness valley” caused by the deleterious mutation is deep enough. Our work has important implications for the emergence of antibiotic-resistant bacteria. In line with recent experimental findings, we hypothesise that switching to a slower growing — but less sensitive — phenotype helps bacteria to develop resistance by providing alternative, faster evolutionary routes to resistance.
Collapse
|
45
|
Obolski U, Ram Y, Hadany L. Key issues review: evolution on rugged adaptive landscapes. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2018; 81:012602. [PMID: 29051394 DOI: 10.1088/1361-6633/aa94d4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Adaptive landscapes represent a mapping between genotype and fitness. Rugged adaptive landscapes contain two or more adaptive peaks: allele combinations with higher fitness than any of their neighbors in the genetic space. How do populations evolve on such rugged landscapes? Evolutionary biologists have struggled with this question since it was first introduced in the 1930s by Sewall Wright. Discoveries in the fields of genetics and biochemistry inspired various mathematical models of adaptive landscapes. The development of landscape models led to numerous theoretical studies analyzing evolution on rugged landscapes under different biological conditions. The large body of theoretical work suggests that adaptive landscapes are major determinants of the progress and outcome of evolutionary processes. Recent technological advances in molecular biology and microbiology allow experimenters to measure adaptive values of large sets of allele combinations and construct empirical adaptive landscapes for the first time. Such empirical landscapes have already been generated in bacteria, yeast, viruses, and fungi, and are contributing to new insights about evolution on adaptive landscapes. In this Key Issues Review we will: (i) introduce the concept of adaptive landscapes; (ii) review the major theoretical studies of evolution on rugged landscapes; (iii) review some of the recently obtained empirical adaptive landscapes; (iv) discuss recent mathematical and statistical analyses motivated by empirical adaptive landscapes, as well as provide the reader with instructions and source code to implement simulations of evolution on adaptive landscapes; and (v) discuss possible future directions for this exciting field.
Collapse
|
46
|
Crona K, Gavryushkin A, Greene D, Beerenwinkel N. Inferring genetic interactions from comparative fitness data. eLife 2017; 6. [PMID: 29260711 PMCID: PMC5737811 DOI: 10.7554/elife.28629] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Accepted: 11/21/2017] [Indexed: 01/13/2023] Open
Abstract
Darwinian fitness is a central concept in evolutionary biology. In practice, however, it is hardly possible to measure fitness for all genotypes in a natural population. Here, we present quantitative tools to make inferences about epistatic gene interactions when the fitness landscape is only incompletely determined due to imprecise measurements or missing observations. We demonstrate that genetic interactions can often be inferred from fitness rank orders, where all genotypes are ordered according to fitness, and even from partial fitness orders. We provide a complete characterization of rank orders that imply higher order epistasis. Our theory applies to all common types of gene interactions and facilitates comprehensive investigations of diverse genetic interactions. We analyzed various genetic systems comprising HIV-1, the malaria-causing parasite Plasmodium vivax, the fungus Aspergillus niger, and the TEM-family of β-lactamase associated with antibiotic resistance. For all systems, our approach revealed higher order interactions among mutations.
Collapse
Affiliation(s)
- Kristina Crona
- Department of Mathematics and Statistics, American University, Washington, DC, United States
| | - Alex Gavryushkin
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Devin Greene
- Department of Mathematics and Statistics, American University, Washington, DC, United States
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
47
|
Miller CR, Van Leuven JT, Wichman HA, Joyce P. Selecting among three basic fitness landscape models: Additive, multiplicative and stickbreaking. Theor Popul Biol 2017; 122:97-109. [PMID: 29198859 DOI: 10.1016/j.tpb.2017.10.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2017] [Revised: 10/26/2017] [Accepted: 10/27/2017] [Indexed: 10/18/2022]
Abstract
Fitness landscapes map genotypes to organismal fitness. Their topographies depend on how mutational effects interact - epistasis - andare important for understanding evolutionary processes such as speciation, the rate of adaptation, the advantage of recombination, and the predictability versus stochasticity of evolution. The growing amount of data has made it possible to better test landscape models empirically. We argue that this endeavor will benefit from the development and use of meaningful basic models against which to compare more complex models. Here we develop statistical and computational methods for fitting fitness data from mutation combinatorial networks to three simple models: additive, multiplicative and stickbreaking. We employ a Bayesian framework for doing model selection. Using simulations, we demonstrate that our methods work and we explore their statistical performance: bias, error, and the power to discriminate among models. We then illustrate our approach and its flexibility by analyzing several previously published datasets. An R-package that implements our methods is available in the CRAN repository under the name Stickbreaker.
Collapse
Affiliation(s)
- Craig R Miller
- Center for Modeling Complex Interactions, University of Idaho, Moscow, ID 84844, United States; Department of Biological Sciences, University of Idaho, Moscow, ID 83844, United States; Department of Mathematics, University of Idaho, Moscow, ID 83844, United States.
| | - James T Van Leuven
- Center for Modeling Complex Interactions, University of Idaho, Moscow, ID 84844, United States
| | - Holly A Wichman
- Center for Modeling Complex Interactions, University of Idaho, Moscow, ID 84844, United States; Department of Biological Sciences, University of Idaho, Moscow, ID 83844, United States
| | - Paul Joyce
- Department of Mathematics, University of Idaho, Moscow, ID 83844, United States
| |
Collapse
|
48
|
Kumar A, Natarajan C, Moriyama H, Witt CC, Weber RE, Fago A, Storz JF. Stability-Mediated Epistasis Restricts Accessible Mutational Pathways in the Functional Evolution of Avian Hemoglobin. Mol Biol Evol 2017; 34:1240-1251. [PMID: 28201714 PMCID: PMC5400398 DOI: 10.1093/molbev/msx085] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
If the fitness effects of amino acid mutations are conditional on genetic background, then mutations can have different effects depending on the sequential order in which they occur during evolutionary transitions in protein function. A key question concerns the fraction of possible mutational pathways connecting alternative functional states that involve transient reductions in fitness. Here we examine the functional effects of multiple amino acid substitutions that contributed to an evolutionary transition in the oxygenation properties of avian hemoglobin (Hb). The set of causative changes included mutations at intradimer interfaces of the Hb tetramer. Replacements at such sites may be especially likely to have epistatic effects on Hb function since residues at intersubunit interfaces are enmeshed in networks of salt bridges and hydrogen bonds between like and unlike subunits; mutational reconfigurations of these atomic contacts can affect allosteric transitions in quaternary structure and the propensity for tetramer-dimer dissociation. We used ancestral protein resurrection in conjunction with a combinatorial protein engineering approach to synthesize genotypes representing the complete set of mutational intermediates in all possible forward pathways that connect functionally distinct ancestral and descendent genotypes. The experiments revealed that 1/2 of all possible forward pathways included mutational intermediates with aberrant functional properties because particular combinations of mutations promoted tetramer-dimer dissociation. The subset of mutational pathways with unstable intermediates may be selectively inaccessible, representing evolutionary roads not taken. The experimental results also demonstrate how epistasis for particular functional properties of proteins may be mediated indirectly by mutational effects on quaternary structural stability.
Collapse
Affiliation(s)
- Amit Kumar
- School of Biological Sciences, University of Nebraska, Lincoln, NE
| | | | - Hideaki Moriyama
- School of Biological Sciences, University of Nebraska, Lincoln, NE
| | - Christopher C. Witt
- Department of Biology, University of New Mexico, Albuquerque, NM
- Museum of Southwestern Biology, University of New Mexico, Albuquerque, NM
| | - Roy E. Weber
- Zoophysiology, Department of Bioscience, Aarhus University, Aarhus, Denmark
| | - Angela Fago
- Zoophysiology, Department of Bioscience, Aarhus University, Aarhus, Denmark
| | - Jay F. Storz
- School of Biological Sciences, University of Nebraska, Lincoln, NE
| |
Collapse
|
49
|
Genotypic Complexity of Fisher's Geometric Model. Genetics 2017; 206:1049-1079. [PMID: 28450460 DOI: 10.1534/genetics.116.199497] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Accepted: 04/15/2017] [Indexed: 01/30/2023] Open
Abstract
Fisher's geometric model was originally introduced to argue that complex adaptations must occur in small steps because of pleiotropic constraints. When supplemented with the assumption of additivity of mutational effects on phenotypic traits, it provides a simple mechanism for the emergence of genotypic epistasis from the nonlinear mapping of phenotypes to fitness. Of particular interest is the occurrence of reciprocal sign epistasis, which is a necessary condition for multipeaked genotypic fitness landscapes. Here we compute the probability that a pair of randomly chosen mutations interacts sign epistatically, which is found to decrease with increasing phenotypic dimension n, and varies nonmonotonically with the distance from the phenotypic optimum. We then derive expressions for the mean number of fitness maxima in genotypic landscapes comprised of all combinations of L random mutations. This number increases exponentially with L, and the corresponding growth rate is used as a measure of the complexity of the landscape. The dependence of the complexity on the model parameters is found to be surprisingly rich, and three distinct phases characterized by different landscape structures are identified. Our analysis shows that the phenotypic dimension, which is often referred to as phenotypic complexity, does not generally correlate with the complexity of fitness landscapes and that even organisms with a single phenotypic trait can have complex landscapes. Our results further inform the interpretation of experiments where the parameters of Fisher's model have been inferred from data, and help to elucidate which features of empirical fitness landscapes can be described by this model.
Collapse
|
50
|
Aliakbari A, Manshour P, Salehi MJ. Records in fractal stochastic processes. CHAOS (WOODBURY, N.Y.) 2017; 27:033116. [PMID: 28364750 DOI: 10.1063/1.4979348] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The record statistics in stationary and non-stationary fractal time series is studied extensively. By calculating various concepts in record dynamics, we find some interesting results. In stationary fractional Gaussian noises, we observe a universal behavior for the whole range of Hurst exponents. However, for non-stationary fractional Brownian motions, the record dynamics is crucially dependent on the memory, which plays the role of a non-stationarity index, here. Indeed, the deviation from the results of the stationary case increases by increasing the Hurst exponent in fractional Brownian motions. We demonstrate that the memory governs the dynamics of the records as long as it causes non-stationarity in fractal stochastic processes; otherwise, it has no impact on the record statistics.
Collapse
Affiliation(s)
- A Aliakbari
- Department of Physics, Faculty of Sciences, Persian Gulf University, 75169 Bushehr, Iran
| | - P Manshour
- Department of Physics, Faculty of Sciences, Persian Gulf University, 75169 Bushehr, Iran
| | - M J Salehi
- Department of Physics, Faculty of Sciences, Persian Gulf University, 75169 Bushehr, Iran
| |
Collapse
|