51
|
Chan YH, Venev SV, Zeldovich KB, Matthews CR. Correlation of fitness landscapes from three orthologous TIM barrels originates from sequence and structure constraints. Nat Commun 2017; 8:14614. [PMID: 28262665 PMCID: PMC5343507 DOI: 10.1038/ncomms14614] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2016] [Accepted: 01/11/2017] [Indexed: 02/07/2023] Open
Abstract
Sequence divergence of orthologous proteins enables adaptation to environmental stresses and promotes evolution of novel functions. Limits on evolution imposed by constraints on sequence and structure were explored using a model TIM barrel protein, indole-3-glycerol phosphate synthase (IGPS). Fitness effects of point mutations in three phylogenetically divergent IGPS proteins during adaptation to temperature stress were probed by auxotrophic complementation of yeast with prokaryotic, thermophilic IGPS. Analysis of beneficial mutations pointed to an unexpected, long-range allosteric pathway towards the active site of the protein. Significant correlations between the fitness landscapes of distant orthologues implicate both sequence and structure as primary forces in defining the TIM barrel fitness landscape and suggest that fitness landscapes can be translocated in sequence space. Exploration of fitness landscapes in the context of a protein fold provides a strategy for elucidating the sequence-structure-fitness relationships in other common motifs. The TIM barrel fold is an evolutionarily conserved motif found in proteins with a variety of enzymatic functions. Here the authors explore the fitness landscape of the TIM barrel protein IGPS and uncover evolutionary constraints on both sequence and structure, accompanied by long range allosteric interactions.
Collapse
Affiliation(s)
- Yvonne H Chan
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, Massachusetts 01605, USA
| | - Sergey V Venev
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation Street, Worcester, Massachusetts 01605, USA
| | - Konstantin B Zeldovich
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation Street, Worcester, Massachusetts 01605, USA
| | - C Robert Matthews
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, Massachusetts 01605, USA
| |
Collapse
|
52
|
Bastolla U, Dehouck Y, Echave J. What evolution tells us about protein physics, and protein physics tells us about evolution. Curr Opin Struct Biol 2017; 42:59-66. [DOI: 10.1016/j.sbi.2016.10.020] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Revised: 10/19/2016] [Accepted: 10/24/2016] [Indexed: 12/21/2022]
|
53
|
Bloom JD. Identification of positive selection in genes is greatly improved by using experimentally informed site-specific models. Biol Direct 2017; 12:1. [PMID: 28095902 PMCID: PMC5240389 DOI: 10.1186/s13062-016-0172-z] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Accepted: 12/14/2016] [Indexed: 12/23/2022] Open
Abstract
Background Sites of positive selection are identified by comparing observed evolutionary patterns to those expected under a null model for evolution in the absence of such selection. For protein-coding genes, the most common null model is that nonsynonymous and synonymous mutations fix at equal rates; this unrealistic model has limited power to detect many interesting forms of selection. Results I describe a new approach that uses a null model based on experimental measurements of a gene’s site-specific amino-acid preferences generated by deep mutational scanning in the lab. This null model makes it possible to identify both diversifying selection for repeated amino-acid change and differential selection for mutations to amino acids that are unexpected given the measurements made in the lab. I show that this approach identifies sites of adaptive substitutions in four genes (lactamase, Gal4, influenza nucleoprotein, and influenza hemagglutinin) far better than a comparable method that simply compares the rates of nonsynonymous and synonymous substitutions. Conclusions As rapid increases in biological data enable increasingly nuanced descriptions of the constraints on individual protein sites, approaches like the one here can improve our ability to identify many interesting forms of selection in natural sequences. Reviewers This article was reviewed by Sebastian Maurer-Stroh, Olivier Tenaillon, and Tal Pupko. All three reviewers are members of the Biology Direct editorial board. Electronic supplementary material The online version of this article (doi:10.1186/s13062-016-0172-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jesse D Bloom
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, 98109, WA, USA.
| |
Collapse
|
54
|
Zanini F, Puller V, Brodin J, Albert J, Neher RA. In vivo mutation rates and the landscape of fitness costs of HIV-1. Virus Evol 2017; 3:vex003. [PMID: 28458914 PMCID: PMC5399928 DOI: 10.1093/ve/vex003] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Mutation rates and fitness costs of deleterious mutations are difficult to measure in vivo but essential for a quantitative understanding of evolution. Using whole genome deep sequencing data from longitudinal samples during untreated HIV-1 infection, we estimated mutation rates and fitness costs in HIV-1 from the dynamics of genetic variation. At approximately neutral sites, mutations accumulate with a rate of 1.2 × 10-5 per site per day, in agreement with the rate measured in cell cultures. We estimated the rate from G to A to be the largest, followed by the other transitions C to T, T to C, and A to G, while transversions are less frequent. At other sites, mutations tend to reduce virus replication. We estimated the fitness cost of mutations at every site in the HIV-1 genome using a model of mutation selection balance. About half of all non-synonymous mutations have large fitness costs (>10 percent), while most synonymous mutations have costs <1 percent. The cost of synonymous mutations is especially low in most of pol where we could not detect measurable costs for the majority of synonymous mutations. In contrast, we find high costs for synonymous mutations in important RNA structures and regulatory regions. The intra-patient fitness cost estimates are consistent across multiple patients, indicating that the deleterious part of the fitness landscape is universal and explains a large fraction of global HIV-1 group M diversity.
Collapse
Affiliation(s)
- Fabio Zanini
- Max Planck Institute for Developmental Biology, Tübingen 72076, Germany
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Vadim Puller
- Max Planck Institute for Developmental Biology, Tübingen 72076, Germany
| | - Johanna Brodin
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, SE-171 76 Stockholm, Sweden
| | - Jan Albert
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, SE-171 76 Stockholm, Sweden
- Department of Clinical Microbiology, Karolinska Institute, SE-171 76, Stockholm, Sweden
| | - Richard A. Neher
- Max Planck Institute for Developmental Biology, Tübingen 72076, Germany
| |
Collapse
|
55
|
Haddox HK, Dingens AS, Bloom JD. Experimental Estimation of the Effects of All Amino-Acid Mutations to HIV's Envelope Protein on Viral Replication in Cell Culture. PLoS Pathog 2016; 12:e1006114. [PMID: 27959955 PMCID: PMC5189966 DOI: 10.1371/journal.ppat.1006114] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2016] [Revised: 12/27/2016] [Accepted: 12/07/2016] [Indexed: 11/18/2022] Open
Abstract
HIV is notorious for its capacity to evade immunity and anti-viral drugs through rapid sequence evolution. Knowledge of the functional effects of mutations to HIV is critical for understanding this evolution. HIV's most rapidly evolving protein is its envelope (Env). Here we use deep mutational scanning to experimentally estimate the effects of all amino-acid mutations to Env on viral replication in cell culture. Most mutations are under purifying selection in our experiments, although a few sites experience strong selection for mutations that enhance HIV's replication in cell culture. We compare our experimental measurements of each site's preference for each amino acid to the actual frequencies of these amino acids in naturally occurring HIV sequences. Our measured amino-acid preferences correlate with amino-acid frequencies in natural sequences for most sites. However, our measured preferences are less concordant with natural amino-acid frequencies at surface-exposed sites that are subject to pressures absent from our experiments such as antibody selection. Our data enable us to quantify the inherent mutational tolerance of each site in Env. We show that the epitopes of broadly neutralizing antibodies have a significantly reduced inherent capacity to tolerate mutations, rigorously validating a pervasive idea in the field. Overall, our results help disentangle the role of inherent functional constraints and external selection pressures in shaping Env's evolution.
Collapse
Affiliation(s)
- Hugh K. Haddox
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Molecular and Cellular Biology PhD Program, University of Washington, Seattle, Washington, United States of America
| | - Adam S. Dingens
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Molecular and Cellular Biology PhD Program, University of Washington, Seattle, Washington, United States of America
| | - Jesse D. Bloom
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| |
Collapse
|
56
|
Bershtein S, Serohijos AW, Shakhnovich EI. Bridging the physical scales in evolutionary biology: from protein sequence space to fitness of organisms and populations. Curr Opin Struct Biol 2016; 42:31-40. [PMID: 27810574 DOI: 10.1016/j.sbi.2016.10.013] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 10/14/2016] [Indexed: 01/11/2023]
Abstract
Bridging the gap between the molecular properties of proteins and organismal/population fitness is essential for understanding evolutionary processes. This task requires the integration of the several physical scales of biological organization, each defined by a distinct set of mechanisms and constraints, into a single unifying model. The molecular scale is dominated by the constraints imposed by the physico-chemical properties of proteins and their substrates, which give rise to trade-offs and epistatic (non-additive) effects of mutations. At the systems scale, biological networks modulate protein expression and can either buffer or enhance the fitness effects of mutations. The population scale is influenced by the mutational input, selection regimes, and stochastic changes affecting the size and structure of populations, which eventually determine the evolutionary fate of mutations. Here, we summarize the recent advances in theory, computer simulations, and experiments that advance our understanding of the links between various physical scales in biology.
Collapse
Affiliation(s)
- Shimon Bershtein
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84501, Israel
| | - Adrian Wr Serohijos
- Département de Biochimie, Centre Robert-Cedergren en Bioinformatique & Génomique, Université de Montréal, Montréal, QC H3T 1J4, Canada
| | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, United States.
| |
Collapse
|
57
|
The power of multiplexed functional analysis of genetic variants. Nat Protoc 2016; 11:1782-7. [PMID: 27583640 DOI: 10.1038/nprot.2016.135] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 07/13/2016] [Indexed: 12/30/2022]
Abstract
New technologies have recently enabled saturation mutagenesis and functional analysis of nearly all possible variants of regulatory elements or proteins of interest in single experiments. Here we discuss the past, present, and future of such multiplexed (functional) assays for variant effects (MAVEs). MAVEs provide detailed insight into sequence-function relationships, and they may prove critical for the prospective clinical interpretation of genetic variants.
Collapse
|
58
|
Abstract
A virus’ mutational robustness is described in terms of the strength and distribution of the mutational fitness effects, or MFE. The distribution of MFE is central to many questions in evolutionary theory and is a key parameter in models of molecular evolution. Here we define the mutational fitness effects in influenza A virus by generating 128 viruses, each with a single nucleotide mutation. In contrast to mutational scanning approaches, this strategy allowed us to unambiguously assign fitness values to individual mutations. The presence of each desired mutation and the absence of additional mutations were verified by next generation sequencing of each stock. A mutation was considered lethal only after we failed to rescue virus in three independent transfections. We measured the fitness of each viable mutant relative to the wild type by quantitative RT-PCR following direct competition on A549 cells. We found that 31.6% of the mutations in the genome-wide dataset were lethal and that the lethal fraction did not differ appreciably between the HA- and NA-encoding segments and the rest of the genome. Of the viable mutants, the fitness mean and standard deviation were 0.80 and 0.22 in the genome-wide dataset and best modeled as a beta distribution. The fitness impact of mutation was marginally lower in the segments coding for HA and NA (0.88 ± 0.16) than in the other 6 segments (0.78 ± 0.24), and their respective beta distributions had slightly different shape parameters. The results for influenza A virus are remarkably similar to our own analysis of CirSeq-derived fitness values from poliovirus and previously published data from other small, single stranded DNA and RNA viruses. These data suggest that genome size, and not nucleic acid type or mode of replication, is the main determinant of viral mutational fitness effects. Like other RNA viruses, influenza virus has a very high mutation rate. While high mutation rates may increase the rate at which influenza virus will adapt to a new host, acquire a new route of transmission, or escape from host immune surveillance, data from model systems suggest that most new viral mutations are either lethal or highly detrimental. Mutational robustness refers to the ability of a virus to tolerate, or buffer, these mutations. The mutational robustness of a virus will determine which mutations are maintained in a population and may have a greater impact on viral evolution than mutation rate. We defined the mutational robustness of influenza A virus by measuring the fitness of a large number of viruses, each with a single point mutation. We found that the overall robustness of influenza was similar to that of poliovirus and other viruses of similar size. Interestingly, mutations appeared to be more easily accommodated in hemagglutinin and neuraminidase than elsewhere in the genome. This work will inform models of influenza evolution at the global and molecular scale.
Collapse
|
59
|
Spielman SJ, Wilke CO. Extensively Parameterized Mutation-Selection Models Reliably Capture Site-Specific Selective Constraint. Mol Biol Evol 2016; 33:2990-3002. [PMID: 27512115 DOI: 10.1093/molbev/msw171] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The mutation-selection model of coding sequence evolution has received renewed attention for its use in estimating site-specific amino acid propensities and selection coefficient distributions. Two computationally tractable mutation-selection inference frameworks have been introduced: One framework employs a fixed-effects, highly parameterized maximum likelihood approach, whereas the other employs a random-effects Bayesian Dirichlet Process approach. While both implementations follow the same model, they appear to make distinct predictions about the distribution of selection coefficients. The fixed-effects framework estimates a large proportion of highly deleterious substitutions, whereas the random-effects framework estimates that all substitutions are either nearly neutral or weakly deleterious. It remains unknown, however, how accurately each method infers evolutionary constraints at individual sites. Indeed, selection coefficient distributions pool all site-specific inferences, thereby obscuring a precise assessment of site-specific estimates. Therefore, in this study, we use a simulation-based strategy to determine how accurately each approach recapitulates the selective constraint at individual sites. We find that the fixed-effects approach, despite its extensive parameterization, consistently and accurately estimates site-specific evolutionary constraint. By contrast, the random-effects Bayesian approach systematically underestimates the strength of natural selection, particularly for slowly evolving sites. We also find that, despite the strong differences between their inferred selection coefficient distributions, the fixed- and random-effects approaches yield surprisingly similar inferences of site-specific selective constraint. We conclude that the fixed-effects mutation-selection framework provides the more reliable software platform for model application and future development.
Collapse
Affiliation(s)
- Stephanie J Spielman
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, TX Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, TX Present address: Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| | - Claus O Wilke
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, TX Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, TX
| |
Collapse
|
60
|
Starr TN, Thornton JW. Epistasis in protein evolution. Protein Sci 2016; 25:1204-18. [PMID: 26833806 PMCID: PMC4918427 DOI: 10.1002/pro.2897] [Citation(s) in RCA: 312] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Revised: 01/25/2016] [Accepted: 01/27/2016] [Indexed: 01/18/2023]
Abstract
The structure, function, and evolution of proteins depend on physical and genetic interactions among amino acids. Recent studies have used new strategies to explore the prevalence, biochemical mechanisms, and evolutionary implications of these interactions-called epistasis-within proteins. Here we describe an emerging picture of pervasive epistasis in which the physical and biological effects of mutations change over the course of evolution in a lineage-specific fashion. Epistasis can restrict the trajectories available to an evolving protein or open new paths to sequences and functions that would otherwise have been inaccessible. We describe two broad classes of epistatic interactions, which arise from different physical mechanisms and have different effects on evolutionary processes. Specific epistasis-in which one mutation influences the phenotypic effect of few other mutations-is caused by direct and indirect physical interactions between mutations, which nonadditively change the protein's physical properties, such as conformation, stability, or affinity for ligands. In contrast, nonspecific epistasis describes mutations that modify the effect of many others; these typically behave additively with respect to the physical properties of a protein but exhibit epistasis because of a nonlinear relationship between the physical properties and their biological effects, such as function or fitness. Both types of interaction are rampant, but specific epistasis has stronger effects on the rate and outcomes of evolution, because it imposes stricter constraints and modulates evolutionary potential more dramatically; it therefore makes evolution more contingent on low-probability historical events and leaves stronger marks on the sequences, structures, and functions of protein families.
Collapse
Affiliation(s)
- Tyler N Starr
- Graduate Program in Biochemistry and Molecular Biophysics, University of Chicago, Chicago, Illinois, 60637
| | - Joseph W Thornton
- Departments of Ecology and Evolution and Human Genetics, University of Chicago, Chicago, Illinois, 60637
| |
Collapse
|
61
|
Abriata LA, Bovigny C, Dal Peraro M. Detection and sequence/structure mapping of biophysical constraints to protein variation in saturated mutational libraries and protein sequence alignments with a dedicated server. BMC Bioinformatics 2016; 17:242. [PMID: 27315797 PMCID: PMC4912743 DOI: 10.1186/s12859-016-1124-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Accepted: 06/07/2016] [Indexed: 11/21/2022] Open
Abstract
Background Protein variability can now be studied by measuring high-resolution tolerance-to-substitution maps and fitness landscapes in saturated mutational libraries. But these rich and expensive datasets are typically interpreted coarsely, restricting detailed analyses to positions of extremely high or low variability or dubbed important beforehand based on existing knowledge about active sites, interaction surfaces, (de)stabilizing mutations, etc. Results Our new webserver PsychoProt (freely available without registration at http://psychoprot.epfl.ch or at http://lucianoabriata.altervista.org/psychoprot/index.html) helps to detect, quantify, and sequence/structure map the biophysical and biochemical traits that shape amino acid preferences throughout a protein as determined by deep-sequencing of saturated mutational libraries or from large alignments of naturally occurring variants. Discussion We exemplify how PsychoProt helps to (i) unveil protein structure-function relationships from experiments and from alignments that are consistent with structures according to coevolution analysis, (ii) recall global information about structural and functional features and identify hitherto unknown constraints to variation in alignments, and (iii) point at different sources of variation among related experimental datasets or between experimental and alignment-based data. Remarkably, metabolic costs of the amino acids pose strong constraints to variability at protein surfaces in nature but not in the laboratory. This and other differences call for caution when extrapolating results from in vitro experiments to natural scenarios in, for example, studies of protein evolution. Conclusion We show through examples how PsychoProt can be a useful tool for the broad communities of structural biology and molecular evolution, particularly for studies about protein modeling, evolution and design. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1124-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Luciano A Abriata
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, and Swiss Institute of Bioinformatics, AAB014 Station 19, Lausanne, 1015, Switzerland.
| | - Christophe Bovigny
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, and Swiss Institute of Bioinformatics, AAB014 Station 19, Lausanne, 1015, Switzerland.,Present address: Molecular Modeling Group, Swiss Institute of Bioinformatics, UNIL, Bâtiment Génopode, Lausanne, 1015, Switzerland
| | - Matteo Dal Peraro
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, and Swiss Institute of Bioinformatics, AAB014 Station 19, Lausanne, 1015, Switzerland
| |
Collapse
|
62
|
Accurate Measurement of the Effects of All Amino-Acid Mutations on Influenza Hemagglutinin. Viruses 2016; 8:v8060155. [PMID: 27271655 PMCID: PMC4926175 DOI: 10.3390/v8060155] [Citation(s) in RCA: 141] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2016] [Revised: 05/21/2016] [Accepted: 05/25/2016] [Indexed: 12/17/2022] Open
Abstract
Influenza genes evolve mostly via point mutations, and so knowing the effect of every amino-acid mutation provides information about evolutionary paths available to the virus. We and others have combined high-throughput mutagenesis with deep sequencing to estimate the effects of large numbers of mutations to influenza genes. However, these measurements have suffered from substantial experimental noise due to a variety of technical problems, the most prominent of which is bottlenecking during the generation of mutant viruses from plasmids. Here we describe advances that ameliorate these problems, enabling us to measure with greatly improved accuracy and reproducibility the effects of all amino-acid mutations to an H1 influenza hemagglutinin on viral replication in cell culture. The largest improvements come from using a helper virus to reduce bottlenecks when generating viruses from plasmids. Our measurements confirm at much higher resolution the results of previous studies suggesting that antigenic sites on the globular head of hemagglutinin are highly tolerant of mutations. We also show that other regions of hemagglutinin—including the stalk epitopes targeted by broadly neutralizing antibodies—have a much lower inherent capacity to tolerate point mutations. The ability to accurately measure the effects of all influenza mutations should enhance efforts to understand and predict viral evolution.
Collapse
|
63
|
Epistasis and the Dynamics of Reversion in Molecular Evolution. Genetics 2016; 203:1335-51. [PMID: 27194749 DOI: 10.1534/genetics.116.188961] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2016] [Accepted: 04/27/2016] [Indexed: 12/27/2022] Open
Abstract
Recent studies of protein evolution contend that the longer an amino acid substitution is present at a site, the less likely it is to revert to the amino acid previously occupying that site. Here we study this phenomenon of decreasing reversion rates rigorously and in a much more general context. We show that, under weak mutation and for arbitrary fitness landscapes, reversion rates decrease with time for any site that is involved in at least one epistatic interaction. Specifically, we prove that, at stationarity, the hazard function of the distribution of waiting times until reversion is strictly decreasing for any such site. Thus, in the presence of epistasis, the longer a particular character has been absent from a site, the less likely the site will revert to its prior state. We also explore several examples of this general result, which share a common pattern whereby the probability of having reverted increases rapidly at short times to some substantial value before becoming almost flat after a few substitutions at other sites. This pattern indicates a characteristic tendency for reversion to occur either almost immediately after the initial substitution or only after a very long time.
Collapse
|
64
|
Echave J, Spielman SJ, Wilke CO. Causes of evolutionary rate variation among protein sites. Nat Rev Genet 2016; 17:109-21. [PMID: 26781812 DOI: 10.1038/nrg.2015.18] [Citation(s) in RCA: 176] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
It has long been recognized that certain sites within a protein, such as sites in the protein core or catalytic residues in enzymes, are evolutionarily more conserved than other sites. However, our understanding of rate variation among sites remains surprisingly limited. Recent progress to address this includes the development of a wide array of reliable methods to estimate site-specific substitution rates from sequence alignments. In addition, several molecular traits have been identified that correlate with site-specific mutation rates, and novel mechanistic biophysical models have been proposed to explain the observed correlations. Nonetheless, current models explain, at best, approximately 60% of the observed variance, highlighting the limitations of current methods and models and the need for new research directions.
Collapse
Affiliation(s)
- Julian Echave
- Escuela de Ciencia y Tecnología, Universidad Nacional de San Martín, 1650 San Martín, Buenos Aires, Argentina
| | - Stephanie J Spielman
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Claus O Wilke
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas 78712, USA
| |
Collapse
|
65
|
Wu NC, Du Y, Le S, Young AP, Zhang TH, Wang Y, Zhou J, Yoshizawa JM, Dong L, Li X, Wu TT, Sun R. Coupling high-throughput genetics with phylogenetic information reveals an epistatic interaction on the influenza A virus M segment. BMC Genomics 2016; 17:46. [PMID: 26754751 PMCID: PMC4710013 DOI: 10.1186/s12864-015-2358-7] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2015] [Accepted: 12/28/2015] [Indexed: 12/15/2022] Open
Abstract
Background Epistasis is one of the central themes in viral evolution due to its importance in drug resistance, immune escape, and interspecies transmission. However, there is a lack of experimental approach to systematically probe for epistatic residues. Results By utilizing the information from natural occurring sequences and high-throughput genetics, this study established a novel strategy to identify epistatic residues. The rationale is that a substitution that is deleterious in one strain may be prevalent in nature due to the presence of a naturally occurring compensatory substitution. Here, high-throughput genetics was applied to influenza A virus M segment to systematically identify deleterious substitutions. Comparison with natural sequence variation showed that a deleterious substitution M1 Q214H was prevalent in circulating strains. A coevolution analysis was then performed and indicated that M1 residues 121, 207, 209, and 214 naturally coevolved as a group. Subsequently, we experimentally validated that M1 A209T was a compensatory substitution for M1 Q214H. Conclusions This work provided a proof-of-concept to identify epistatic residues by coupling high-throughput genetics with phylogenetic information. In particular, we were able to identify an epistatic interaction between M1 substitutions A209T and Q214H. This analytic strategy can potentially be adapted to study any protein of interest, provided that the information on natural sequence variants is available. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2358-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Nicholas C Wu
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA. .,Molecular Biology InstituteUniversity of California, Los Angeles, 90095, CA, USA. .,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, 92037, CA, USA.
| | - Yushen Du
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Shuai Le
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA. .,Department of Microbiology, Third Military Medical University, Chongqing, 400038, China.
| | - Arthur P Young
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Tian-Hao Zhang
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Yuanyuan Wang
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Jian Zhou
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Janice M Yoshizawa
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Ling Dong
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Xinmin Li
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Ting-Ting Wu
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Ren Sun
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| |
Collapse
|
66
|
Zanini F, Brodin J, Thebo L, Lanz C, Bratt G, Albert J, Neher RA. Population genomics of intrapatient HIV-1 evolution. eLife 2015; 4:e11282. [PMID: 26652000 PMCID: PMC4718817 DOI: 10.7554/elife.11282] [Citation(s) in RCA: 139] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 12/08/2015] [Indexed: 12/18/2022] Open
Abstract
Many microbial populations rapidly adapt to changing environments with multiple variants competing for survival. To quantify such complex evolutionary dynamics in vivo, time resolved and genome wide data including rare variants are essential. We performed whole-genome deep sequencing of HIV-1 populations in 9 untreated patients, with 6-12 longitudinal samples per patient spanning 5-8 years of infection. The data can be accessed and explored via an interactive web application. We show that patterns of minor diversity are reproducible between patients and mirror global HIV-1 diversity, suggesting a universal landscape of fitness costs that control diversity. Reversions towards the ancestral HIV-1 sequence are observed throughout infection and account for almost one third of all sequence changes. Reversion rates depend strongly on conservation. Frequent recombination limits linkage disequilibrium to about 100 bp in most of the genome, but strong hitch-hiking due to short range linkage limits diversity.
Collapse
Affiliation(s)
- Fabio Zanini
- Evolutionary Dynamics and Biophysics, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Johanna Brodin
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm, Sweden
| | - Lina Thebo
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm, Sweden
| | - Christa Lanz
- Evolutionary Dynamics and Biophysics, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Göran Bratt
- Department of Clinical Science and Education, Stockholm South General Hospital, Stockholm, Sweden
| | - Jan Albert
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm, Sweden
- Department of Clinical Microbiology, Karolinska University Hospital, Stockholm, Sweden
| | - Richard A Neher
- Evolutionary Dynamics and Biophysics, Max Planck Institute for Developmental Biology, Tübingen, Germany
| |
Collapse
|