1
|
Metzger BPH, Park Y, Starr TN, Thornton JW. Epistasis facilitates functional evolution in an ancient transcription factor. eLife 2024; 12:RP88737. [PMID: 38767330 PMCID: PMC11105156 DOI: 10.7554/elife.88737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
Abstract
A protein's genetic architecture - the set of causal rules by which its sequence produces its functions - also determines its possible evolutionary trajectories. Prior research has proposed that the genetic architecture of proteins is very complex, with pervasive epistatic interactions that constrain evolution and make function difficult to predict from sequence. Most of this work has analyzed only the direct paths between two proteins of interest - excluding the vast majority of possible genotypes and evolutionary trajectories - and has considered only a single protein function, leaving unaddressed the genetic architecture of functional specificity and its impact on the evolution of new functions. Here, we develop a new method based on ordinal logistic regression to directly characterize the global genetic determinants of multiple protein functions from 20-state combinatorial deep mutational scanning (DMS) experiments. We use it to dissect the genetic architecture and evolution of a transcription factor's specificity for DNA, using data from a combinatorial DMS of an ancient steroid hormone receptor's capacity to activate transcription from two biologically relevant DNA elements. We show that the genetic architecture of DNA recognition consists of a dense set of main and pairwise effects that involve virtually every possible amino acid state in the protein-DNA interface, but higher-order epistasis plays only a tiny role. Pairwise interactions enlarge the set of functional sequences and are the primary determinants of specificity for different DNA elements. They also massively expand the number of opportunities for single-residue mutations to switch specificity from one DNA target to another. By bringing variants with different functions close together in sequence space, pairwise epistasis therefore facilitates rather than constrains the evolution of new functions.
Collapse
Affiliation(s)
- Brian PH Metzger
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
| | - Yeonwoo Park
- Program in Genetics, Genomics, and Systems Biology, University of ChicagoChicagoUnited States
| | - Tyler N Starr
- Department of Biochemistry and Molecular Biophysics, University of ChicagoChicagoUnited States
| | - Joseph W Thornton
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
- Department of Human Genetics, University of ChicagoChicagoUnited States
| |
Collapse
|
2
|
Morel M, Zhukova A, Lemoine F, Gascuel O. Accurate Detection of Convergent Mutations in Large Protein Alignments With ConDor. Genome Biol Evol 2024; 16:evae040. [PMID: 38451738 PMCID: PMC10986858 DOI: 10.1093/gbe/evae040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 01/30/2024] [Accepted: 02/22/2024] [Indexed: 03/09/2024] Open
Abstract
Evolutionary convergences are observed at all levels, from phenotype to DNA and protein sequences, and changes at these different levels tend to be correlated. Notably, convergent mutations can lead to convergent changes in phenotype, such as changes in metabolism, drug resistance, and other adaptations to changing environments. We propose a two-component approach to detect mutations subject to convergent evolution in protein alignments. The "Emergence" component selects mutations that emerge more often than expected, while the "Correlation" component selects mutations that correlate with the convergent phenotype under study. With regard to Emergence, a phylogeny deduced from the alignment is provided by the user and is used to simulate the evolution of each alignment position. These simulations allow us to estimate the expected number of mutations in a neutral model, which is compared to the observed number of mutations in the data studied. In Correlation, a comparative phylogenetic approach, is used to measure whether the presence of each of the observed mutations is correlated with the convergent phenotype. Each component can be used on its own, for example Emergence when no phenotype is available. Our method is implemented in a standalone workflow and a webserver, called ConDor. We evaluate the properties of ConDor using simulated data, and we apply it to three real datasets: sedge PEPC proteins, HIV reverse transcriptase, and fish rhodopsin. The results show that the two components of ConDor complement each other, with an overall accuracy that compares favorably to other available tools, especially on large datasets.
Collapse
Affiliation(s)
- Marie Morel
- Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France
- Université Claude Bernard Lyon 1, LBBE, UMR 5558, CNRS, VAS, Villeurbanne, 69100, France
| | - Anna Zhukova
- Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, France
| | - Frédéric Lemoine
- Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, France
- Institut Pasteur, Université Paris Cité, CNR Virus Des Infections Respiratoires, Paris, France
| | - Olivier Gascuel
- Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France
- Institut de Systématique, Evolution, Biodiversité (UMR 7205—CNRS, Muséum National d’Histoire Naturelle, SU, EPHE, UA), Paris, France
| |
Collapse
|
3
|
Latrille T, Rodrigue N, Lartillot N. Genes and sites under adaptation at the phylogenetic scale also exhibit adaptation at the population-genetic scale. Proc Natl Acad Sci U S A 2023; 120:e2214977120. [PMID: 36897968 PMCID: PMC10089192 DOI: 10.1073/pnas.2214977120] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 02/11/2023] [Indexed: 03/12/2023] Open
Abstract
Adaptation in protein-coding sequences can be detected from multiple sequence alignments across species or alternatively by leveraging polymorphism data within a population. Across species, quantification of the adaptive rate relies on phylogenetic codon models, classically formulated in terms of the ratio of nonsynonymous over synonymous substitution rates. Evidence of an accelerated nonsynonymous substitution rate is considered a signature of pervasive adaptation. However, because of the background of purifying selection, these models are potentially limited in their sensitivity. Recent developments have led to more sophisticated mutation-selection codon models aimed at making a more detailed quantitative assessment of the interplay between mutation, purifying, and positive selection. In this study, we conducted a large-scale exome-wide analysis of placental mammals with mutation-selection models, assessing their performance at detecting proteins and sites under adaptation. Importantly, mutation-selection codon models are based on a population-genetic formalism and thus are directly comparable to the McDonald and Kreitman test at the population level to quantify adaptation. Taking advantage of this relationship between phylogenetic and population genetics analyses, we integrated divergence and polymorphism data across the entire exome for 29 populations across 7 genera and showed that proteins and sites detected to be under adaptation at the phylogenetic scale are also under adaptation at the population-genetic scale. Altogether, our exome-wide analysis shows that phylogenetic mutation-selection codon models and the population-genetic test of adaptation can be reconciled and are congruent, paving the way for integrative models and analyses across individuals and populations.
Collapse
Affiliation(s)
- Thibault Latrille
- Université de Lyon, Université Lyon 1, CNRS, VetAgro Sup, Laboratoire de Biométrie et Biologie Evolutive, UMR5558, 69100Villeurbanne, France
- École Normale Supérieure de Lyon, Université de Lyon, 69342Lyon, France
- Department of Computational Biology, Université de Lausanne, 1015Lausanne, Switzerland
| | - Nicolas Rodrigue
- Department of Biology, Institute of Biochemistry, and School of Mathematics and Statistics, Carleton University, K1S 5B6Ottawa, Canada
| | - Nicolas Lartillot
- Université de Lyon, Université Lyon 1, CNRS, VetAgro Sup, Laboratoire de Biométrie et Biologie Evolutive, UMR5558, 69100Villeurbanne, France
| |
Collapse
|
4
|
Hu Y, Wang X, Xu Y, Yang H, Tong Z, Tian R, Xu S, Yu L, Guo Y, Shi P, Huang S, Yang G, Shi S, Wei F. Molecular mechanisms of adaptive evolution in wild animals and plants. SCIENCE CHINA. LIFE SCIENCES 2023; 66:453-495. [PMID: 36648611 PMCID: PMC9843154 DOI: 10.1007/s11427-022-2233-x] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Accepted: 08/30/2022] [Indexed: 01/18/2023]
Abstract
Wild animals and plants have developed a variety of adaptive traits driven by adaptive evolution, an important strategy for species survival and persistence. Uncovering the molecular mechanisms of adaptive evolution is the key to understanding species diversification, phenotypic convergence, and inter-species interaction. As the genome sequences of more and more non-model organisms are becoming available, the focus of studies on molecular mechanisms of adaptive evolution has shifted from the candidate gene method to genetic mapping based on genome-wide scanning. In this study, we reviewed the latest research advances in wild animals and plants, focusing on adaptive traits, convergent evolution, and coevolution. Firstly, we focused on the adaptive evolution of morphological, behavioral, and physiological traits. Secondly, we reviewed the phenotypic convergences of life history traits and responding to environmental pressures, and the underlying molecular convergence mechanisms. Thirdly, we summarized the advances of coevolution, including the four main types: mutualism, parasitism, predation and competition. Overall, these latest advances greatly increase our understanding of the underlying molecular mechanisms for diverse adaptive traits and species interaction, demonstrating that the development of evolutionary biology has been greatly accelerated by multi-omics technologies. Finally, we highlighted the emerging trends and future prospects around the above three aspects of adaptive evolution.
Collapse
Affiliation(s)
- Yibo Hu
- CAS Key Lab of Animal Ecology and Conservation Biology, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Xiaoping Wang
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, School of Life Sciences, Yunnan University, Kunming, 650091, China
| | - Yongchao Xu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Hui Yang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650201, China
| | - Zeyu Tong
- Institute of Evolution and Ecology, School of Life Sciences, Central China Normal University, Wuhan, 430079, China
| | - Ran Tian
- College of Life Sciences, Nanjing Normal University, Nanjing, 210023, China
| | - Shaohua Xu
- State Key Laboratory of Biocontrol, Guangdong Key Lab of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China
| | - Li Yu
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, School of Life Sciences, Yunnan University, Kunming, 650091, China.
| | - Yalong Guo
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China.
| | - Peng Shi
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650201, China.
| | - Shuangquan Huang
- Institute of Evolution and Ecology, School of Life Sciences, Central China Normal University, Wuhan, 430079, China.
| | - Guang Yang
- Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, 511458, China.
- College of Life Sciences, Nanjing Normal University, Nanjing, 210023, China.
| | - Suhua Shi
- State Key Laboratory of Biocontrol, Guangdong Key Lab of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China.
| | - Fuwen Wei
- CAS Key Lab of Animal Ecology and Conservation Biology, Chinese Academy of Sciences, Beijing, 100101, China.
- Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, 511458, China.
| |
Collapse
|
5
|
Detecting macroevolutionary genotype-phenotype associations using error-corrected rates of protein convergence. Nat Ecol Evol 2023; 7:155-170. [PMID: 36604553 PMCID: PMC9834058 DOI: 10.1038/s41559-022-01932-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 10/12/2022] [Indexed: 01/07/2023]
Abstract
On macroevolutionary timescales, extensive mutations and phylogenetic uncertainty mask the signals of genotype-phenotype associations underlying convergent evolution. To overcome this problem, we extended the widely used framework of non-synonymous to synonymous substitution rate ratios and developed the novel metric ωC, which measures the error-corrected convergence rate of protein evolution. While ωC distinguishes natural selection from genetic noise and phylogenetic errors in simulation and real examples, its accuracy allows an exploratory genome-wide search of adaptive molecular convergence without phenotypic hypothesis or candidate genes. Using gene expression data, we explored over 20 million branch combinations in vertebrate genes and identified the joint convergence of expression patterns and protein sequences with amino acid substitutions in functionally important sites, providing hypotheses on undiscovered phenotypes. We further extended our method with a heuristic algorithm to detect highly repetitive convergence among computationally non-trivial higher-order phylogenetic combinations. Our approach allows bidirectional searches for genotype-phenotype associations, even in lineages that diverged for hundreds of millions of years.
Collapse
|
6
|
Debray R, De Luna N, Koskella B. Historical contingency drives compensatory evolution and rare reversal of phage resistance. Mol Biol Evol 2022; 39:6673247. [PMID: 35994371 PMCID: PMC9447851 DOI: 10.1093/molbev/msac182] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Bacteria and lytic viruses (phages) engage in highly dynamic coevolutionary interactions over time, yet we have little idea of how transient selection by phages might shape the future evolutionary trajectories of their host populations. To explore this question, we generated genetically diverse phage-resistant mutants of the bacterium Pseudomonas syringae. We subjected the panel of mutants to prolonged experimental evolution in the absence of phages. Some populations re-evolved phage sensitivity, whereas others acquired compensatory mutations that reduced the costs of resistance without altering resistance levels. To ask whether these outcomes were driven by the initial genetic mechanisms of resistance, we next evolved independent replicates of each individual mutant in the absence of phages. We found a strong signature of historical contingency: some mutations were highly reversible across replicate populations, whereas others were highly entrenched. Through whole-genome sequencing of bacteria over time, we also found that populations with the same resistance gene acquired more parallel sets of mutations than populations with different resistance genes, suggesting that compensatory adaptation is also contingent on how resistance initially evolved. Our study identifies an evolutionary ratchet in bacteria–phage coevolution and may explain previous observations that resistance persists over time in some bacterial populations but is lost in others. We add to a growing body of work describing the key role of phages in the ecological and evolutionary dynamics of their host communities. Beyond this specific trait, our study provides a new insight into the genetic architecture of historical contingency, a crucial component of interpreting and predicting evolution.
Collapse
Affiliation(s)
- Reena Debray
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Nina De Luna
- Department of Immunology, Pennsylvania State University, State College, PA, USA
| | - Britt Koskella
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA.,Chan Zuckerberg BioHub, San Francisco, CA, USA
| |
Collapse
|
7
|
Mohammadi S, Herrera-Álvarez S, Yang L, Rodríguez-Ordoñez MDP, Zhang K, Storz JF, Dobler S, Crawford AJ, Andolfatto P. Constraints on the evolution of toxin-resistant Na,K-ATPases have limited dependence on sequence divergence. PLoS Genet 2022; 18:e1010323. [PMID: 35972957 DOI: 10.1101/2021.11.29.470343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 09/09/2022] [Accepted: 07/04/2022] [Indexed: 05/25/2023] Open
Abstract
A growing body of theoretical and experimental evidence suggests that intramolecular epistasis is a major determinant of rates and patterns of protein evolution and imposes a substantial constraint on the evolution of novel protein functions. Here, we examine the role of intramolecular epistasis in the recurrent evolution of resistance to cardiotonic steroids (CTS) across tetrapods, which occurs via specific amino acid substitutions to the α-subunit family of Na,K-ATPases (ATP1A). After identifying a series of recurrent substitutions at two key sites of ATP1A that are predicted to confer CTS resistance in diverse tetrapods, we then performed protein engineering experiments to test the functional consequences of introducing these substitutions onto divergent species backgrounds. In line with previous results, we find that substitutions at these sites can have substantial background-dependent effects on CTS resistance. Globally, however, these substitutions also have pleiotropic effects that are consistent with additive rather than background-dependent effects. Moreover, the magnitude of a substitution's effect on activity does not depend on the overall extent of ATP1A sequence divergence between species. Our results suggest that epistatic constraints on the evolution of CTS-resistant forms of Na,K-ATPase likely depend on a small number of sites, with little dependence on overall levels of protein divergence. We propose that dependence on a limited number sites may account for the observation of convergent CTS resistance substitutions observed among taxa with highly divergent Na,K-ATPases (See S1 Text for Spanish translation).
Collapse
Affiliation(s)
- Shabnam Mohammadi
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska, United States of America
- Molecular Evolutionary Biology, Institut für Zell- und Systembiologie der Tiere, Universität Hamburg, Hamburg, Germany
| | - Santiago Herrera-Álvarez
- Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Lu Yang
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey, United States of America
| | | | - Karen Zhang
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey, United States of America
| | - Jay F Storz
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska, United States of America
| | - Susanne Dobler
- Molecular Evolutionary Biology, Institut für Zell- und Systembiologie der Tiere, Universität Hamburg, Hamburg, Germany
| | - Andrew J Crawford
- Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - Peter Andolfatto
- Department of Biological Sciences, Columbia University, New York city, New York, United States of America
| |
Collapse
|
8
|
Mohammadi S, Herrera-Álvarez S, Yang L, Rodríguez-Ordoñez MDP, Zhang K, Storz JF, Dobler S, Crawford AJ, Andolfatto P. Constraints on the evolution of toxin-resistant Na,K-ATPases have limited dependence on sequence divergence. PLoS Genet 2022; 18:e1010323. [PMID: 35972957 PMCID: PMC9462791 DOI: 10.1371/journal.pgen.1010323] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 09/09/2022] [Accepted: 07/04/2022] [Indexed: 11/19/2022] Open
Abstract
A growing body of theoretical and experimental evidence suggests that intramolecular epistasis is a major determinant of rates and patterns of protein evolution and imposes a substantial constraint on the evolution of novel protein functions. Here, we examine the role of intramolecular epistasis in the recurrent evolution of resistance to cardiotonic steroids (CTS) across tetrapods, which occurs via specific amino acid substitutions to the α-subunit family of Na,K-ATPases (ATP1A). After identifying a series of recurrent substitutions at two key sites of ATP1A that are predicted to confer CTS resistance in diverse tetrapods, we then performed protein engineering experiments to test the functional consequences of introducing these substitutions onto divergent species backgrounds. In line with previous results, we find that substitutions at these sites can have substantial background-dependent effects on CTS resistance. Globally, however, these substitutions also have pleiotropic effects that are consistent with additive rather than background-dependent effects. Moreover, the magnitude of a substitution’s effect on activity does not depend on the overall extent of ATP1A sequence divergence between species. Our results suggest that epistatic constraints on the evolution of CTS-resistant forms of Na,K-ATPase likely depend on a small number of sites, with little dependence on overall levels of protein divergence. We propose that dependence on a limited number sites may account for the observation of convergent CTS resistance substitutions observed among taxa with highly divergent Na,K-ATPases (See S1 Text for Spanish translation). Individual amino acids within a protein work in concert to produce a functionally coherent structure that must be maintained as a protein diverges over time. Given this structure-function relationship, we expect the effects of new mutations to depend on amino acid states at other sites throughout the protein (i.e., background dependence) and that identical mutations will have more similar effects in more closely-related species, for which orthologous proteins will be less diverged. We tested this hypothesis by performing protein-engineering experiments on ATP1A, a protein that mediates resistance to toxins known as cardiotonic steroids (CTS), to reveal the extent of background-dependence across representative tetrapods. We find that, while the effects of mutations at two key sites implicated in CTS-resistance are indeed often background-dependent, the magnitude of these effects does not correlate with overall levels of ATP1A divergence. Our results instead suggest that background-dependent effects are determined by amino acid states at a small number of sites throughout the protein. Evolutionary constraints imposed by relatively few sites may explain the frequent occurrence of identical or similar CTS-resistance substitutions among ATP1A proteins of highly divergent animals (See S1 Text for Spanish translation).
Collapse
Affiliation(s)
- Shabnam Mohammadi
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska, United States of America
- Molecular Evolutionary Biology, Institut für Zell- und Systembiologie der Tiere, Universität Hamburg, Hamburg, Germany
| | - Santiago Herrera-Álvarez
- Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Lu Yang
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey, United States of America
| | | | - Karen Zhang
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey, United States of America
| | - Jay F. Storz
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska, United States of America
| | - Susanne Dobler
- Molecular Evolutionary Biology, Institut für Zell- und Systembiologie der Tiere, Universität Hamburg, Hamburg, Germany
| | - Andrew J. Crawford
- Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - Peter Andolfatto
- Department of Biological Sciences, Columbia University, New York city, New York, United States of America
- * E-mail:
| |
Collapse
|
9
|
A high-quality genome of the dobsonfly Neoneuromus ignobilis reveals molecular convergences in aquatic insects. Genomics 2022; 114:110437. [PMID: 35902070 DOI: 10.1016/j.ygeno.2022.110437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Revised: 07/03/2022] [Accepted: 07/21/2022] [Indexed: 11/22/2022]
Abstract
Neoneuromus ignobilis is an archaic holometabolous aquatic predatory insect. However, a lack of genomic resources hinders the use of whole genome sequencing to explore their genetic basis and molecular mechanisms for adaptive evolution. Here, we provided a high-contiguity, chromosome-level genome assembly of N. ignobilis using high coverage Nanopore and PacBio reads with the Hi-C technique. The final assembly is 480.67 MB in size, containing 12 telomere-ended pseudochromosomes with only 17 gaps. We compared 42 hexapod species genomes including six independent lineages comprising 11 aquatic insects, and found convergent expansions of long wavelength-sensitive and blue-sensitive opsins, thermal stress response TRP channels, and sulfotransferases in aquatic insects, which may be related to their aquatic adaptation. We also detected strong nonrandom signals of convergent amino acid substitutions in aquatic insects. Collectively, our comparative genomic analysis revealed the evidence of molecular convergences in aquatic insects during both gene family evolution and convergent amino acid substitutions.
Collapse
|
10
|
Phylogenetic inference of changes in amino acid propensities with single-position resolution. PLoS Comput Biol 2022; 18:e1009878. [PMID: 35180226 PMCID: PMC9106220 DOI: 10.1371/journal.pcbi.1009878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 05/13/2022] [Accepted: 01/28/2022] [Indexed: 11/19/2022] Open
Abstract
Fitness conferred by the same allele may differ between genotypes and environments, and these differences shape variation and evolution. Changes in amino acid propensities at protein sites over the course of evolution have been inferred from sequence alignments statistically, but the existing methods are data-intensive and aggregate multiple sites. Here, we develop an approach to detect individual amino acids that confer different fitness in different groups of species from combined sequence and phylogenetic data. Using the fact that the probability of a substitution to an amino acid depends on its fitness, our method looks for amino acids such that substitutions to them occur more frequently in one group of lineages than in another. We validate our method using simulated evolution of a protein site under different scenarios and show that it has high specificity for a wide range of assumptions regarding the underlying changes in selection, while its sensitivity differs between scenarios. We apply our method to the env gene of two HIV-1 subtypes, A and B, and to the HA gene of two influenza A subtypes, H1 and H3, and show that the inferred fitness changes are consistent with the fitness differences observed in deep mutational scanning experiments. We find that changes in relative fitness of different amino acid variants within a site do not always trigger episodes of positive selection and therefore may not result in an overall increase in the frequency of substitutions, but can still be detected from changes in relative frequencies of different substitutions.
Collapse
|
11
|
Youssef N, Susko E, Roger AJ, Bielawski JP. Evolution of amino acid propensities under stability-mediated epistasis. Mol Biol Evol 2022; 39:6522130. [PMID: 35134997 PMCID: PMC8896634 DOI: 10.1093/molbev/msac030] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Site-specific amino acid preferences are influenced by the genetic background of the protein. The preferences for resident amino acids are expected to, on average, increase over time because of replacements at other sites - a nonadaptive phenomenon referred to as the 'evolutionary Stokes shift'. Alternatively, decreases in resident amino acid propensity have recently been viewed as evidence of adaptations to external environmental changes. Using population genetics theory and thermodynamic stability-constraints, we show that nonadaptive evolution can lead to both positive and negative shifts in propensities following the fixation of an amino acid, emphasizing that the detection of negative shifts is not conclusive evidence of adaptation. Considering shifts in propensities over windows between substitutions at a focal site, we find that following ≈ 50% of substitutions the propensity for the new resident amino acid decreases over time, and both positive and negative shifts were comparable in magnitude. Preferences were often conserved via a significant negative autocorrelation in propensity changes-increases in propensities often followed by decreases, and vice versa. Lastly, we explore the underlying mechanisms that lead propensities to fluctuate. We observe that stabilizing replacements increase the mutational tolerance at a site and in doing so decrease the propensity for the resident amino acid. In contrast, destabilizing substitutions result in more rugged fitness landscapes that tend to favor the resident amino acid. In summary, our results characterize propensity trajectories under nonadaptive stability-constrained evolution against which evidence of adaptations should be calibrated.
Collapse
Affiliation(s)
- Noor Youssef
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Edward Susko
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada
| | - Andrew J Roger
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada
| | - Joseph P Bielawski
- Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
12
|
Positive selection in noncoding genomic regions of vocal learning birds is associated with genes implicated in vocal learning and speech functions in humans. Genome Res 2021; 31:2035-2049. [PMID: 34667117 PMCID: PMC8559704 DOI: 10.1101/gr.275989.121] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 08/17/2021] [Indexed: 11/25/2022]
Abstract
Vocal learning, the ability to imitate sounds from conspecifics and the environment, is a key component of human spoken language and learned song in three independently evolved avian groups—oscine songbirds, parrots, and hummingbirds. Humans and each of these three bird clades exhibit specialized behavioral, neuroanatomical, and brain gene expression convergence related to vocal learning, speech, and song. To understand the evolutionary basis of vocal learning gene specializations and convergence, we searched for and identified accelerated genomic regions (ARs), a marker of positive selection, specific to vocal learning birds. We found avian vocal learner-specific ARs, and they were enriched in noncoding regions near genes with known speech functions or brain gene expression specializations in humans and vocal learning birds, including FOXP2, NEUROD6, ZEB2, and MEF2C, and near genes with major neurodevelopmental functions, including NR2F1, NRP2, and BCL11B. We also found enrichment near the SFARI class S genes associated with syndromic vocal communication forms of autism spectrum disorders. These findings reveal strong candidate noncoding regions near genes for the evolutionary adaptations that distinguish vocal learning species from their close vocal nonlearning relatives and provide further evidence of molecular convergence between birdsong and human spoken language.
Collapse
|
13
|
Youssef N, Susko E, Roger AJ, Bielawski JP. Shifts in amino acid preferences as proteins evolve: A synthesis of experimental and theoretical work. Protein Sci 2021; 30:2009-2028. [PMID: 34322924 PMCID: PMC8442975 DOI: 10.1002/pro.4161] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/19/2021] [Accepted: 07/26/2021] [Indexed: 11/08/2022]
Abstract
Amino acid preferences vary across sites and time. While variation across sites is widely accepted, the extent and frequency of temporal shifts are contentious. Our understanding of the drivers of amino acid preference change is incomplete: To what extent are temporal shifts driven by adaptive versus nonadaptive evolutionary processes? We review phenomena that cause preferences to vary (e.g., evolutionary Stokes shift, contingency, and entrenchment) and clarify how they differ. To determine the extent and prevalence of shifted preferences, we review experimental and theoretical studies. Analyses of natural sequence alignments often detect decreases in homoplasy (convergence and reversions) rates, and variation in replacement rates with time-signals that are consistent with temporally changing preferences. While approaches inferring shifts in preferences from patterns in natural alignments are valuable, they are indirect since multiple mechanisms (both adaptive and nonadaptive) could lead to the observed signal. Alternatively, site-directed mutagenesis experiments allow for a more direct assessment of shifted preferences. They corroborate evidence from multiple sequence alignments, revealing that the preference for an amino acid at a site varies depending on the background sequence. However, shifts in preferences are usually minor in magnitude and sites with significantly shifted preferences are low in frequency. The small yet consistent perturbations in preferences could, nevertheless, jeopardize the accuracy of inference procedures, which assume constant preferences. We conclude by discussing if and how such shifts in preferences might influence widely used time-homogenous inference procedures and potential ways to mitigate such effects.
Collapse
Affiliation(s)
- Noor Youssef
- Department of BiologyDalhousie UniversityHalifaxNova ScotiaCanada
| | - Edward Susko
- Department of Mathematics and StatisticsDalhousie UniversityHalifaxNova ScotiaCanada
| | - Andrew J. Roger
- Department of Biochemistry and Molecular BiologyDalhousie UniversityHalifaxNova ScotiaCanada
| | - Joseph P. Bielawski
- Department of BiologyDalhousie UniversityHalifaxNova ScotiaCanada
- Department of Mathematics and StatisticsDalhousie UniversityHalifaxNova ScotiaCanada
| |
Collapse
|
14
|
Xu S, Wang J, Guo Z, He Z, Shi S. Genomic Convergence in the Adaptation to Extreme Environments. PLANT COMMUNICATIONS 2020; 1:100117. [PMID: 33367270 PMCID: PMC7747959 DOI: 10.1016/j.xplc.2020.100117] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Revised: 10/12/2020] [Accepted: 10/28/2020] [Indexed: 05/08/2023]
Abstract
Convergent evolution is especially common in plants that have independently adapted to the same extreme environments (i.e., extremophile plants). The recent burst of omics data has alleviated many limitations that have hampered molecular convergence studies of non-model extremophile plants. In this review, we summarize cases of genomic convergence in these taxa to examine the extent and type of genomic convergence during the process of adaptation to extreme environments. Despite being well studied by candidate gene approaches, convergent evolution at individual sites is rare and often has a high false-positive rate when assessed in whole genomes. By contrast, genomic convergence at higher genetic levels has been detected during adaptation to the same extreme environments. Examples include the convergence of biological pathways and changes in gene expression, gene copy number, amino acid usage, and GC content. Higher convergence levels play important roles in the adaptive evolution of extremophiles and may be more frequent and involve more genes. In several cases, multiple types of convergence events have been found to co-occur. However, empirical and theoretical studies of this higher level convergent evolution are still limited. In conclusion, both the development of powerful approaches and the detection of convergence at various genetic levels are needed to further reveal the genetic mechanisms of plant adaptation to extreme environments.
Collapse
Affiliation(s)
- Shaohua Xu
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Jiayan Wang
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Zixiao Guo
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Ziwen He
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Suhua Shi
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, China
| |
Collapse
|
15
|
Stolyarova AV, Nabieva E, Ptushenko VV, Favorov AV, Popova AV, Neverov AD, Bazykin GA. Senescence and entrenchment in evolution of amino acid sites. Nat Commun 2020; 11:4603. [PMID: 32929079 PMCID: PMC7490271 DOI: 10.1038/s41467-020-18366-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Accepted: 08/20/2020] [Indexed: 01/01/2023] Open
Abstract
Amino acid propensities at a site change in the course of protein evolution. This may happen for two reasons. Changes may be triggered by substitutions at epistatically interacting sites elsewhere in the genome. Alternatively, they may arise due to environmental changes that are external to the genome. Here, we design a framework for distinguishing between these alternatives. Using analytical modelling and simulations, we show that they cause opposite dynamics of the fitness of the allele currently occupying the site: it tends to increase with the time since its origin due to epistasis ("entrenchment"), but to decrease due to random environmental fluctuations ("senescence"). By analysing the genomes of vertebrates and insects, we show that the amino acids originating at negatively selected sites experience strong entrenchment. By contrast, the amino acids originating at positively selected sites experience senescence. We propose that senescence of the current allele is a cause of adaptive evolution.
Collapse
Affiliation(s)
- A V Stolyarova
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Skolkovo, 143028, Russia.
| | - E Nabieva
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Skolkovo, 143028, Russia
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Moscow, 127051, Russia
| | - V V Ptushenko
- Department of Photochemistry and Photobiology, N. M. Emanuel Institute of Biochemical Physics of Russian Academy of Sciences, Moscow, 119334, Russia
- A. N. Belozersky Institute of Physical-Chemical Biology, M. V. Lomonosov Moscow State University, Moscow, 119992, Russia
| | - A V Favorov
- Division of Biostatistics and Bioinformatics, Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
- Laboratory of System Biology and Computational Genetics, Vavilov Institute of General Genetics, Moscow, 119991, Russia
| | - A V Popova
- Department of Molecular Diagnostics, Central Research Institute for Epidemiology, Moscow, 111123, Russia
| | - A D Neverov
- Department of Molecular Diagnostics, Central Research Institute for Epidemiology, Moscow, 111123, Russia
| | - G A Bazykin
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Skolkovo, 143028, Russia
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Moscow, 127051, Russia
| |
Collapse
|
16
|
Youssef N, Susko E, Bielawski JP. Consequences of Stability-Induced Epistasis for Substitution Rates. Mol Biol Evol 2020; 37:3131-3148. [DOI: 10.1093/molbev/msaa151] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
AbstractDo interactions between residues in a protein (i.e., epistasis) significantly alter evolutionary dynamics? If so, what consequences might they have on inference from traditional codon substitution models which assume site-independence for the sake of computational tractability? To investigate the effects of epistasis on substitution rates, we employed a mechanistic mutation-selection model in conjunction with a fitness framework derived from protein stability. We refer to this as the stability-informed site-dependent (S-SD) model and developed a new stability-informed site-independent (S-SI) model that captures the average effect of stability constraints on individual sites of a protein. Comparison of S-SI and S-SD offers a novel and direct method for investigating the consequences of stability-induced epistasis on protein evolution. We developed S-SI and S-SD models for three natural proteins and showed that they generate sequences consistent with real alignments. Our analyses revealed that epistasis tends to increase substitution rates compared with the rates under site-independent evolution. We then assessed the epistatic sensitivity of individual site and discovered a counterintuitive effect: Highly connected sites were less influenced by epistasis relative to exposed sites. Lastly, we show that, despite the unrealistic assumptions, traditional models perform comparably well in the presence and absence of epistasis and provide reasonable summaries of average selection intensities. We conclude that epistatic models are critical to understanding protein evolutionary dynamics, but epistasis might not be required for reasonable inference of selection pressure when averaging over time and sites.
Collapse
Affiliation(s)
- Noor Youssef
- Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada
- Centre for Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Edward Susko
- Centre for Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Joseph P Bielawski
- Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada
- Centre for Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
17
|
Těšický M, Velová H, Novotný M, Kreisinger J, Beneš V, Vinkler M. Positive selection and convergent evolution shape molecular phenotypic traits of innate immunity receptors in tits (Paridae). Mol Ecol 2020; 29:3056-3070. [PMID: 32652716 DOI: 10.1111/mec.15547] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Revised: 06/09/2020] [Accepted: 06/26/2020] [Indexed: 01/04/2023]
Abstract
Despite widespread variability and redundancy abounding animal immunity, little is currently known about the rate of evolutionary convergence (functionally analogous traits not inherited from a common ancestor) in host molecular adaptations to parasite selective pressures. Toll-like receptors (TLRs) provide the molecular interface allowing hosts to recognize pathogenic structures and trigger early danger signals initiating an immune response. Using a novel combination of bioinformatic approaches, here we explore genetic variation in ligand-binding regions of bacteria-sensing TLR4 and TLR5 in 29 species belonging to the tit family of passerine birds (Aves: Paridae). Three out of the four consensual positively selected sites in TLR4 and six out of 14 positively selected positions in TLR5 were located on the receptor surface near the functionally important sites, and based on the phylogenetic pattern evolved in a convergent (parallel) manner. This type of evolution was also seen at one N-glycosylation site and two positively selected phosphorylation sites, providing the first evidence of convergence in post-translational modifications in evolutionary immunology. Finally, the overall mismatch between phylogeny and the clustering of surface charge distribution demonstrates that convergence is common in overall TLR4 and TLR5 molecular phenotypes involved in ligand binding. Our analysis did not reveal any broad ecological traits explaining the convergence observed in electrostatic potentials, suggesting that information on microbial symbionts may be needed to explain TLR evolution. Adopting state-of-the-art predictive structural bionformatics, we have outlined a new broadly applicable methodological approach to estimate the functional significance of positively selected variation and test for the adaptive molecular convergence in protein-coding polymorphisms.
Collapse
Affiliation(s)
- Martin Těšický
- Department of Zoology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Hana Velová
- Department of Zoology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Marian Novotný
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Jakub Kreisinger
- Department of Zoology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Vladimír Beneš
- European Molecular Laboratory Heidelberg, Heidelberg, Germany
| | - Michal Vinkler
- Department of Zoology, Faculty of Science, Charles University, Prague, Czech Republic
| |
Collapse
|
18
|
He Z, Xu S, Shi S. Adaptive convergence at the genomic level-prevalent, uncommon or very rare? Natl Sci Rev 2020; 7:947-951. [PMID: 34692116 PMCID: PMC8289048 DOI: 10.1093/nsr/nwaa076] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2019] [Revised: 02/17/2020] [Accepted: 04/21/2020] [Indexed: 12/31/2022] Open
Affiliation(s)
- Ziwen He
- School of Life Sciences, Sun Yat-sen University, China
| | - Shaohua Xu
- School of Life Sciences, Sun Yat-sen University, China
| | - Suhua Shi
- School of Life Sciences, Sun Yat-sen University, China
| |
Collapse
|
19
|
Rey C, Lanore V, Veber P, Guéguen L, Lartillot N, Sémon M, Boussau B. Detecting adaptive convergent amino acid evolution. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180234. [PMID: 31154974 PMCID: PMC6560273 DOI: 10.1098/rstb.2018.0234] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/25/2019] [Indexed: 11/12/2022] Open
Abstract
In evolutionary genomics, researchers have taken an interest in identifying substitutions that subtend convergent phenotypic adaptations. This is a difficult question that requires distinguishing foreground convergent substitutions that are involved in the convergent phenotype from background convergent substitutions. Those may be linked to other adaptations, may be neutral or may be the consequence of mutational biases. Furthermore, there is no generally accepted definition of convergent substitutions. Various methods that use different definitions have been proposed in the literature, resulting in different sets of candidate foreground convergent substitutions. In this article, we first describe the processes that can generate foreground convergent substitutions in coding sequences, separating adaptive from non-adaptive processes. Second, we review methods that have been proposed to detect foreground convergent substitutions in coding sequences and expose the assumptions that underlie them. Finally, we examine their power on simulations of convergent changes-including in the presence of a change in the efficacy of selection-and on empirical alignments. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- Carine Rey
- ENS de Lyon, CNRS UMR 5239, INSERM U1210, LBMC, Univ Lyon, Université Claude Bernard Lyon 1, F-69007 Lyon, France
| | - Vincent Lanore
- CNRS UMR 5558, LBBE, Univ Lyon, Université Claude Bernard Lyon 1, F-69100 Villeurbanne, France
| | - Philippe Veber
- CNRS UMR 5558, LBBE, Univ Lyon, Université Claude Bernard Lyon 1, F-69100 Villeurbanne, France
| | - Laurent Guéguen
- CNRS UMR 5558, LBBE, Univ Lyon, Université Claude Bernard Lyon 1, F-69100 Villeurbanne, France
| | - Nicolas Lartillot
- CNRS UMR 5558, LBBE, Univ Lyon, Université Claude Bernard Lyon 1, F-69100 Villeurbanne, France
| | - Marie Sémon
- ENS de Lyon, CNRS UMR 5239, INSERM U1210, LBMC, Univ Lyon, Université Claude Bernard Lyon 1, F-69007 Lyon, France
| | - Bastien Boussau
- CNRS UMR 5558, LBBE, Univ Lyon, Université Claude Bernard Lyon 1, F-69100 Villeurbanne, France
| |
Collapse
|
20
|
Mendes FK, Livera AP, Hahn MW. The perils of intralocus recombination for inferences of molecular convergence. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180244. [PMID: 31154973 DOI: 10.1098/rstb.2018.0244] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Accurate inferences of convergence require that the appropriate tree topology be used. If there is a mismatch between the tree a trait has evolved along and the tree used for analysis, then false inferences of convergence ('hemiplasy') can occur. To avoid problems of hemiplasy when there are high levels of gene tree discordance with the species tree, researchers have begun to construct tree topologies from individual loci. However, due to intralocus recombination, even locus-specific trees may contain multiple topologies within them. This implies that the use of individual tree topologies discordant with the species tree can still lead to incorrect inferences about molecular convergence. Here, we examine the frequency with which single exons and single protein-coding genes contain multiple underlying tree topologies, in primates and Drosophila, and quantify the effects of hemiplasy when using trees inferred from individual loci. In both clades, we find that there are most often multiple diagnosable topologies within single exons and whole genes, with 91% of Drosophila protein-coding genes containing multiple topologies. Because of this underlying topological heterogeneity, even using trees inferred from individual protein-coding genes results in 25% and 38% of substitutions falsely labelled as convergent in primates and Drosophila, respectively. While constructing local trees can reduce the problem of hemiplasy, our results suggest that it will be difficult to completely avoid false inferences of convergence. We conclude by suggesting several ways forward in the analysis of convergent evolution, for both molecular and morphological characters. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- Fábio K Mendes
- 1 Department of Computer Science, The University of Auckland , Auckland 1010 , New Zealand.,2 Department of Biology, Indiana University , Bloomington, IN 47405 , USA
| | - Andrew P Livera
- 2 Department of Biology, Indiana University , Bloomington, IN 47405 , USA
| | - Matthew W Hahn
- 2 Department of Biology, Indiana University , Bloomington, IN 47405 , USA.,3 Department of Computer Science, Indiana University , Bloomington, IN 47405 , USA
| |
Collapse
|
21
|
Musil M, Konegger H, Hon J, Bednar D, Damborsky J. Computational Design of Stable and Soluble Biocatalysts. ACS Catal 2018. [DOI: 10.1021/acscatal.8b03613] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Milos Musil
- Loschmidt Laboratories, Centre for Toxic Compounds in the Environment (RECETOX), and Department of Experimental Biology, Faculty of Science, Masaryk University, 625 00 Brno, Czech Republic
- IT4Innovations Centre of Excellence, Faculty of Information Technology, Brno University of Technology, 612 66 Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital, Pekarska 53, 656 91 Brno, Czech Republic
| | - Hannes Konegger
- Loschmidt Laboratories, Centre for Toxic Compounds in the Environment (RECETOX), and Department of Experimental Biology, Faculty of Science, Masaryk University, 625 00 Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital, Pekarska 53, 656 91 Brno, Czech Republic
| | - Jiri Hon
- Loschmidt Laboratories, Centre for Toxic Compounds in the Environment (RECETOX), and Department of Experimental Biology, Faculty of Science, Masaryk University, 625 00 Brno, Czech Republic
- IT4Innovations Centre of Excellence, Faculty of Information Technology, Brno University of Technology, 612 66 Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital, Pekarska 53, 656 91 Brno, Czech Republic
| | - David Bednar
- Loschmidt Laboratories, Centre for Toxic Compounds in the Environment (RECETOX), and Department of Experimental Biology, Faculty of Science, Masaryk University, 625 00 Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital, Pekarska 53, 656 91 Brno, Czech Republic
| | - Jiri Damborsky
- Loschmidt Laboratories, Centre for Toxic Compounds in the Environment (RECETOX), and Department of Experimental Biology, Faculty of Science, Masaryk University, 625 00 Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital, Pekarska 53, 656 91 Brno, Czech Republic
| |
Collapse
|
22
|
Storz JF. Compensatory mutations and epistasis for protein function. Curr Opin Struct Biol 2018; 50:18-25. [PMID: 29100081 PMCID: PMC5936477 DOI: 10.1016/j.sbi.2017.10.009] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Revised: 10/05/2017] [Accepted: 10/12/2017] [Indexed: 01/09/2023]
Abstract
Adaptive protein evolution may be facilitated by neutral amino acid mutations that confer no benefit when they first arise but which potentiate subsequent function-altering mutations via direct or indirect structural mechanisms. Theoretical and empirical results indicate that such compensatory interactions (intramolecular epistasis) can exert a strong influence on trajectories of protein evolution. For this reason, assessing the form and prevalence of intramolecular epistasis and characterizing biophysical mechanisms of compensatory interaction are important research goals at the nexus of structural biology and molecular evolution. Here I review recent insights derived from protein-engineering studies, and I describe an approach for identifying and characterizing mechanisms of epistasis that integrates experimental data on structure-function relationships with analyses of comparative sequence data.
Collapse
Affiliation(s)
- Jay F Storz
- University of Nebraska, School of Biological Sciences, Lincoln, NE 68588-0114, United States.
| |
Collapse
|
23
|
Pervasive contingency and entrenchment in a billion years of Hsp90 evolution. Proc Natl Acad Sci U S A 2018; 115:4453-4458. [PMID: 29626131 DOI: 10.1073/pnas.1718133115] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Interactions among mutations within a protein have the potential to make molecular evolution contingent and irreversible, but the extent to which epistasis actually shaped historical evolutionary trajectories is unclear. To address this question, we experimentally measured how the fitness effects of historical sequence substitutions changed during the billion-year evolutionary history of the heat shock protein 90 (Hsp90) ATPase domain beginning from a deep eukaryotic ancestor to modern Saccharomyces cerevisiae We found a pervasive influence of epistasis. Of 98 derived amino acid states that evolved along this lineage, about half compromise fitness when introduced into the reconstructed ancestral Hsp90. And the vast majority of ancestral states reduce fitness when introduced into the extant S. cerevisiae Hsp90. Overall, more than 75% of historical substitutions were contingent on permissive substitutions that rendered the derived state nondeleterious, became entrenched by subsequent restrictive substitutions that made the ancestral state deleterious, or both. This epistasis was primarily caused by specific interactions among sites rather than a general effect on the protein's tolerance to mutation. Our results show that epistasis continually opened and closed windows of mutational opportunity over evolutionary timescales, producing histories and biological states that reflect the transient internal constraints imposed by the protein's fleeting sequence states.
Collapse
|
24
|
Dungan SZ, Chang BSW. Epistatic interactions influence terrestrial-marine functional shifts in cetacean rhodopsin. Proc Biol Sci 2018; 284:rspb.2016.2743. [PMID: 28250185 DOI: 10.1098/rspb.2016.2743] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2016] [Accepted: 02/03/2017] [Indexed: 12/12/2022] Open
Abstract
Like many aquatic vertebrates, whales have blue-shifting spectral tuning substitutions in the dim-light visual pigment, rhodopsin, that are thought to increase photosensitivity in underwater environments. We have discovered that known spectral tuning substitutions also have surprising epistatic effects on another function of rhodopsin, the kinetic rates associated with light-activated intermediates. By using absorbance spectroscopy and fluorescence-based retinal release assays on heterologously expressed rhodopsin, we assessed both spectral and kinetic differences between cetaceans (killer whale) and terrestrial outgroups (hippo, bovine). Mutation experiments revealed that killer whale rhodopsin is unusually resilient to pleiotropic effects on retinal release from key blue-shifting substitutions (D83N and A292S), largely due to a surprisingly specific epistatic interaction between D83N and the background residue, S299. Ancestral sequence reconstruction indicated that S299 is an ancestral residue that predates the evolution of blue-shifting substitutions at the origins of Cetacea. Based on these results, we hypothesize that intramolecular epistasis helped to conserve rhodopsin's kinetic properties while enabling blue-shifting spectral tuning substitutions as cetaceans adapted to aquatic environments. Trade-offs between different aspects of molecular function are rarely considered in protein evolution, but in cetacean and other vertebrate rhodopsins, may underlie multiple evolutionary scenarios for the selection of specific amino acid substitutions.
Collapse
Affiliation(s)
- Sarah Z Dungan
- Department Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada M5S 3B2
| | - Belinda S W Chang
- Department Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada M5S 3B2 .,Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, Canada M5S 3B2.,Department Cell and Systems Biology, University of Toronto, Toronto, ON, Canada M5S 3G5
| |
Collapse
|
25
|
Klink GV, Bazykin GA. Parallel Evolution of Metazoan Mitochondrial Proteins. Genome Biol Evol 2018; 9:1341-1350. [PMID: 28595327 PMCID: PMC5520408 DOI: 10.1093/gbe/evx025] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/06/2017] [Indexed: 12/11/2022] Open
Abstract
Amino acid propensities at amino acid sites change with time due to epistatic interactions or changing environment, affecting the probabilities of fixation of different amino acids. Such changes should lead to an increased rate of homoplasies (reversals, parallelisms, and convergences) at closely related species. Here, we reconstruct the phylogeny of twelve mitochondrial proteins from several thousand metazoan species, and measure the phylogenetic distances between branches at which either the same allele originated repeatedly due to homoplasies, or different alleles originated due to divergent substitutions. The mean phylogenetic distance between parallel substitutions is ∼20% lower than the mean phylogenetic distance between divergent substitutions, indicating that a variant fixed in a species is more likely to be deleterious in a more phylogenetically remote species, compared with a more closely related species. These findings are robust to artefacts of phylogenetic reconstruction or of pooling of sites from different conservation classes or functional groups, and imply that single-position fitness landscapes change at rates similar to rates of amino acid changes.
Collapse
Affiliation(s)
- Galya V Klink
- Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia
| | - Georgii A Bazykin
- Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia.,Skolkovo Institute of Science and Technology, Skolkovo, Russia
| |
Collapse
|
26
|
Klink GV, Golovin AV, Bazykin GA. Substitutions into amino acids that are pathogenic in human mitochondrial proteins are more frequent in lineages closely related to human than in distant lineages. PeerJ 2017; 5:e4143. [PMID: 29250469 PMCID: PMC5731343 DOI: 10.7717/peerj.4143] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Accepted: 11/16/2017] [Indexed: 11/23/2022] Open
Abstract
Propensities for different amino acids within a protein site change in the course of evolution, so that an amino acid deleterious in a particular species may be acceptable at the same site in a different species. Here, we study the amino acid-changing variants in human mitochondrial genes, and analyze their occurrence in non-human species. We show that substitutions giving rise to such variants tend to occur in lineages closely related to human more frequently than in more distantly related lineages, indicating that a human variant is more likely to be deleterious in more distant species. Unexpectedly, substitutions giving rise to amino acids that correspond to alleles pathogenic in humans also more frequently occur in more closely related lineages. Therefore, a pathogenic variant still tends to be more acceptable in human mitochondria than a variant that may only be fit after a substantial perturbation of the protein structure.
Collapse
Affiliation(s)
- Galya V. Klink
- Sector of Molecular Evolution, Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russian Federation
| | - Andrey V. Golovin
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russian Federation
| | - Georgii A. Bazykin
- Sector of Molecular Evolution, Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russian Federation
- Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Skolkovo, Russian Federation
| |
Collapse
|
27
|
Zou Z, Zhang J. Gene Tree Discordance Does Not Explain Away the Temporal Decline of Convergence in Mammalian Protein Sequence Evolution. Mol Biol Evol 2017; 34:1682-1688. [PMID: 28379570 DOI: 10.1093/molbev/msx109] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Several authors reported lower frequencies of protein sequence convergence between more distantly related evolutionary lineages and attributed this trend to epistasis, which renders the acceptable amino acids at a site more different and convergence less likely in more divergent lineages. A recent primate study, however, suggested that this trend is at least partially and potentially entirely an artifact of gene tree discordance (GTD). Here, we demonstrate in a genome-wide data set from 17 mammals that the temporal trend remains (1) upon the control of the GTD level, (2) in genes whose genealogies are concordant with the species tree, and (3) for convergent changes, which are extremely unlikely to be caused by GTD. Similar results are observed in a comparable data set of 12 fruit flies in some but not all of these tests. We conclude that, at least in some cases, the temporal decline of convergence is genuine, reflecting an impact of epistasis on protein evolution.
Collapse
Affiliation(s)
- Zhengting Zou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI
| |
Collapse
|
28
|
Goldstein RA, Pollock DD. Sequence entropy of folding and the absolute rate of amino acid substitutions. Nat Ecol Evol 2017; 1:1923-1930. [PMID: 29062121 PMCID: PMC5701738 DOI: 10.1038/s41559-017-0338-9] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2017] [Accepted: 09/05/2017] [Indexed: 12/01/2022]
Abstract
Adequate representations of protein evolution should consider how the acceptance of mutations depends on the sequence context in which they arise. However, epistatic interactions among sites in a protein result in time and spatial substitution rate heterogeneity beyond the capabilities of current models. Here, we exploit parallels between amino acid substitutions and chemical reaction kinetics to develop an improved theory of protein evolution. We constructed a mechanistic framework for modelling amino acid substitution rates that employs the formalisms of statistical mechanics, with population genetics principles underlying the analysis. Theoretical analyses and computer simulations of proteins under purifying selection for thermodynamic stability show that substitution rates and the stabilisation of resident amino acids (the ‘evolutionary Stokes shift’) can be predicted from biophysics and the effect of sequence entropy alone. Furthermore, we demonstrate that substitutions predominantly occur when epistatic interactions result in near neutrality; substitution rates are determined by how often epistasis results in such nearly neutral conditions. This theory provides a general framework for modelling protein sequence change under purifying selection, potentially explains patterns of convergence and mutation rates in real proteins that are incompatible with previous models, and provides a better null model for the detection of adaptive changes.
Collapse
Affiliation(s)
- Richard A Goldstein
- Division of Infection and Immunity, University College London, London, WC1E 6BT, UK
| | - David D Pollock
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| |
Collapse
|
29
|
Thomas GWC, Hahn MW, Hahn Y. The Effects of Increasing the Number of Taxa on Inferences of Molecular Convergence. Genome Biol Evol 2017; 9:213-221. [PMID: 28057728 PMCID: PMC5381636 DOI: 10.1093/gbe/evw306] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/01/2017] [Indexed: 12/27/2022] Open
Abstract
Convergent evolution provides insight into the link between phenotype and genotype. Recently, large-scale comparative studies of convergent evolution have become possible, but researchers are still trying to determine the best way to design these types of analyses. One aspect of molecular convergence studies that has not yet been investigated is how taxonomic sample size affects inferences of molecular convergence. Here we show that increased sample size decreases the amount of inferred molecular convergence associated with the three convergent transitions to a marine environment in mammals. The sampling of more taxa-both with and without the convergent phenotype-reveals that alleles associated only with marine mammals in small datasets are actually more widespread, or are not shared by all marine species. The sampling of more taxa also allows finer resolution of ancestral substitutions, revealing that they are not in fact on lineages leading to solely marine species. We revisit a previous study on marine mammals and find that only 7 of the reported 43 genes with convergent substitutions still show signs of convergence with a larger number of background species. However, four of those seven genes also showed signs of positive selection in the original analysis and may still be good candidates for adaptive convergence. Though our study is framed around the convergence of marine mammals, we expect our conclusions on taxonomic sampling are generalizable to any study of molecular convergence.
Collapse
Affiliation(s)
- Gregg W C Thomas
- Department of Biology and School of Informatics and Computing, Indiana University, Bloomington, Indiana
| | - Matthew W Hahn
- Department of Biology and School of Informatics and Computing, Indiana University, Bloomington, Indiana
| | - Yoonsoo Hahn
- Department of Life Science, Research Center for Biomolecules and Biosystems, Chung-Ang University, Seoul, Republic of Korea
| |
Collapse
|
30
|
Xu S, He Z, Guo Z, Zhang Z, Wyckoff GJ, Greenberg A, Wu CI, Shi S. Genome-Wide Convergence during Evolution of Mangroves from Woody Plants. Mol Biol Evol 2017; 34:1008-1015. [PMID: 28087771 DOI: 10.1093/molbev/msw277] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
When living organisms independently invade a new environment, the evolution of similar phenotypic traits is often observed. An interesting but contentious issue is whether the underlying molecular biology also converges in the new habitat. Independent invasions of tropical intertidal zones by woody plants, collectively referred to as mangrove trees, represent some dramatic examples. The high salinity, hypoxia, and other stressors in the new habitat might have affected both genomic features and protein structures. Here, we developed a new method for detecting convergence at conservative Sites (CCS) and applied it to the genomic sequences of mangroves. In simulations, the CCS method drastically reduces random convergence at rapidly evolving sites as well as falsely inferred convergence caused by the misinferences of the ancestral character. In mangrove genomes, we estimated ∼400 genes that have experienced convergence over the background level of convergence in the nonmangrove relatives. The convergent genes are enriched in pathways related to stress response and embryo development, which could be important for mangroves' adaptation to the new habitat.
Collapse
Affiliation(s)
- Shaohua Xu
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Ziwen He
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Zixiao Guo
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Zhang Zhang
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Gerald J Wyckoff
- Molecular Biology and Biochemistry, University of Missouri-Kansas City, Kansas City, MO
| | | | - Chung-I Wu
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China.,Department of Ecology and Evolution, University of Chicago, Chicago, IL
| | - Suhua Shi
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| |
Collapse
|
31
|
Shen XX, Hittinger CT, Rokas A. Contentious relationships in phylogenomic studies can be driven by a handful of genes. Nat Ecol Evol 2017; 1:126. [PMID: 28812701 PMCID: PMC5560076 DOI: 10.1038/s41559-017-0126] [Citation(s) in RCA: 256] [Impact Index Per Article: 36.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2016] [Accepted: 03/01/2017] [Indexed: 01/05/2023]
Abstract
Phylogenomic studies have resolved countless branches of the tree of life, but remain strongly contradictory on certain, contentious relationships. Here, we use a maximum likelihood framework to quantify the distribution of phylogenetic signal among genes and sites for 17 contentious branches and 6 well-established control branches in plant, animal and fungal phylogenomic data matrices. We find that resolution in some of these 17 branches rests on a single gene or a few sites, and that removal of a single gene in concatenation analyses or a single site from every gene in coalescence-based analyses diminishes support and can alter the inferred topology. These results suggest that tiny subsets of very large data matrices drive the resolution of specific internodes, providing a dissection of the distribution of support and observed incongruence in phylogenomic analyses. We submit that quantifying the distribution of phylogenetic signal in phylogenomic data is essential for evaluating whether branches, especially contentious ones, are truly resolved. Finally, we offer one detailed example of such an evaluation for the controversy regarding the earliest-branching metazoan phylum, for which examination of the distributions of gene-wise and site-wise phylogenetic signal across eight data matrices consistently supports ctenophores as the sister group to all other metazoans.
Collapse
Affiliation(s)
- Xing-Xing Shen
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Chris Todd Hittinger
- Laboratory of Genetics, Genome Center of Wisconsin, DOE Great Lakes Bioenergy Research Center, Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| |
Collapse
|
32
|
Fukushima K, Fang X, Alvarez-Ponce D, Cai H, Carretero-Paulet L, Chen C, Chang TH, Farr KM, Fujita T, Hiwatashi Y, Hoshi Y, Imai T, Kasahara M, Librado P, Mao L, Mori H, Nishiyama T, Nozawa M, Pálfalvi G, Pollard ST, Rozas J, Sánchez-Gracia A, Sankoff D, Shibata TF, Shigenobu S, Sumikawa N, Uzawa T, Xie M, Zheng C, Pollock DD, Albert VA, Li S, Hasebe M. Genome of the pitcher plant Cephalotus reveals genetic changes associated with carnivory. Nat Ecol Evol 2017; 1:59. [DOI: 10.1038/s41559-016-0059] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Accepted: 12/16/2016] [Indexed: 11/09/2022]
|
33
|
Teufel AI, Wilke CO. Accelerated simulation of evolutionary trajectories in origin-fixation models. J R Soc Interface 2017; 14:20160906. [PMID: 28228542 PMCID: PMC5332577 DOI: 10.1098/rsif.2016.0906] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Accepted: 01/31/2017] [Indexed: 11/12/2022] Open
Abstract
We present an accelerated algorithm to forward-simulate origin-fixation models. Our algorithm requires, on average, only about two fitness evaluations per fixed mutation, whereas traditional algorithms require, per one fixed mutation, a number of fitness evaluations of the order of the effective population size, Ne Our accelerated algorithm yields the exact same steady state as the original algorithm but produces a different order of fixed mutations. By comparing several relevant evolutionary metrics, such as the distribution of fixed selection coefficients and the probability of reversion, we find that the two algorithms behave equivalently in many respects. However, the accelerated algorithm yields less variance in fixed selection coefficients. Notably, we are able to recover the expected amount of variance by rescaling population size, and we find a linear relationship between the rescaled population size and the population size used by the original algorithm. Considering the widespread usage of origin-fixation simulations across many areas of evolutionary biology, we introduce our accelerated algorithm as a useful tool for increasing the computational complexity of fitness functions without sacrificing much in terms of accuracy of the evolutionary simulation.
Collapse
Affiliation(s)
- Ashley I Teufel
- Department of Integrative Biology, Institute for Cellular and Molecular Biology, and Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, TX 78712, USA
| | - Claus O Wilke
- Department of Integrative Biology, Institute for Cellular and Molecular Biology, and Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
34
|
Bazykin GA. Changing preferences: deformation of single position amino acid fitness landscapes and evolution of proteins. Biol Lett 2016; 11:rsbl.2015.0315. [PMID: 26445980 DOI: 10.1098/rsbl.2015.0315] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
The fitness landscape-the function that relates genotypes to fitness-and its role in directing evolution are a central object of evolutionary biology. However, its huge dimensionality precludes understanding of even the basic aspects of its shape. One way to approach it is to ask a simpler question: what are the properties of a function that assigns fitness to each possible variant at just one particular site-a single position fitness landscape-and how does it change in the course of evolution? Analyses of genomic data from multiple species and multiple individuals within a species have proved beyond reasonable doubt that fitness functions of positions throughout the genome do themselves change with time, thus shaping protein evolution. Here, I will briefly review the literature that addresses these dynamics, focusing on recent genome-scale analyses of fitness functions of amino acid sites, i.e. vectors of fitnesses of 20 individual amino acid variants at a given position of a protein. The set of amino acids that confer high fitness at a particular position changes with time, and the rate of this change is comparable with the rate at which a position evolves, implying that this process plays a major role in evolutionary dynamics. However, the causes of these changes remain largely unclear.
Collapse
Affiliation(s)
- Georgii A Bazykin
- Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow 127051, Russia Faculty of Bioengineering and Bioinformatics and Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119234, Russia Pirogov Russian National Research Medical University, Moscow 117997, Russia
| |
Collapse
|
35
|
Mendes FK, Hahn Y, Hahn MW. Gene Tree Discordance Can Generate Patterns of Diminishing Convergence over Time. Mol Biol Evol 2016; 33:3299-3307. [PMID: 27634870 DOI: 10.1093/molbev/msw197] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Phenotypic convergence is an exciting outcome of adaptive evolution, occurring when different species find similar solutions to the same problem. Unraveling the molecular basis of convergence provides a way to link genotype to adaptive phenotypes, but can also shed light on the extent to which molecular evolution is repeatable and predictable. Many recent genome-wide studies have uncovered a striking pattern of diminishing convergence over time, ascribing this pattern to the presence of intramolecular epistatic interactions. Here, we consider gene tree discordance as an alternative cause of changes in convergence levels over time in a primate dataset. We demonstrate that gene tree discordance can produce patterns of diminishing convergence by itself, and that controlling for discordance as a cause of apparent convergence makes the pattern disappear. We also show that synonymous substitutions, where neither selection nor epistasis should be prevalent, have the same diminishing pattern of molecular convergence in primates. Finally, we demonstrate that even in situations where biological discordance is not possible, discordance due to errors in species tree inference can drive similar patterns. Though intramolecular epistasis could in principle create a pattern of declining convergence over time, our results suggest a possible alternative explanation for this widespread pattern. These results contribute to a growing appreciation not just of the presence of gene tree discordance, but of the unpredictable effects this discordance can have on analyses of molecular evolution.
Collapse
Affiliation(s)
- Fábio K Mendes
- Department of Biology, Indiana University, Bloomington, IN
| | - Yoonsoo Hahn
- Department of Life Science, Research Center for Biomolecules and Biosystems, Chung-Ang University, Seoul, Republic of Korea
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN.,School of Informatics and Computing, Indiana University, Bloomington, IN
| |
Collapse
|
36
|
Starr TN, Thornton JW. Epistasis in protein evolution. Protein Sci 2016; 25:1204-18. [PMID: 26833806 PMCID: PMC4918427 DOI: 10.1002/pro.2897] [Citation(s) in RCA: 289] [Impact Index Per Article: 36.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Revised: 01/25/2016] [Accepted: 01/27/2016] [Indexed: 01/18/2023]
Abstract
The structure, function, and evolution of proteins depend on physical and genetic interactions among amino acids. Recent studies have used new strategies to explore the prevalence, biochemical mechanisms, and evolutionary implications of these interactions-called epistasis-within proteins. Here we describe an emerging picture of pervasive epistasis in which the physical and biological effects of mutations change over the course of evolution in a lineage-specific fashion. Epistasis can restrict the trajectories available to an evolving protein or open new paths to sequences and functions that would otherwise have been inaccessible. We describe two broad classes of epistatic interactions, which arise from different physical mechanisms and have different effects on evolutionary processes. Specific epistasis-in which one mutation influences the phenotypic effect of few other mutations-is caused by direct and indirect physical interactions between mutations, which nonadditively change the protein's physical properties, such as conformation, stability, or affinity for ligands. In contrast, nonspecific epistasis describes mutations that modify the effect of many others; these typically behave additively with respect to the physical properties of a protein but exhibit epistasis because of a nonlinear relationship between the physical properties and their biological effects, such as function or fitness. Both types of interaction are rampant, but specific epistasis has stronger effects on the rate and outcomes of evolution, because it imposes stricter constraints and modulates evolutionary potential more dramatically; it therefore makes evolution more contingent on low-probability historical events and leaves stronger marks on the sequences, structures, and functions of protein families.
Collapse
Affiliation(s)
- Tyler N Starr
- Graduate Program in Biochemistry and Molecular Biophysics, University of Chicago, Chicago, Illinois, 60637
| | - Joseph W Thornton
- Departments of Ecology and Evolution and Human Genetics, University of Chicago, Chicago, Illinois, 60637
| |
Collapse
|
37
|
Wheeler LC, Lim SA, Marqusee S, Harms MJ. The thermostability and specificity of ancient proteins. Curr Opin Struct Biol 2016; 38:37-43. [PMID: 27288744 DOI: 10.1016/j.sbi.2016.05.015] [Citation(s) in RCA: 85] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Revised: 05/18/2016] [Accepted: 05/24/2016] [Indexed: 11/16/2022]
Abstract
Were ancient proteins systematically different than modern proteins? The answer to this question is profoundly important, shaping how we understand the origins of protein biochemical, biophysical, and functional properties. Ancestral sequence reconstruction (ASR), a phylogenetic approach to infer the sequences of ancestral proteins, may reveal such trends. We discuss two proposed trends: a transition from higher to lower thermostability and a tendency for proteins to acquire higher specificity over time. We review the evidence for elevated ancestral thermostability and discuss its possible origins in a changing environmental temperature and/or reconstruction bias. We also conclude that there is, as yet, insufficient data to support a trend from promiscuity to specificity. Finally, we propose future work to understand these proposed evolutionary trends.
Collapse
Affiliation(s)
- Lucas C Wheeler
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR, United States; Institute of Molecular Biology, University of Oregon, Eugene, OR, United States
| | - Shion A Lim
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, United States; Institute for Quantitative Biosciences (QB3), University of California, Berkeley, Berkeley, CA, United States
| | - Susan Marqusee
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, United States; Institute for Quantitative Biosciences (QB3), University of California, Berkeley, Berkeley, CA, United States.
| | - Michael J Harms
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR, United States; Institute of Molecular Biology, University of Oregon, Eugene, OR, United States.
| |
Collapse
|
38
|
Epistasis and the Dynamics of Reversion in Molecular Evolution. Genetics 2016; 203:1335-51. [PMID: 27194749 DOI: 10.1534/genetics.116.188961] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2016] [Accepted: 04/27/2016] [Indexed: 12/27/2022] Open
Abstract
Recent studies of protein evolution contend that the longer an amino acid substitution is present at a site, the less likely it is to revert to the amino acid previously occupying that site. Here we study this phenomenon of decreasing reversion rates rigorously and in a much more general context. We show that, under weak mutation and for arbitrary fitness landscapes, reversion rates decrease with time for any site that is involved in at least one epistatic interaction. Specifically, we prove that, at stationarity, the hazard function of the distribution of waiting times until reversion is strictly decreasing for any such site. Thus, in the presence of epistasis, the longer a particular character has been absent from a site, the less likely the site will revert to its prior state. We also explore several examples of this general result, which share a common pattern whereby the probability of having reverted increases rapidly at short times to some substantial value before becoming almost flat after a few substitutions at other sites. This pattern indicates a characteristic tendency for reversion to occur either almost immediately after the initial substitution or only after a very long time.
Collapse
|
39
|
Goldstein RA, Pollock DD. The tangled bank of amino acids. Protein Sci 2016; 25:1354-62. [PMID: 27028523 PMCID: PMC4918418 DOI: 10.1002/pro.2930] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2016] [Revised: 03/24/2016] [Accepted: 03/24/2016] [Indexed: 12/01/2022]
Abstract
The use of amino acid substitution matrices to model protein evolution has yielded important insights into both the evolutionary process and the properties of specific protein families. In order to make these models tractable, standard substitution matrices represent the average results of the evolutionary process rather than the underlying molecular biophysics and population genetics, treating proteins as a set of independently evolving sites rather than as an integrated biomolecular entity. With advances in computing and the increasing availability of sequence data, we now have an opportunity to move beyond current substitution matrices to more interpretable mechanistic models with greater fidelity to the evolutionary process of mutation and selection and the holistic nature of the selective constraints. As part of this endeavour, we consider how epistatic interactions induce spatial and temporal rate heterogeneity, and demonstrate how these generally ignored factors can reconcile standard substitution rate matrices and the underlying biology, allowing us to better understand the meaning of these substitution rates. Using computational simulations of protein evolution, we can demonstrate the importance of both spatial and temporal heterogeneity in modelling protein evolution.
Collapse
Affiliation(s)
- Richard A Goldstein
- Division of Infection and Immunity, University College London, London, WC1E 6BT, UK
| | - David D Pollock
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado, 80045
| |
Collapse
|
40
|
Abstract
To what extent is the convergent evolution of protein function attributable to convergent or parallel changes at the amino acid level? The mutations that contribute to adaptive protein evolution may represent a biased subset of all possible beneficial mutations owing to mutation bias and/or variation in the magnitude of deleterious pleiotropy. A key finding is that the fitness effects of amino acid mutations are often conditional on genetic background. This context dependence (epistasis) can reduce the probability of convergence and parallelism because it reduces the number of possible mutations that are unconditionally acceptable in divergent genetic backgrounds. Here, I review factors that influence the probability of replicated evolution at the molecular level.
Collapse
Affiliation(s)
- Jay F Storz
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska 68588, USA
| |
Collapse
|
41
|
The first whole genome and transcriptome of the cinereous vulture reveals adaptation in the gastric and immune defense systems and possible convergent evolution between the Old and New World vultures. Genome Biol 2015; 16:215. [PMID: 26486310 PMCID: PMC4618389 DOI: 10.1186/s13059-015-0780-4] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Accepted: 09/15/2015] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND The cinereous vulture, Aegypius monachus, is the largest bird of prey and plays a key role in the ecosystem by removing carcasses, thus preventing the spread of diseases. Its feeding habits force it to cope with constant exposure to pathogens, making this species an interesting target for discovering functionally selected genetic variants. Furthermore, the presence of two independently evolved vulture groups, Old World and New World vultures, provides a natural experiment in which to investigate convergent evolution due to obligate scavenging. RESULTS We sequenced the genome of a cinereous vulture, and mapped it to the bald eagle reference genome, a close relative with a divergence time of 18 million years. By comparing the cinereous vulture to other avian genomes, we find positively selected genetic variations in this species associated with respiration, likely linked to their ability of immune defense responses and gastric acid secretion, consistent with their ability to digest carcasses. Comparisons between the Old World and New World vulture groups suggest convergent gene evolution. We assemble the cinereous vulture blood transcriptome from a second individual, and annotate genes. Finally, we infer the demographic history of the cinereous vulture which shows marked fluctuations in effective population size during the late Pleistocene. CONCLUSIONS We present the first genome and transcriptome analyses of the cinereous vulture compared to other avian genomes and transcriptomes, revealing genetic signatures of dietary and environmental adaptations accompanied by possible convergent evolution between the Old World and New World vultures.
Collapse
|
42
|
Contingency and entrenchment in protein evolution under purifying selection. Proc Natl Acad Sci U S A 2015; 112:E3226-35. [PMID: 26056312 DOI: 10.1073/pnas.1412933112] [Citation(s) in RCA: 140] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
The phenotypic effect of an allele at one genetic site may depend on alleles at other sites, a phenomenon known as epistasis. Epistasis can profoundly influence the process of evolution in populations and shape the patterns of protein divergence across species. Whereas epistasis between adaptive substitutions has been studied extensively, relatively little is known about epistasis under purifying selection. Here we use computational models of thermodynamic stability in a ligand-binding protein to explore the structure of epistasis in simulations of protein sequence evolution. Even though the predicted effects on stability of random mutations are almost completely additive, the mutations that fix under purifying selection are enriched for epistasis. In particular, the mutations that fix are contingent on previous substitutions: Although nearly neutral at their time of fixation, these mutations would be deleterious in the absence of preceding substitutions. Conversely, substitutions under purifying selection are subsequently entrenched by epistasis with later substitutions: They become increasingly deleterious to revert over time. Our results imply that, even under purifying selection, protein sequence evolution is often contingent on history and so it cannot be predicted by the phenotypic effects of mutations assayed in the ancestral background.
Collapse
|
43
|
Evaluation of Ancestral Sequence Reconstruction Methods to Infer Nonstationary Patterns of Nucleotide Substitution. Genetics 2015; 200:873-90. [PMID: 25948563 DOI: 10.1534/genetics.115.177386] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2015] [Accepted: 04/28/2015] [Indexed: 01/07/2023] Open
Abstract
Inference of gene sequences in ancestral species has been widely used to test hypotheses concerning the process of molecular sequence evolution. However, the approach may produce spurious results, mainly because using the single best reconstruction while ignoring the suboptimal ones creates systematic biases. Here we implement methods to correct for such biases and use computer simulation to evaluate their performance when the substitution process is nonstationary. The methods we evaluated include parsimony and likelihood using the single best reconstruction (SBR), averaging over reconstructions weighted by the posterior probabilities (AWP), and a new method called expected Markov counting (EMC) that produces maximum-likelihood estimates of substitution counts for any branch under a nonstationary Markov model. We simulated base composition evolution on a phylogeny for six species, with different selective pressures on G+C content among lineages, and compared the counts of nucleotide substitutions recorded during simulation with the inference by different methods. We found that large systematic biases resulted from (i) the use of parsimony or likelihood with SBR, (ii) the use of a stationary model when the substitution process is nonstationary, and (iii) the use of the Hasegawa-Kishino-Yano (HKY) model, which is too simple to adequately describe the substitution process. The nonstationary general time reversible (GTR) model, used with AWP or EMC, accurately recovered the substitution counts, even in cases of complex parameter fluctuations. We discuss model complexity and the compromise between bias and variance and suggest that the new methods may be useful for studying complex patterns of nucleotide substitution in large genomic data sets.
Collapse
|
44
|
Zou Z, Zhang J. Are Convergent and Parallel Amino Acid Substitutions in Protein Evolution More Prevalent Than Neutral Expectations? Mol Biol Evol 2015; 32:2085-96. [PMID: 25862140 DOI: 10.1093/molbev/msv091] [Citation(s) in RCA: 85] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Convergent and parallel amino acid substitutions in protein evolution, collectively referred to as molecular convergence here, have small probabilities under neutral evolution. For this reason, molecular convergence is commonly viewed as evidence for similar adaptations of different species. The surge in the number of reports of molecular convergence in the last decade raises the intriguing question of whether molecular convergence occurs substantially more frequently than expected under neutral evolution. We here address this question using all one-to-one orthologous proteins encoded by the genomes of 12 fruit fly species and those encoded by 17 mammals. We found that the expected amount of molecular convergence varies greatly depending on the specific neutral substitution model assumed at each amino acid site and that the observed amount of molecular convergence is explainable by neutral models incorporating site-specific information of acceptable amino acids. Interestingly, the total number of convergent and parallel substitutions between two lineages, relative to the neutral expectation, decreases with the genetic distance between the two lineages, regardless of the model used in computing the neutral expectation. We hypothesize that this trend results from differences in the amino acids acceptable at a given site among different clades of a phylogeny, due to prevalent epistasis, and provide simulation as well as empirical evidence for this hypothesis. Together, our study finds no genomic evidence for higher-than-neutral levels of molecular convergence, but suggests the presence of abundant epistasis that decreases the likelihood of molecular convergence between distantly related lineages.
Collapse
Affiliation(s)
- Zhengting Zou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor
| |
Collapse
|