1
|
Rojas Chávez RA, Fili M, Han C, Rahman SA, Bicar IGL, Gregory S, Helverson A, Hu G, Darbro BW, Das J, Brown GD, Haim H. Mapping the Evolutionary Space of SARS-CoV-2 Variants to Anticipate Emergence of Subvariants Resistant to COVID-19 Therapeutics. PLoS Comput Biol 2024; 20:e1012215. [PMID: 38857308 PMCID: PMC11192331 DOI: 10.1371/journal.pcbi.1012215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 06/21/2024] [Accepted: 05/30/2024] [Indexed: 06/12/2024] Open
Abstract
New sublineages of SARS-CoV-2 variants-of-concern (VOCs) continuously emerge with mutations in the spike glycoprotein. In most cases, the sublineage-defining mutations vary between the VOCs. It is unclear whether these differences reflect lineage-specific likelihoods for mutations at each spike position or the stochastic nature of their appearance. Here we show that SARS-CoV-2 lineages have distinct evolutionary spaces (a probabilistic definition of the sequence states that can be occupied by expanding virus subpopulations). This space can be accurately inferred from the patterns of amino acid variability at the whole-protein level. Robust networks of co-variable sites identify the highest-likelihood mutations in new VOC sublineages and predict remarkably well the emergence of subvariants with resistance mutations to COVID-19 therapeutics. Our studies reveal the contribution of low frequency variant patterns at heterologous sites across the protein to accurate prediction of the changes at each position of interest.
Collapse
Affiliation(s)
| | - Mohammad Fili
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, Iowa, United States of America
| | - Changze Han
- Department of Microbiology and Immunology, The University of Iowa, Iowa City, Iowa, United States of America
| | - Syed A. Rahman
- Center for Systems Immunology, Departments of Immunology and Computational & Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, United States of America
| | - Isaiah G. L. Bicar
- Department of Microbiology and Immunology, The University of Iowa, Iowa City, Iowa, United States of America
| | - Sullivan Gregory
- Department of Microbiology and Immunology, The University of Iowa, Iowa City, Iowa, United States of America
| | - Annika Helverson
- Department of Biostatistics, College of Public Health, The University of Iowa, Iowa City, Iowa, United States of America
| | - Guiping Hu
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, Iowa, United States of America
| | - Benjamin W. Darbro
- Department of Pediatrics, University of Iowa Hospitals and Clinics, Iowa City, Iowa, United States of America
| | - Jishnu Das
- Center for Systems Immunology, Departments of Immunology and Computational & Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, United States of America
| | - Grant D. Brown
- Department of Biostatistics, College of Public Health, The University of Iowa, Iowa City, Iowa, United States of America
| | - Hillel Haim
- Department of Microbiology and Immunology, The University of Iowa, Iowa City, Iowa, United States of America
| |
Collapse
|
2
|
McDiarmid CS, Hooper DM, Stier A, Griffith SC. Mitonuclear interactions impact aerobic metabolism in hybrids and may explain mitonuclear discordance in young, naturally hybridizing bird lineages. Mol Ecol 2024; 33:e17374. [PMID: 38727686 DOI: 10.1111/mec.17374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 02/26/2024] [Accepted: 03/20/2024] [Indexed: 06/07/2024]
Abstract
Understanding genetic incompatibilities and genetic introgression between incipient species are major goals in evolutionary biology. Mitochondrial genes evolve rapidly and exist in dense gene networks with coevolved nuclear genes, suggesting that mitochondrial respiration may be particularly susceptible to disruption in hybrid organisms. Mitonuclear interactions have been demonstrated to contribute to hybrid dysfunction between deeply divergent taxa crossed in the laboratory, but there are few empirical examples of mitonuclear interactions between younger lineages that naturally hybridize. Here, we use controlled hybrid crosses and high-resolution respirometry to provide the first experimental evidence in a bird that inter-lineage mitonuclear interactions impact mitochondrial aerobic metabolism. Specifically, respiration capacity of the two mitodiscordant backcrosses (with mismatched mitonuclear combinations) differs from one another, although they do not differ significantly from the parental groups or mitoconcordant backcrosses as we would expect of mitonuclear disruptions. In the wild hybrid zone between these subspecies, the mitochondrial cline centre is shifted west of the nuclear cline centre, which is consistent with the direction of our experimental results. Our results therefore demonstrate asymmetric mitonuclear interactions that impact the capacity of cellular mitochondrial respiration and may help to explain the geographic discordance between mitochondrial and nuclear genomes observed in the wild.
Collapse
Affiliation(s)
- Callum S McDiarmid
- School of Natural Sciences, Macquarie University, Sydney, New South Wales, Australia
| | - Daniel M Hooper
- Institute for Comparative Genomics and Richard Gilder Graduate School, American Museum of Natural History, New York, New York, USA
| | - Antoine Stier
- Department of Biology, University of Turku, Turku, Finland
- Institut Pluridisciplinaire Hubert Curien, UMR7178, Université de Strasbourg, CNRS, Strasbourg, France
| | - Simon C Griffith
- School of Natural Sciences, Macquarie University, Sydney, New South Wales, Australia
| |
Collapse
|
3
|
Mohammadi S, Herrera-Álvarez S, Yang L, Rodríguez-Ordoñez MDP, Zhang K, Storz JF, Dobler S, Crawford AJ, Andolfatto P. Constraints on the evolution of toxin-resistant Na,K-ATPases have limited dependence on sequence divergence. PLoS Genet 2022; 18:e1010323. [PMID: 35972957 DOI: 10.1101/2021.11.29.470343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 09/09/2022] [Accepted: 07/04/2022] [Indexed: 05/25/2023] Open
Abstract
A growing body of theoretical and experimental evidence suggests that intramolecular epistasis is a major determinant of rates and patterns of protein evolution and imposes a substantial constraint on the evolution of novel protein functions. Here, we examine the role of intramolecular epistasis in the recurrent evolution of resistance to cardiotonic steroids (CTS) across tetrapods, which occurs via specific amino acid substitutions to the α-subunit family of Na,K-ATPases (ATP1A). After identifying a series of recurrent substitutions at two key sites of ATP1A that are predicted to confer CTS resistance in diverse tetrapods, we then performed protein engineering experiments to test the functional consequences of introducing these substitutions onto divergent species backgrounds. In line with previous results, we find that substitutions at these sites can have substantial background-dependent effects on CTS resistance. Globally, however, these substitutions also have pleiotropic effects that are consistent with additive rather than background-dependent effects. Moreover, the magnitude of a substitution's effect on activity does not depend on the overall extent of ATP1A sequence divergence between species. Our results suggest that epistatic constraints on the evolution of CTS-resistant forms of Na,K-ATPase likely depend on a small number of sites, with little dependence on overall levels of protein divergence. We propose that dependence on a limited number sites may account for the observation of convergent CTS resistance substitutions observed among taxa with highly divergent Na,K-ATPases (See S1 Text for Spanish translation).
Collapse
Affiliation(s)
- Shabnam Mohammadi
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska, United States of America
- Molecular Evolutionary Biology, Institut für Zell- und Systembiologie der Tiere, Universität Hamburg, Hamburg, Germany
| | - Santiago Herrera-Álvarez
- Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Lu Yang
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey, United States of America
| | | | - Karen Zhang
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey, United States of America
| | - Jay F Storz
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska, United States of America
| | - Susanne Dobler
- Molecular Evolutionary Biology, Institut für Zell- und Systembiologie der Tiere, Universität Hamburg, Hamburg, Germany
| | - Andrew J Crawford
- Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - Peter Andolfatto
- Department of Biological Sciences, Columbia University, New York city, New York, United States of America
| |
Collapse
|
4
|
Mohammadi S, Herrera-Álvarez S, Yang L, Rodríguez-Ordoñez MDP, Zhang K, Storz JF, Dobler S, Crawford AJ, Andolfatto P. Constraints on the evolution of toxin-resistant Na,K-ATPases have limited dependence on sequence divergence. PLoS Genet 2022; 18:e1010323. [PMID: 35972957 PMCID: PMC9462791 DOI: 10.1371/journal.pgen.1010323] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 09/09/2022] [Accepted: 07/04/2022] [Indexed: 11/19/2022] Open
Abstract
A growing body of theoretical and experimental evidence suggests that intramolecular epistasis is a major determinant of rates and patterns of protein evolution and imposes a substantial constraint on the evolution of novel protein functions. Here, we examine the role of intramolecular epistasis in the recurrent evolution of resistance to cardiotonic steroids (CTS) across tetrapods, which occurs via specific amino acid substitutions to the α-subunit family of Na,K-ATPases (ATP1A). After identifying a series of recurrent substitutions at two key sites of ATP1A that are predicted to confer CTS resistance in diverse tetrapods, we then performed protein engineering experiments to test the functional consequences of introducing these substitutions onto divergent species backgrounds. In line with previous results, we find that substitutions at these sites can have substantial background-dependent effects on CTS resistance. Globally, however, these substitutions also have pleiotropic effects that are consistent with additive rather than background-dependent effects. Moreover, the magnitude of a substitution's effect on activity does not depend on the overall extent of ATP1A sequence divergence between species. Our results suggest that epistatic constraints on the evolution of CTS-resistant forms of Na,K-ATPase likely depend on a small number of sites, with little dependence on overall levels of protein divergence. We propose that dependence on a limited number sites may account for the observation of convergent CTS resistance substitutions observed among taxa with highly divergent Na,K-ATPases (See S1 Text for Spanish translation).
Collapse
Affiliation(s)
- Shabnam Mohammadi
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska, United States of America
- Molecular Evolutionary Biology, Institut für Zell- und Systembiologie der Tiere, Universität Hamburg, Hamburg, Germany
| | - Santiago Herrera-Álvarez
- Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Lu Yang
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey, United States of America
| | | | - Karen Zhang
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey, United States of America
| | - Jay F. Storz
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska, United States of America
| | - Susanne Dobler
- Molecular Evolutionary Biology, Institut für Zell- und Systembiologie der Tiere, Universität Hamburg, Hamburg, Germany
| | - Andrew J. Crawford
- Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - Peter Andolfatto
- Department of Biological Sciences, Columbia University, New York city, New York, United States of America
| |
Collapse
|
5
|
Chen DS, Clark AG, Wolfner MF. Octopaminergic/tyraminergic Tdc2 neurons regulate biased sperm usage in female Drosophila melanogaster. Genetics 2022; 221:6613932. [PMID: 35736370 DOI: 10.1093/genetics/iyac097] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 06/04/2022] [Indexed: 11/14/2022] Open
Abstract
In polyandrous internally fertilizing species, a multiply-mated female can use stored sperm from different males in a biased manner to fertilize her eggs. The female's ability to assess sperm quality and compatibility is essential for her reproductive success, and represents an important aspect of postcopulatory sexual selection. In Drosophila melanogaster, previous studies demonstrated that the female nervous system plays an active role in influencing progeny paternity proportion, and suggested a role for octopaminergic/tyraminergic Tdc2 neurons in this process. Here, we report that inhibiting Tdc2 neuronal activity causes females to produce a higher-than-normal proportion of first-male progeny. This difference is not due to differences in sperm storage or release, but instead is attributable to the suppression of second-male sperm usage bias that normally occurs in control females. We further show that a subset of Tdc2 neurons innervating the female reproductive tract is largely responsible for the progeny proportion phenotype that is observed when Tdc2 neurons are inhibited globally. On the contrary, overactivation of Tdc2 neurons does not further affect sperm storage and release or progeny proportion. These results suggest that octopaminergic/tyraminergic signaling allows a multiply-mated female to bias sperm usage, and identify a new role for the female nervous system in postcopulatory sexual selection.
Collapse
Affiliation(s)
- Dawn S Chen
- Department of Molecular Biology and Genetics, Cornell University, Ithaca NY 14853, USA
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca NY 14853, USA
| | - Mariana F Wolfner
- Department of Molecular Biology and Genetics, Cornell University, Ithaca NY 14853, USA
| |
Collapse
|
6
|
Stolyarova AV, Neretina TV, Zvyagina EA, Fedotova AV, Kondrashov A, Bazykin GA. Complex fitness landscape shapes variation in a hyperpolymorphic species. eLife 2022; 11:76073. [PMID: 35532122 PMCID: PMC9187340 DOI: 10.7554/elife.76073] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 05/09/2022] [Indexed: 11/13/2022] Open
Abstract
It is natural to assume that patterns of genetic variation in hyperpolymorphic species can reveal large-scale properties of the fitness landscape that are hard to detect by studying species with ordinary levels of genetic variation. Here, we study such patterns in a fungus Schizophyllum commune, the most polymorphic species known. Throughout the genome, short-range linkage disequilibrium (LD) caused by attraction of minor alleles is higher between pairs of nonsynonymous than of synonymous variants. This effect is especially pronounced for pairs of sites that are located within the same gene, especially if a large fraction of the gene is covered by haploblocks, genome segments where the gene pool consists of two highly divergent haplotypes, which is a signature of balancing selection. Haploblocks are usually shorter than 1000 nucleotides, and collectively cover about 10% of the S. commune genome. LD tends to be substantially higher for pairs of nonsynonymous variants encoding amino acids that interact within the protein. There is a substantial correlation between LDs at the same pairs of nonsynonymous mutations in the USA and the Russian populations. These patterns indicate that selection in S. commune involves positive epistasis due to compensatory interactions between nonsynonymous alleles. When less polymorphic species are studied, analogous patterns can be detected only through interspecific comparisons. Changes to DNA known as mutations may alter how the proteins and other components of a cell work, and thus play an important role in allowing living things to evolve new traits and abilities over many generations. Whether a mutation is beneficial or harmful may differ depending on the genetic background of the individual – that is, depending on other mutations present in other positions within the same gene – due to a phenomenon called epistasis. Epistasis is known to affect how various species accumulate differences in their DNA compared to each other over time. For example, a mutation that is rare in humans and known to cause disease may be widespread in other primates because its negative effect is canceled out by another mutation that is standard for these species but absent in humans. However, it remains unclear whether epistasis plays a significant part in shaping genetic differences between individuals of the same species. A type of fungus known as Schizophyllum commune lives on rotting wood and is found across the world. It is one of the most genetically diverse species currently known, so there is a higher chance of pairs of compensatory mutations occurring and persisting for a long time in S. commune than in most other species, providing a unique opportunity to study epistasis. Here, Stolyarova et al. studied two distinct populations of S. commune, one from the USA and one from Russia. The team found that – unlike in humans, flies and other less genetically diverse species – epistasis maintains combinations of mutations in S. commune that individually would be harmful to the fungus but together compensate for each other. For example, pairs of mutations affecting specific molecules known as amino acids – the building blocks of proteins – that physically interact with each other tended to be found together in the same individuals. One potential downside of having pairs of compensatory mutations in the genome is that when the organism reproduces, the process of making sex cells may split up these pairs so that harmful mutations are inherited without their partner mutations. Thus, epistasis may have helped shape the way S. commune and other genetically diverse species have evolved.
Collapse
Affiliation(s)
| | - Tatiana V Neretina
- Biological Faculty, Lomonosov Moscow State University, Moscow, Russian Federation
| | - Elena A Zvyagina
- Biological Faculty, Lomonosov Moscow State University, Moscow, Russian Federation
| | - Anna V Fedotova
- Skolkovo Institute of Science and Technology, Moscow, Russian Federation
| | - Alexey Kondrashov
- Department of Ecology and Evolutionary Biology, University of Michigan-Ann Arbor, Ann Arbor, United States
| | - Georgii A Bazykin
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Moscow, Russian Federation
| |
Collapse
|
7
|
Lee YCG. Synergistic epistasis of the deleterious effects of transposable elements. Genetics 2022; 220:iyab211. [PMID: 34888644 PMCID: PMC9097265 DOI: 10.1093/genetics/iyab211] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 11/10/2021] [Indexed: 11/12/2022] Open
Abstract
The replicative nature and generally deleterious effects of transposable elements (TEs) raise an outstanding question about how TE copy number is stably contained in host populations. Classic theoretical analyses predict that, when the decline in fitness due to each additional TE insertion is greater than linear, or when there is synergistic epistasis, selection against TEs can result in a stable equilibrium of TE copy number. While several mechanisms are predicted to yield synergistic deleterious effects of TEs, we lack empirical investigations of the presence of such epistatic interactions. Purifying selection with synergistic epistasis generates repulsion linkage between deleterious alleles. We investigated this population genetic signal in the likely ancestral Drosophila melanogaster population and found evidence supporting the presence of synergistic epistasis among TE insertions, especially TEs expected to exert large fitness impacts. Even though synergistic epistasis of TEs has been predicted to arise through ectopic recombination and TE-mediated epigenetic silencing mechanisms, we only found mixed support for the associated predictions. We observed signals of synergistic epistasis for a large number of TE families, which is consistent with the expectation that such epistatic interaction mainly happens among copies of the same family. Curiously, significant repulsion linkage was also found among TE insertions from different families, suggesting the possibility that synergism of TEs' deleterious fitness effects could arise above the family level and through mechanisms similar to those of simple mutations. Our findings set the stage for investigating the prevalence and importance of epistatic interactions in the evolutionary dynamics of TEs.
Collapse
Affiliation(s)
- Yuh Chwen G Lee
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA 92697, USA
| |
Collapse
|
8
|
Wang Y, Lei R, Nourmohammad A, Wu NC. Antigenic evolution of human influenza H3N2 neuraminidase is constrained by charge balancing. eLife 2021; 10:e72516. [PMID: 34878407 PMCID: PMC8683081 DOI: 10.7554/elife.72516] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 12/07/2021] [Indexed: 11/13/2022] Open
Abstract
As one of the main influenza antigens, neuraminidase (NA) in H3N2 virus has evolved extensively for more than 50 years due to continuous immune pressure. While NA has recently emerged as an effective vaccine target, biophysical constraints on the antigenic evolution of NA remain largely elusive. Here, we apply combinatorial mutagenesis and next-generation sequencing to characterize the local fitness landscape in an antigenic region of NA in six different human H3N2 strains that were isolated around 10 years apart. The local fitness landscape correlates well among strains and the pairwise epistasis is highly conserved. Our analysis further demonstrates that local net charge governs the pairwise epistasis in this antigenic region. In addition, we show that residue coevolution in this antigenic region is correlated with the pairwise epistasis between charge states. Overall, this study demonstrates the importance of quantifying epistasis and the underlying biophysical constraint for building a model of influenza evolution.
Collapse
Affiliation(s)
- Yiquan Wang
- Department of Biochemistry, University of Illinois at Urbana-ChampaignUrbanaUnited States
| | - Ruipeng Lei
- Department of Biochemistry, University of Illinois at Urbana-ChampaignUrbanaUnited States
| | - Armita Nourmohammad
- Department of Physics, University of WashingtonSeattleUnited States
- Max Planck Institute for Dynamics and Self-OrganizationGöttingenGermany
- Fred Hutchinson Cancer Research CenterSeattleUnited States
| | - Nicholas C Wu
- Department of Biochemistry, University of Illinois at Urbana-ChampaignUrbanaUnited States
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-ChampaignUrbanaUnited States
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-ChampaignUrbanaUnited States
- Carle Illinois College of Medicine, University of Illinois at Urbana-ChampaignUrbanaUnited States
| |
Collapse
|
9
|
Neverov AD, Popova AV, Fedonin GG, Cheremukhin EA, Klink GV, Bazykin GA. Episodic evolution of coadapted sets of amino acid sites in mitochondrial proteins. PLoS Genet 2021; 17:e1008711. [PMID: 33493156 PMCID: PMC7861529 DOI: 10.1371/journal.pgen.1008711] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 02/04/2021] [Accepted: 12/07/2020] [Indexed: 11/19/2022] Open
Abstract
The rate of evolution differs between protein sites and changes with time. However, the link between these two phenomena remains poorly understood. Here, we design a phylogenetic approach for distinguishing pairs of amino acid sites that evolve concordantly, i.e., such that substitutions at one site trigger subsequent substitutions at the other; and also pairs of sites that evolve discordantly, so that substitutions at one site impede subsequent substitutions at the other. We distinguish groups of amino acid sites that undergo coordinated evolution and evolve discordantly from other such groups. In mitochondrion-encoded proteins of metazoans and fungi, we show that concordantly evolving sites are clustered in protein structures. By analysing the phylogenetic patterns of substitutions at concordantly and discordantly evolving site pairs, we find that concordant evolution has two distinct causes: epistatic interactions between amino acid substitutions and episodes of selection independently affecting substitutions at different sites. The rate of substitutions at concordantly evolving groups of protein sites changes in the course of evolution, indicating episodes of selection limited to some of the lineages. The phylogenetic positions of these changes are consistent between proteins, suggesting common selective forces underlying them. The mode and rate of evolution of a protein site depends on the effect of its mutations on protein fitness. The fitness effect of a mutation itself can change in the course of evolution for at least two reasons. First, it can be modulated by substitutions occurring at other sites, a phenomenon called epistasis. Second, changes in selection can be non-epistatic, affecting sites independently of one another. Here, we analyse substitutions accumulated by the evolving lineages of the five proteins encoded by the mitochondrial genomes of thousands of species of metazoans and fungi. We show that substitutions at different amino acid sites occur in a coordinated fashion, and this coordination is caused both by epistasis and by episodes of selection affecting groups of sites. We partition each protein into several groups of concordantly evolving sites such that evolution of sites from different groups is discordant, and show that the proteins encoded by the mitochondrial genome consist of coevolving structural blocks. Some of these blocks have a clear functional specialization, e.g. are associated with interfaces between proteins composing respiratory complexes. Together, our results reveal a previously unrecognized complexity in the causes of variation in evolutionary rates between protein sites.
Collapse
Affiliation(s)
- Alexey D. Neverov
- Department of Molecular Diagnostics, Central Research Institute for Epidemiology, Moscow, Russia
- * E-mail:
| | - Anfisa V. Popova
- Department of Molecular Diagnostics, Central Research Institute for Epidemiology, Moscow, Russia
| | - Gennady G. Fedonin
- Department of Molecular Diagnostics, Central Research Institute for Epidemiology, Moscow, Russia
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Moscow, Russia
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow region, Russia
| | | | - Galya V. Klink
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Moscow, Russia
| | - Georgii A. Bazykin
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Skolkovo, Russia
| |
Collapse
|
10
|
Dench J, Hinz A, Aris‐Brosou S, Kassen R. Identifying the drivers of computationally detected correlated evolution among sites under antibiotic selection. Evol Appl 2020; 13:781-793. [PMID: 32211067 PMCID: PMC7086105 DOI: 10.1111/eva.12900] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Revised: 10/02/2019] [Accepted: 11/14/2019] [Indexed: 11/29/2022] Open
Abstract
The ultimate causes of correlated evolution among sites in a genome remain difficult to tease apart. To address this problem directly, we performed a high-throughput search for correlated evolution among sites associated with resistance to a fluoroquinolone antibiotic using whole-genome data from clinical strains of Pseudomonas aeruginosa, before validating our computational predictions experimentally. We show that for at least two sites, this correlation is underlain by epistasis. Our analysis also revealed eight additional pairs of synonymous substitutions displaying correlated evolution underlain by physical linkage, rather than selection associated with antibiotic resistance. Our results provide direct evidence that both epistasis and physical linkage among sites can drive the correlated evolution identified by high-throughput computational tools. In other words, the observation of correlated evolution is not by itself sufficient evidence to guarantee that the sites in question are epistatic; such a claim requires additional evidence, ideally coming from direct estimates of epistasis, based on experimental evidence.
Collapse
Affiliation(s)
- Jonathan Dench
- Department of BiologyUniversity of OttawaOttawaOntarioCanada
| | - Aaron Hinz
- Department of BiologyUniversity of OttawaOttawaOntarioCanada
| | - Stéphane Aris‐Brosou
- Department of BiologyUniversity of OttawaOttawaOntarioCanada
- Department of Mathematics and StatisticsUniversity of OttawaOttawaOntarioCanada
| | - Rees Kassen
- Department of BiologyUniversity of OttawaOttawaOntarioCanada
| |
Collapse
|
11
|
Arnold B, Sohail M, Wadsworth C, Corander J, Hanage WP, Sunyaev S, Grad YH. Fine-Scale Haplotype Structure Reveals Strong Signatures of Positive Selection in a Recombining Bacterial Pathogen. Mol Biol Evol 2020; 37:417-428. [PMID: 31589312 PMCID: PMC6993868 DOI: 10.1093/molbev/msz225] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Identifying genetic variation in bacteria that has been shaped by ecological differences remains an important challenge. For recombining bacteria, the sign and strength of linkage provide a unique lens into ongoing selection. We show that derived alleles <300 bp apart in Neisseria gonorrhoeae exhibit more coupling linkage than repulsion linkage, a pattern that cannot be explained by limited recombination or neutrality as these couplings are significantly stronger for nonsynonymous alleles than synonymous alleles. This general pattern is driven by a small fraction of highly diverse genes, many of which exhibit evidence of interspecies horizontal gene transfer and an excess of intermediate frequency alleles. Extensive simulations show that two distinct forms of positive selection can create these patterns of genetic variation: directional selection on horizontally transferred alleles or balancing selection that maintains distinct haplotypes in the presence of recombination. Our results establish a framework for identifying patterns of selection in fine-scale haplotype structure that indicate specific ecological processes in species that recombine with distantly related lineages or possess coexisting adaptive haplotypes.
Collapse
Affiliation(s)
- Brian Arnold
- Division of Informatics, Faculty of Arts and Sciences, Harvard University, Cambridge, MA
- Center for Communicable Disease Dynamics, Harvard T. H. Chan School of Public Health, Boston, MA
| | - Mashaal Sohail
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA
| | - Crista Wadsworth
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA
| | - Jukka Corander
- Department of Biostatistics, University of Oslo, Oslo, Norway
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| | - William P Hanage
- Center for Communicable Disease Dynamics, Harvard T. H. Chan School of Public Health, Boston, MA
| | - Shamil Sunyaev
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA
| | - Yonatan H Grad
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA
- Division of Infectious Diseases, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| |
Collapse
|
12
|
Belinky F, Sela I, Rogozin IB, Koonin EV. Crossing fitness valleys via double substitutions within codons. BMC Biol 2019; 17:105. [PMID: 31842858 PMCID: PMC6916188 DOI: 10.1186/s12915-019-0727-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 11/20/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Single nucleotide substitutions in protein-coding genes can be divided into synonymous (S), with little fitness effect, and non-synonymous (N) ones that alter amino acids and thus generally have a greater effect. Most of the N substitutions are affected by purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases potentially could alleviate the deleterious effect of single substitutions, making them subject to positive selection. To elucidate the effects of selection on double substitutions in all codons, it is critical to differentiate selection from mutational biases. RESULTS We addressed the evolutionary regimes of within-codon double substitutions in 37 groups of closely related prokaryotic genomes from diverse phyla by comparing the fractions of double substitutions within codons to those of the equivalent double S substitutions in adjacent codons. Under the assumption that substitutions occur one at a time, all within-codon double substitutions can be represented as "ancestral-intermediate-final" sequences (where "intermediate" refers to the first single substitution and "final" refers to the second substitution) and can be partitioned into four classes: (1) SS, S intermediate-S final; (2) SN, S intermediate-N final; (3) NS, N intermediate-S final; and (4) NN, N intermediate-N final. We found that the selective pressure on the second substitution markedly differs among these classes of double substitutions. Analogous to single S (synonymous) substitutions, SS double substitutions evolve neutrally, whereas analogous to single N (non-synonymous) substitutions, SN double substitutions are subject to purifying selection. In contrast, NS show positive selection on the second step because the original amino acid is recovered. The NN double substitutions are heterogeneous and can be subject to either purifying or positive selection, or evolve neutrally, depending on the amino acid similarity between the final or intermediate and the ancestral states. CONCLUSIONS The results of the present, comprehensive analysis of the evolutionary landscape of within-codon double substitutions reaffirm the largely conservative regime of protein evolution. However, the second step of a double substitution can be subject to positive selection when the first step is deleterious. Such positive selection can result in frequent crossing of valleys on the fitness landscape.
Collapse
Affiliation(s)
- Frida Belinky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Itamar Sela
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
13
|
Ben Slimen H, Schaschl H, Knauer F, Suchentrunk F. Selection on the mitochondrial ATP synthase 6 and the NADH dehydrogenase 2 genes in hares (Lepus capensis L., 1758) from a steep ecological gradient in North Africa. BMC Evol Biol 2017; 17:46. [PMID: 28173765 PMCID: PMC5297179 DOI: 10.1186/s12862-017-0896-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Accepted: 01/26/2017] [Indexed: 11/30/2022] Open
Abstract
Background Recent studies of selection on mitochondrial (mt) OXPHOS genes suggest adaptation due mainly to environmental variation. In this context, Tunisian hares that display several external phenotypes with phylogenetically rather homogenous gene pool and shallow population structure provide a good precondition to detect positive selection on mt genes related to environmental/climatic variation, specifically ambient temperature and precipitation. Results We used codon-based methods along with population genetic data to test for positive selection on ATPase synthase 6 (ATP6) and NADH dehydrogenase 2 (ND2) of cape hares (Lepus capensis) collected along a steep ecological gradient in Tunisia. We found significantly higher differentiation at the ATP6 locus across Tunisia, with sub-humid Mediterranean, semi-arid, and arid Sahara climate than for fourteen unlinked supposedly neutrally evolving nuclear microsatellites and mt control region sequences. This suggested positive selection on ATP6 sequences, which was confirmed by several codon-based tests for one sequence site that together with a second site translated into four different amino acids. Positive selection on ND2 sequences was also confirmed by several codon-based tests. The corresponding frequencies of the two most prevalent variants at each locus varied significantly across climate regions, and our logistic general linear models of occurrence of those proteins indicated significant effects of mean annual temperature for ATP6 and mean minimum temperature of the coldest month of the year for ND2, independent of geographical location, annual precipitation, and the respective co-occurring protein at the second locus. Moreover, presence of the ancestral ATP6 protein, as inferred from phylogenetic networks, was positively affected by the simultaneous presence of the derived ND2 protein and vice versa, independent of temperature, precipitation, or geographic location. Finally, we obtained a significant coevolution signal for the ancestral ATP6 and derived ND2 sequences and vice versa. Conclusions positive selection was strongly suggested by the population genetic approach and the codon-based tests in both mtDNA genes. Moreover, the two most prevalent proteins at the ATP6 locus were distributed at significantly varying frequencies across the study area with a significant effect of mean annual temperature on the occurrence of the ATP6 proteins independent of geographical coordinates and the co-occuring ND2 protein variant. For ND2, occurrence of the two most frequent protein variants was significantly influenced by the mean minimum temperature of the coldest month, independent of the co-occurring ATP6 protein variant and geographical coordinates. This strongly suggests direct involvement of ambient temperature in the adaptation of the studied mtOXPHOS genes. Electronic supplementary material The online version of this article (doi:10.1186/s12862-017-0896-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hichem Ben Slimen
- UR Génomique des Insectes Ravageurs des Cultures d'Intérêt Agronomique (GIRC), Université de Tunis El-Manar, 2092, El Manar, Tunisia. .,Institut Supérieur de Biotechnologie de Béja, Beja, 9000, Tunisia.
| | - Helmut Schaschl
- Department of Anthropology, University of Vienna, Althanstrasse 14, 1090, Vienna, Austria
| | - Felix Knauer
- Research Institute of Wildlife Ecology, University of Veterinary Medicine Vienna, Savoyenstrasse 1, 1160, Vienna, Austria
| | - Franz Suchentrunk
- Research Institute of Wildlife Ecology, University of Veterinary Medicine Vienna, Savoyenstrasse 1, 1160, Vienna, Austria
| |
Collapse
|
14
|
Harpak A, Bhaskar A, Pritchard JK. Mutation Rate Variation is a Primary Determinant of the Distribution of Allele Frequencies in Humans. PLoS Genet 2016; 12:e1006489. [PMID: 27977673 PMCID: PMC5157949 DOI: 10.1371/journal.pgen.1006489] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 11/16/2016] [Indexed: 01/06/2023] Open
Abstract
The site frequency spectrum (SFS) has long been used to study demographic history and natural selection. Here, we extend this summary by examining the SFS conditional on the alleles found at the same site in other species. We refer to this extension as the "phylogenetically-conditioned SFS" or cSFS. Using recent large-sample data from the Exome Aggregation Consortium (ExAC), combined with primate genome sequences, we find that human variants that occurred independently in closely related primate lineages are at higher frequencies in humans than variants with parallel substitutions in more distant primates. We show that this effect is largely due to sites with elevated mutation rates causing significant departures from the widely-used infinite sites mutation model. Our analysis also suggests substantial variation in mutation rates even among mutations involving the same nucleotide changes. In summary, we show that variable mutation rates are key determinants of the SFS in humans.
Collapse
Affiliation(s)
- Arbel Harpak
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Anand Bhaskar
- Department of Genetics, Stanford University, Stanford, California, United States of America
- Howard Hughes Medical Institute, Stanford University, Stanford, California, United States of America
| | - Jonathan K. Pritchard
- Department of Biology, Stanford University, Stanford, California, United States of America
- Department of Genetics, Stanford University, Stanford, California, United States of America
- Howard Hughes Medical Institute, Stanford University, Stanford, California, United States of America
| |
Collapse
|
15
|
Bazykin GA. Changing preferences: deformation of single position amino acid fitness landscapes and evolution of proteins. Biol Lett 2016; 11:rsbl.2015.0315. [PMID: 26445980 DOI: 10.1098/rsbl.2015.0315] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
The fitness landscape-the function that relates genotypes to fitness-and its role in directing evolution are a central object of evolutionary biology. However, its huge dimensionality precludes understanding of even the basic aspects of its shape. One way to approach it is to ask a simpler question: what are the properties of a function that assigns fitness to each possible variant at just one particular site-a single position fitness landscape-and how does it change in the course of evolution? Analyses of genomic data from multiple species and multiple individuals within a species have proved beyond reasonable doubt that fitness functions of positions throughout the genome do themselves change with time, thus shaping protein evolution. Here, I will briefly review the literature that addresses these dynamics, focusing on recent genome-scale analyses of fitness functions of amino acid sites, i.e. vectors of fitnesses of 20 individual amino acid variants at a given position of a protein. The set of amino acids that confer high fitness at a particular position changes with time, and the rate of this change is comparable with the rate at which a position evolves, implying that this process plays a major role in evolutionary dynamics. However, the causes of these changes remain largely unclear.
Collapse
Affiliation(s)
- Georgii A Bazykin
- Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow 127051, Russia Faculty of Bioengineering and Bioinformatics and Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119234, Russia Pirogov Russian National Research Medical University, Moscow 117997, Russia
| |
Collapse
|
16
|
Elyashiv E, Sattath S, Hu TT, Strutsovsky A, McVicker G, Andolfatto P, Coop G, Sella G. A Genomic Map of the Effects of Linked Selection in Drosophila. PLoS Genet 2016; 12:e1006130. [PMID: 27536991 PMCID: PMC4990265 DOI: 10.1371/journal.pgen.1006130] [Citation(s) in RCA: 93] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2015] [Accepted: 05/26/2016] [Indexed: 01/23/2023] Open
Abstract
Natural selection at one site shapes patterns of genetic variation at linked sites. Quantifying the effects of "linked selection" on levels of genetic diversity is key to making reliable inference about demography, building a null model in scans for targets of adaptation, and learning about the dynamics of natural selection. Here, we introduce the first method that jointly infers parameters of distinct modes of linked selection, notably background selection and selective sweeps, from genome-wide diversity data, functional annotations and genetic maps. The central idea is to calculate the probability that a neutral site is polymorphic given local annotations, substitution patterns, and recombination rates. Information is then combined across sites and samples using composite likelihood in order to estimate genome-wide parameters of distinct modes of selection. In addition to parameter estimation, this approach yields a map of the expected neutral diversity levels along the genome. To illustrate the utility of our approach, we apply it to genome-wide resequencing data from 125 lines in Drosophila melanogaster and reliably predict diversity levels at the 1Mb scale. Our results corroborate estimates of a high fraction of beneficial substitutions in proteins and untranslated regions (UTR). They allow us to distinguish between the contribution of sweeps and other modes of selection around amino acid substitutions and to uncover evidence for pervasive sweeps in untranslated regions (UTRs). Our inference further suggests a substantial effect of other modes of linked selection and of adaptation in particular. More generally, we demonstrate that linked selection has had a larger effect in reducing diversity levels and increasing their variance in D. melanogaster than previously appreciated.
Collapse
Affiliation(s)
- Eyal Elyashiv
- Department of Ecology, Evolution, and Behavior, Hebrew University of Jerusalem, Jerusalem, Israel
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
| | - Shmuel Sattath
- Department of Ecology, Evolution, and Behavior, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Tina T. Hu
- Department of Ecology and Evolutionary Biology and the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Alon Strutsovsky
- Department of Ecology, Evolution, and Behavior, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Graham McVicker
- The Laboratory of Genetics and The Integrative Biology Laboratory, Salk Institute for Biological Studies, La Jolla, California, United States of America
| | - Peter Andolfatto
- Department of Ecology and Evolutionary Biology and the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Graham Coop
- Department of Evolution and Ecology, University of California, Davis, Davis, California, United States of America
| | - Guy Sella
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
| |
Collapse
|
17
|
Zhu W, Cooper DN, Zhao Q, Wang Y, Liu R, Li Q, Férec C, Wang Y, Chen JM. Concurrent nucleotide substitution mutations in the human genome are characterized by a significantly decreased transition/transversion ratio. Hum Mutat 2015; 36:333-41. [PMID: 25546635 DOI: 10.1002/humu.22749] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2014] [Accepted: 12/17/2014] [Indexed: 01/16/2023]
Abstract
There is accumulating evidence that the number of multiple-nucleotide substitutions (MNS) occurring in closely spaced sites in eukaryotic genomes is significantly higher than would be predicted from the random accumulation of independently generated single-nucleotide substitutions (SNS). Although this excess can in principle be accounted for by the concept of transient hypermutability, a general mutational signature of concurrent MNS mutations has not so far been evident. Employing a dataset (N = 449) of "concurrent" double MNS mutations causing human inherited disease, we have identified just such a mutational signature: concurrently generated double MNS mutations exhibit a >twofold lower transition/transversion ratio (termed RTs/Tv ) than independently generated de novo SNS mutations (<0.80 vs. 2.10; P = 2.69 × 10(-14) ). We replicated this novel finding through a similar analysis employing two double MNS variant datasets with differing abundances of concurrent events (150,521 variants with both substitutions on the same haplotypic lineage vs. 94,875 variants whose component substitutions were on different haplotypic lineages) plus 5,430,874 SNS variants, all being derived from the whole-genome sequencing of seven Chinese individuals. Evaluation of the newly observed mutational signature in diverse contexts provides solid support for the postulated role of translesion synthesis DNA polymerases in transient hypermutability.
Collapse
Affiliation(s)
- Wenjuan Zhu
- Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, China
| | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Vedanayagam JP, Garrigan D. The effects of natural selection across molecular pathways in Drosophila melanogaster. BMC Evol Biol 2015; 15:203. [PMID: 26391223 PMCID: PMC4578789 DOI: 10.1186/s12862-015-0472-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2015] [Accepted: 08/30/2015] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Whole-genome RNA interference post-transcriptional silencing (RNAi) is a widely used method for studying the phenotypic effects of knocking down individual genes. In this study, we use a population genomic approach to characterize the rate of evolution for proteins affecting 26 RNAi knockdown phenotypes in Drosophila melanogaster. RESULTS We find that only two of the 26 RNAi knockdown phenotypes are enriched for rapidly evolving proteins: innate immunity and regulation of Hedgehog signaling. Among all genes associated with an RNAi knockdown phenotype, we note examples in which the adaptively evolving proteins play a well-defined role in a given molecular pathway. However, most adaptively evolving proteins are found to perform more general cellular functions. When RNAi phenotypes are grouped into categories according to cellular function, we find that genes involved in the greatest number of phenotypic categories are also significantly more likely to have a history of rapid protein evolution. CONCLUSIONS We show that genes that have been demonstrated to have a measurable effect on multiple molecular phenotypes show higher rates of protein evolution than genes having an effect on a single category of phenotype. Defining pleiotropy in this way yields very different results than previous studies that define pleiotropy by the number of physical interactions, which show highly connected proteins tend to evolve more slowly than lowly connected proteins. We suggest that a high degree of pleiotropy may increase the likelihood of compensatory substitution, consistent with modern theoretical work on adaptation.
Collapse
Affiliation(s)
| | - Daniel Garrigan
- Department of Biology, University of Rochester, Rochester, New York, 14627, USA.
| |
Collapse
|
19
|
Seplyarskiy VB, Bazykin GA, Soldatov RA. Polymerase ζ Activity Is Linked to Replication Timing in Humans: Evidence from Mutational Signatures. Mol Biol Evol 2015; 32:3158-72. [PMID: 26376651 DOI: 10.1093/molbev/msv184] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Replication timing is an important determinant of germline mutation patterns, with a higher rate of point mutations in late replicating regions. Mechanisms underlying this association remain elusive. One of the suggested explanations is the activity of error-prone DNA polymerases in late-replicating regions. Polymerase zeta (pol ζ), an essential error-prone polymerase biased toward transversions, also has a tendency to produce dinucleotide mutations (DNMs), complex mutational events that simultaneously affect two adjacent nucleotides. Experimental studies have shown that pol ζ is strongly biased toward GC→AA/TT DNMs. Using primate divergence data, we show that the GC→AA/TT pol ζ mutational signature is the most frequent among DNMs, and its rate exceeds the mean rate of other DNM types by a factor of approximately 10. Unlike the overall rate of DNMs, the pol ζ signature drastically increases with the replication time in the human genome. Finally, the pol ζ signature is enriched in transcribed regions, and there is a strong prevalence of GC→TT over GC→AA DNMs on the nontemplate strand, indicating association with transcription. A recurrently occurring GC→TT DNM in HRAS and SOD1 genes causes the Costello syndrome and amyotrophic lateral sclerosis correspondently; we observe an approximately 1 kb long mutation hotspot enriched by transversions near these DNMs in both cases, suggesting a link between these diseases and pol ζ activity. This study uncovers the genomic preferences of pol ζ, shedding light on a novel cause of mutational heterogeneity along the genome.
Collapse
Affiliation(s)
- Vladimir B Seplyarskiy
- Institute of Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia Pirogov Russian National Research Medical University, Moscow, Russia
| | - Georgii A Bazykin
- Institute of Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia Pirogov Russian National Research Medical University, Moscow, Russia
| | - Ruslan A Soldatov
- Institute of Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| |
Collapse
|
20
|
Ivankov DN, Finkelstein AV, Kondrashov FA. A structural perspective of compensatory evolution. Curr Opin Struct Biol 2014; 26:104-12. [PMID: 24981969 PMCID: PMC4141909 DOI: 10.1016/j.sbi.2014.05.004] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2014] [Revised: 04/11/2014] [Accepted: 05/16/2014] [Indexed: 11/25/2022]
Abstract
The study of molecular evolution is important because it reveals how protein functions emerge and evolve. Recently, several types of studies indicated that substitutions in molecular evolution occur in a compensatory manner, whereby the occurrence of a substitution depends on the amino acid residues at other sites. However, a molecular or structural basis behind the compensation often remains obscure. Here, we review studies on the interface of structural biology and molecular evolution that revealed novel aspects of compensatory evolution. In many cases structural studies benefit from evolutionary data while structural data often add a functional dimension to the study of molecular evolution.
Collapse
Affiliation(s)
- Dmitry N Ivankov
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), 88 Dr. Aiguader, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain; Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, 4 Institutskaya str., Pushchino, Moscow Region, 142290, Russia
| | - Alexei V Finkelstein
- Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, 4 Institutskaya str., Pushchino, Moscow Region, 142290, Russia
| | - Fyodor A Kondrashov
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), 88 Dr. Aiguader, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), 23 Pg. Lluís Companys, 08010 Barcelona, Spain.
| |
Collapse
|
21
|
Background selection as baseline for nucleotide variation across the Drosophila genome. PLoS Genet 2014; 10:e1004434. [PMID: 24968283 PMCID: PMC4072542 DOI: 10.1371/journal.pgen.1004434] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2013] [Accepted: 04/28/2014] [Indexed: 11/21/2022] Open
Abstract
The constant removal of deleterious mutations by natural selection causes a reduction in neutral diversity and efficacy of selection at genetically linked sites (a process called Background Selection, BGS). Population genetic studies, however, often ignore BGS effects when investigating demographic events or the presence of other types of selection. To obtain a more realistic evolutionary expectation that incorporates the unavoidable consequences of deleterious mutations, we generated high-resolution landscapes of variation across the Drosophila melanogaster genome under a BGS scenario independent of polymorphism data. We find that BGS plays a significant role in shaping levels of variation across the entire genome, including long introns and intergenic regions distant from annotated genes. We also find that a very large percentage of the observed variation in diversity across autosomes can be explained by BGS alone, up to 70% across individual chromosome arms at 100-kb scale, thus indicating that BGS predictions can be used as baseline to infer additional types of selection and demographic events. This approach allows detecting several outlier regions with signal of recent adaptive events and selective sweeps. The use of a BGS baseline, however, is particularly appropriate to investigate the presence of balancing selection and our study exposes numerous genomic regions with the predicted signature of higher polymorphism than expected when a BGS context is taken into account. Importantly, we show that these conclusions are robust to the mutation and selection parameters of the BGS model. Finally, analyses of protein evolution together with previous comparisons of genetic maps between Drosophila species, suggest temporally variable recombination landscapes and, thus, local BGS effects that may differ between extant and past phases. Because genome-wide BGS and temporal changes in linkage effects can skew approaches to estimate demographic and selective events, future analyses should incorporate BGS predictions and capture local recombination variation across genomes and along lineages. The removal of deleterious mutations from natural populations has potential consequences on patterns of variation across genomes. Population genetic analyses, however, often assume that such effects are negligible across recombining regions of species like Drosophila. We use simple models of purifying selection and current knowledge of recombination rates and gene distribution across the genome to obtain a baseline of variation predicted by the constant input and removal of deleterious mutations. We find that purifying selection alone can explain a major fraction of the observed variance in nucleotide diversity across the genome. The use of a baseline of variation predicted by linkage to deleterious mutations as null expectation exposes genomic regions under other selective regimes, including more regions showing the signature of balancing selection than would be evident when using traditional approaches. Our study also indicates that most, if not all, nucleotides across the D. melanogaster genome are significantly influenced by the removal of deleterious mutations, even when located in the middle of highly recombining regions and distant from genes. Additionally, the study of rates of protein evolution confirms previous analyses suggesting that the recombination landscape across the genome has changed in the recent history of D. melanogaster. All these reported factors can skew current analyses designed to capture demographic events or estimate the strength and frequency of adaptive mutations, and illustrate the need for new and more realistic theoretical and modeling approaches to study naturally occurring genetic variation.
Collapse
|
22
|
Lee YCG, Langley CH, Begun DJ. Differential strengths of positive selection revealed by hitchhiking effects at small physical scales in Drosophila melanogaster. Mol Biol Evol 2013; 31:804-16. [PMID: 24361994 DOI: 10.1093/molbev/mst270] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
The long time scale of adaptive evolution makes it difficult to directly observe the spread of most beneficial mutations through natural populations. Therefore, inferring attributes of beneficial mutations by studying the genomic signals left by directional selection is an important component of population genetics research. One kind of signal is a trough in nearby neutral genetic variation due to selective fixation of initially rare alleles, a phenomenon known as "genetic hitchhiking." Accumulated evidence suggests that a considerable fraction of substitutions in the Drosophila genome results from positive selection, most of which are expected to have small selection coefficients and influence the population genetics of sites in the immediate vicinity. Using Drosophila melanogaster population genomic data, we found that the heterogeneity in synonymous polymorphism surrounding different categories of coding fixations is readily observable even within 25 bp of focal substitutions, which we interpret as the result of small-scale hitchhiking effects. The strength of natural selection on different sites appears to be quite heterogeneous. Particularly, neighboring fixations that changed amino acid polarities in a way that maintained the overall polarities of a protein were under stronger selection than other categories of fixations. Interestingly, we found that substitutions in slow-evolving genes are associated with stronger hitchhiking effects. This is consistent with the idea that adaptive evolution may involve few substitutions with large effects or many substitutions with small effects. Because our approach only weakly depends on the numbers of recent nonsynonymous substitutions, it can provide a complimentary view to the adaptive evolution inferred by other divergence-based evolutionary genetic methods.
Collapse
Affiliation(s)
- Yuh Chwen G Lee
- Department of Evolution and Ecology and Center for Population Biology, University of California, Davis
| | | | | |
Collapse
|
23
|
Melamed D, Young DL, Gamble CE, Miller CR, Fields S. Deep mutational scanning of an RRM domain of the Saccharomyces cerevisiae poly(A)-binding protein. RNA (NEW YORK, N.Y.) 2013; 19:1537-51. [PMID: 24064791 PMCID: PMC3851721 DOI: 10.1261/rna.040709.113] [Citation(s) in RCA: 147] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
The RNA recognition motif (RRM) is the most common RNA-binding domain in eukaryotes. Differences in RRM sequences dictate, in part, both RNA and protein-binding specificities and affinities. We used a deep mutational scanning approach to study the sequence-function relationship of the RRM2 domain of the Saccharomyces cerevisiae poly(A)-binding protein (Pab1). By scoring the activity of more than 100,000 unique Pab1 variants, including 1246 with single amino acid substitutions, we delineated the mutational constraints on each residue. Clustering of residues with similar mutational patterns reveals three major classes, composed principally of RNA-binding residues, of hydrophobic core residues, and of the remaining residues. The first class also includes a highly conserved residue not involved in RNA binding, G150, which can be mutated to destabilize Pab1. A comparison of the mutational sensitivity of yeast Pab1 residues to their evolutionary conservation reveals that most residues tolerate more substitutions than are present in the natural sequences, although other residues that tolerate fewer substitutions may point to specialized functions in yeast. An analysis of ∼40,000 double mutants indicates a preference for a short distance between two mutations that display an epistatic interaction. As examples of interactions, the mutations N139T, N139S, and I157L suppress other mutations that interfere with RNA binding and protein stability. Overall, this study demonstrates that living cells can be subjected to a single assay to analyze hundreds of thousands of protein variants in parallel.
Collapse
Affiliation(s)
- Daniel Melamed
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - David L. Young
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Caitlin E. Gamble
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Christina R. Miller
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Stanley Fields
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
- Department of Medicine, University of Washington, Seattle, Washington 98195, USA
- Corresponding authorE-mail
| |
Collapse
|
24
|
Cutter AD, Jovelin R, Dey A. Molecular hyperdiversity and evolution in very large populations. Mol Ecol 2013; 22:2074-95. [PMID: 23506466 PMCID: PMC4065115 DOI: 10.1111/mec.12281] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2012] [Revised: 01/24/2013] [Accepted: 01/29/2013] [Indexed: 02/06/2023]
Abstract
The genomic density of sequence polymorphisms critically affects the sensitivity of inferences about ongoing sequence evolution, function and demographic history. Most animal and plant genomes have relatively low densities of polymorphisms, but some species are hyperdiverse with neutral nucleotide heterozygosity exceeding 5%. Eukaryotes with extremely large populations, mimicking bacterial and viral populations, present novel opportunities for studying molecular evolution in sexually reproducing taxa with complex development. In particular, hyperdiverse species can help answer controversial questions about the evolution of genome complexity, the limits of natural selection, modes of adaptation and subtleties of the mutation process. However, such systems have some inherent complications and here we identify topics in need of theoretical developments. Close relatives of the model organisms Caenorhabditis elegans and Drosophila melanogaster provide known examples of hyperdiverse eukaryotes, encouraging functional dissection of resulting molecular evolutionary patterns. We recommend how best to exploit hyperdiverse populations for analysis, for example, in quantifying the impact of noncrossover recombination in genomes and for determining the identity and micro-evolutionary selective pressures on noncoding regulatory elements.
Collapse
Affiliation(s)
- Asher D Cutter
- Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada.
| | | | | |
Collapse
|
25
|
Terekhanova NV, Bazykin GA, Neverov A, Kondrashov AS, Seplyarskiy VB. Prevalence of multinucleotide replacements in evolution of primates and Drosophila. Mol Biol Evol 2013; 30:1315-25. [PMID: 23447710 PMCID: PMC3649671 DOI: 10.1093/molbev/mst036] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Evolution of sequences mostly involves independent changes at different sites. However, substitutions at neighboring sites may co-occur as multinucleotide replacement events (MNRs). Here, we compare noncoding sequences of several species of primates, and of three species of Drosophila fruit flies, in a phylogenetic analysis of the replacements that occurred between species at nearby nucleotide sites. Both in primates and in Drosophila, the frequency of single-nucleotide replacements is substantially elevated within 10 nucleotides from other replacements that occurred on the same lineage but not on another lineage. The data imply that dinucleotide replacements (DNRs) affecting sites at distances of up to 10 nucleotides from each other are responsible for 2.3% of single-nucleotide replacements in primate genomes and for 5.6% in Drosophila genomes. Among these DNRs, 26% and 69%, respectively, are in fact parts of replacements of three or more trinucleotide replacements (TNRs). The plurality of MNRs affect nearby nucleotides, so that at least six times as many DNRs affect two adjacent nucleotide sites than sites 10 nucleotides apart. Still, approximately 60% of DNRs, and approximately 90% of TNRs, span distances more than two (or three) nucleotides. MNRs make a major contribution to the observed clustering of substitutions: In the human–chimpanzee comparison, DNRs are responsible for 50% of cases when two nearby replacements are observed on the human lineage, and TNRs are responsible for 83% of cases when three replacements at three immediately adjacent sites are observed on the human lineage. The prevalence of MNRs matches that is observed in data on de novo mutations and is also observed in the regions with the lowest sequence conservation, suggesting that MNRs mainly have mutational origin; however, epistatic selection and/or gene conversion may also play a role.
Collapse
Affiliation(s)
- Nadezhda V Terekhanova
- Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | | | | | | | | |
Collapse
|
26
|
Tóth-Petróczy Á, Tawfik DS. Protein Insertions and Deletions Enabled by Neutral Roaming in Sequence Space. Mol Biol Evol 2013; 30:761-71. [DOI: 10.1093/molbev/mst003] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
|
27
|
The emergence of protein complexes: quaternary structure, dynamics and allostery. Colworth Medal Lecture. Biochem Soc Trans 2012; 40:475-91. [PMID: 22616857 DOI: 10.1042/bst20120056] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
All proteins require physical interactions with other proteins in order to perform their functions. Most of them oligomerize into homomers, and a vast majority of these homomers interact with other proteins, at least part of the time, forming transient or obligate heteromers. In the present paper, we review the structural, biophysical and evolutionary aspects of these protein interactions. We discuss how protein function and stability benefit from oligomerization, as well as evolutionary pathways by which oligomers emerge, mostly from the perspective of homomers. Finally, we emphasize the specificities of heteromeric complexes and their structure and evolution. We also discuss two analytical approaches increasingly being used to study protein structures as well as their interactions. First, we review the use of the biological networks and graph theory for analysis of protein interactions and structure. Secondly, we discuss recent advances in techniques for detecting correlated mutations, with the emphasis on their role in identifying pathways of allosteric communication.
Collapse
|
28
|
Hu TT, Eisen MB, Thornton KR, Andolfatto P. A second-generation assembly of the Drosophila simulans genome provides new insights into patterns of lineage-specific divergence. Genome Res 2012; 23:89-98. [PMID: 22936249 PMCID: PMC3530686 DOI: 10.1101/gr.141689.112] [Citation(s) in RCA: 119] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
We create a new assembly of the Drosophila simulans genome using 142 million paired short-read sequences and previously published data for strain w501. Our assembly represents a higher-quality genomic sequence with greater coverage, fewer misassemblies, and, by several indexes, fewer sequence errors. Evolutionary analysis of this genome reference sequence reveals interesting patterns of lineage-specific divergence that are different from those previously reported. Specifically, we find that Drosophila melanogaster evolves faster than D. simulans at all annotated classes of sites, including putatively neutrally evolving sites found in minimal introns. While this may be partly explained by a higher mutation rate in D. melanogaster, we also find significant heterogeneity in rates of evolution across classes of sites, consistent with historical differences in the effective population size for the two species. Also contrary to previous findings, we find that the X chromosome is evolving significantly faster than autosomes for nonsynonymous and most noncoding DNA sites and significantly slower for synonymous sites. The absence of a X/A difference for putatively neutral sites and the robustness of the pattern to Gene Ontology and sex-biased expression suggest that partly recessive beneficial mutations may comprise a substantial fraction of noncoding DNA divergence observed between species. Our results have more general implications for the interpretation of evolutionary analyses of genomes of different quality.
Collapse
Affiliation(s)
- Tina T Hu
- Department of Ecology and Evolutionary Biology and the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA.
| | | | | | | |
Collapse
|
29
|
Bullaughey K. Multidimensional adaptive evolution of a feed-forward network and the illusion of compensation. Evolution 2012; 67:49-65. [PMID: 23289561 DOI: 10.1111/j.1558-5646.2012.01735.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
When multiple substitutions affect a trait in opposing ways, they are often assumed to be compensatory, not only with respect to the trait, but also with respect to fitness. This type of compensatory evolution has been suggested to underlie the evolution of protein structures and interactions, RNA secondary structures, and gene regulatory modules and networks. The possibility for compensatory evolution results from epistasis. Yet if epistasis is widespread, then it is also possible that the opposing substitutions are individually adaptive. I term this possibility an adaptive reversal. Although possible for arbitrary phenotype-fitness mappings, it has not yet been investigated whether such epistasis is prevalent in a biologically realistic setting. I investigate a particular regulatory circuit, the type I coherent feed-forward loop, which is ubiquitous in natural systems and is accurately described by a simple mathematical model. I show that such reversals are common during adaptive evolution, can result solely from the topology of the fitness landscape, and can occur even when adaptation follows a modest environmental change and the network was well adapted to the original environment. The possibility of adaptive reversals warrants a systems perspective when interpreting substitution patterns in gene regulatory networks.
Collapse
Affiliation(s)
- Kevin Bullaughey
- Department of Ecology & Evolution, University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
30
|
Bazykin GA, Kondrashov AS. Major role of positive selection in the evolution of conservative segments of Drosophila proteins. Proc Biol Sci 2012; 279:3409-17. [PMID: 22673359 PMCID: PMC3396909 DOI: 10.1098/rspb.2012.0776] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Slow evolution of conservative segments of coding and non-coding DNA is caused by the action of negative selection, which removes new mutations. However, the mode of selection that affects the few substitutions that do occur within such segments remains unclear. Here, we show that the fraction of allele replacements that were driven by positive selection, and the strength of this selection, is the highest within the conservative segments of Drosophila protein-coding genes. The McDonald–Kreitman test, applied to the data on variation in Drosophila melanogaster and in Drosophila simulans, indicates that within the most conservative protein segments, approximately 72 per cent (approx. 80%) of allele replacements were driven by positive selection, as opposed to only approximately 44 per cent (approx. 53%) at rapidly evolving segments. Data on multiple non-synonymous substitutions at a codon lead to the same conclusion and additionally indicate that positive selection driving allele replacements at conservative sites is the strongest, as it accelerates evolution by a factor of approximately 40, as opposed to a factor of approximately 5 at rapidly evolving sites. Thus, random drift plays only a minor role in the evolution of conservative DNA segments, and those relatively rare allele replacements that occur within such segments are mostly driven by substantial positive selection.
Collapse
Affiliation(s)
- Georgii A Bazykin
- Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Vorbyevy Gory 1-73, Moscow 119992, Russia
| | | |
Collapse
|
31
|
Leushkin EV, Bazykin GA, Kondrashov AS. Insertions and deletions trigger adaptive walks in Drosophila proteins. Proc Biol Sci 2012; 279:3075-82. [PMID: 22456880 PMCID: PMC3385466 DOI: 10.1098/rspb.2011.2571] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Maps that relate all possible genotypes or phenotypes to fitness—fitness landscapes—are central to the evolution of life, but remain poorly known. An insertion or a deletion (indel) of one or several amino acids constitutes a substantial leap of a protein within the space of amino acid sequences, and it is unlikely that after such a leap the new sequence corresponds precisely to a fitness peak. Thus, one can expect an indel in the protein-coding sequence that gets fixed in a population to be followed by some number of adaptive amino acid substitutions, which move the new sequence towards a nearby fitness peak. Here, we study substitutions that occur after a frame-preserving indel in evolving proteins of Drosophila. An insertion triggers 1.03 ± 0.75 amino acid substitutions within the protein region centred at the site of insertion, and a deletion triggers 4.77 ± 1.03 substitutions within such a region. The difference between these values is probably owing to a higher fraction of effectively neutral insertions. Almost all of the triggered amino acid substitutions can be attributed to positive selection, and most of them occur relatively soon after the triggering indel and take place upstream of its site. A high fraction of substitutions that follow an indel occur at previously conserved sites, suggesting that an indel substantially changes selection that shapes the protein region around it. Thus, an indel is often followed by an adaptive walk of length that is in agreement with the theory of molecular adaptation.
Collapse
Affiliation(s)
- Evgeny V Leushkin
- Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Leninskye Gory 1-73, Moscow 119991, Russia.
| | | | | |
Collapse
|
32
|
Abstract
Information is a key concept in evolutionary biology. Information stored in a biological organism's genome is used to generate the organism and to maintain and control it. Information is also that which evolves. When a population adapts to a local environment, information about this environment is fixed in a representative genome. However, when an environment changes, information can be lost. At the same time, information is processed by animal brains to survive in complex environments, and the capacity for information processing also evolves. Here, I review applications of information theory to the evolution of proteins and to the evolution of information processing in simulated agents that adapt to perform a complex task.
Collapse
Affiliation(s)
- Christoph Adami
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan 48824, USA.
| |
Collapse
|
33
|
Callahan BJ. The length scale of selection in protein evolution. Fly (Austin) 2011; 6:16-20. [PMID: 22198524 DOI: 10.4161/fly.18305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Central to the study of molecular evolution, and an area of long-standing debate, is the appropriate model for the fitness landscape of proteins. Much of this debate has focused on the strength and frequency of positive and purifying selection, but the form and frequency of selective correlations is also a vital element. The constituent amino acids within a protein generically interact and share selective pressures in predictable ways, which conflicts with the selective independence assumed by common caricatures of the fitness landscape. Here, I discuss a recent study by myself and coauthors that used whole-genome comparisons of orthologous molecular sequences from closely related Drosophilids to explore the form of the selective correlations and selective interactions (epistasis) between the amino acids within a protein. I outline our results and highlight our finding of a selective length scale of ten amino acids within which individual amino acids are substantially and generically more likely to share selective pressures and interact epistatically. I then focus on the evidence presented in our study supporting a substantial role for epistasis in the process of molecular evolution, and discuss further the implications of this widespread epistasis on the overdispersion of the molecular clock and the efficacy of common tests for positive selection.
Collapse
|
34
|
Wilson DJ, Hernandez RD, Andolfatto P, Przeworski M. A population genetics-phylogenetics approach to inferring natural selection in coding sequences. PLoS Genet 2011; 7:e1002395. [PMID: 22144911 PMCID: PMC3228810 DOI: 10.1371/journal.pgen.1002395] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2010] [Accepted: 10/08/2011] [Indexed: 01/23/2023] Open
Abstract
Through an analysis of polymorphism within and divergence between species, we can hope to learn about the distribution of selective effects of mutations in the genome, changes in the fitness landscape that occur over time, and the location of sites involved in key adaptations that distinguish modern-day species. We introduce a novel method for the analysis of variation in selection pressures within and between species, spatially along the genome and temporally between lineages. We model codon evolution explicitly using a joint population genetics-phylogenetics approach that we developed for the construction of multiallelic models with mutation, selection, and drift. Our approach has the advantage of performing direct inference on coding sequences, inferring ancestral states probabilistically, utilizing allele frequency information, and generalizing to multiple species. We use a Bayesian sliding window model for intragenic variation in selection coefficients that efficiently combines information across sites and captures spatial clustering within the genome. To demonstrate the utility of the method, we infer selective pressures acting in Drosophila melanogaster and D. simulans from polymorphism and divergence data for 100 X-linked coding regions.
Collapse
Affiliation(s)
- Daniel J Wilson
- Department of Human Genetics and Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, USA.
| | | | | | | |
Collapse
|
35
|
Schneider A, Charlesworth B, Eyre-Walker A, Keightley PD. A method for inferring the rate of occurrence and fitness effects of advantageous mutations. Genetics 2011; 189:1427-37. [PMID: 21954160 PMCID: PMC3241409 DOI: 10.1534/genetics.111.131730] [Citation(s) in RCA: 84] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2011] [Accepted: 09/24/2011] [Indexed: 11/18/2022] Open
Abstract
The distribution of fitness effects (DFE) of new mutations is of fundamental importance in evolutionary genetics. Recently, methods have been developed for inferring the DFE that use information from the allele frequency distributions of putatively neutral and selected nucleotide polymorphic variants in a population sample. Here, we extend an existing maximum-likelihood method that estimates the DFE under the assumption that mutational effects are unconditionally deleterious, by including a fraction of positively selected mutations. We allow one or more classes of positive selection coefficients in the model and estimate both the fraction of mutations that are advantageous and the strength of selection acting on them. We show by simulations that the method is capable of recovering the parameters of the DFE under a range of conditions. We apply the method to two data sets on multiple protein-coding genes from African populations of Drosophila melanogaster. We use a probabilistic reconstruction of the ancestral states of the polymorphic sites to distinguish between derived and ancestral states at polymorphic nucleotide sites. In both data sets, we see a significant improvement in the fit when a category of positively selected amino acid mutations is included, but no further improvement if additional categories are added. We estimate that between 1% and 2% of new nonsynonymous mutations in D. melanogaster are positively selected, with a scaled selection coefficient representing the product of the effective population size, N(e), and the strength of selection on heterozygous carriers of ∼2.5.
Collapse
Affiliation(s)
- Adrian Schneider
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton BN1 9QG, United Kingdom
| | - Peter D. Keightley
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
| |
Collapse
|
36
|
Huang YF, Golding GB. Inferring sequence regions under functional divergence in duplicate genes. Bioinformatics 2011; 28:176-83. [DOI: 10.1093/bioinformatics/btr635] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
37
|
Weighing the evidence for adaptation at the molecular level. Trends Genet 2011; 27:343-9. [PMID: 21775012 DOI: 10.1016/j.tig.2011.06.003] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2011] [Revised: 06/10/2011] [Accepted: 06/10/2011] [Indexed: 11/24/2022]
Abstract
The abundance of genome polymorphism and divergence data has provided unprecedented insight into how mutation, drift and natural selection shape genome evolution. Application of the McDonald-Kreitman (MK) test to such data indicates a pervasive influence of positive selection, particularly in Drosophila species. However, evidence for positive selection in other species ranging from yeast to humans is often weak or absent. Although evidence for positive selection could be obscured in some species, there is also reason to believe that the frequency of adaptive substitutions could be overestimated as a result of epistatic fitness effects or hitchhiking of deleterious mutations. Based on these considerations it is argued that the common assumption of independence among sites must be relaxed before abandoning the neutral theory of molecular evolution.
Collapse
|