1
|
Notin P, Kollasch AW, Ritter D, van Niekerk L, Paul S, Spinner H, Rollins N, Shaw A, Weitzman R, Frazer J, Dias M, Franceschi D, Orenbuch R, Gal Y, Marks DS. ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness Prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.07.570727. [PMID: 38106144 PMCID: PMC10723403 DOI: 10.1101/2023.12.07.570727] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Predicting the effects of mutations in proteins is critical to many applications, from understanding genetic disease to designing novel proteins that can address our most pressing challenges in climate, agriculture and healthcare. Despite a surge in machine learning-based protein models to tackle these questions, an assessment of their respective benefits is challenging due to the use of distinct, often contrived, experimental datasets, and the variable performance of models across different protein families. Addressing these challenges requires scale. To that end we introduce ProteinGym, a large-scale and holistic set of benchmarks specifically designed for protein fitness prediction and design. It encompasses both a broad collection of over 250 standardized deep mutational scanning assays, spanning millions of mutated sequences, as well as curated clinical datasets providing high-quality expert annotations about mutation effects. We devise a robust evaluation framework that combines metrics for both fitness prediction and design, factors in known limitations of the underlying experimental methods, and covers both zero-shot and supervised settings. We report the performance of a diverse set of over 70 high-performing models from various subfields (eg., alignment-based, inverse folding) into a unified benchmark suite. We open source the corresponding codebase, datasets, MSAs, structures, model predictions and develop a user-friendly website that facilitates data access and analysis.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Ada Shaw
- Applied Mathematics, Harvard University
| | | | | | - Mafalda Dias
- Centre for Genomic Regulation, Universitat Pompeu Fabra
| | | | | | - Yarin Gal
- Computer Science, University of Oxford
| | | |
Collapse
|
2
|
Qu Y, Niu Z, Ding Q, Zhao T, Kong T, Bai B, Ma J, Zhao Y, Zheng J. Ensemble Learning with Supervised Methods Based on Large-Scale Protein Language Models for Protein Mutation Effects Prediction. Int J Mol Sci 2023; 24:16496. [PMID: 38003686 PMCID: PMC10671426 DOI: 10.3390/ijms242216496] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 11/11/2023] [Accepted: 11/17/2023] [Indexed: 11/26/2023] Open
Abstract
Machine learning has been increasingly utilized in the field of protein engineering, and research directed at predicting the effects of protein mutations has attracted increasing attention. Among them, so far, the best results have been achieved by related methods based on protein language models, which are trained on a large number of unlabeled protein sequences to capture the generally hidden evolutionary rules in protein sequences, and are therefore able to predict their fitness from protein sequences. Although numerous similar models and methods have been successfully employed in practical protein engineering processes, the majority of the studies have been limited to how to construct more complex language models to capture richer protein sequence feature information and utilize this feature information for unsupervised protein fitness prediction. There remains considerable untapped potential in these developed models, such as whether the prediction performance can be further improved by integrating different models to further improve the accuracy of prediction. Furthermore, how to utilize large-scale models for prediction methods of mutational effects on quantifiable properties of proteins due to the nonlinear relationship between protein fitness and the quantification of specific functionalities has yet to be explored thoroughly. In this study, we propose an ensemble learning approach for predicting mutational effects of proteins integrating protein sequence features extracted from multiple large protein language models, as well as evolutionarily coupled features extracted in homologous sequences, while comparing the differences between linear regression and deep learning models in mapping these features to quantifiable functional changes. We tested our approach on a dataset of 17 protein deep mutation scans and indicated that the integrated approach together with linear regression enables the models to have higher prediction accuracy and generalization. Moreover, we further illustrated the reliability of the integrated approach by exploring the differences in the predictive performance of the models across species and protein sequence lengths, as well as by visualizing clustering of ensemble and non-ensemble features.
Collapse
Affiliation(s)
- Yang Qu
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo 315300, China; (Y.Q.); (Z.N.); (Q.D.); (T.Z.)
- Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315300, China; (T.K.); (B.B.); (J.M.)
| | - Zitong Niu
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo 315300, China; (Y.Q.); (Z.N.); (Q.D.); (T.Z.)
- Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315300, China; (T.K.); (B.B.); (J.M.)
| | - Qiaojiao Ding
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo 315300, China; (Y.Q.); (Z.N.); (Q.D.); (T.Z.)
- Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315300, China; (T.K.); (B.B.); (J.M.)
| | - Taowa Zhao
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo 315300, China; (Y.Q.); (Z.N.); (Q.D.); (T.Z.)
- Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315300, China; (T.K.); (B.B.); (J.M.)
| | - Tong Kong
- Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315300, China; (T.K.); (B.B.); (J.M.)
| | - Bing Bai
- Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315300, China; (T.K.); (B.B.); (J.M.)
| | - Jianwei Ma
- Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315300, China; (T.K.); (B.B.); (J.M.)
| | - Yitian Zhao
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo 315300, China; (Y.Q.); (Z.N.); (Q.D.); (T.Z.)
- Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315300, China; (T.K.); (B.B.); (J.M.)
| | - Jianping Zheng
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo 315300, China; (Y.Q.); (Z.N.); (Q.D.); (T.Z.)
- Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315300, China; (T.K.); (B.B.); (J.M.)
| |
Collapse
|
3
|
Flynn J, Samant N, Schneider-Nachum G, Tenzin T, Bolon DNA. Mutational fitness landscape and drug resistance. Curr Opin Struct Biol 2023; 78:102525. [PMID: 36621152 PMCID: PMC10243218 DOI: 10.1016/j.sbi.2022.102525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 11/29/2022] [Accepted: 12/06/2022] [Indexed: 01/08/2023]
Abstract
Robust technology has been developed to systematically quantify fitness landscapes that provide valuable opportunities to improve our understanding of drug resistance and define new avenues to develop drugs with reduced resistance susceptibility. We outline the critical importance of drug resistance studies and the potential for fitness landscape approaches to contribute to this effort. We describe the major technical advancements in mutational scanning, which is the primary approach used to quantify protein fitness landscapes. There are many complex steps to consider in planning and executing mutational scanning projects including developing a selection scheme, generating mutant libraries, tracking the frequency of variants using next-generation sequencing, and processing and interpreting the data. Key experimental parameters impacting each of these steps are discussed to aid in planning fitness landscape studies. There is a strong need for improved understanding of drug resistance, and fitness landscapes provide a promising new approach.
Collapse
Affiliation(s)
- Julia Flynn
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Neha Samant
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Gily Schneider-Nachum
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Tsepal Tenzin
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Daniel N A Bolon
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA.
| |
Collapse
|
4
|
Lansch‐Justen L, Cusseddu D, Schmitz MA, Bank C. The extinction time under mutational meltdown driven by high mutation rates. Ecol Evol 2022; 12:e9046. [PMID: 35813923 PMCID: PMC9257376 DOI: 10.1002/ece3.9046] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 06/01/2022] [Accepted: 06/04/2022] [Indexed: 01/15/2023] Open
Abstract
Mutational meltdown describes an eco-evolutionary process in which the accumulation of deleterious mutations causes a fitness decline that eventually leads to the extinction of a population. Possible applications of this concept include medical treatment of RNA virus infections based on mutagenic drugs that increase the mutation rate of the pathogen. To determine the usefulness and expected success of such an antiviral treatment, estimates of the expected time to mutational meltdown are necessary. Here, we compute the extinction time of a population under high mutation rates, using both analytical approaches and stochastic simulations. Extinction is the result of three consecutive processes: (a) initial accumulation of deleterious mutations due to the increased mutation pressure; (b) consecutive loss of the fittest haplotype due to Muller's ratchet; (c) rapid population decline toward extinction. We find accurate analytical results for the mean extinction time, which show that the deleterious mutation rate has the strongest effect on the extinction time. We confirm that intermediate-sized deleterious selection coefficients minimize the extinction time. Finally, our simulations show that the variation in extinction time, given a set of parameters, is surprisingly small.
Collapse
Affiliation(s)
- Lucy Lansch‐Justen
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Institute of Evolution and EcologyUniversity of EdinburghEdinburghUK
| | - Davide Cusseddu
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Grupo Física‐Matemática, Faculdade de CiênciasUniversidade de LisboaLisboaPortugal
| | | | - Claudia Bank
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Institute of Ecology and EvolutionUniversity of BernBernSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| |
Collapse
|
5
|
Identification of a permissive secondary mutation that restores the enzymatic activity of oseltamivir resistance mutation H275Y. J Virol 2022; 96:e0198221. [PMID: 35045267 DOI: 10.1128/jvi.01982-21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Many oseltamivir resistance mutations exhibit fitness defects in the absence of drug pressure that hinders their propagation in hosts. Secondary permissive mutations can rescue fitness defects and facilitate the segregation of resistance mutations in viral populations. Previous studies have identified a panel of permissive or compensatory mutations in neuraminidase (NA) that restore the growth defect of the predominant oseltamivir resistance mutation (H275Y) in H1N1 influenza A. In prior work, we identified a hyperactive mutation (Y276F) that increased NA activity by approximately 70%. While Y276F had not been previously identified as a permissive mutation, we hypothesized that Y276F may counteract the defects caused by H275Y by buffering its reduced NA expression and enzyme activity. In this study we measured the relative fitness, NA activity, and surface expression, as well as sensitivity to oseltamivir, for several oseltamivir resistance mutations including H275Y in the wildtype or Y276F genetic background. Our results demonstrate that Y276F selectively rescues the fitness defect of H275Y by restoring its NA surface expression and enzymatic activity, elucidating the local compensatory structural impacts of Y276F on the adjacent H275Y. Importance The potential for influenza A virus (IAV) to cause pandemics makes understanding evolutionary mechanisms that impact drug resistance critical for developing surveillance and treatment strategies. Oseltamivir is the most widely used therapeutic strategy to treat IAV infections, but mutations in IAV can lead to drug resistance. The main oseltamivir resistance mutation, H275Y, occurs in the neuraminidase (NA) protein of IAV and reduces drug binding as well as NA function. Here, we identify a new helper mutation, Y276F that can rescue the functional defects of H275Y and contribute to the evolution of drug resistance in IAV.
Collapse
|
6
|
Flynn JM, Rossouw A, Cote-Hammarlof P, Fragata I, Mavor D, Hollins C, Bank C, Bolon DN. Comprehensive fitness maps of Hsp90 show widespread environmental dependence. eLife 2020; 9:53810. [PMID: 32129763 PMCID: PMC7069724 DOI: 10.7554/elife.53810] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 03/03/2020] [Indexed: 12/29/2022] Open
Abstract
Gene-environment interactions have long been theorized to influence molecular evolution. However, the environmental dependence of most mutations remains unknown. Using deep mutational scanning, we engineered yeast with all 44,604 single codon changes encoding 14,160 amino acid variants in Hsp90 and quantified growth effects under standard conditions and under five stress conditions. To our knowledge, these are the largest determined comprehensive fitness maps of point mutants. The growth of many variants differed between conditions, indicating that environment can have a large impact on Hsp90 evolution. Multiple variants provided growth advantages under individual conditions; however, these variants tended to exhibit growth defects in other environments. The diversity of Hsp90 sequences observed in extant eukaryotes preferentially contains variants that supported robust growth under all tested conditions. Rather than favoring substitutions in individual conditions, the long-term selective pressure on Hsp90 may have been that of fluctuating environments, leading to robustness under a variety of conditions.
Collapse
Affiliation(s)
- Julia M Flynn
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, United States
| | - Ammeret Rossouw
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, United States
| | - Pamela Cote-Hammarlof
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, United States
| | - Inês Fragata
- Instituto Gulbenkian de Ciência, Oeiras, Portugal
| | - David Mavor
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, United States
| | - Carl Hollins
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, United States
| | - Claudia Bank
- Instituto Gulbenkian de Ciência, Oeiras, Portugal
| | - Daniel Na Bolon
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, United States
| |
Collapse
|
7
|
Esposito D, Weile J, Shendure J, Starita LM, Papenfuss AT, Roth FP, Fowler DM, Rubin AF. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol 2019; 20:223. [PMID: 31679514 PMCID: PMC6827219 DOI: 10.1186/s13059-019-1845-6] [Citation(s) in RCA: 138] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Accepted: 10/01/2019] [Indexed: 11/10/2022] Open
Abstract
Multiplex assays of variant effect (MAVEs), such as deep mutational scans and massively parallel reporter assays, test thousands of sequence variants in a single experiment. Despite the importance of MAVE data for basic and clinical research, there is no standard resource for their discovery and distribution. Here, we present MaveDB ( https://www.mavedb.org ), a public repository for large-scale measurements of sequence variant impact, designed for interoperability with applications to interpret these datasets. We also describe the first such application, MaveVis, which retrieves, visualizes, and contextualizes variant effect maps. Together, the database and applications will empower the community to mine these powerful datasets.
Collapse
Affiliation(s)
- Daniel Esposito
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Jochen Weile
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Lea M Starita
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Anthony T Papenfuss
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, VIC, Australia
- Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, Australia
| | - Frederick P Roth
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada.
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
- Department of Bioengineering, University of Washington, Seattle, WA, USA.
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia.
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia.
| |
Collapse
|
8
|
Kemble H, Nghe P, Tenaillon O. Recent insights into the genotype-phenotype relationship from massively parallel genetic assays. Evol Appl 2019; 12:1721-1742. [PMID: 31548853 PMCID: PMC6752143 DOI: 10.1111/eva.12846] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/21/2019] [Accepted: 07/02/2019] [Indexed: 12/20/2022] Open
Abstract
With the molecular revolution in Biology, a mechanistic understanding of the genotype-phenotype relationship became possible. Recently, advances in DNA synthesis and sequencing have enabled the development of deep mutational scanning assays, capable of scoring comprehensive libraries of genotypes for fitness and a variety of phenotypes in massively parallel fashion. The resulting empirical genotype-fitness maps pave the way to predictive models, potentially accelerating our ability to anticipate the behaviour of pathogen and cancerous cell populations from sequencing data. Besides from cellular fitness, phenotypes of direct application in industry (e.g. enzyme activity) and medicine (e.g. antibody binding) can be quantified and even selected directly by these assays. This review discusses the technological basis of and recent developments in massively parallel genetics, along with the trends it is uncovering in the genotype-phenotype relationship (distribution of mutation effects, epistasis), their possible mechanistic bases and future directions for advancing towards the goal of predictive genetics.
Collapse
Affiliation(s)
- Harry Kemble
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Philippe Nghe
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Olivier Tenaillon
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
| |
Collapse
|
9
|
Soh YS, Moncla LH, Eguia R, Bedford T, Bloom JD. Comprehensive mapping of adaptation of the avian influenza polymerase protein PB2 to humans. eLife 2019; 8:45079. [PMID: 31038123 PMCID: PMC6491042 DOI: 10.7554/elife.45079] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Accepted: 03/31/2019] [Indexed: 12/11/2022] Open
Abstract
Viruses like influenza are infamous for their ability to adapt to new hosts. Retrospective studies of natural zoonoses and passaging in the lab have identified a modest number of host-adaptive mutations. However, it is unclear if these mutations represent all ways that influenza can adapt to a new host. Here we take a prospective approach to this question by completely mapping amino-acid mutations to the avian influenza virus polymerase protein PB2 that enhance growth in human cells. We identify numerous previously uncharacterized human-adaptive mutations. These mutations cluster on PB2’s surface, highlighting potential interfaces with host factors. Some previously uncharacterized adaptive mutations occur in avian-to-human transmission of H7N9 influenza, showing their importance for natural virus evolution. But other adaptive mutations do not occur in nature because they are inaccessible via single-nucleotide mutations. Overall, our work shows how selection at key molecular surfaces combines with evolutionary accessibility to shape viral host adaptation. Viruses copy themselves by hijacking the cells of an infected host, but this comes with some limitations. Cells from different species have different molecular machinery and so viruses often have to specialize to a narrow group of species. This specialization consists largely of fine-tuning the way that viral proteins interact with host proteins. For instance, in bird flu viruses, a protein known as PB2 does not interact well with the machinery in human cells. Because PB2 proteins form part of the viral polymerase (the structure that copies the viral genome), this prevents bird flu viruses from replicating efficiently in humans. Sometimes however, changes in the PB2 protein allow bird flu viruses to better replicate in humans, potentially leading to deadly flu pandemics. To understand exactly how this happens, researchers have previously used two approaches: examining the changes that have happened in past flu viruses, and monitoring the evolution of bird flu viruses grown in human cells in the lab. However, these approaches can only look at a small number of the many possible genetic changes to the virus. This makes it hard to anticipate the new ways that flu might adapt to human cells in the future. To overcome this problem, Soh et al. systematically created all of the single changes to the bird flu PB2, altering every element of the protein sequence one-by-one. They then tested which of the changes to PB2 helped the virus grow better in human cells. The modifications that made the viruses thrive were on the surface of the protein, suggesting that they might improve interaction with the cell machinery of the host. Some changes have been found in bird flu viruses that have recently jumped into humans in nature, although fortunately none of these viruses have yet spread widely to cause a pandemic. Many factors affect the evolution of viruses, and their ability to infect new species. Understanding which changes in proteins help these microbes adapt to new hosts is an important element that scientists could consider to assess future risks of pandemics.
Collapse
Affiliation(s)
- Yq Shirleen Soh
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, United States.,Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Louise H Moncla
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, United States.,Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Rachel Eguia
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Trevor Bedford
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, United States.,Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Jesse D Bloom
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, United States.,Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, United States.,Howard Hughes Medical Institute, Seattle, United States
| |
Collapse
|
10
|
Ecological and Evolutionary Processes Shaping Viral Genetic Diversity. Viruses 2019; 11:v11030220. [PMID: 30841497 PMCID: PMC6466605 DOI: 10.3390/v11030220] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 02/22/2019] [Accepted: 02/27/2019] [Indexed: 02/07/2023] Open
Abstract
The contemporary genomic diversity of viruses is a result of the continuous and dynamic interaction of past ecological and evolutionary processes. Thus, genome sequences of viruses can be a valuable source of information about these processes. In this review, we first describe the relevant processes shaping viral genomic variation, with a focus on the role of host–virus coevolution and its potential to give rise to eco-evolutionary feedback loops. We further give a brief overview of available methodology designed to extract information about these processes from genomic data. Short generation times and small genomes make viruses ideal model systems to study the joint effect of complex coevolutionary and eco-evolutionary interactions on genetic evolution. This complexity, together with the diverse array of lifetime and reproductive strategies in viruses ask for extensions of existing inference methods, for example by integrating multiple information sources. Such integration can broaden the applicability of genetic inference methods and thus further improve our understanding of the role viruses play in biological communities.
Collapse
|
11
|
Starbæk SMR, Brogaard L, Dawson HD, Smith AD, Heegaard PMH, Larsen LE, Jungersen G, Skovgaard K. Animal Models for Influenza A Virus Infection Incorporating the Involvement of Innate Host Defenses: Enhanced Translational Value of the Porcine Model. ILAR J 2018; 59:323-337. [PMID: 30476076 DOI: 10.1093/ilar/ily009] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Revised: 06/19/2018] [Indexed: 01/05/2025] Open
Abstract
Influenza is a viral respiratory disease having a major impact on public health. Influenza A virus (IAV) usually causes mild transitory disease in humans. However, in specific groups of individuals such as severely obese, the elderly, and individuals with underlying inflammatory conditions, IAV can cause severe illness or death. In this review, relevant small and large animal models for human IAV infection, including the pig, ferret, and mouse, are discussed. The focus is on the pig as a large animal model for human IAV infection as well as on the associated innate immune response. Pigs are natural hosts for the same IAV subtypes as humans, they develop clinical disease mirroring human symptoms, they have similar lung anatomy, and their respiratory physiology and immune responses to IAV infection are remarkably similar to what is observed in humans. The pig model shows high face and target validity for human IAV infection, making it suitable for modeling many aspects of influenza, including increased risk of severe disease and impaired vaccine response due to underlying pathologies such as low-grade inflammation. Comparative analysis of proteins involved in viral pattern recognition, interferon responses, and regulation of interferon-stimulated genes reveals a significantly higher degree of similarity between pig, ferret, and human compared with mice. It is concluded that the pig is a promising animal model displaying substantial human translational value with the ability to provide essential insights into IAV infection, pathogenesis, and immunity.
Collapse
Affiliation(s)
- Sofie M R Starbæk
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Louise Brogaard
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Harry D Dawson
- Beltsville Human Nutrition Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland
| | - Allen D Smith
- Beltsville Human Nutrition Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland
| | - Peter M H Heegaard
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Lars E Larsen
- National Veterinary Institute, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Gregers Jungersen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Kerstin Skovgaard
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| |
Collapse
|
12
|
Kinetic, Thermodynamic, and Structural Analysis of Drug Resistance Mutations in Neuraminidase from the 2009 Pandemic Influenza Virus. Viruses 2018; 10:v10070339. [PMID: 29933553 PMCID: PMC6071225 DOI: 10.3390/v10070339] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2018] [Revised: 06/14/2018] [Accepted: 06/19/2018] [Indexed: 12/25/2022] Open
Abstract
Neuraminidase is the main target for current influenza drugs. Reduced susceptibility to oseltamivir, the most widely prescribed neuraminidase inhibitor, has been repeatedly reported. The resistance substitutions I223V and S247N, alone or in combination with the major oseltamivir-resistance mutation H275Y, have been observed in 2009 pandemic H1N1 viruses. We overexpressed and purified the ectodomain of wild-type neuraminidase from the A/California/07/2009 (H1N1) influenza virus, as well as variants containing H275Y, I223V, and S247N single mutations and H275Y/I223V and H275Y/S247N double mutations. We performed enzymological and thermodynamic analyses and structurally examined the resistance mechanism. Our results reveal that the I223V or S247N substitution alone confers only a moderate reduction in oseltamivir affinity. In contrast, the major oseltamivir resistance mutation H275Y causes a significant decrease in the enzyme’s ability to bind this drug. Combination of H275Y with an I223V or S247N mutation results in extreme impairment of oseltamivir’s inhibition potency. Our structural analyses revealed that the H275Y substitution has a major effect on the oseltamivir binding pose within the active site while the influence of other studied mutations is much less prominent. Our crystal structures also helped explain the augmenting effect on resistance of combining H275Y with both substitutions.
Collapse
|
13
|
Abstract
The deterministic force of natural selection and stochastic influence of drift shape RNA virus evolution. New deep-sequencing and microfluidics technologies allow us to quantify the effect of mutations and trace the evolution of viral populations with single-genome and single-nucleotide resolution. Such experiments can reveal the topography of the genotype-fitness landscapes that shape the path of viral evolution. By combining historical analyses, like phylogenetic approaches, with high-throughput and high-resolution evolutionary experiments, we can observe parallel patterns of evolution that drive important phenotypic transitions. These developments provide a framework for quantifying and anticipating potential evolutionary events. Here, we examine emerging technologies that can map the selective landscapes of viruses, focusing on their application to pathogenic viruses. We identify areas where these technologies can bolster our ability to study the evolution of viruses and to anticipate and possibly intervene in evolutionary events and prevent viral disease.
Collapse
Affiliation(s)
- Patrick T Dolan
- Department of Biology, Stanford University, E200 Clark Center, 318 Campus Drive, Stanford, CA 94305, USA; Department of Microbiology and Immunology, University of California, San Francisco, 600 16th Street, GH-S572, UCSF Box 2280, San Francisco, CA 94143-2280, USA
| | - Zachary J Whitfield
- Department of Microbiology and Immunology, University of California, San Francisco, 600 16th Street, GH-S572, UCSF Box 2280, San Francisco, CA 94143-2280, USA
| | - Raul Andino
- Department of Microbiology and Immunology, University of California, San Francisco, 600 16th Street, GH-S572, UCSF Box 2280, San Francisco, CA 94143-2280, USA.
| |
Collapse
|
14
|
Canale AS, Venev SV, Whitfield TW, Caffrey DR, Marasco WA, Schiffer CA, Kowalik TF, Jensen JD, Finberg RW, Zeldovich KB, Wang JP, Bolon DNA. Synonymous Mutations at the Beginning of the Influenza A Virus Hemagglutinin Gene Impact Experimental Fitness. J Mol Biol 2018; 430:1098-1115. [PMID: 29466705 DOI: 10.1016/j.jmb.2018.02.009] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Revised: 01/19/2018] [Accepted: 02/05/2018] [Indexed: 01/15/2023]
Abstract
The fitness effects of synonymous mutations can provide insights into biological and evolutionary mechanisms. We analyzed the experimental fitness effects of all single-nucleotide mutations, including synonymous substitutions, at the beginning of the influenza A virus hemagglutinin (HA) gene. Many synonymous substitutions were deleterious both in bulk competition and for individually isolated clones. Investigating protein and RNA levels of a subset of individually expressed HA variants revealed that multiple biochemical properties contribute to the observed experimental fitness effects. Our results indicate that a structural element in the HA segment viral RNA may influence fitness. Examination of naturally evolved sequences in human hosts indicates a preference for the unfolded state of this structural element compared to that found in swine hosts. Our overall results reveal that synonymous mutations may have greater fitness consequences than indicated by simple models of sequence conservation, and we discuss the implications of this finding for commonly used evolutionary tests and analyses.
Collapse
Affiliation(s)
- Aneth S Canale
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Sergey V Venev
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Troy W Whitfield
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01655, USA; Department of Medicine, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Daniel R Caffrey
- Department of Medicine, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Wayne A Marasco
- Department of Cancer Immunology & Virology, Dana-Farber Cancer Institute, Harvard Medical School, 450 Brookline Avenue, Boston, MA 02215, USA
| | - Celia A Schiffer
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Timothy F Kowalik
- Department of Microbiology and Physiological Systems, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Center for Evolution & Medicine, Arizona State University, Tempe, AZ. 85281, USA
| | - Robert W Finberg
- Department of Medicine, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Konstantin B Zeldovich
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Jennifer P Wang
- Department of Medicine, University of Massachusetts Medical School, Worcester, MA 01655, USA.
| | - Daniel N A Bolon
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01655, USA.
| |
Collapse
|
15
|
Rubin AF, Gelman H, Lucas N, Bajjalieh SM, Papenfuss AT, Speed TP, Fowler DM. Correction to: A statistical framework for analyzing deep mutational scanning data. Genome Biol 2018; 19:17. [PMID: 29415752 PMCID: PMC5803959 DOI: 10.1186/s13059-018-1391-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
CORRECTION After publication of our article [1] it was brought to our attention that a line of code was missing from our program to combine the within-replicate variance and between-replicate variance. This led to an overestimation of the standard errors calculated using the Enrich2 random-effects model.
Collapse
Affiliation(s)
- Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Australia.,Department of Medical Biology, University of Melbourne, Melbourne, Australia.,Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, Australia.,Department of Genome Sciences, University of Washington, Seattle, USA
| | - Hannah Gelman
- Department of Genome Sciences, University of Washington, Seattle, USA.,Institute for Protein Design, University of Washington, Seattle, USA
| | - Nathan Lucas
- Department of Pathology, University of Washington, Seattle, USA
| | | | - Anthony T Papenfuss
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Australia.,Department of Medical Biology, University of Melbourne, Melbourne, Australia.,Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, Australia.,Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Australia.,Department of Mathematics and Statistics, University of Melbourne, Melbourne, Australia
| | - Terence P Speed
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Australia.,Department of Mathematics and Statistics, University of Melbourne, Melbourne, Australia
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, USA. .,Department of Bioengineering, University of Washington, Seattle, USA.
| |
Collapse
|
16
|
Evolutionary mechanisms studied through protein fitness landscapes. Curr Opin Struct Biol 2018; 48:141-148. [DOI: 10.1016/j.sbi.2018.01.001] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Revised: 12/26/2017] [Accepted: 01/01/2018] [Indexed: 12/15/2022]
|
17
|
Rubin AF, Gelman H, Lucas N, Bajjalieh SM, Papenfuss AT, Speed TP, Fowler DM. A statistical framework for analyzing deep mutational scanning data. Genome Biol 2017; 18:150. [PMID: 28784151 PMCID: PMC5547491 DOI: 10.1186/s13059-017-1272-5] [Citation(s) in RCA: 132] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Accepted: 07/06/2017] [Indexed: 11/10/2022] Open
Abstract
Deep mutational scanning is a widely used method for multiplex measurement of functional consequences of protein variants. We developed a new deep mutational scanning statistical model that generates error estimates for each measurement, capturing both sampling error and consistency between replicates. We apply our model to one novel and five published datasets comprising 243,732 variants and demonstrate its superiority in removing noisy variants and conducting hypothesis testing. Simulations show our model applies to scans based on cell growth or binding and handles common experimental errors. We implemented our model in Enrich2, software that can empower researchers analyzing deep mutational scanning data.
Collapse
Affiliation(s)
- Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.,Department of Medical Biology, University of Melbourne, Melbourne, VIC, 3010, Australia.,Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, 3000, Australia.,Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | - Hannah Gelman
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA.,Institute for Protein Design, University of Washington, Seattle, WA, 98195, USA
| | - Nathan Lucas
- Department of Pathology, University of Washington, Seattle, WA, 98195, USA
| | - Sandra M Bajjalieh
- Department of Pathology, University of Washington, Seattle, WA, 98195, USA
| | - Anthony T Papenfuss
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.,Department of Medical Biology, University of Melbourne, Melbourne, VIC, 3010, Australia.,Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, 3000, Australia.,Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, VIC, 3010, Australia.,Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Terence P Speed
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.,Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA. .,Department of Bioengineering, University of Washington, Seattle, WA, 98195, USA.
| |
Collapse
|
18
|
Ashenberg O, Padmakumar J, Doud MB, Bloom JD. Deep mutational scanning identifies sites in influenza nucleoprotein that affect viral inhibition by MxA. PLoS Pathog 2017; 13:e1006288. [PMID: 28346537 PMCID: PMC5383324 DOI: 10.1371/journal.ppat.1006288] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2016] [Revised: 04/06/2017] [Accepted: 03/10/2017] [Indexed: 01/24/2023] Open
Abstract
The innate-immune restriction factor MxA inhibits influenza replication by targeting the viral nucleoprotein (NP). Human influenza virus is more resistant than avian influenza virus to inhibition by human MxA, and prior work has compared human and avian viral strains to identify amino-acid differences in NP that affect sensitivity to MxA. However, this strategy is limited to identifying sites in NP where mutations that affect MxA sensitivity have fixed during the small number of documented zoonotic transmissions of influenza to humans. Here we use an unbiased deep mutational scanning approach to quantify how all single amino-acid mutations to NP affect MxA sensitivity in the context of replication-competent virus. We both identify new sites in NP where mutations affect MxA resistance and re-identify mutations known to have increased MxA resistance during historical adaptations of influenza to humans. Most of the sites where mutations have the greatest effect are almost completely conserved across all influenza A viruses, and the amino acids at these sites confer relatively high resistance to MxA. These sites cluster in regions of NP that appear to be important for its recognition by MxA. Overall, our work systematically identifies the sites in influenza nucleoprotein where mutations affect sensitivity to MxA. We also demonstrate a powerful new strategy for identifying regions of viral proteins that affect inhibition by host factors. During viral infection, human cells express proteins that can restrict virus replication. However, in many cases it remains unclear what determines the sensitivity of a given viral strain to a particular restriction factor. Here we use a high-throughput approach to measure how all amino-acid mutations to the nucleoprotein of influenza virus affect restriction by the human protein MxA. We find several dozen sites where mutations substantially affect the sensitivity of influenza virus to MxA. While a few of these sites are known to have fixed mutations during past adaptations of influenza virus to humans, most of the sites are broadly conserved across all influenza strains and have never previously been described as affecting MxA resistance. Our results therefore show that the known historical evolution of influenza has only involved substitutions at a small fraction of the sites where mutations can in principle affect MxA resistance. We suggest that this is because many sites are already broadly fixed at amino acids that confer high resistance.
Collapse
Affiliation(s)
- Orr Ashenberg
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Jai Padmakumar
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Michael B. Doud
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Medical Scientist Training Program, University of Washington School of Medicine, Seattle, WA, USA
| | - Jesse D. Bloom
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- * E-mail:
| |
Collapse
|
19
|
Prachanronarong KL, Özen A, Thayer KM, Yilmaz LS, Zeldovich KB, Bolon DN, Kowalik TF, Jensen JD, Finberg RW, Wang JP, Kurt-Yilmaz N, Schiffer CA. Molecular Basis for Differential Patterns of Drug Resistance in Influenza N1 and N2 Neuraminidase. J Chem Theory Comput 2016; 12:6098-6108. [PMID: 27951676 DOI: 10.1021/acs.jctc.6b00703] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Neuraminidase (NA) inhibitors are used for the prevention and treatment of influenza A virus infections. Two subtypes of NA, N1 and N2, predominate in viruses that infect humans, but differential patterns of drug resistance have emerged in each subtype despite highly homologous active sites. To understand the molecular basis for the selection of these drug resistance mutations, structural and dynamic analyses on complexes of N1 and N2 NA with substrates and inhibitors were performed. Comparison of dynamic substrate and inhibitor envelopes and interactions at the active site revealed how differential patterns of drug resistance have emerged for specific drug resistance mutations, at residues I222, S246, and H274 in N1 and E119 in N2. Our results show that the differences in intermolecular interactions, especially van der Waals contacts, of the inhibitors versus substrates at the NA active site effectively explain the selection of resistance mutations in the two subtypes. Avoiding such contacts that render inhibitors vulnerable to resistance by better mimicking the dynamics and intermolecular interactions of substrates can lead to the development of novel inhibitors that avoid drug resistance in both subtypes.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Jeffrey D Jensen
- School of Life Sciences, École Polytechnique Fédérale de Lausanne , 1015 Lausanne, Switzerland
| | | | | | | | | |
Collapse
|
20
|
On the importance of skewed offspring distributions and background selection in virus population genetics. Heredity (Edinb) 2016; 117:393-399. [PMID: 27649621 DOI: 10.1038/hdy.2016.58] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 06/08/2016] [Indexed: 12/16/2022] Open
Abstract
Many features of virus populations make them excellent candidates for population genetic study, including a very high rate of mutation, high levels of nucleotide diversity, exceptionally large census population sizes, and frequent positive selection. However, these attributes also mean that special care must be taken in population genetic inference. For example, highly skewed offspring distributions, frequent and severe population bottleneck events associated with infection and compartmentalization, and strong purifying selection all affect the distribution of genetic variation but are often not taken into account. Here, we draw particular attention to multiple-merger coalescent events and background selection, discuss potential misinference associated with these processes, and highlight potential avenues for better incorporating them into future population genetic analyses.
Collapse
|
21
|
A Statistical Guide to the Design of Deep Mutational Scanning Experiments. Genetics 2016; 204:77-87. [PMID: 27412710 DOI: 10.1534/genetics.116.190462] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Accepted: 06/29/2016] [Indexed: 12/21/2022] Open
Abstract
The characterization of the distribution of mutational effects is a key goal in evolutionary biology. Recently developed deep-sequencing approaches allow for accurate and simultaneous estimation of the fitness effects of hundreds of engineered mutations by monitoring their relative abundance across time points in a single bulk competition. Naturally, the achievable resolution of the estimated fitness effects depends on the specific experimental setup, the organism and type of mutations studied, and the sequencing technology utilized, among other factors. By means of analytical approximations and simulations, we provide guidelines for optimizing time-sampled deep-sequencing bulk competition experiments, focusing on the number of mutants, the sequencing depth, and the number of sampled time points. Our analytical results show that sampling more time points together with extending the duration of the experiment improves the achievable precision disproportionately compared with increasing the sequencing depth or reducing the number of competing mutants. Even if the duration of the experiment is fixed, sampling more time points and clustering these at the beginning and the end of the experiment increase experimental power and allow for efficient and precise assessment of the entire range of selection coefficients. Finally, we provide a formula for calculating the 95%-confidence interval for the measurement error estimate, which we implement as an interactive web tool. This allows for quantification of the maximum expected a priori precision of the experimental setup, as well as for a statistical threshold for determining deviations from neutrality for specific selection coefficient estimates.
Collapse
|
22
|
Accurate Measurement of the Effects of All Amino-Acid Mutations on Influenza Hemagglutinin. Viruses 2016; 8:v8060155. [PMID: 27271655 PMCID: PMC4926175 DOI: 10.3390/v8060155] [Citation(s) in RCA: 141] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2016] [Revised: 05/21/2016] [Accepted: 05/25/2016] [Indexed: 12/17/2022] Open
Abstract
Influenza genes evolve mostly via point mutations, and so knowing the effect of every amino-acid mutation provides information about evolutionary paths available to the virus. We and others have combined high-throughput mutagenesis with deep sequencing to estimate the effects of large numbers of mutations to influenza genes. However, these measurements have suffered from substantial experimental noise due to a variety of technical problems, the most prominent of which is bottlenecking during the generation of mutant viruses from plasmids. Here we describe advances that ameliorate these problems, enabling us to measure with greatly improved accuracy and reproducibility the effects of all amino-acid mutations to an H1 influenza hemagglutinin on viral replication in cell culture. The largest improvements come from using a helper virus to reduce bottlenecks when generating viruses from plasmids. Our measurements confirm at much higher resolution the results of previous studies suggesting that antigenic sites on the globular head of hemagglutinin are highly tolerant of mutations. We also show that other regions of hemagglutinin—including the stalk epitopes targeted by broadly neutralizing antibodies—have a much lower inherent capacity to tolerate point mutations. The ability to accurately measure the effects of all influenza mutations should enhance efforts to understand and predict viral evolution.
Collapse
|
23
|
Boucher JI, Bolon DNA, Tawfik DS. Quantifying and understanding the fitness effects of protein mutations: Laboratory versus nature. Protein Sci 2016; 25:1219-26. [PMID: 27010590 DOI: 10.1002/pro.2928] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2016] [Revised: 03/21/2016] [Accepted: 03/21/2016] [Indexed: 11/11/2022]
Abstract
The last decade has seen a growing number of experiments aimed at systematically mapping the effects of mutations in different proteins, and of attempting to correlate their biophysical and biochemical effects with organismal fitness. While insightful, systematic laboratory measurements of fitness effects present challenges and difficulties. Here, we discuss the limitations associated with such measurements, and in particular the challenge of correlating the effects of mutations at the single protein level ("protein fitness") with their effects on organismal fitness. A variety of experimental setups are used, with some measuring the direct effects on protein function and others monitoring the growth rate of a model organism carrying the protein mutants. The manners by which fitness effects are calculated and presented also vary, and the conclusions, including the derived distributions of fitness effects of mutations, vary accordingly. The comparison of the effects of mutations in the laboratory to the natural protein diversity, namely to amino acid changes that have fixed in the course of millions of years of evolution, is also debatable. The results of laboratory experiments may, therefore, be less relevant to understanding long-term inter-species variations yet insightful with regard to short-term polymorphism, for example, in the study of the effects of human SNPs.
Collapse
Affiliation(s)
- Jeffrey I Boucher
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts, 01605
| | - Daniel N A Bolon
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts, 01605
| | - Dan S Tawfik
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, 76100, Israel
| |
Collapse
|
24
|
Phillips AM, Shoulders MD. The Path of Least Resistance: Mechanisms to Reduce Influenza's Sensitivity to Oseltamivir. J Mol Biol 2016; 428:533-537. [PMID: 26748011 DOI: 10.1016/j.jmb.2015.12.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Angela M Phillips
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Matthew D Shoulders
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| |
Collapse
|