101
|
Lyons DM, Zou Z, Xu H, Zhang J. Idiosyncratic epistasis creates universals in mutational effects and evolutionary trajectories. Nat Ecol Evol 2020; 4:1685-1693. [PMID: 32895516 PMCID: PMC7710555 DOI: 10.1038/s41559-020-01286-y] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Accepted: 07/23/2020] [Indexed: 01/06/2023]
Abstract
Patterns of epistasis and shapes of fitness landscapes are of wide interest because of their bearings on a number of evolutionary theories. The common phenomena of slowing fitness increases during adaptations and diminishing returns from beneficial mutations are believed to reflect a concave fitness landscape and a preponderance of negative epistasis. Paradoxically, fitness decreases tend to decelerate and harm from deleterious mutations shrinks during the accumulation of random mutations-patterns thought to indicate a convex fitness landscape and a predominance of positive epistasis. Current theories cannot resolve this apparent contradiction. Here, we show that the phenotypic effect of a mutation varies substantially depending on the specific genetic background and that this idiosyncrasy in epistasis creates all of the above trends without requiring a biased distribution of epistasis. The idiosyncratic epistasis theory explains the universalities in mutational effects and evolutionary trajectories as emerging from randomness due to biological complexity.
Collapse
Affiliation(s)
| | | | | | - Jianzhi Zhang
- Correspondence to Jianzhi Zhang, Department of Ecology and Evolutionary Biology, University of Michigan, 4018 Biological Sciences Building, 1105 North University Avenue, Ann Arbor, MI 48109, USA, Phone: 734-763-0527,
| |
Collapse
|
102
|
Zurek PJ, Knyphausen P, Neufeld K, Pushpanath A, Hollfelder F. UMI-linked consensus sequencing enables phylogenetic analysis of directed evolution. Nat Commun 2020; 11:6023. [PMID: 33243970 PMCID: PMC7691348 DOI: 10.1038/s41467-020-19687-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 10/12/2020] [Indexed: 11/09/2022] Open
Abstract
The success of protein evolution campaigns is strongly dependent on the sequence context in which mutations are introduced, stemming from pervasive non-additive interactions between a protein's amino acids ('intra-gene epistasis'). Our limited understanding of such epistasis hinders the correct prediction of the functional contributions and adaptive potential of mutations. Here we present a straightforward unique molecular identifier (UMI)-linked consensus sequencing workflow (UMIC-seq) that simplifies mapping of evolutionary trajectories based on full-length sequences. Attaching UMIs to gene variants allows accurate consensus generation for closely related genes with nanopore sequencing. We exemplify the utility of this approach by reconstructing the artificial phylogeny emerging in three rounds of directed evolution of an amine dehydrogenase biocatalyst via ultrahigh throughput droplet screening. Uniquely, we are able to identify lineages and their founding variant, as well as non-additive interactions between mutations within a full gene showing sign epistasis. Access to deep and accurate long reads will facilitate prediction of key beneficial mutations and adaptive potential based on in silico analysis of large sequence datasets.
Collapse
Affiliation(s)
- Paul Jannis Zurek
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
- Johnson Matthey Plc, Cambridge, CB4 0WE, UK
| | - Philipp Knyphausen
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
| | - Katharina Neufeld
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
- Johnson Matthey Plc, Cambridge, CB4 0WE, UK
| | | | - Florian Hollfelder
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK.
| |
Collapse
|
103
|
Alejaldre L, Lemay-St-Denis C, Perez Lopez C, Sancho Jodar F, Guallar V, Pelletier JN. Known Evolutionary Paths Are Accessible to Engineered ß-Lactamases Having Altered Protein Motions at the Timescale of Catalytic Turnover. Front Mol Biosci 2020; 7:599298. [PMID: 33330628 PMCID: PMC7716773 DOI: 10.3389/fmolb.2020.599298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 10/23/2020] [Indexed: 11/26/2022] Open
Abstract
The evolution of new protein functions is dependent upon inherent biophysical features of proteins. Whereas, it has been shown that changes in protein dynamics can occur in the course of directed molecular evolution trajectories and contribute to new function, it is not known whether varying protein dynamics modify the course of evolution. We investigate this question using three related ß-lactamases displaying dynamics that differ broadly at the slow timescale that corresponds to catalytic turnover yet have similar fast dynamics, thermal stability, catalytic, and substrate recognition profiles. Introduction of substitutions E104K and G238S, that are known to have a synergistic effect on function in the parent ß-lactamase, showed similar increases in catalytic efficiency toward cefotaxime in the related ß-lactamases. Molecular simulations using Protein Energy Landscape Exploration reveal that this results from stabilizing the catalytically-productive conformations, demonstrating the dominance of the synergistic effect of the E014K and G238S substitutions in vitro in contexts that vary in terms of sequence and dynamics. Furthermore, three rounds of directed molecular evolution demonstrated that known cefotaximase-enhancing mutations were accessible regardless of the differences in dynamics. Interestingly, specific sequence differences between the related ß-lactamases were shown to have a higher effect in evolutionary outcomes than did differences in dynamics. Overall, these ß-lactamase models show tolerance to protein dynamics at the timescale of catalytic turnover in the evolution of a new function.
Collapse
Affiliation(s)
- Lorea Alejaldre
- Biochemistry Department, Université de Montréal, Montréal, QC, Canada
- PROTEO, The Québec Network for Research on Protein, Function, Engineering and Applications, Quebec City, QC, Canada
- CGCC, Center in Green Chemistry and Catalysis, Montréal, QC, Canada
| | - Claudèle Lemay-St-Denis
- Biochemistry Department, Université de Montréal, Montréal, QC, Canada
- PROTEO, The Québec Network for Research on Protein, Function, Engineering and Applications, Quebec City, QC, Canada
- CGCC, Center in Green Chemistry and Catalysis, Montréal, QC, Canada
| | | | | | - Victor Guallar
- Barcelona Supercomputing Center, Barcelona, Spain
- ICREA: Institució Catalana de Recerca i Estudis Avancats, Barcelona, Spain
| | - Joelle N. Pelletier
- Biochemistry Department, Université de Montréal, Montréal, QC, Canada
- PROTEO, The Québec Network for Research on Protein, Function, Engineering and Applications, Quebec City, QC, Canada
- CGCC, Center in Green Chemistry and Catalysis, Montréal, QC, Canada
- Chemistry Department, Université de Montréal, Montréal, QC, Canada
| |
Collapse
|
104
|
Song H, Bremer BJ, Hinds EC, Raskutti G, Romero PA. Inferring Protein Sequence-Function Relationships with Large-Scale Positive-Unlabeled Learning. Cell Syst 2020; 12:92-101.e8. [PMID: 33212013 DOI: 10.1016/j.cels.2020.10.007] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 08/13/2020] [Accepted: 10/22/2020] [Indexed: 10/22/2022]
Abstract
Machine learning can infer how protein sequence maps to function without requiring a detailed understanding of the underlying physical or biological mechanisms. It is challenging to apply existing supervised learning frameworks to large-scale experimental data generated by deep mutational scanning (DMS) and related methods. DMS data often contain high-dimensional and correlated sequence variables, experimental sampling error and bias, and the presence of missing data. Notably, most DMS data do not contain examples of negative sequences, making it challenging to directly estimate how sequence affects function. Here, we develop a positive-unlabeled (PU) learning framework to infer sequence-function relationships from large-scale DMS data. Our PU learning method displays excellent predictive performance across ten large-scale sequence-function datasets, representing proteins of different folds, functions, and library types. The estimated parameters pinpoint key residues that dictate protein structure and function. Finally, we apply our statistical sequence-function model to design highly stabilized enzymes.
Collapse
Affiliation(s)
- Hyebin Song
- Department of Statistics, The Pennsylvania State University, State College, PA 16802, USA; Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Bennett J Bremer
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Emily C Hinds
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Garvesh Raskutti
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Philip A Romero
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA; Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA.
| |
Collapse
|
105
|
High-Throughput Protein Engineering by Massively Parallel Combinatorial Mutagenesis. Methods Mol Biol 2020. [PMID: 33125641 DOI: 10.1007/978-1-0716-0892-0_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Exploring how combinatorial mutations can be combined to optimize protein functions is important to guide protein engineering. Given the vast combinatorial space of changing multiple amino acids, identifying the top-performing variants from a large number of mutants might not be possible without a high-throughput gene assembly and screening strategy. Here we describe the CombiSEAL platform, a strategy that allows for modularization of any protein sequence into multiple segments for mutagenesis and barcoding, and seamless single-pot ligations of different segments to generate a library of combination mutants linked with concatenated barcodes at one end. By reading the barcodes using next-generation sequencing, activities of each protein variant during the protein selection process can be easily tracked in a high-throughput manner. CombiSEAL not only allows the identification of better protein variants but also enables the systematic analyses to distinguish the beneficial, deleterious, and neutral effects of combining different mutations on protein functions.
Collapse
|
106
|
Lite TLV, Grant RA, Nocedal I, Littlehale ML, Guo MS, Laub MT. Uncovering the basis of protein-protein interaction specificity with a combinatorially complete library. eLife 2020; 9:e60924. [PMID: 33107822 PMCID: PMC7669267 DOI: 10.7554/elife.60924] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 10/26/2020] [Indexed: 12/27/2022] Open
Abstract
Protein-protein interaction specificity is often encoded at the primary sequence level. However, the contributions of individual residues to specificity are usually poorly understood and often obscured by mutational robustness, sequence degeneracy, and epistasis. Using bacterial toxin-antitoxin systems as a model, we screened a combinatorially complete library of antitoxin variants at three key positions against two toxins. This library enabled us to measure the effect of individual substitutions on specificity in hundreds of genetic backgrounds. These distributions allow inferences about the general nature of interface residues in promoting specificity. We find that positive and negative contributions to specificity are neither inherently coupled nor mutually exclusive. Further, a wild-type antitoxin appears optimized for specificity as no substitutions improve discrimination between cognate and non-cognate partners. By comparing crystal structures of paralogous complexes, we provide a rationale for our observations. Collectively, this work provides a generalizable approach to understanding the logic of molecular recognition.
Collapse
Affiliation(s)
- Thuy-Lan V Lite
- Department of Biology Massachusetts Institute of TechnologyCambridgeUnited States
| | - Robert A Grant
- Department of Biology Massachusetts Institute of TechnologyCambridgeUnited States
| | - Isabel Nocedal
- Department of Biology Massachusetts Institute of TechnologyCambridgeUnited States
| | - Megan L Littlehale
- Department of Biology Massachusetts Institute of TechnologyCambridgeUnited States
| | - Monica S Guo
- Department of Biology Massachusetts Institute of TechnologyCambridgeUnited States
| | - Michael T Laub
- Department of Biology Massachusetts Institute of TechnologyCambridgeUnited States
- Howard Hughes Medical Institute Massachusetts Institute of TechnologyCambridgeUnited States
| |
Collapse
|
107
|
Vihinen M. Functional effects of protein variants. Biochimie 2020; 180:104-120. [PMID: 33164889 DOI: 10.1016/j.biochi.2020.10.009] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 10/15/2020] [Accepted: 10/19/2020] [Indexed: 12/11/2022]
Abstract
Genetic and other variations frequently affect protein functions. Scientific articles can contain confusing descriptions about which function or property is affected, and in many cases the statements are pure speculation without any experimental evidence. To clarify functional effects of protein variations of genetic or non-genetic origin, a systematic conceptualisation and framework are introduced. This framework describes protein functional effects on abundance, activity, specificity and affinity, along with countermeasures, which allow cells, tissues and organisms to tolerate, avoid, repair, attenuate or resist (TARAR) the effects. Effects on abundance discussed include gene dosage, restricted expression, mis-localisation and degradation. Enzymopathies, effects on kinetics, allostery and regulation of protein activity are subtopics for the effects of variants on activity. Variation outcomes on specificity and affinity comprise promiscuity, specificity, affinity and moonlighting. TARAR mechanisms redress variations with active and passive processes including chaperones, redundancy, robustness, canalisation and metabolic and signalling rewiring. A framework for pragmatic protein function analysis and presentation is introduced. All of the mechanisms and effects are described along with representative examples, most often in relation to diseases. In addition, protein function is discussed from evolutionary point of view. Application of the presented framework facilitates unambiguous, detailed and specific description of functional effects and their systematic study.
Collapse
Affiliation(s)
- Mauno Vihinen
- Department of Experimental Medical Science, BMC B13, Lund University, SE-22 184, Lund, Sweden.
| |
Collapse
|
108
|
Li X, Lehner B. Biophysical ambiguities prevent accurate genetic prediction. Nat Commun 2020; 11:4923. [PMID: 33004824 PMCID: PMC7529754 DOI: 10.1038/s41467-020-18694-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 09/04/2020] [Indexed: 12/27/2022] Open
Abstract
A goal of biology is to predict how mutations combine to alter phenotypes, fitness and disease. It is often assumed that mutations combine additively or with interactions that can be predicted. Here, we show using simulations that, even for the simple example of the lambda phage transcription factor CI repressing a gene, this assumption is incorrect and that perfect measurements of the effects of mutations on a trait and mechanistic understanding can be insufficient to predict what happens when two mutations are combined. This apparent paradox arises because mutations can have different biophysical effects to cause the same change in a phenotype and the outcome in a double mutant depends upon what these hidden biophysical changes actually are. Pleiotropy and non-monotonic functions further confound prediction of how mutations interact. Accurate prediction of phenotypes and disease will sometimes not be possible unless these biophysical ambiguities can be resolved using additional measurements.
Collapse
Affiliation(s)
- Xianghua Li
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain. .,Universitat Pompeu Fabra (UPF), Barcelona, Spain. .,ICREA, Pg. Luis Companys 23, Barcelona, 08010, Spain.
| |
Collapse
|
109
|
Zhang TH, Dai L, Barton JP, Du Y, Tan Y, Pang W, Chakraborty AK, Lloyd-Smith JO, Sun R. Predominance of positive epistasis among drug resistance-associated mutations in HIV-1 protease. PLoS Genet 2020; 16:e1009009. [PMID: 33085662 PMCID: PMC7605711 DOI: 10.1371/journal.pgen.1009009] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 11/02/2020] [Accepted: 07/24/2020] [Indexed: 12/12/2022] Open
Abstract
Drug-resistant mutations often have deleterious impacts on replication fitness, posing a fitness cost that can only be overcome by compensatory mutations. However, the role of fitness cost in the evolution of drug resistance has often been overlooked in clinical studies or in vitro selection experiments, as these observations only capture the outcome of drug selection. In this study, we systematically profile the fitness landscape of resistance-associated sites in HIV-1 protease using deep mutational scanning. We construct a mutant library covering combinations of mutations at 11 sites in HIV-1 protease, all of which are associated with resistance to protease inhibitors in clinic. Using deep sequencing, we quantify the fitness of thousands of HIV-1 protease mutants after multiple cycles of replication in human T cells. Although the majority of resistance-associated mutations have deleterious effects on viral replication, we find that epistasis among resistance-associated mutations is predominantly positive. Furthermore, our fitness data are consistent with genetic interactions inferred directly from HIV sequence data of patients. Fitness valleys formed by strong positive epistasis reduce the likelihood of reversal of drug resistance mutations. Overall, our results support the view that strong compensatory effects are involved in the emergence of clinically observed resistance mutations and provide insights to understanding fitness barriers in the evolution and reversion of drug resistance.
Collapse
Affiliation(s)
- Tian-hao Zhang
- Molecular Biology Institute, University of California, Los Angeles, CA 90095, USA
| | - Lei Dai
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - John P. Barton
- Department of Physics and Astronomy, University of California, Riverside, CA 92521, USA
| | - Yushen Du
- School of Medicine, ZheJiang University, Hangzhou, 210000, China
- Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Yuxiang Tan
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Wenwen Pang
- Department of Public Health Laboratory Science, West China School of Public Health, Sichuan University, Chengdu 610041, China
| | - Arup K. Chakraborty
- Institute for Medical Engineering and Science, Departments of Chemical Engineering, Physics, & Chemistry, Massachusetts Institute of Technology, MA 21309, USA
- Ragon Institute of MGH, MIT, & Harvard, Cambridge, MA 21309, USA
| | - James O. Lloyd-Smith
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095, USA
| | - Ren Sun
- Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| |
Collapse
|
110
|
Abstract
Living systems evolve one mutation at a time, but a single mutation can alter the effect of subsequent mutations. The underlying mechanistic determinants of such epistasis are unclear. Here, we demonstrate that the physical dynamics of a biological system can generically constrain epistasis. We analyze models and experimental data on proteins and regulatory networks. In each, we find that if the long-time physical dynamics is dominated by a slow, collective mode, then the dimensionality of mutational effects is reduced. Consequently, epistatic coefficients for different combinations of mutations are no longer independent, even if individually strong. Such epistasis can be summarized as resulting from a global nonlinearity applied to an underlying linear trait, that is, as global epistasis. This constraint, in turn, reduces the ruggedness of the sequence-to-function map. By providing a generic mechanistic origin for experimentally observed global epistasis, our work suggests that slow collective physical modes can make biological systems evolvable.
Collapse
Affiliation(s)
- Kabir Husain
- Department of Physics, University of Chicago, Chicago, IL
| | - Arvind Murugan
- Department of Physics, University of Chicago, Chicago, IL
| |
Collapse
|
111
|
Jakobson CM, Jarosz DF. What Has a Century of Quantitative Genetics Taught Us About Nature's Genetic Tool Kit? Annu Rev Genet 2020; 54:439-464. [PMID: 32897739 DOI: 10.1146/annurev-genet-021920-102037] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The complexity of heredity has been appreciated for decades: Many traits are controlled not by a single genetic locus but instead by polymorphisms throughout the genome. The importance of complex traits in biology and medicine has motivated diverse approaches to understanding their detailed genetic bases. Here, we focus on recent systematic studies, many in budding yeast, which have revealed that large numbers of all kinds of molecular variation, from noncoding to synonymous variants, can make significant contributions to phenotype. Variants can affect different traits in opposing directions, and their contributions can be modified by both the environment and the epigenetic state of the cell. The integration of prospective (synthesizing and analyzing variants) and retrospective (examining standing variation) approaches promises to reveal how natural selection shapes quantitative traits. Only by comprehensively understanding nature's genetic tool kit can we predict how phenotypes arise from the complex ensembles of genetic variants in living organisms.
Collapse
Affiliation(s)
- Christopher M Jakobson
- Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, California 94305, USA;
| | - Daniel F Jarosz
- Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, California 94305, USA; .,Department of Developmental Biology, Stanford University School of Medicine, Stanford, California 94305, USA
| |
Collapse
|
112
|
Procko E. Deep mutagenesis in the study of COVID-19: a technical overview for the proteomics community. Expert Rev Proteomics 2020; 17:633-638. [PMID: 33084449 PMCID: PMC7594187 DOI: 10.1080/14789450.2020.1833721] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 10/05/2020] [Indexed: 12/14/2022]
Abstract
INTRODUCTION The spike (S) of SARS coronavirus 2 (SARS-CoV-2) engages angiotensin-converting enzyme 2 (ACE2) on a host cell to trigger viral-cell membrane fusion and infection. The extracellular region of ACE2 can be administered as a soluble decoy to compete for binding sites on the receptor-binding domain (RBD) of S, but it has only moderate affinity and efficacy. The RBD, which is targeted by neutralizing antibodies, may also change and adapt through mutation as SARS-CoV-2 becomes endemic, posing challenges for therapeutic and vaccine development. AREAS COVERED Deep mutagenesis is a Big Data approach to characterizing sequence variants. A deep mutational scan of ACE2 expressed on human cells identified mutations that increase S affinity and guided the engineering of a potent and broad soluble receptor decoy. A deep mutational scan of the RBD displayed on the surface of yeast has revealed residues tolerant of mutational changes that may act as a source for drug resistance and antigenic drift. EXPERT OPINION Deep mutagenesis requires a selection of diverse sequence variants; an in vitro evolution experiment that is tracked with next-generation sequencing. The choice of expression system, diversity of the variant library and selection strategy have important consequences for data quality and interpretation.
Collapse
Affiliation(s)
- Erik Procko
- Department of Biochemistry and Cancer Center at Illinois, University of Illinois, Urbana, IL, USA
| |
Collapse
|
113
|
DiMSum: an error model and pipeline for analyzing deep mutational scanning data and diagnosing common experimental pathologies. Genome Biol 2020; 21:207. [PMID: 32799905 PMCID: PMC7429474 DOI: 10.1186/s13059-020-02091-3] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 07/05/2020] [Indexed: 12/30/2022] Open
Abstract
Deep mutational scanning (DMS) enables multiplexed measurement of the effects of thousands of variants of proteins, RNAs, and regulatory elements. Here, we present a customizable pipeline, DiMSum, that represents an end-to-end solution for obtaining variant fitness and error estimates from raw sequencing data. A key innovation of DiMSum is the use of an interpretable error model that captures the main sources of variability arising in DMS workflows, outperforming previous methods. DiMSum is available as an R/Bioconda package and provides summary reports to help researchers diagnose common DMS pathologies and take remedial steps in their analyses.
Collapse
|
114
|
Yang G, Miton CM, Tokuriki N. A mechanistic view of enzyme evolution. Protein Sci 2020; 29:1724-1747. [PMID: 32557882 PMCID: PMC7380680 DOI: 10.1002/pro.3901] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 06/14/2020] [Accepted: 06/16/2020] [Indexed: 12/15/2022]
Abstract
New enzyme functions often evolve through the recruitment and optimization of latent promiscuous activities. How do mutations alter the molecular architecture of enzymes to enhance their activities? Can we infer general mechanisms that are common to most enzymes, or does each enzyme require a unique optimization process? The ability to predict the location and type of mutations necessary to enhance an enzyme's activity is critical to protein engineering and rational design. In this review, via the detailed examination of recent studies that have shed new light on the molecular changes underlying the optimization of enzyme function, we provide a mechanistic perspective of enzyme evolution. We first present a global survey of the prevalence of activity-enhancing mutations and their distribution within protein structures. We then delve into the molecular solutions that mediate functional optimization, specifically highlighting several common mechanisms that have been observed across multiple examples. As distinct protein sequences encounter different evolutionary bottlenecks, different mechanisms are likely to emerge along evolutionary trajectories toward improved function. Identifying the specific mechanism(s) that need to be improved upon, and tailoring our engineering efforts to each sequence, may considerably improve our chances to succeed in generating highly efficient catalysts in the future.
Collapse
Affiliation(s)
- Gloria Yang
- Michael Smith LaboratoriesUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Charlotte M. Miton
- Michael Smith LaboratoriesUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Nobuhiko Tokuriki
- Michael Smith LaboratoriesUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| |
Collapse
|
115
|
Rogers JM. Peptide Folding and Binding Probed by Systematic Non-canonical Mutagenesis. Front Mol Biosci 2020; 7:100. [PMID: 32671094 PMCID: PMC7326784 DOI: 10.3389/fmolb.2020.00100] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Accepted: 05/04/2020] [Indexed: 12/20/2022] Open
Abstract
Many proteins and peptides fold upon binding another protein. Mutagenesis has proved an essential tool in the study of these multi-step molecular recognition processes. By comparing the biophysical behavior of carefully selected mutants, the concert of interactions and conformational changes that occur during folding and binding can be separated and assessed. Recently, this mutagenesis approach has been radically expanded by deep mutational scanning methods, which allow for many thousands of mutations to be examined in parallel. Furthermore, these high-throughput mutagenesis methods have been expanded to include mutations to non-canonical amino acids, returning peptide structure-activity relationships with unprecedented depth and detail. These developments are timely, as the insights they provide can guide the optimization of de novo cyclic peptides, a promising new modality for chemical probes and therapeutic agents.
Collapse
Affiliation(s)
- Joseph M Rogers
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
116
|
Wu NC, Thompson AJ, Lee JM, Su W, Arlian BM, Xie J, Lerner RA, Yen HL, Bloom JD, Wilson IA. Different genetic barriers for resistance to HA stem antibodies in influenza H3 and H1 viruses. Science 2020; 368:1335-1340. [PMID: 32554590 PMCID: PMC7412937 DOI: 10.1126/science.aaz5143] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 04/14/2020] [Indexed: 12/19/2022]
Abstract
The discovery and characterization of broadly neutralizing human antibodies (bnAbs) to the highly conserved stem region of influenza hemagglutinin (HA) have contributed to considerations of a universal influenza vaccine. However, the potential for resistance to stem bnAbs also needs to be more thoroughly evaluated. Using deep mutational scanning, with a focus on epitope residues, we found that the genetic barrier to resistance to stem bnAbs is low for the H3 subtype but substantially higher for the H1 subtype owing to structural differences in the HA stem. Several strong resistance mutations in H3 can be observed in naturally circulating strains and do not reduce in vitro viral fitness and in vivo pathogenicity. This study highlights a potential challenge for development of a truly universal influenza vaccine.
Collapse
Affiliation(s)
- Nicholas C Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Andrew J Thompson
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Juhye M Lee
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Medical Scientist Training Program, University of Washington, Seattle, WA 98195, USA
| | - Wen Su
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Britni M Arlian
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Jia Xie
- Department of Chemistry, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Richard A Lerner
- Department of Chemistry, The Scripps Research Institute, La Jolla, CA 92037, USA
- The Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Hui-Ling Yen
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Jesse D Bloom
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Ian A Wilson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA.
- The Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| |
Collapse
|
117
|
Allostery and Epistasis: Emergent Properties of Anisotropic Networks. ENTROPY 2020; 22:e22060667. [PMID: 33286439 PMCID: PMC7517209 DOI: 10.3390/e22060667] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 06/02/2020] [Accepted: 06/08/2020] [Indexed: 11/17/2022]
Abstract
Understanding the underlying mechanisms behind protein allostery and non-additivity of substitution outcomes (i.e., epistasis) is critical when attempting to predict the functional impact of mutations, particularly at non-conserved sites. In an effort to model these two biological properties, we extend the framework of our metric to calculate dynamic coupling between residues, the Dynamic Coupling Index (DCI) to two new metrics: (i) EpiScore, which quantifies the difference between the residue fluctuation response of a functional site when two other positions are perturbed with random Brownian kicks simultaneously versus individually to capture the degree of cooperativity of these two other positions in modulating the dynamics of the functional site and (ii) DCIasym, which measures the degree of asymmetry between the residue fluctuation response of two sites when one or the other is perturbed with a random force. Applied to four independent systems, we successfully show that EpiScore and DCIasym can capture important biophysical properties in dual mutant substitution outcomes. We propose that allosteric regulation and the mechanisms underlying non-additive amino acid substitution outcomes (i.e., epistasis) can be understood as emergent properties of an anisotropic network of interactions where the inclusion of the full network of interactions is critical for accurate modeling. Consequently, mutations which drive towards a new function may require a fine balance between functional site asymmetry and strength of dynamic coupling with the functional sites. These two tools will provide mechanistic insight into both understanding and predicting the outcome of dual mutations.
Collapse
|
118
|
Du Y, Hultquist JF, Zhou Q, Olson A, Tseng Y, Zhang TH, Hong M, Tang K, Chen L, Meng X, McGregor MJ, Dai L, Gong D, Martin-Sancho L, Chanda S, Li X, Bensenger S, Krogan NJ, Sun R. mRNA display with library of even-distribution reveals cellular interactors of influenza virus NS1. Nat Commun 2020; 11:2449. [PMID: 32415096 PMCID: PMC7229031 DOI: 10.1038/s41467-020-16140-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Accepted: 04/13/2020] [Indexed: 12/19/2022] Open
Abstract
A comprehensive examination of protein-protein interactions (PPIs) is fundamental for the understanding of cellular machineries. However, limitations in current methodologies often prevent the detection of PPIs with low abundance proteins. To overcome this challenge, we develop a mRNA display with library of even-distribution (md-LED) method that facilitates the detection of low abundance binders with high specificity and sensitivity. As a proof-of-principle, we apply md-LED to IAV NS1 protein. Complementary to AP-MS, md-LED enables us to validate previously described PPIs as well as to identify novel NS1 interactors. We show that interacting with FASN allows NS1 to directly regulate the synthesis of cellular fatty acids. We also use md-LED to identify a mutant of NS1, D92Y, results in a loss of interaction with CPSF1. The use of high-throughput sequencing as the readout for md-LED enables sensitive quantification of interactions, ultimately enabling massively parallel experimentation for the investigation of PPIs.
Collapse
Affiliation(s)
- Yushen Du
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA, 90095, USA.
- Cancer Institute, ZJU-UCLA Joint Center for Medical Education and Research, School of Medicine, Zhejiang University, Hangzhou, 310058, China.
| | - Judd F Hultquist
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA, 94158, USA
- California Institute for Quantitative Biosciences, QB3, University of California, San Francisco, San Francisco, CA, 94158, USA
- J. David Gladstone Institutes, San Francisco, CA, 94158, USA
- Division of Infectious Diseases, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Quan Zhou
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA, 90095, USA
| | - Anders Olson
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA, 90095, USA
| | - Yenwen Tseng
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA, 90095, USA
| | - Tian-Hao Zhang
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, CA, 90095, USA
| | - Mengying Hong
- Cancer Institute, ZJU-UCLA Joint Center for Medical Education and Research, School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Kejun Tang
- Cancer Institute, ZJU-UCLA Joint Center for Medical Education and Research, School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Liubo Chen
- Cancer Institute, ZJU-UCLA Joint Center for Medical Education and Research, School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Xiangzhi Meng
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA, 90095, USA
| | - Michael J McGregor
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA, 94158, USA
- California Institute for Quantitative Biosciences, QB3, University of California, San Francisco, San Francisco, CA, 94158, USA
- J. David Gladstone Institutes, San Francisco, CA, 94158, USA
| | - Lei Dai
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA, 90095, USA
| | - Danyang Gong
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA, 90095, USA
| | - Laura Martin-Sancho
- Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Sumit Chanda
- Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Xinming Li
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine at UCLA, L, Los Angeles, CA, 90095, USA
| | - Steve Bensenger
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, CA, 90095, USA
| | - Nevan J Krogan
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA, 94158, USA
- California Institute for Quantitative Biosciences, QB3, University of California, San Francisco, San Francisco, CA, 94158, USA
- J. David Gladstone Institutes, San Francisco, CA, 94158, USA
| | - Ren Sun
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA, 90095, USA.
- Molecular Biology Institute, University of California, Los Angeles, CA, 90095, USA.
| |
Collapse
|
119
|
Nagano M, Suga H. Expansion of Modality: Peptides to Pseudo-Natural Macrocyclic Peptides. J SYN ORG CHEM JPN 2020. [DOI: 10.5059/yukigoseikyokaishi.78.516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
| | - Hiroaki Suga
- Department of Chemistry, Graduate School of Science, The University of Tokyo
| |
Collapse
|
120
|
Lesk AM. Not Enough Natural Data? Sequence and Ye Shall Find. Front Mol Biosci 2020; 7:65. [PMID: 32373628 PMCID: PMC7186298 DOI: 10.3389/fmolb.2020.00065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Accepted: 03/25/2020] [Indexed: 11/28/2022] Open
Affiliation(s)
- Arthur M Lesk
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, United States
| |
Collapse
|
121
|
Zhou J, McCandlish DM. Minimum epistasis interpolation for sequence-function relationships. Nat Commun 2020; 11:1782. [PMID: 32286265 PMCID: PMC7156698 DOI: 10.1038/s41467-020-15512-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 03/12/2020] [Indexed: 12/17/2022] Open
Abstract
Massively parallel phenotyping assays have provided unprecedented insight into how multiple mutations combine to determine biological function. While such assays can measure phenotypes for thousands to millions of genotypes in a single experiment, in practice these measurements are not exhaustive, so that there is a need for techniques to impute values for genotypes whose phenotypes have not been directly assayed. Here, we present an imputation method based on inferring the least epistatic possible sequence-function relationship compatible with the data. In particular, we infer the reconstruction where mutational effects change as little as possible across adjacent genetic backgrounds. The resulting models can capture complex higher-order genetic interactions near the data, but approach additivity where data is sparse or absent. We apply the method to high-throughput transcription factor binding assays and use it to explore a fitness landscape for protein G.
Collapse
Affiliation(s)
- Juannan Zhou
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.
| |
Collapse
|
122
|
Fantini M, Lisi S, De Los Rios P, Cattaneo A, Pastore A. Protein Structural Information and Evolutionary Landscape by In Vitro Evolution. Mol Biol Evol 2020; 37:1179-1192. [PMID: 31670785 PMCID: PMC7086169 DOI: 10.1093/molbev/msz256] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Protein structure is tightly intertwined with function according to the laws of evolution. Understanding how structure determines function has been the aim of structural biology for decades. Here, we have wondered instead whether it is possible to exploit the function for which a protein was evolutionary selected to gain information on protein structure and on the landscape explored during the early stages of molecular and natural evolution. To answer to this question, we developed a new methodology, which we named CAMELS (Coupling Analysis by Molecular Evolution Library Sequencing), that is able to obtain the in vitro evolution of a protein from an artificial selection based on function. We were able to observe with CAMELS many features of the TEM-1 beta-lactamase local fold exclusively by generating and sequencing large libraries of mutational variants. We demonstrated that we can, whenever a functional phenotypic selection of a protein is available, sketch the structural and evolutionary landscape of a protein without utilizing purified proteins, collecting physical measurements, or relying on the pool of natural protein variants.
Collapse
Affiliation(s)
- Marco Fantini
- BioSNS Laboratory of Biology, Scuola Normale Superiore (SNS), Pisa, Italy
| | - Simonetta Lisi
- BioSNS Laboratory of Biology, Scuola Normale Superiore (SNS), Pisa, Italy
| | - Paolo De Los Rios
- Institute of Physics, School of Basic Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Antonino Cattaneo
- BioSNS Laboratory of Biology, Scuola Normale Superiore (SNS), Pisa, Italy
- European Brain Research Institute, Rome, Italy
| | - Annalisa Pastore
- Department of Clinical and Basic Neuroscience, Maurice Wohl Institute, King's College London, London, United Kingdom
- Dementia Research Institute, King’s College London, London, United Kingdom
| |
Collapse
|
123
|
Blanco C, Verbanic S, Seelig B, Chen IA. High throughput sequencing of in vitro selections of mRNA-displayed peptides: data analysis and applications. Phys Chem Chem Phys 2020; 22:6492-6506. [PMID: 31967131 PMCID: PMC8219182 DOI: 10.1039/c9cp05912a] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
In vitro selection using mRNA display is currently a widely used method to isolate functional peptides with desired properties. The analysis of high throughput sequencing (HTS) data from in vitro evolution experiments has proven to be a powerful technique but only recently has it been applied to mRNA display selections. In this Perspective, we introduce aspects of mRNA display and HTS that may be of interest to physical chemists. We highlight the potential of HTS to analyze in vitro selections of peptides and review recent advances in the application of HTS analysis to mRNA display experiments. We discuss some possible issues involved with HTS analysis and summarize some strategies to alleviate them. Finally, the potential for future impact of advancing HTS analysis on mRNA display experiments is discussed.
Collapse
Affiliation(s)
- Celia Blanco
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, CA 93106, USA.
| | | | | | | |
Collapse
|
124
|
Bravi B, Ravasio R, Brito C, Wyart M. Direct coupling analysis of epistasis in allosteric materials. PLoS Comput Biol 2020; 16:e1007630. [PMID: 32119660 PMCID: PMC7067494 DOI: 10.1371/journal.pcbi.1007630] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 03/12/2020] [Accepted: 01/03/2020] [Indexed: 11/22/2022] Open
Abstract
In allosteric proteins, the binding of a ligand modifies function at a distant active site. Such allosteric pathways can be used as target for drug design, generating considerable interest in inferring them from sequence alignment data. Currently, different methods lead to conflicting results, in particular on the existence of long-range evolutionary couplings between distant amino-acids mediating allostery. Here we propose a resolution of this conundrum, by studying epistasis and its inference in models where an allosteric material is evolved in silico to perform a mechanical task. We find in our model the four types of epistasis (Synergistic, Sign, Antagonistic, Saturation), which can be both short or long-range and have a simple mechanical interpretation. We perform a Direct Coupling Analysis (DCA) and find that DCA predicts well the cost of point mutations but is a rather poor generative model. Strikingly, it can predict short-range epistasis but fails to capture long-range epistasis, in consistence with empirical findings. We propose that such failure is generic when function requires subparts to work in concert. We illustrate this idea with a simple model, which suggests that other methods may be better suited to capture long-range effects. Allostery in proteins is the property of highly specific responses to ligand binding at a distant site. To inform protocols of de novo drug design, it is fundamental to understand the impact of mutations on allosteric regulation and whether it can be predicted from evolutionary correlations. In this work we consider allosteric architectures artificially evolved to optimize the cooperativity of binding at allosteric and active site. We first characterize the emergent pattern of epistasis as well as the underlying mechanical phenomena, finding the four types of epistasis (Synergistic, Sign, Antagonistic, Saturation), which can be both short or long-range. The numerical evolution of these allosteric architectures allows us to benchmark Direct Coupling Analysis, a method which relies on co-evolution in sequence data to infer direct evolutionary couplings, in connection to allostery. We show that Direct Coupling Analysis predicts quantitatively point mutation costs but underestimates strong long-range epistasis. We provide an argument, based on a simplified model, illustrating the reasons for this discrepancy. Our analysis suggests neural networks as more promising tool to measure epistasis.
Collapse
Affiliation(s)
- Barbara Bravi
- Institute of Physics, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- * E-mail: (BB); (MW)
| | - Riccardo Ravasio
- Institute of Physics, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Carolina Brito
- Instituto de Fìsica, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Matthieu Wyart
- Institute of Physics, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- * E-mail: (BB); (MW)
| |
Collapse
|
125
|
Penn WD, McKee AG, Kuntz CP, Woods H, Nash V, Gruenhagen TC, Roushar FJ, Chandak M, Hemmerich C, Rusch DB, Meiler J, Schlebach JP. Probing biophysical sequence constraints within the transmembrane domains of rhodopsin by deep mutational scanning. SCIENCE ADVANCES 2020; 6:eaay7505. [PMID: 32181350 PMCID: PMC7056298 DOI: 10.1126/sciadv.aay7505] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 12/11/2019] [Indexed: 05/15/2023]
Abstract
Membrane proteins must balance the sequence constraints associated with folding and function against the hydrophobicity required for solvation within the bilayer. We recently found the expression and maturation of rhodopsin are limited by the hydrophobicity of its seventh transmembrane domain (TM7), which contains polar residues that are essential for function. On the basis of these observations, we hypothesized that rhodopsin's expression should be less tolerant of mutations in TM7 relative to those within hydrophobic TM domains. To test this hypothesis, we used deep mutational scanning to compare the effects of 808 missense mutations on the plasma membrane expression of rhodopsin in HEK293T cells. Our results confirm that a higher proportion of mutations within TM7 (37%) decrease rhodopsin's plasma membrane expression relative to those within a hydrophobic TM domain (TM2, 25%). These results in conjunction with an evolutionary analysis suggest solvation energetics likely restricts the evolutionary sequence space of polar TM domains.
Collapse
Affiliation(s)
- Wesley D. Penn
- Department of Chemistry, Indiana University, Bloomington, IN 47405, USA
| | - Andrew G. McKee
- Department of Chemistry, Indiana University, Bloomington, IN 47405, USA
| | - Charles P. Kuntz
- Department of Chemistry, Indiana University, Bloomington, IN 47405, USA
| | - Hope Woods
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, USA
- Chemical and Physical Biology Program, Vanderbilt University, Nashville, TN 37235, USA
| | - Veronica Nash
- Department of Chemistry, Indiana University, Bloomington, IN 47405, USA
| | | | | | - Mahesh Chandak
- Department of Chemistry, Indiana University, Bloomington, IN 47405, USA
| | - Chris Hemmerich
- Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN 47405, USA
| | - Douglas B. Rusch
- Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN 47405, USA
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, USA
| | - Jonathan P. Schlebach
- Department of Chemistry, Indiana University, Bloomington, IN 47405, USA
- Corresponding author.
| |
Collapse
|
126
|
Affiliation(s)
- Melissa Chiasson
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA. .,Department of Bioengineering, University of Washington, Seattle, WA, USA. .,Genetic Networks Program, CIFAR, Toronto, Ontario, Canada.
| |
Collapse
|
127
|
Miton CM, Chen JZ, Ost K, Anderson DW, Tokuriki N. Statistical analysis of mutational epistasis to reveal intramolecular interaction networks in proteins. Methods Enzymol 2020; 643:243-280. [DOI: 10.1016/bs.mie.2020.07.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
128
|
Atkinson JT, Jones AM, Nanda V, Silberg JJ. Protein tolerance to random circular permutation correlates with thermostability and local energetics of residue-residue contacts. Protein Eng Des Sel 2019; 32:489-501. [PMID: 32626892 PMCID: PMC7462040 DOI: 10.1093/protein/gzaa012] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 04/13/2020] [Accepted: 04/15/2020] [Indexed: 01/08/2023] Open
Abstract
Adenylate kinase (AK) orthologs with a range of thermostabilities were subjected to random circular permutation, and deep mutational scanning was used to evaluate where new protein termini were nondisruptive to activity. The fraction of circularly permuted variants that retained function in each library correlated with AK thermostability. In addition, analysis of the positional tolerance to new termini, which increase local conformational flexibility, showed that bonds were either functionally sensitive to cleavage across all homologs, differentially sensitive, or uniformly tolerant. The mobile AMP-binding domain, which displays the highest calculated contact energies, presented the greatest tolerance to new termini across all AKs. In contrast, retention of function in the lid and core domains was more dependent upon AK melting temperature. These results show that family permutation profiling identifies primary structure that has been selected by evolution for dynamics that are critical to activity within an enzyme family. These findings also illustrate how deep mutational scanning can be applied to protein homologs in parallel to differentiate how topology, stability, and local energetics govern mutational tolerance.
Collapse
Affiliation(s)
- Joshua T Atkinson
- Systems, Synthetic, and Physical Biology Graduate Program, Rice University, 6100 Main Street, MS-180, Houston, TX 77005, USA
- Department of BioSciences, Rice University, 6100 Main Street, MS-140, Houston, TX 77005, USA
| | - Alicia M Jones
- Biochemistry and Cell Biology Graduate Program, Rice University, 6100 Main Street, MS-140, Houston, TX 77005, USA
| | - Vikas Nanda
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jonathan J Silberg
- Department of BioSciences, Rice University, 6100 Main Street, MS-140, Houston, TX 77005, USA
- Department of Bioengineering, Rice University, 6100 Main Street, MS-142, Houston, TX 77005, USA
- Department of Chemical and Biomolecular Engineering, Rice University, 6100 Main Street, MS-362, Houston, TX 77005, USA
| |
Collapse
|
129
|
Esposito D, Weile J, Shendure J, Starita LM, Papenfuss AT, Roth FP, Fowler DM, Rubin AF. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol 2019; 20:223. [PMID: 31679514 PMCID: PMC6827219 DOI: 10.1186/s13059-019-1845-6] [Citation(s) in RCA: 125] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Accepted: 10/01/2019] [Indexed: 11/10/2022] Open
Abstract
Multiplex assays of variant effect (MAVEs), such as deep mutational scans and massively parallel reporter assays, test thousands of sequence variants in a single experiment. Despite the importance of MAVE data for basic and clinical research, there is no standard resource for their discovery and distribution. Here, we present MaveDB ( https://www.mavedb.org ), a public repository for large-scale measurements of sequence variant impact, designed for interoperability with applications to interpret these datasets. We also describe the first such application, MaveVis, which retrieves, visualizes, and contextualizes variant effect maps. Together, the database and applications will empower the community to mine these powerful datasets.
Collapse
Affiliation(s)
- Daniel Esposito
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Jochen Weile
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Lea M Starita
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Anthony T Papenfuss
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, VIC, Australia
- Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, Australia
| | - Frederick P Roth
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada.
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
- Department of Bioengineering, University of Washington, Seattle, WA, USA.
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia.
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia.
| |
Collapse
|
130
|
Kemble H, Nghe P, Tenaillon O. Recent insights into the genotype-phenotype relationship from massively parallel genetic assays. Evol Appl 2019; 12:1721-1742. [PMID: 31548853 PMCID: PMC6752143 DOI: 10.1111/eva.12846] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/21/2019] [Accepted: 07/02/2019] [Indexed: 12/20/2022] Open
Abstract
With the molecular revolution in Biology, a mechanistic understanding of the genotype-phenotype relationship became possible. Recently, advances in DNA synthesis and sequencing have enabled the development of deep mutational scanning assays, capable of scoring comprehensive libraries of genotypes for fitness and a variety of phenotypes in massively parallel fashion. The resulting empirical genotype-fitness maps pave the way to predictive models, potentially accelerating our ability to anticipate the behaviour of pathogen and cancerous cell populations from sequencing data. Besides from cellular fitness, phenotypes of direct application in industry (e.g. enzyme activity) and medicine (e.g. antibody binding) can be quantified and even selected directly by these assays. This review discusses the technological basis of and recent developments in massively parallel genetics, along with the trends it is uncovering in the genotype-phenotype relationship (distribution of mutation effects, epistasis), their possible mechanistic bases and future directions for advancing towards the goal of predictive genetics.
Collapse
Affiliation(s)
- Harry Kemble
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Philippe Nghe
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Olivier Tenaillon
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
| |
Collapse
|
131
|
Wang S, Dai L. Evolving generalists in switching rugged landscapes. PLoS Comput Biol 2019; 15:e1007320. [PMID: 31574088 PMCID: PMC6771975 DOI: 10.1371/journal.pcbi.1007320] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 08/02/2019] [Indexed: 01/05/2023] Open
Abstract
Evolving systems, be it an antibody repertoire in the face of mutating pathogens or a microbial population exposed to varied antibiotics, constantly search for adaptive solutions in time-varying fitness landscapes. Generalists refer to genotypes that remain fit across diverse selective pressures; while multi-drug resistant microbes are undesired yet prevalent, broadly-neutralizing antibodies are much wanted but rare. However, little is known about under what conditions such generalists with a high capacity to adapt can be efficiently discovered by evolution. In addition, can epistasis-the source of landscape ruggedness and path constraints-play a different role, if the environment varies in a non-random way? We present a generative model to estimate the propensity of evolving generalists in rugged landscapes that are tunably related and alternating relatively slowly. We find that environmental cycling can substantially facilitate the search for fit generalists by dynamically enlarging their effective basins of attraction. Importantly, these high performers are most likely to emerge at intermediate levels of ruggedness and environmental relatedness. Our approach allows one to estimate correlations across environments from the topography of experimental fitness landscapes. Our work provides a conceptual framework to study evolution in time-correlated complex environments, and offers statistical understanding that suggests general strategies for eliciting broadly neutralizing antibodies or preventing microbes from evolving multi-drug resistance.
Collapse
Affiliation(s)
- Shenshen Wang
- Department of Physics and Astronomy, University of California, Los Angeles, Los Angeles, California, United States of America
- * E-mail:
| | - Lei Dai
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| |
Collapse
|
132
|
Li X, Lalić J, Baeza-Centurion P, Dhar R, Lehner B. Changes in gene expression predictably shift and switch genetic interactions. Nat Commun 2019; 10:3886. [PMID: 31467279 PMCID: PMC6715729 DOI: 10.1038/s41467-019-11735-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Accepted: 07/29/2019] [Indexed: 11/18/2022] Open
Abstract
Non-additive interactions between mutations occur extensively and also change across conditions, making genetic prediction a difficult challenge. To better understand the plasticity of genetic interactions (epistasis), we combine mutations in a single protein performing a single function (a transcriptional repressor inhibiting a target gene). Even in this minimal system, genetic interactions switch from positive (suppressive) to negative (enhancing) as the expression of the gene changes. These seemingly complicated changes can be predicted using a mathematical model that propagates the effects of mutations on protein folding to the cellular phenotype. More generally, changes in gene expression should be expected to alter the effects of mutations and how they interact whenever the relationship between expression and a phenotype is nonlinear, which is the case for most genes. These results have important implications for understanding genotype-phenotype maps and illustrate how changes in genetic interactions can often—but not always—be predicted by hierarchical mechanistic models. Non-additive genetic interactions are plastic and can complicate genetic prediction. Here, using deep mutagenesis of the lambda repressor, Li et al. reveal that changes in gene expression can alter the strength and direction of genetic interactions between mutations in many genes and develop mathematical models for predicting them.
Collapse
Affiliation(s)
- Xianghua Li
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
| | - Jasna Lalić
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
| | - Pablo Baeza-Centurion
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
| | - Riddhiman Dhar
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain. .,Universitat Pompeu Fabra (UPF), Barcelona, Spain. .,ICREA, Pg. Luis Companys 23, Barcelona, 08010, Spain.
| |
Collapse
|
133
|
Ferrada E. Gene Families, Epistasis and the Amino Acid Preferences of Protein Homologs. Evol Bioinform Online 2019; 15:1176934319870485. [PMID: 31452598 PMCID: PMC6698995 DOI: 10.1177/1176934319870485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 07/27/2019] [Indexed: 11/16/2022] Open
Abstract
In order to preserve structure and function, proteins tend to preferentially conserve amino acids at particular sites along the sequence. Because mutations can affect structure and function, the question arises whether the preference of a protein site for a particular amino acid varies between protein homologs, and to what extent that variation depends on sequence divergence. Answering these questions can help in the development of models of sequence evolution, as well as provide insights on the dependence of the fitness effects of mutations on the genetic background of sequences, a phenomenon known as epistasis. Here, I comment on recent computational work providing a systematic analysis of the extent to which the amino acid preferences of proteins depend on the background mutations of protein homologs.
Collapse
Affiliation(s)
- Evandro Ferrada
- Center for Genomics and Bioinformatics, Faculty of Science, Universidad Mayor, Santiago, Chile
| |
Collapse
|
134
|
Atkinson JT, Jones AM, Zhou Q, Silberg JJ. Circular permutation profiling by deep sequencing libraries created using transposon mutagenesis. Nucleic Acids Res 2019; 46:e76. [PMID: 29912470 PMCID: PMC6061844 DOI: 10.1093/nar/gky255] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Accepted: 03/28/2018] [Indexed: 12/17/2022] Open
Abstract
Deep mutational scanning has been used to create high-resolution DNA sequence maps that illustrate the functional consequences of large numbers of point mutations. However, this approach has not yet been applied to libraries of genes created by random circular permutation, an engineering strategy that is used to create open reading frames that express proteins with altered contact order. We describe a new method, termed circular permutation profiling with DNA sequencing (CPP-seq), which combines a one-step transposon mutagenesis protocol for creating libraries with a functional selection, deep sequencing and computational analysis to obtain unbiased insight into a protein's tolerance to circular permutation. Application of this method to an adenylate kinase revealed that CPP-seq creates two types of vectors encoding each circularly permuted gene, which differ in their ability to express proteins. Functional selection of this library revealed that >65% of the sampled vectors that express proteins are enriched relative to those that cannot translate proteins. Mapping enriched sequences onto structure revealed that the mobile AMP binding and rigid core domains display greater tolerance to backbone fragmentation than the mobile lid domain, illustrating how CPP-seq can be used to relate a protein's biophysical characteristics to the retention of activity upon permutation.
Collapse
Affiliation(s)
- Joshua T Atkinson
- Systems, Synthetic, and Physical Biology Graduate Program, Rice University, 6100 Main MS-180, Houston, TX 77005, USA
| | - Alicia M Jones
- Department of BioSciences, Rice University, MS-140, 6100 Main Street, Houston, TX 77005, USA
| | - Quan Zhou
- Department of Statistics, Rice University, 6100 Main Street, Houston, TX 77005, USA
| | - Jonathan J Silberg
- Department of BioSciences, Rice University, MS-140, 6100 Main Street, Houston, TX 77005, USA.,Department of Bioengineering, Rice University, 6100 Main Street, Houston, TX 77005, USA
| |
Collapse
|
135
|
Choi GCG, Zhou P, Yuen CTL, Chan BKC, Xu F, Bao S, Chu HY, Thean D, Tan K, Wong KH, Zheng Z, Wong ASL. Combinatorial mutagenesis en masse optimizes the genome editing activities of SpCas9. Nat Methods 2019; 16:722-730. [PMID: 31308554 DOI: 10.1038/s41592-019-0473-0] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 06/03/2019] [Indexed: 01/01/2023]
Abstract
The combined effect of multiple mutations on protein function is hard to predict; thus, the ability to functionally assess a vast number of protein sequence variants would be practically useful for protein engineering. Here we present a high-throughput platform that enables scalable assembly and parallel characterization of barcoded protein variants with combinatorial modifications. We demonstrate this platform, which we name CombiSEAL, by systematically characterizing a library of 948 combination mutants of the widely used Streptococcus pyogenes Cas9 (SpCas9) nuclease to optimize its genome-editing activity in human cells. The ease with which the editing activities of the pool of SpCas9 variants can be assessed at multiple on- and off-target sites accelerates the identification of optimized variants and facilitates the study of mutational epistasis. We successfully identify Opti-SpCas9, which possesses enhanced editing specificity without sacrificing potency and broad targeting range. This platform is broadly applicable for engineering proteins through combinatorial modifications en masse.
Collapse
Affiliation(s)
- Gigi C G Choi
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Hong Kong, China
| | - Peng Zhou
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Hong Kong, China
| | - Chaya T L Yuen
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Hong Kong, China
| | - Becky K C Chan
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Hong Kong, China
| | - Feng Xu
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Hong Kong, China
| | - Siyu Bao
- Ming Wai Lau Centre for Reparative Medicine, Karolinska Institutet, Hong Kong, China
| | - Hoi Yee Chu
- Ming Wai Lau Centre for Reparative Medicine, Karolinska Institutet, Hong Kong, China
| | - Dawn Thean
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Hong Kong, China
| | - Kaeling Tan
- Faculty of Health Sciences, University of Macau, Macau, China
- Genomics, Bioinformatics and Single Cell Analysis Core, Faculty of Health Sciences, University of Macau, Macau, China
| | - Koon Ho Wong
- Faculty of Health Sciences, University of Macau, Macau, China
- Institute of Translational Medicine, University of Macau, Macau, China
| | - Zongli Zheng
- Ming Wai Lau Centre for Reparative Medicine, Karolinska Institutet, Hong Kong, China
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China
- Biotechnology and Health Centre, City University of Hong Kong Shenzhen Research Institute, Shenzhen, China
| | - Alan S L Wong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Hong Kong, China.
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, China.
| |
Collapse
|
136
|
Protein stability engineering insights revealed by domain-wide comprehensive mutagenesis. Proc Natl Acad Sci U S A 2019; 116:16367-16377. [PMID: 31371509 DOI: 10.1073/pnas.1903888116] [Citation(s) in RCA: 108] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The accurate prediction of protein stability upon sequence mutation is an important but unsolved challenge in protein engineering. Large mutational datasets are required to train computational predictors, but traditional methods for collecting stability data are either low-throughput or measure protein stability indirectly. Here, we develop an automated method to generate thermodynamic stability data for nearly every single mutant in a small 56-residue protein. Analysis reveals that most single mutants have a neutral effect on stability, mutational sensitivity is largely governed by residue burial, and unexpectedly, hydrophobics are the best tolerated amino acid type. Correlating the output of various stability-prediction algorithms against our data shows that nearly all perform better on boundary and surface positions than for those in the core and are better at predicting large-to-small mutations than small-to-large ones. We show that the most stable variants in the single-mutant landscape are better identified using combinations of 2 prediction algorithms and including more algorithms can provide diminishing returns. In most cases, poor in silico predictions were tied to compositional differences between the data being analyzed and the datasets used to train the algorithm. Finally, we find that strategies to extract stabilities from high-throughput fitness data such as deep mutational scanning are promising and that data produced by these methods may be applicable toward training future stability-prediction tools.
Collapse
|
137
|
Abstract
Evolvability is the ability of a biological system to produce phenotypic variation that is both heritable and adaptive. It has long been the subject of anecdotal observations and theoretical work. In recent years, however, the molecular causes of evolvability have been an increasing focus of experimental work. Here, we review recent experimental progress in areas as different as the evolution of drug resistance in cancer cells and the rewiring of transcriptional regulation circuits in vertebrates. This research reveals the importance of three major themes: multiple genetic and non-genetic mechanisms to generate phenotypic diversity, robustness in genetic systems, and adaptive landscape topography. We also discuss the mounting evidence that evolvability can evolve and the question of whether it evolves adaptively.
Collapse
|
138
|
Rollins NJ, Brock KP, Poelwijk FJ, Stiffler MA, Gauthier NP, Sander C, Marks DS. Inferring protein 3D structure from deep mutation scans. Nat Genet 2019; 51:1170-1176. [PMID: 31209393 PMCID: PMC7295002 DOI: 10.1038/s41588-019-0432-9] [Citation(s) in RCA: 90] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 04/29/2019] [Indexed: 11/09/2022]
Abstract
We describe an experimental method of three-dimensional (3D) structure determination that exploits the increasing ease of high-throughput mutational scans. Inspired by the success of using natural, evolutionary sequence covariation to compute protein and RNA folds, we explored whether 'laboratory', synthetic sequence variation might also yield 3D structures. We analyzed five large-scale mutational scans and discovered that the pairs of residues with the largest positive epistasis in the experiments are sufficient to determine the 3D fold. We show that the strongest epistatic pairings from genetic screens of three proteins, a ribozyme and a protein interaction reveal 3D contacts within and between macromolecules. Using these experimental epistatic pairs, we compute ab initio folds for a GB1 domain (within 1.8 Å of the crystal structure) and a WW domain (2.1 Å). We propose strategies that reduce the number of mutants needed for contact prediction, suggesting that genomics-based techniques can efficiently predict 3D structure.
Collapse
Affiliation(s)
- Nathan J Rollins
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Kelly P Brock
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
| | - Frank J Poelwijk
- cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Michael A Stiffler
- cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nicholas P Gauthier
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| |
Collapse
|
139
|
Schmiedel JM, Lehner B. Determining protein structures using deep mutagenesis. Nat Genet 2019; 51:1177-1186. [PMID: 31209395 PMCID: PMC7610650 DOI: 10.1038/s41588-019-0431-x] [Citation(s) in RCA: 94] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 04/29/2019] [Indexed: 12/12/2022]
Abstract
Determining the three-dimensional structures of macromolecules is a major goal of biological research, because of the close relationship between structure and function; however, thousands of protein domains still have unknown structures. Structure determination usually relies on physical techniques including X-ray crystallography, NMR spectroscopy and cryo-electron microscopy. Here we present a method that allows the high-resolution three-dimensional backbone structure of a biological macromolecule to be determined only from measurements of the activity of mutant variants of the molecule. This genetic approach to structure determination relies on the quantification of genetic interactions (epistasis) between mutations and the discrimination of direct from indirect interactions. This provides an alternative experimental strategy for structure determination, with the potential to reveal functional and in vivo structures.
Collapse
Affiliation(s)
- Jörn M Schmiedel
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
- ICREA, Barcelona, Spain.
| |
Collapse
|
140
|
Jakobson CM, Jarosz DF. Molecular Origins of Complex Heritability in Natural Genotype-to-Phenotype Relationships. Cell Syst 2019; 8:363-379.e3. [PMID: 31054809 PMCID: PMC6560647 DOI: 10.1016/j.cels.2019.04.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 03/25/2019] [Accepted: 04/05/2019] [Indexed: 01/09/2023]
Abstract
The statistical complexity of heredity has long been evident, but its molecular origins remain elusive. To investigate, we charted 90 comprehensive genotype-to-phenotype maps in a large population of wild diploid yeast. In contrast to long-standing assumptions, all types of genetic variation contributed similarly to phenotype. Causal synonymous and regulatory variants exhibited distinct molecular signatures, as did nonlinearities in heterozygote fitness that likely contribute to hybrid vigor. Highly pleiotropic variants altered disordered sequences within signaling hubs, and their effects correlated across environments-even when antagonistic-suggesting that large fitness gains bring concomitant costs. Natural genetic networks defined by the causal loci differed from those determined by precise gene deletions or protein-protein interactions. Finally, we found that traits that would appear omnigenic in less powered studies do in fact have finite genetic determinants. Integrating these molecular principles will be crucial as genome reading and writing become routine in research, industry, and medicine.
Collapse
Affiliation(s)
- Christopher M Jakobson
- Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Daniel F Jarosz
- Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA 94305, USA.
| |
Collapse
|
141
|
Kinney JB, McCandlish DM. Massively Parallel Assays and Quantitative Sequence-Function Relationships. Annu Rev Genomics Hum Genet 2019; 20:99-127. [PMID: 31091417 DOI: 10.1146/annurev-genom-083118-014845] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Over the last decade, a rich variety of massively parallel assays have revolutionized our understanding of how biological sequences encode quantitative molecular phenotypes. These assays include deep mutational scanning, high-throughput SELEX, and massively parallel reporter assays. Here, we review these experimental methods and how the data they produce can be used to quantitatively model sequence-function relationships. In doing so, we touch on a diverse range of topics, including the identification of clinically relevant genomic variants, the modeling of transcription factor binding to DNA, the functional and evolutionary landscapes of proteins, and cis-regulatory mechanisms in both transcription and mRNA splicing. We further describe a unified conceptual framework and a core set of mathematical modeling strategies that studies in these diverse areas can make use of. Finally, we highlight key aspects of experimental design and mathematical modeling that are important for the results of such studies to be interpretable and reproducible.
Collapse
Affiliation(s)
- Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| |
Collapse
|
142
|
Domingo J, Baeza-Centurion P, Lehner B. The Causes and Consequences of Genetic Interactions (Epistasis). Annu Rev Genomics Hum Genet 2019; 20:433-460. [PMID: 31082279 DOI: 10.1146/annurev-genom-083118-014857] [Citation(s) in RCA: 124] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The same mutation can have different effects in different individuals. One important reason for this is that the outcome of a mutation can depend on the genetic context in which it occurs. This dependency is known as epistasis. In recent years, there has been a concerted effort to quantify the extent of pairwise and higher-order genetic interactions between mutations through deep mutagenesis of proteins and RNAs. This research has revealed two major components of epistasis: nonspecific genetic interactions caused by nonlinearities in genotype-to-phenotype maps, and specific interactions between particular mutations. Here, we provide an overview of our current understanding of the mechanisms causing epistasis at the molecular level, the consequences of genetic interactions for evolution and genetic prediction, and the applications of epistasis for understanding biology and determining macromolecular structures.
Collapse
Affiliation(s)
- Júlia Domingo
- Systems Biology Program, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; , ,
| | - Pablo Baeza-Centurion
- Systems Biology Program, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; , ,
| | - Ben Lehner
- Systems Biology Program, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; , , .,Universitat Pompeu Fabra, 08003 Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), 08010 Barcelona, Spain
| |
Collapse
|
143
|
Gaiha GD, Rossin EJ, Urbach J, Landeros C, Collins DR, Nwonu C, Muzhingi I, Anahtar MN, Waring OM, Piechocka-Trocha A, Waring M, Worrall DP, Ghebremichael MS, Newman RM, Power KA, Allen TM, Chodosh J, Walker BD. Structural topology defines protective CD8 + T cell epitopes in the HIV proteome. Science 2019; 364:480-484. [PMID: 31048489 PMCID: PMC6855781 DOI: 10.1126/science.aav5095] [Citation(s) in RCA: 98] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2018] [Accepted: 03/25/2019] [Indexed: 12/26/2022]
Abstract
Mutationally constrained epitopes of variable pathogens represent promising targets for vaccine design but are not reliably identified by sequence conservation. In this study, we employed structure-based network analysis, which applies network theory to HIV protein structure data to quantitate the topological importance of individual amino acid residues. Mutation of residues at important network positions disproportionately impaired viral replication and occurred with high frequency in epitopes presented by protective human leukocyte antigen (HLA) class I alleles. Moreover, CD8+ T cell targeting of highly networked epitopes distinguished individuals who naturally control HIV, even in the absence of protective HLA alleles. This approach thereby provides a mechanistic basis for immune control and a means to identify CD8+ T cell epitopes of topological importance for rational immunogen design, including a T cell-based HIV vaccine.
Collapse
Affiliation(s)
- Gaurav D Gaiha
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
- Gastrointestinal Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Elizabeth J Rossin
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA 02114, USA
| | - Jonathan Urbach
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | | | - David R Collins
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Chioma Nwonu
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | - Itai Muzhingi
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | - Melis N Anahtar
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Olivia M Waring
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Alicja Piechocka-Trocha
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Michael Waring
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Daniel P Worrall
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | | | - Ruchi M Newman
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | - Karen A Power
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | - Todd M Allen
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | - James Chodosh
- Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA 02114, USA
| | - Bruce D Walker
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA.
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
144
|
Abstract
For nearly a century adaptive landscapes have provided overviews of the evolutionary process and yet they remain metaphors. We redefine adaptive landscapes in terms of biological processes rather than descriptive phenomenology. We focus on the underlying mechanisms that generate emergent properties such as epistasis, dominance, trade-offs and adaptive peaks. We illustrate the utility of landscapes in predicting the course of adaptation and the distribution of fitness effects. We abandon aged arguments concerning landscape ruggedness in favor of empirically determining landscape architecture. In so doing, we transform the landscape metaphor into a scientific framework within which causal hypotheses can be tested.
Collapse
Affiliation(s)
- Xiao Yi
- BioTechnology Institute, University of Minnesota, St. Paul, MN
| | - Antony M Dean
- BioTechnology Institute, University of Minnesota, St. Paul, MN
- Department of Ecology, Evolution, and Behavior, University of Minnesota, St. Paul, MN
| |
Collapse
|
145
|
Horovitz A, Fleisher RC, Mondal T. Double-mutant cycles: new directions and applications. Curr Opin Struct Biol 2019; 58:10-17. [PMID: 31029859 DOI: 10.1016/j.sbi.2019.03.025] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2019] [Accepted: 03/20/2019] [Indexed: 11/17/2022]
Abstract
Double-mutant cycle (DMC) analysis is a powerful approach for detecting and quantifying the energetics of both direct and long-range interactions in proteins and other chemical systems. It can also be used to unravel higher-order interactions (e.g. three-body effects) that lead to cooperativity in protein folding and function. In this review, we describe new applications of DMC analysis based on advances in native mass spectrometry and high-throughput methods such as next generation sequencing and protein complementation assays. These developments have facilitated carrying out high-throughput DMC analysis, which can be used to characterize increasingly higher-order interactions and very large interaction networks in proteins. Such studies have provided insights into the extent of cooperativity (epistasis) in protein structures. High-throughput DMC studies have also been used to validate correlated mutation analysis and can provide restraints for protein docking.
Collapse
Affiliation(s)
- Amnon Horovitz
- Department of Structural Biology, Weizmann Institute of Science, Rehovot 7610001, Israel.
| | - Rachel C Fleisher
- Department of Structural Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Tridib Mondal
- Department of Structural Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
146
|
Pokusaeva VO, Usmanova DR, Putintseva EV, Espinar L, Sarkisyan KS, Mishin AS, Bogatyreva NS, Ivankov DN, Akopyan AV, Avvakumov SY, Povolotskaya IS, Filion GJ, Carey LB, Kondrashov FA. An experimental assay of the interactions of amino acids from orthologous sequences shaping a complex fitness landscape. PLoS Genet 2019; 15:e1008079. [PMID: 30969963 PMCID: PMC6476524 DOI: 10.1371/journal.pgen.1008079] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 04/22/2019] [Accepted: 03/11/2019] [Indexed: 11/18/2022] Open
Abstract
Characterizing the fitness landscape, a representation of fitness for a large set of genotypes, is key to understanding how genetic information is interpreted to create functional organisms. Here we determined the evolutionarily-relevant segment of the fitness landscape of His3, a gene coding for an enzyme in the histidine synthesis pathway, focusing on combinations of amino acid states found at orthologous sites of extant species. Just 15% of amino acids found in yeast His3 orthologues were always neutral while the impact on fitness of the remaining 85% depended on the genetic background. Furthermore, at 67% of sites, amino acid replacements were under sign epistasis, having both strongly positive and negative effect in different genetic backgrounds. 46% of sites were under reciprocal sign epistasis. The fitness impact of amino acid replacements was influenced by only a few genetic backgrounds but involved interaction of multiple sites, shaping a rugged fitness landscape in which many of the shortest paths between highly fit genotypes are inaccessible. An intuitive understanding of protein evolution dictates that, with the exception of adaptive substitutions, amino acid states should be freely exchangeable between the same gene from different species. However, the extent to which this assertion holds true has not been tested in a controlled experiment. Here, we show that whether an amino acid state can be exchanged between orthologues depends on other amino acid states in the same protein. Furthermore, we show that the mode of interaction of amino acid states is multidimensional. Assuming that amino acid replacements influence the protein in several independent ways substantially improves our ability to predict the effect of an amino acid state in a protein sequence that has not been observed in nature.
Collapse
Affiliation(s)
| | - Dinara R. Usmanova
- Department of Systems Biology, Columbia University, New York, NY, United States of America
| | | | - Lorena Espinar
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), 88 Dr. Aiguader, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Karen S. Sarkisyan
- Institute of Science and Technology Austria, Am Campus 1, Klosterneuburg, Austria
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia
- Medical Research Council London Institute of Medical Sciences, Imperial College London, London, United Kingdom
| | | | - Natalya S. Bogatyreva
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), 88 Dr. Aiguader, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, Pushchino, Moscow region, Russia
| | - Dmitry N. Ivankov
- Institute of Science and Technology Austria, Am Campus 1, Klosterneuburg, Austria
- Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, Pushchino, Moscow region, Russia
| | - Arseniy V. Akopyan
- Institute of Science and Technology Austria, Am Campus 1, Klosterneuburg, Austria
| | - Sergey Ya. Avvakumov
- Institute of Science and Technology Austria, Am Campus 1, Klosterneuburg, Austria
| | - Inna S. Povolotskaya
- Veltischev Research and Clinical Institute for Pediatrics of the Pirogov Russian National Research Medical University, Moscow, Russia
| | - Guillaume J. Filion
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), 88 Dr. Aiguader, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Lucas B. Carey
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Center for Quantitative Biology and Peking-Tsinghua Joint Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- * E-mail: (LBC); (FAK)
| | - Fyodor A. Kondrashov
- Institute of Science and Technology Austria, Am Campus 1, Klosterneuburg, Austria
- * E-mail: (LBC); (FAK)
| |
Collapse
|
147
|
Gonzalez CE, Ostermeier M. Pervasive Pairwise Intragenic Epistasis among Sequential Mutations in TEM-1 β-Lactamase. J Mol Biol 2019; 431:1981-1992. [PMID: 30922874 DOI: 10.1016/j.jmb.2019.03.020] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Revised: 02/25/2019] [Accepted: 03/13/2019] [Indexed: 12/25/2022]
Abstract
Interactions between mutations play a central role in shaping the fitness landscape, but a clear picture of intragenic epistasis has yet to emerge. To further reveal the prevalence and patterns of intragenic epistasis, we present a survey of epistatic interactions between sequential mutations in TEM-1 β-lactamase. We measured the fitness effect of ~12,000 pairs of consecutive amino acid substitutions and used our previous study of the fitness effects of single amino acid substitutions to calculate epistasis for over 8000 mutation pairs. Since sequential mutations are prone to physically interact, we postulated that our study would be surveying specific epistasis instead of nonspecific epistasis. We found widespread negative epistasis, especially in beta-strands, and a high frequency of negative sign epistasis among individually beneficial mutations. Negative epistasis (52%) occurred 7.6 times as frequently as positive epistasis (6.8%). Buried residues experienced more negative epistasis that surface-exposed residues. However, TEM-1 exhibited a couple of hotspots for positive epistasis, most notably L221/ R222 at which many combinations of mutations positively interacted. This study is the first to systematically examine pairwise epistasis throughout an entire protein performing its native function in its native host.
Collapse
Affiliation(s)
- Courtney E Gonzalez
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218, USA
| | - Marc Ostermeier
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218, USA.
| |
Collapse
|
148
|
Adams RM, Kinney JB, Walczak AM, Mora T. Epistasis in a Fitness Landscape Defined by Antibody-Antigen Binding Free Energy. Cell Syst 2019; 8:86-93.e3. [PMID: 30611676 PMCID: PMC6487650 DOI: 10.1016/j.cels.2018.12.004] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Revised: 10/12/2018] [Accepted: 12/07/2018] [Indexed: 12/16/2022]
Abstract
Epistasis is the phenomenon by which the effect of a mutation depends on its genetic background. While it is usually defined in terms of organismal fitness, for single proteins, it must reflect physical interactions among residues. Here, we systematically extract the specific contribution pairwise epistasis makes to the physical affinity of antibody-antigen binding relevant to affinity maturation, a process of accelerated Darwinian evolution. We find that, among competing definitions of affinity, the binding free energy is the most appropriate to describe epistasis. We show that epistasis is pervasive, accounting for 25%-35% of variability, of which a large fraction is beneficial. This work suggests that epistasis both constrains, through negative epistasis, and enlarges, through positive epistasis, the set of possible evolutionary paths that can produce high-affinity sequences during repeated rounds of mutation and selection.
Collapse
Affiliation(s)
- Rhys M Adams
- CNRS, Laboratoire de Physique Théorique, UPMC (Sorbonne University), and École Normale Supérieure (PSL), 24 rue Lhomond, Paris 75005, France; Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, 1 Bungtown Rd., Cold Spring Harbor, NY 11724, USA
| | - Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, 1 Bungtown Rd., Cold Spring Harbor, NY 11724, USA
| | - Aleksandra M Walczak
- CNRS, Laboratoire de Physique Théorique, UPMC (Sorbonne University), and École Normale Supérieure (PSL), 24 rue Lhomond, Paris 75005, France.
| | - Thierry Mora
- CNRS, Laboratoire de Physique Statistique, UPMC (Sorbonne University), Paris-Diderot University, and École Normale Supérieure (PSL), 24, rue Lhomond, Paris 75005, France.
| |
Collapse
|
149
|
Baeza-Centurion P, Miñana B, Schmiedel JM, Valcárcel J, Lehner B. Combinatorial Genetics Reveals a Scaling Law for the Effects of Mutations on Splicing. Cell 2019; 176:549-563.e23. [PMID: 30661752 DOI: 10.1016/j.cell.2018.12.010] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 08/29/2018] [Accepted: 12/07/2018] [Indexed: 02/08/2023]
Abstract
Despite a wealth of molecular knowledge, quantitative laws for accurate prediction of biological phenomena remain rare. Alternative pre-mRNA splicing is an important regulated step in gene expression frequently perturbed in human disease. To understand the combined effects of mutations during evolution, we quantified the effects of all possible combinations of exonic mutations accumulated during the emergence of an alternatively spliced human exon. This revealed that mutation effects scale non-monotonically with the inclusion level of an exon, with each mutation having maximum effect at a predictable intermediate inclusion level. This scaling is observed genome-wide for cis and trans perturbations of splicing, including for natural and disease-associated variants. Mathematical modeling suggests that competition between alternative splice sites is sufficient to cause this non-linearity in the genotype-phenotype map. Combining the global scaling law with specific pairwise interactions between neighboring mutations allows accurate prediction of the effects of complex genotype changes involving >10 mutations.
Collapse
Affiliation(s)
- Pablo Baeza-Centurion
- Systems Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain
| | - Belén Miñana
- Gene Regulation, Stem Cells and Cancer Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain
| | - Jörn M Schmiedel
- Systems Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain
| | - Juan Valcárcel
- Gene Regulation, Stem Cells and Cancer Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluís Companys 23, 08010 Barcelona, Spain.
| | - Ben Lehner
- Systems Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain; Gene Regulation, Stem Cells and Cancer Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluís Companys 23, 08010 Barcelona, Spain.
| |
Collapse
|
150
|
Khare S, Bhasin M, Sahoo A, Varadarajan R. Protein model discrimination attempts using mutational sensitivity, predicted secondary structure, and model quality information. Proteins 2019; 87:326-336. [PMID: 30615225 DOI: 10.1002/prot.25654] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Revised: 12/22/2018] [Accepted: 01/02/2019] [Indexed: 01/02/2023]
Abstract
Structure prediction methods often generate a large number of models for a target sequence. Even if the correct fold for the target sequence is sampled in this dataset, it is difficult to distinguish it from other decoy structures. An attempt to solve this problem using experimental mutational sensitivity data for the CcdB protein was described previously by exploiting the correlation of residue depth with mutational sensitivity (r ~ 0.6). We now show that such a correlation extends to four other proteins with localized active sites, and for which saturation mutagenesis datasets exist. We also examine whether incorporation of predicted secondary structure information and the DOPE model quality assessment score, in addition to mutational sensitivity, improves the accuracy of model discrimination using a decoy dataset of 163 targets from CASP. Although most CASP models would have been subjected to model quality assessment prior to submission, we find that the DOPE score makes a substantial contribution to the observed improvement. We therefore also applied the approach to CcdB and four other proteins for which reliable experimental mutational data exist and observe that inclusion of experimental mutational data results in a small qualitative improvement in model discrimination relative to that seen with just the DOPE score. This is largely because of our limited ability to quantitatively predict effects of point mutations on in vivo protein activity. Further improvements in the methodology are required to facilitate improved utilization of single mutant data.
Collapse
Affiliation(s)
- Shruti Khare
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Munmun Bhasin
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Anusmita Sahoo
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Raghavan Varadarajan
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India.,Chemical Biology Unit, Jawaharlal Nehru Center for Advanced Scientific Research, Bangalore, India
| |
Collapse
|