1
|
Bendel AM, Skendo K, Klein D, Shimada K, Kauneckaite-Griguole K, Diss G. Optimization of a deep mutational scanning workflow to improve quantification of mutation effects on protein-protein interactions. BMC Genomics 2024; 25:630. [PMID: 38914936 PMCID: PMC11194945 DOI: 10.1186/s12864-024-10524-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 06/14/2024] [Indexed: 06/26/2024] Open
Abstract
Deep Mutational Scanning (DMS) assays are powerful tools to study sequence-function relationships by measuring the effects of thousands of sequence variants on protein function. During a DMS experiment, several technical artefacts might distort non-linearly the functional score obtained, potentially biasing the interpretation of the results. We therefore tested several technical parameters in the deepPCA workflow, a DMS assay for protein-protein interactions, in order to identify technical sources of non-linearities. We found that parameters common to many DMS assays such as amount of transformed DNA, timepoint of harvest and library composition can cause non-linearities in the data. Designing experiments in a way to minimize these non-linear effects will improve the quantification and interpretation of mutation effects.
Collapse
Affiliation(s)
- Alexandra M Bendel
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
- University of Basel, Basel, Switzerland
| | | | - Dominique Klein
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Kenji Shimada
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Kotryna Kauneckaite-Griguole
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Guillaume Diss
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland.
| |
Collapse
|
2
|
Metzger BPH, Park Y, Starr TN, Thornton JW. Epistasis facilitates functional evolution in an ancient transcription factor. eLife 2024; 12:RP88737. [PMID: 38767330 PMCID: PMC11105156 DOI: 10.7554/elife.88737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
Abstract
A protein's genetic architecture - the set of causal rules by which its sequence produces its functions - also determines its possible evolutionary trajectories. Prior research has proposed that the genetic architecture of proteins is very complex, with pervasive epistatic interactions that constrain evolution and make function difficult to predict from sequence. Most of this work has analyzed only the direct paths between two proteins of interest - excluding the vast majority of possible genotypes and evolutionary trajectories - and has considered only a single protein function, leaving unaddressed the genetic architecture of functional specificity and its impact on the evolution of new functions. Here, we develop a new method based on ordinal logistic regression to directly characterize the global genetic determinants of multiple protein functions from 20-state combinatorial deep mutational scanning (DMS) experiments. We use it to dissect the genetic architecture and evolution of a transcription factor's specificity for DNA, using data from a combinatorial DMS of an ancient steroid hormone receptor's capacity to activate transcription from two biologically relevant DNA elements. We show that the genetic architecture of DNA recognition consists of a dense set of main and pairwise effects that involve virtually every possible amino acid state in the protein-DNA interface, but higher-order epistasis plays only a tiny role. Pairwise interactions enlarge the set of functional sequences and are the primary determinants of specificity for different DNA elements. They also massively expand the number of opportunities for single-residue mutations to switch specificity from one DNA target to another. By bringing variants with different functions close together in sequence space, pairwise epistasis therefore facilitates rather than constrains the evolution of new functions.
Collapse
Affiliation(s)
- Brian PH Metzger
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
| | - Yeonwoo Park
- Program in Genetics, Genomics, and Systems Biology, University of ChicagoChicagoUnited States
| | - Tyler N Starr
- Department of Biochemistry and Molecular Biophysics, University of ChicagoChicagoUnited States
| | - Joseph W Thornton
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
- Department of Human Genetics, University of ChicagoChicagoUnited States
| |
Collapse
|
3
|
Wagner A. Genotype sampling for deep-learning assisted experimental mapping of a combinatorially complete fitness landscape. Bioinformatics 2024; 40:btae317. [PMID: 38745436 PMCID: PMC11132821 DOI: 10.1093/bioinformatics/btae317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 03/21/2024] [Accepted: 05/14/2024] [Indexed: 05/16/2024] Open
Abstract
MOTIVATION Experimental characterization of fitness landscapes, which map genotypes onto fitness, is important for both evolutionary biology and protein engineering. It faces a fundamental obstacle in the astronomical number of genotypes whose fitness needs to be measured for any one protein. Deep learning may help to predict the fitness of many genotypes from a smaller neural network training sample of genotypes with experimentally measured fitness. Here I use a recently published experimentally mapped fitness landscape of more than 260 000 protein genotypes to ask how such sampling is best performed. RESULTS I show that multilayer perceptrons, recurrent neural networks, convolutional networks, and transformers, can explain more than 90% of fitness variance in the data. In addition, 90% of this performance is reached with a training sample comprising merely ≈103 sequences. Generalization to unseen test data is best when training data is sampled randomly and uniformly, or sampled to minimize the number of synonymous sequences. In contrast, sampling to maximize sequence diversity or codon usage bias reduces performance substantially. These observations hold for more than one network architecture. Simple sampling strategies may perform best when training deep learning neural networks to map fitness landscapes from experimental data. AVAILABILITY AND IMPLEMENTATION The fitness landscape data analyzed here is publicly available as described previously (Papkou et al. 2023). All code used to analyze this landscape is publicly available at https://github.com/andreas-wagner-uzh/fitness_landscape_sampling.
Collapse
Affiliation(s)
- Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, 8057 Zurich, Switzerland
- Swiss Institute of Bioinformatics, Quartier Sorge-Batiment Genopode,1015 Lausanne, Switzerland
- The Santa Fe Institute, Santa Fe, 87501 NM, United States
| |
Collapse
|
4
|
Kohlmayr JM, Grabner GF, Nusser A, Höll A, Manojlović V, Halwachs B, Masser S, Jany-Luig E, Engelke H, Zimmermann R, Stelzl U. Mutational scanning pinpoints distinct binding sites of key ATGL regulators in lipolysis. Nat Commun 2024; 15:2516. [PMID: 38514628 PMCID: PMC10958042 DOI: 10.1038/s41467-024-46937-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 03/14/2024] [Indexed: 03/23/2024] Open
Abstract
ATGL is a key enzyme in intracellular lipolysis and plays an important role in metabolic and cardiovascular diseases. ATGL is tightly regulated by a known set of protein-protein interaction partners with activating or inhibiting functions in the control of lipolysis. Here, we use deep mutational protein interaction perturbation scanning and generate comprehensive profiles of single amino acid variants that affect the interactions of ATGL with its regulatory partners: CGI-58, G0S2, PLIN1, PLIN5 and CIDEC. Twenty-three ATGL amino acid variants yield a specific interaction perturbation pattern when validated in co-immunoprecipitation experiments in mammalian cells. We identify and characterize eleven highly selective ATGL switch mutations which affect the interaction of one of the five partners without affecting the others. Switch mutations thus provide distinct interaction determinants for ATGL's key regulatory proteins at an amino acid resolution. When we test triglyceride hydrolase activity in vitro and lipolysis in cells, the activity patterns of the ATGL switch variants trace to their protein interaction profile. In the context of structural data, the integration of variant binding and activity profiles provides insights into the regulation of lipolysis and the impact of mutations in human disease.
Collapse
Affiliation(s)
- Johanna M Kohlmayr
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Graz, Austria
| | - Gernot F Grabner
- Institute of Molecular Biosciences, Biochemistry, University of Graz, Graz, Austria
- Gottfried Schatz Research Center, Molecular Biology and Biochemistry, Medical University of Graz, Graz, Austria
| | - Anna Nusser
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Graz, Austria
| | - Anna Höll
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Graz, Austria
| | - Verina Manojlović
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Graz, Austria
| | - Bettina Halwachs
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Graz, Austria
- Field of Excellence BioHealth - University of Graz, Graz, Austria
| | - Sarah Masser
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Graz, Austria
- BioTechMed-Graz, Graz, Austria
| | - Evelyne Jany-Luig
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Graz, Austria
| | - Hanna Engelke
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Graz, Austria
- Field of Excellence BioHealth - University of Graz, Graz, Austria
| | - Robert Zimmermann
- Institute of Molecular Biosciences, Biochemistry, University of Graz, Graz, Austria
- Field of Excellence BioHealth - University of Graz, Graz, Austria
- BioTechMed-Graz, Graz, Austria
| | - Ulrich Stelzl
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Graz, Austria.
- Field of Excellence BioHealth - University of Graz, Graz, Austria.
- BioTechMed-Graz, Graz, Austria.
| |
Collapse
|
5
|
Judge A, Sankaran B, Hu L, Palaniappan M, Birgy A, Prasad BVV, Palzkill T. Network of epistatic interactions in an enzyme active site revealed by large-scale deep mutational scanning. Proc Natl Acad Sci U S A 2024; 121:e2313513121. [PMID: 38483989 PMCID: PMC10962969 DOI: 10.1073/pnas.2313513121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 02/14/2024] [Indexed: 03/19/2024] Open
Abstract
Cooperative interactions between amino acids are critical for protein function. A genetic reflection of cooperativity is epistasis, which is when a change in the amino acid at one position changes the sequence requirements at another position. To assess epistasis within an enzyme active site, we utilized CTX-M β-lactamase as a model system. CTX-M hydrolyzes β-lactam antibiotics to provide antibiotic resistance, allowing a simple functional selection for rapid sorting of modified enzymes. We created all pairwise mutations across 17 active site positions in the β-lactamase enzyme and quantitated the function of variants against two β-lactam antibiotics using next-generation sequencing. Context-dependent sequence requirements were determined by comparing the antibiotic resistance function of double mutations across the CTX-M active site to their predicted function based on the constituent single mutations, revealing both positive epistasis (synergistic interactions) and negative epistasis (antagonistic interactions) between amino acid substitutions. The resulting trends demonstrate that positive epistasis is present throughout the active site, that epistasis between residues is mediated through substrate interactions, and that residues more tolerant to substitutions serve as generic compensators which are responsible for many cases of positive epistasis. Additionally, we show that a key catalytic residue (Glu166) is amenable to compensatory mutations, and we characterize one such double mutant (E166Y/N170G) that acts by an altered catalytic mechanism. These findings shed light on the unique biochemical factors that drive epistasis within an enzyme active site and will inform enzyme engineering efforts by bridging the gap between amino acid sequence and catalytic function.
Collapse
Affiliation(s)
- Allison Judge
- Verna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX77030
| | - Banumathi Sankaran
- Department of Molecular Biophysics and Integrated Bioimaging, Berkeley Center for Structural Biology Lawrence Berkeley National Laboratory, Berkeley, CA94720
| | - Liya Hu
- Verna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX77030
| | - Murugesan Palaniappan
- Department of Pathology and Immunology, Center for Drug Discovery, Baylor College of Medicine, Houston, TX77030
| | - André Birgy
- Verna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX77030
- Infections, Antimicrobials, Modelling, Evolution, UMR 1137, French Insitute for Medical Research (INSERM), Faculty of Health, Université Paris Cité, Paris75006, France
| | - B. V. Venkataram Prasad
- Verna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX77030
| | - Timothy Palzkill
- Verna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX77030
| |
Collapse
|
6
|
Ding D, Shaw AY, Sinai S, Rollins N, Prywes N, Savage DF, Laub MT, Marks DS. Protein design using structure-based residue preferences. Nat Commun 2024; 15:1639. [PMID: 38388493 PMCID: PMC10884402 DOI: 10.1038/s41467-024-45621-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 01/29/2024] [Indexed: 02/24/2024] Open
Abstract
Recent developments in protein design rely on large neural networks with up to 100s of millions of parameters, yet it is unclear which residue dependencies are critical for determining protein function. Here, we show that amino acid preferences at individual residues-without accounting for mutation interactions-explain much and sometimes virtually all of the combinatorial mutation effects across 8 datasets (R2 ~ 78-98%). Hence, few observations (~100 times the number of mutated residues) enable accurate prediction of held-out variant effects (Pearson r > 0.80). We hypothesized that the local structural contexts around a residue could be sufficient to predict mutation preferences, and develop an unsupervised approach termed CoVES (Combinatorial Variant Effects from Structure). Our results suggest that CoVES outperforms not just model-free methods but also similarly to complex models for creating functional and diverse protein variants. CoVES offers an effective alternative to complicated models for identifying functional protein mutations.
Collapse
Affiliation(s)
- David Ding
- Innovative Genomics Institute, University of California, Berkeley, CA, 94720, USA.
| | - Ada Y Shaw
- Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA
| | - Sam Sinai
- Dyno Therapeutics, Watertown, MA, 02472, USA
| | - Nathan Rollins
- Seismic Therapeutics, Lab Central, Cambridge, MA, 02142, USA
| | - Noam Prywes
- Innovative Genomics Institute, University of California, Berkeley, CA, 94720, USA
| | - David F Savage
- Innovative Genomics Institute, University of California, Berkeley, CA, 94720, USA
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- Howard Hughes Medical Institute, University of California, Berkeley, CA, 94720, USA
| | - Michael T Laub
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA.
| |
Collapse
|
7
|
Park Y, Metzger BP, Thornton JW. The simplicity of protein sequence-function relationships. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.02.556057. [PMID: 37732229 PMCID: PMC10508729 DOI: 10.1101/2023.09.02.556057] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
How complicated is the genetic architecture of proteins - the set of causal effects by which sequence determines function? High-order epistatic interactions among residues are thought to be pervasive, making a protein's function difficult to predict or understand from its sequence. Most studies, however, used methods that overestimate epistasis, because they analyze genetic architecture relative to a designated reference sequence - causing measurement noise and small local idiosyncrasies to propagate into pervasive high-order interactions - or have not effectively accounted for global nonlinearity in the sequence-function relationship. Here we present a new reference-free method that jointly estimates global nonlinearity and specific epistatic interactions across a protein's entire genotype-phenotype map. This method yields a maximally efficient explanation of a protein's genetic architecture and is more robust than existing methods to measurement noise, partial sampling, and model misspecification. We reanalyze 20 combinatorial mutagenesis experiments from a diverse set of proteins and find that additive and pairwise effects, along with a simple nonlinearity to account for limited dynamic range, explain a median of 96% of total variance in measured phenotypes (and >92% in every case). Only a tiny fraction of genotypes are strongly affected by third- or higher-order epistasis. Genetic architecture is also sparse: the number of terms required to explain the vast majority of variance is smaller than the number of genotypes by many orders of magnitude. The sequence-function relationship in most proteins is therefore far simpler than previously thought, opening the way for new and tractable approaches to characterize it.
Collapse
Affiliation(s)
- Yeonwoo Park
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL 60637
- Current affiliation: Center for RNA Research, Institute for Basic Science, Seoul, Republic of Korea 08826
| | - Brian P.H. Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637
- Current affiliation: Department of Biological Sciences, Purdue University, West Lafayette, IN 47907
| | - Joseph W. Thornton
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637
- Department of Human Genetics, University of Chicago, Chicago, IL 60637
| |
Collapse
|
8
|
Nemoto T, Ocari T, Planul A, Tekinsoy M, Zin EA, Dalkara D, Ferrari U. ACIDES: on-line monitoring of forward genetic screens for protein engineering. Nat Commun 2023; 14:8504. [PMID: 38148337 PMCID: PMC10751290 DOI: 10.1038/s41467-023-43967-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 11/24/2023] [Indexed: 12/28/2023] Open
Abstract
Forward genetic screens of mutated variants are a versatile strategy for protein engineering and investigation, which has been successfully applied to various studies like directed evolution (DE) and deep mutational scanning (DMS). While next-generation sequencing can track millions of variants during the screening rounds, the vast and noisy nature of the sequencing data impedes the estimation of the performance of individual variants. Here, we propose ACIDES that combines statistical inference and in-silico simulations to improve performance estimation in the library selection process by attributing accurate statistical scores to individual variants. We tested ACIDES first on a random-peptide-insertion experiment and then on multiple public datasets from DE and DMS studies. ACIDES allows experimentalists to reliably estimate variant performance on the fly and can aid protein engineering and research pipelines in a range of applications, including gene therapy.
Collapse
Affiliation(s)
- Takahiro Nemoto
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France.
- Graduate School of Informatics, Kyoto University, Yoshida Hon-machi, Sakyo-ku, Kyoto, 606-8501, Japan.
- Premium Research Institute for Human Metaverse Medicine (WPI-PRIMe), Osaka University, Suita, Osaka, 565-0871, Japan.
| | - Tommaso Ocari
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Arthur Planul
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Muge Tekinsoy
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Emilia A Zin
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Deniz Dalkara
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France.
| | - Ulisse Ferrari
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France.
| |
Collapse
|
9
|
Xi C, Diao J, Moon TS. Advances in ligand-specific biosensing for structurally similar molecules. Cell Syst 2023; 14:1024-1043. [PMID: 38128482 PMCID: PMC10751988 DOI: 10.1016/j.cels.2023.10.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 08/23/2023] [Accepted: 10/19/2023] [Indexed: 12/23/2023]
Abstract
The specificity of biological systems makes it possible to develop biosensors targeting specific metabolites, toxins, and pollutants in complex medical or environmental samples without interference from structurally similar compounds. For the last two decades, great efforts have been devoted to creating proteins or nucleic acids with novel properties through synthetic biology strategies. Beyond augmenting biocatalytic activity, expanding target substrate scopes, and enhancing enzymes' enantioselectivity and stability, an increasing research area is the enhancement of molecular specificity for genetically encoded biosensors. Here, we summarize recent advances in the development of highly specific biosensor systems and their essential applications. First, we describe the rational design principles required to create libraries containing potential mutants with less promiscuity or better specificity. Next, we review the emerging high-throughput screening techniques to engineer biosensing specificity for the desired target. Finally, we examine the computer-aided evaluation and prediction methods to facilitate the construction of ligand-specific biosensors.
Collapse
Affiliation(s)
- Chenggang Xi
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Jinjin Diao
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Tae Seok Moon
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, MO, USA; Division of Biology and Biomedical Sciences, Washington University in St. Louis, St. Louis, MO, USA.
| |
Collapse
|
10
|
Maes S, Deploey N, Peelman F, Eyckerman S. Deep mutational scanning of proteins in mammalian cells. CELL REPORTS METHODS 2023; 3:100641. [PMID: 37963462 PMCID: PMC10694495 DOI: 10.1016/j.crmeth.2023.100641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 07/06/2023] [Accepted: 10/20/2023] [Indexed: 11/16/2023]
Abstract
Protein mutagenesis is essential for unveiling the molecular mechanisms underlying protein function in health, disease, and evolution. In the past decade, deep mutational scanning methods have evolved to support the functional analysis of nearly all possible single-amino acid changes in a protein of interest. While historically these methods were developed in lower organisms such as E. coli and yeast, recent technological advancements have resulted in the increased use of mammalian cells, particularly for studying proteins involved in human disease. These advancements will aid significantly in the classification and interpretation of variants of unknown significance, which are being discovered at large scale due to the current surge in the use of whole-genome sequencing in clinical contexts. Here, we explore the experimental aspects of deep mutational scanning studies in mammalian cells and report the different methods used in each step of the workflow, ultimately providing a useful guide toward the design of such studies.
Collapse
Affiliation(s)
- Stefanie Maes
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biochemistry and Microbiology, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Nick Deploey
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Frank Peelman
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Sven Eyckerman
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium.
| |
Collapse
|
11
|
Xie X, Sun X, Wang Y, Lehner B, Li X. Dominance vs epistasis: the biophysical origins and plasticity of genetic interactions within and between alleles. Nat Commun 2023; 14:5551. [PMID: 37689712 PMCID: PMC10492795 DOI: 10.1038/s41467-023-41188-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 08/25/2023] [Indexed: 09/11/2023] Open
Abstract
An important challenge in genetics, evolution and biotechnology is to understand and predict how mutations combine to alter phenotypes, including molecular activities, fitness and disease. In diploids, mutations in a gene can combine on the same chromosome or on different chromosomes as a "heteroallelic combination". However, a direct comparison of the extent, sign, and stability of the genetic interactions between variants within and between alleles is lacking. Here we use thermodynamic models of protein folding and ligand-binding to show that interactions between mutations within and between alleles are expected in even very simple biophysical systems. Protein folding alone generates within-allele interactions and a single molecular interaction is sufficient to cause between-allele interactions and dominance. These interactions change differently, quantitatively and qualitatively as a system becomes more complex. Altering the concentration of a ligand can, for example, switch alleles from dominant to recessive. Our results show that intra-molecular epistasis and dominance should be widely expected in even the simplest biological systems but also reinforce the view that they are plastic system properties and so a formidable challenge to predict. Accurate prediction of both intra-molecular epistasis and dominance will require either detailed mechanistic understanding and experimental parameterization or brute-force measurement and learning.
Collapse
Affiliation(s)
- Xuan Xie
- Zhejiang University - University of Edinburgh Institute, Zhejiang University School of Medicine, Haining, 314400, P. R. China
| | - Xia Sun
- Zhejiang University - University of Edinburgh Institute, Zhejiang University School of Medicine, Haining, 314400, P. R. China
- Deanery of Biomedical Sciences, College of Medicine & Veterinary Medicine, University of Edinburgh, Edinburgh, EH8 9XD, UK
| | - Yuheng Wang
- Zhejiang University - University of Edinburgh Institute, Zhejiang University School of Medicine, Haining, 314400, P. R. China
- Deanery of Biomedical Sciences, College of Medicine & Veterinary Medicine, University of Edinburgh, Edinburgh, EH8 9XD, UK
| | - Ben Lehner
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, 08003, Spain.
- ICREA, Pg. Luis Companys 23, Barcelona, 08010, Spain.
- Wellcome Sanger Institute, Wellcome Genome Campus Hinxton, Cambridge, CB10 1SA, UK.
| | - Xianghua Li
- Zhejiang University - University of Edinburgh Institute, Zhejiang University School of Medicine, Haining, 314400, P. R. China.
- Wellcome Sanger Institute, Wellcome Genome Campus Hinxton, Cambridge, CB10 1SA, UK.
- Deanery of Biomedical Sciences, College of Medicine & Veterinary Medicine, University of Edinburgh, Edinburgh, EH8 9XD, UK.
- Biomedical and Health Translational Centre of Zhejiang Province, Haizhou East Road 718, Haining, 314400, P. R. China.
| |
Collapse
|
12
|
Baryshev A, La Fleur A, Groves B, Michel C, Baker D, Ljubetič A, Seelig G. Massively parallel protein-protein interaction measurement by sequencing (MP3-seq) enables rapid screening of protein heterodimers. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.08.527770. [PMID: 36798377 PMCID: PMC9934699 DOI: 10.1101/2023.02.08.527770] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
Abstract
Protein-protein interactions (PPIs) regulate many cellular processes, and engineered PPIs have cell and gene therapy applications. Here we introduce massively parallel protein-protein interaction measurement by sequencing (MP3-seq), an easy-to-use and highly scalable yeast-two-hybrid approach for measuring PPIs. In MP3-seq, DNA barcodes are associated with specific protein pairs, and barcode enrichment can be read by sequencing to provide a direct measure of interaction strength. We show that MP3-seq is highly quantitative and scales to over 100,000 interactions. We apply MP3-seq to characterize interactions between families of rationally designed heterodimers and to investigate elements conferring specificity to coiled-coil interactions. Finally, we predict coiled heterodimer structures using AlphaFold-Multimer (AF-M) and train linear models on physics simulation energy terms to predict MP3-seq values. We find that AF-M and AF-M complex prediction-based models could be valuable for pre-screening interactions, but that measuring interactions experimentally remains necessary to rank their strengths quantitatively.
Collapse
Affiliation(s)
- Alexander Baryshev
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA 98195, USA
| | - Alyssa La Fleur
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA
| | - Benjamin Groves
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA 98195, USA
| | - Cirstyn Michel
- Department of Bioengineering, University of Washington, Seattle, WA 98195, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Ajasja Ljubetič
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
- Department for Synthetic Biology and Immunology, National Institute of Chemistry, Ljubljana SI-1000, Slovenia
| | - Georg Seelig
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA 98195, USA
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
13
|
Chen L, Zhang Z, Li Z, Li R, Huo R, Chen L, Wang D, Luo X, Chen K, Liao C, Zheng M. Learning protein fitness landscapes with deep mutational scanning data from multiple sources. Cell Syst 2023; 14:706-721.e5. [PMID: 37591206 DOI: 10.1016/j.cels.2023.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 05/30/2023] [Accepted: 07/18/2023] [Indexed: 08/19/2023]
Abstract
One of the key points of machine learning-assisted directed evolution (MLDE) is the accurate learning of the fitness landscape, a conceptual mapping from sequence variants to the desired function. Here, we describe a multi-protein training scheme that leverages the existing deep mutational scanning data from diverse proteins to aid in understanding the fitness landscape of a new protein. Proof-of-concept trials are designed to validate this training scheme in three aspects: random and positional extrapolation for single-variant effects, zero-shot fitness predictions for new proteins, and extrapolation for higher-order variant effects from single-variant effects. Moreover, our study identified previously overlooked strong baselines, and their unexpectedly good performance brings our attention to the pitfalls of MLDE. Overall, these results may improve our understanding of the association between different protein fitness profiles and shed light on developing better machine learning-assisted approaches to the directed evolution of proteins. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Lin Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zehong Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhenghao Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; Shanghai Institute for Advanced Immunochemical Studies, School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Rui Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; School of Pharmacy, China Pharmaceutical University, Nanjing 211198, China
| | - Ruifeng Huo
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Lifan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | | | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kaixian Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China; School of Pharmacy, China Pharmaceutical University, Nanjing 211198, China
| | - Cangsong Liao
- University of Chinese Academy of Sciences, Beijing 100049, China; Chemical Biology Research Center, Shanghai Institute of Materia Medica, Chinese Academy of Science, Shanghai 201203, China.
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China; School of Pharmacy, China Pharmaceutical University, Nanjing 211198, China; School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China.
| |
Collapse
|
14
|
Boldridge WC, Ljubetič A, Kim H, Lubock N, Szilágyi D, Lee J, Brodnik A, Jerala R, Kosuri S. A multiplexed bacterial two-hybrid for rapid characterization of protein-protein interactions and iterative protein design. Nat Commun 2023; 14:4636. [PMID: 37532706 PMCID: PMC10397247 DOI: 10.1038/s41467-023-38697-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 05/11/2023] [Indexed: 08/04/2023] Open
Abstract
Protein-protein interactions (PPIs) are crucial for biological functions and have applications ranging from drug design to synthetic cell circuits. Coiled-coils have been used as a model to study the sequence determinants of specificity. However, building well-behaved sets of orthogonal pairs of coiled-coils remains challenging due to inaccurate predictions of orthogonality and difficulties in testing at scale. To address this, we develop the next-generation bacterial two-hybrid (NGB2H) method, which allows for the rapid exploration of interactions of programmed protein libraries in a quantitative and scalable way using next-generation sequencing readout. We design, build, and test large sets of orthogonal synthetic coiled-coils, assayed over 8,000 PPIs, and used the dataset to train a more accurate coiled-coil scoring algorithm (iCipa). After characterizing nearly 18,000 new PPIs, we identify to the best of our knowledge the largest set of orthogonal coiled-coils to date, with fifteen on-target interactions. Our approach provides a powerful tool for the design of orthogonal PPIs.
Collapse
Affiliation(s)
- W Clifford Boldridge
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, 90095, USA
| | - Ajasja Ljubetič
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, 1000, Ljubljana, Slovenia.
- EN-FIST Centre of Excellence, 1000, Ljubljana, Slovenia.
| | - Hwangbeom Kim
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, 90095, USA
- Samsung Biologics, Incheon, Republic of Korea
| | - Nathan Lubock
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, 90095, USA
- Octant Inc, Emeryville, CA, 94608, USA
| | | | - Jonathan Lee
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, CA, 90095, USA
- Keck School of Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | | | - Roman Jerala
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, 1000, Ljubljana, Slovenia.
- EN-FIST Centre of Excellence, 1000, Ljubljana, Slovenia.
| | - Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, 90095, USA.
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Octant Inc, Emeryville, CA, 94608, USA.
| |
Collapse
|
15
|
Wagner A. Evolvability-enhancing mutations in the fitness landscapes of an RNA and a protein. Nat Commun 2023; 14:3624. [PMID: 37336901 PMCID: PMC10279741 DOI: 10.1038/s41467-023-39321-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 06/05/2023] [Indexed: 06/21/2023] Open
Abstract
Can evolvability-the ability to produce adaptive heritable variation-itself evolve through adaptive Darwinian evolution? If so, then Darwinian evolution may help create the conditions that enable Darwinian evolution. Here I propose a framework that is suitable to address this question with available experimental data on adaptive landscapes. I introduce the notion of an evolvability-enhancing mutation, which increases the likelihood that subsequent mutations in an evolving organism, protein, or RNA molecule are adaptive. I search for such mutations in the experimentally characterized and combinatorially complete fitness landscapes of a protein and an RNA molecule. I find that such evolvability-enhancing mutations indeed exist. They constitute a small fraction of all mutations, which shift the distribution of fitness effects of subsequent mutations towards less deleterious mutations, and increase the incidence of beneficial mutations. Evolving populations which experience such mutations can evolve significantly higher fitness. The study of evolvability-enhancing mutations opens many avenues of investigation into the evolution of evolvability.
Collapse
Affiliation(s)
- Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland.
- Swiss Institute of Bioinformatics, Quartier Sorge-Batiment Genopode, Lausanne, Switzerland.
- The Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
16
|
Soneson C, Bendel AM, Diss G, Stadler MB. mutscan-a flexible R package for efficient end-to-end analysis of multiplexed assays of variant effect data. Genome Biol 2023; 24:132. [PMID: 37264470 DOI: 10.1186/s13059-023-02967-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 05/10/2023] [Indexed: 06/03/2023] Open
Abstract
Multiplexed assays of variant effect (MAVE) experimentally measure the effect of large numbers of sequence variants by selective enrichment of sequences with desirable properties followed by quantification by sequencing. mutscan is an R package for flexible analysis of such experiments, covering the entire workflow from raw reads up to statistical analysis and visualization. The core components are implemented in C++ for efficiency. Various experimental designs are supported, including single or paired reads with optional unique molecular identifiers. To find variants with changed relative abundance, mutscan employs established statistical models provided in the edgeR and limma packages. mutscan is available from https://github.com/fmicompbio/mutscan .
Collapse
Affiliation(s)
- Charlotte Soneson
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Alexandra M Bendel
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - Guillaume Diss
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - Michael B Stadler
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
- University of Basel, Basel, Switzerland.
| |
Collapse
|
17
|
Baier F, Gauye F, Perez-Carrasco R, Payne JL, Schaerli Y. Environment-dependent epistasis increases phenotypic diversity in gene regulatory networks. SCIENCE ADVANCES 2023; 9:eadf1773. [PMID: 37224262 DOI: 10.1126/sciadv.adf1773] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Accepted: 04/17/2023] [Indexed: 05/26/2023]
Abstract
Mutations to gene regulatory networks can be maladaptive or a source of evolutionary novelty. Epistasis confounds our understanding of how mutations affect the expression patterns of gene regulatory networks, a challenge exacerbated by the dependence of epistasis on the environment. We used the toolkit of synthetic biology to systematically assay the effects of pairwise and triplet combinations of mutant genotypes on the expression pattern of a gene regulatory network expressed in Escherichia coli that interprets an inducer gradient across a spatial domain. We uncovered a preponderance of epistasis that can switch in magnitude and sign across the inducer gradient to produce a greater diversity of expression pattern phenotypes than would be possible in the absence of such environment-dependent epistasis. We discuss our findings in the context of the evolution of hybrid incompatibilities and evolutionary novelties.
Collapse
Affiliation(s)
- Florian Baier
- Department of Fundamental Microbiology, University of Lausanne, Biophore Building, 1015 Lausanne, Switzerland
| | - Florence Gauye
- Department of Fundamental Microbiology, University of Lausanne, Biophore Building, 1015 Lausanne, Switzerland
| | | | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, 8092 Zurich, Switzerland
| | - Yolanda Schaerli
- Department of Fundamental Microbiology, University of Lausanne, Biophore Building, 1015 Lausanne, Switzerland
| |
Collapse
|
18
|
Bragina EY, Puzyrev VP. Genetic outline of the hermeneutics of the diseases connection phenomenon in human. Vavilovskii Zhurnal Genet Selektsii 2023; 27:7-17. [PMID: 36923482 PMCID: PMC10009484 DOI: 10.18699/vjgb-23-03] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 12/25/2022] [Accepted: 12/26/2022] [Indexed: 03/11/2023] Open
Abstract
The structure of diseases in humans is heterogeneous, which is manifested by various combinations of diseases, including comorbidities associated with a common pathogenetic mechanism, as well as diseases that rarely manifest together. Recently, there has been a growing interest in studying the patterns of development of not individual diseases, but entire families associated with common pathogenetic mechanisms and common genes involved in their development. Studies of this problem make it possible to isolate an essential genetic component that controls the formation of disease conglomerates in a complex way through functionally interacting modules of individual genes in gene networks. An analytical review of studies on the problems of various aspects of the combination of diseases is the purpose of this study. The review uses the metaphor of a hermeneutic circle to understand the structure of regular relationships between diseases, and provides a conceptual framework related to the study of multiple diseases in an individual. The existing terminology is considered in relation to them, including multimorbidity, polypathies, comorbidity, conglomerates, families, "second diseases", syntropy and others. Here we summarize the key results that are extremely useful, primarily for describing the genetic architecture of diseases of a multifactorial nature. Summaries of the research problem of the disease connection phenomenon allow us to approach the systematization and natural classification of diseases. From practical healthcare perspective, the description of the disease connection phenomenon is crucial for expanding the clinician's interpretive horizon and moving beyond narrow, disease-specific therapeutic decisions.
Collapse
Affiliation(s)
- E Yu Bragina
- Research Institute of Medical Genetics, Tomsk National Research Medical Center of the Russian Academy of Sciences, Tomsk, Russia
| | - V P Puzyrev
- Research Institute of Medical Genetics, Tomsk National Research Medical Center of the Russian Academy of Sciences, Tomsk, Russia Siberian State Medical University, Tomsk, Russia
| |
Collapse
|
19
|
Conti MM, Li R, Narváez Ramos MA, Zhu LJ, Fazzio TG, Benanti JA. Phosphosite Scanning reveals a complex phosphorylation code underlying CDK-dependent activation of Hcm1. Nat Commun 2023; 14:310. [PMID: 36658165 PMCID: PMC9852432 DOI: 10.1038/s41467-023-36035-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 01/11/2023] [Indexed: 01/20/2023] Open
Abstract
Ordered cell cycle progression is coordinated by cyclin dependent kinases (CDKs). CDKs often phosphorylate substrates at multiple sites clustered within disordered regions. However, for most substrates, it is not known which phosphosites are functionally important. We developed a high-throughput approach, Phosphosite Scanning, that tests the importance of each phosphosite within a multisite phosphorylated domain. We show that Phosphosite Scanning identifies multiple combinations of phosphosites that can regulate protein function and reveals specific phosphorylations that are required for phosphorylation at additional sites within a domain. We applied this approach to the yeast transcription factor Hcm1, a conserved regulator of mitotic genes that is critical for accurate chromosome segregation. Phosphosite Scanning revealed a complex CDK-regulatory circuit that mediates Cks1-dependent phosphorylation of key activating sites in vivo. These results illuminate the mechanism of Hcm1 activation by CDK and establish Phosphosite Scanning as a powerful tool for decoding multisite phosphorylated domains.
Collapse
Affiliation(s)
- Michelle M Conti
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA
| | - Rui Li
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA
| | - Michelle A Narváez Ramos
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA
| | - Lihua Julie Zhu
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA.,Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA
| | - Thomas G Fazzio
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA
| | - Jennifer A Benanti
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA.
| |
Collapse
|
20
|
Wei H, Li X. Deep mutational scanning: A versatile tool in systematically mapping genotypes to phenotypes. Front Genet 2023; 14:1087267. [PMID: 36713072 PMCID: PMC9878224 DOI: 10.3389/fgene.2023.1087267] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 01/02/2023] [Indexed: 01/13/2023] Open
Abstract
Unveiling how genetic variations lead to phenotypic variations is one of the key questions in evolutionary biology, genetics, and biomedical research. Deep mutational scanning (DMS) technology has allowed the mapping of tens of thousands of genetic variations to phenotypic variations efficiently and economically. Since its first systematic introduction about a decade ago, we have witnessed the use of deep mutational scanning in many research areas leading to scientific breakthroughs. Also, the methods in each step of deep mutational scanning have become much more versatile thanks to the oligo-synthesizing technology, high-throughput phenotyping methods and deep sequencing technology. However, each specific possible step of deep mutational scanning has its pros and cons, and some limitations still await further technological development. Here, we discuss recent scientific accomplishments achieved through the deep mutational scanning and describe widely used methods in each step of deep mutational scanning. We also compare these different methods and analyze their advantages and disadvantages, providing insight into how to design a deep mutational scanning study that best suits the aims of the readers' projects.
Collapse
Affiliation(s)
- Huijin Wei
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China
| | - Xianghua Li
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China,Deanery of Biomedical Sciences, University of Edinburgh, Edinburgh, United Kingdom,The Second Affiliated Hospital of Zhejiang University, Hangzhou, Zhejiang, China,Biomedical and Health Translational Centre of Zhejiang Province, Haining, Zhejiang, China,*Correspondence: Xianghua Li,
| |
Collapse
|
21
|
Fu Y, Bedő J, Papenfuss AT, Rubin AF. Integrating deep mutational scanning and low-throughput mutagenesis data to predict the impact of amino acid variants. Gigascience 2022; 12:giad073. [PMID: 37721410 PMCID: PMC10506130 DOI: 10.1093/gigascience/giad073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 07/02/2023] [Accepted: 08/23/2023] [Indexed: 09/19/2023] Open
Abstract
BACKGROUND Evaluating the impact of amino acid variants has been a critical challenge for studying protein function and interpreting genomic data. High-throughput experimental methods like deep mutational scanning (DMS) can measure the effect of large numbers of variants in a target protein, but because DMS studies have not been performed on all proteins, researchers also model DMS data computationally to estimate variant impacts by predictors. RESULTS In this study, we extended a linear regression-based predictor to explore whether incorporating data from alanine scanning (AS), a widely used low-throughput mutagenesis method, would improve prediction results. To evaluate our model, we collected 146 AS datasets, mapping to 54 DMS datasets across 22 distinct proteins. CONCLUSIONS We show that improved model performance depends on the compatibility of the DMS and AS assays, and the scale of improvement is closely related to the correlation between DMS and AS results.
Collapse
Affiliation(s)
- Yunfan Fu
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
| | - Justin Bedő
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
| | - Anthony T Papenfuss
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
- Peter MacCallum Cancer Centre, Melbourne, Victoria 3000, Australia
| | - Alan F Rubin
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
| |
Collapse
|
22
|
Tabet D, Parikh V, Mali P, Roth FP, Claussnitzer M. Scalable Functional Assays for the Interpretation of Human Genetic Variation. Annu Rev Genet 2022; 56:441-465. [PMID: 36055970 DOI: 10.1146/annurev-genet-072920-032107] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Scalable sequence-function studies have enabled the systematic analysis and cataloging of hundreds of thousands of coding and noncoding genetic variants in the human genome. This has improved clinical variant interpretation and provided insights into the molecular, biophysical, and cellular effects of genetic variants at an astonishing scale and resolution across the spectrum of allele frequencies. In this review, we explore current applications and prospects for the field and outline the principles underlying scalable functional assay design, with a focus on the study of single-nucleotide coding and noncoding variants.
Collapse
Affiliation(s)
- Daniel Tabet
- Donnelly Centre, Department of Molecular Genetics, and Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
| | - Victoria Parikh
- Center for Inherited Cardiovascular Disease, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, California, USA
| | - Prashant Mali
- Department of Bioengineering, University of California, San Diego, California, USA
| | - Frederick P Roth
- Donnelly Centre, Department of Molecular Genetics, and Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
| | - Melina Claussnitzer
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Center for Genomic Medicine and Endocrine Division, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Harvard University, Boston, Massachusetts, USA;
| |
Collapse
|
23
|
Azbukina N, Zharikova A, Ramensky V. Intragenic compensation through the lens of deep mutational scanning. Biophys Rev 2022; 14:1161-1182. [PMID: 36345285 PMCID: PMC9636336 DOI: 10.1007/s12551-022-01005-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/26/2022] [Indexed: 12/20/2022] Open
Abstract
A significant fraction of mutations in proteins are deleterious and result in adverse consequences for protein function, stability, or interaction with other molecules. Intragenic compensation is a specific case of positive epistasis when a neutral missense mutation cancels effect of a deleterious mutation in the same protein. Permissive compensatory mutations facilitate protein evolution, since without them all sequences would be extremely conserved. Understanding compensatory mechanisms is an important scientific challenge at the intersection of protein biophysics and evolution. In human genetics, intragenic compensatory interactions are important since they may result in variable penetrance of pathogenic mutations or fixation of pathogenic human alleles in orthologous proteins from related species. The latter phenomenon complicates computational and clinical inference of an allele's pathogenicity. Deep mutational scanning is a relatively new technique that enables experimental studies of functional effects of thousands of mutations in proteins. We review the important aspects of the field and discuss existing limitations of current datasets. We reviewed ten published DMS datasets with quantified functional effects of single and double mutations and described rates and patterns of intragenic compensation in eight of them. Supplementary Information The online version contains supplementary material available at 10.1007/s12551-022-01005-w.
Collapse
Affiliation(s)
- Nadezhda Azbukina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
| | - Anastasia Zharikova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| | - Vasily Ramensky
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| |
Collapse
|
24
|
High-throughput approaches to functional characterization of genetic variation in yeast. Curr Opin Genet Dev 2022; 76:101979. [PMID: 36075138 DOI: 10.1016/j.gde.2022.101979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 07/29/2022] [Accepted: 08/02/2022] [Indexed: 11/20/2022]
Abstract
Expansion of sequencing efforts to include thousands of genomes is providing a fundamental resource for determining the genetic diversity that exists in a population. Now, high-throughput approaches are necessary to begin to understand the role these genotypic changes play in affecting phenotypic variation. Saccharomyces cerevisiae maintains its position as an excellent model system to determine the function of unknown variants with its exceptional genetic diversity, phenotypic diversity, and reliable genetic manipulation tools. Here, we review strategies and techniques developed in yeast that scale classic approaches of assessing variant function. These approaches improve our ability to better map quantitative trait loci at a higher resolution, even for rare variants, and are already providing greater insight into the role that different types of mutations play in phenotypic variation and evolution not just in yeast but across taxa.
Collapse
|
25
|
Després PC, Cisneros AF, Alexander EMM, Sonigara R, Gagné-Thivierge C, Dubé AK, Landry CR. Asymmetrical dose responses shape the evolutionary trade-off between antifungal resistance and nutrient use. Nat Ecol Evol 2022; 6:1501-1515. [PMID: 36050399 DOI: 10.1038/s41559-022-01846-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 07/07/2022] [Indexed: 12/22/2022]
Abstract
Antimicrobial resistance is an emerging threat for public health. The success of resistance mutations depends on the trade-off between the benefits and costs they incur. This trade-off is largely unknown and uncharacterized for antifungals. Here, we systematically measure the effect of all amino acid substitutions in the yeast cytosine deaminase Fcy1, the target of the antifungal 5-fluorocytosine (5-FC, flucytosine). We identify over 900 missense mutations granting resistance to 5-FC, a large fraction of which appear to act through destabilization of the protein. The relationship between 5-FC resistance and growth sustained by cytosine deamination is characterized by a sharp trade-off, such that small gains in resistance universally lead to large losses in canonical enzyme function. We show that this steep relationship can be explained by differences in the dose-response functions of 5-FC and cytosine. Finally, we observe the same trade-off shape for the orthologue of FCY1 in Cryptoccocus neoformans, a human pathogen. Our results provide a powerful resource and platform for interpreting drug target variants in fungal pathogens as well as unprecedented insights into resistance-function trade-offs.
Collapse
Affiliation(s)
- Philippe C Després
- Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, Canada.
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, Canada.
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, Canada.
- Centre de Recherche sur les Données Massives, Université Laval, Québec, Canada.
| | - Angel F Cisneros
- Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, Canada
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, Canada
- Centre de Recherche sur les Données Massives, Université Laval, Québec, Canada
| | - Emilie M M Alexander
- Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, Canada
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, Canada
- Centre de Recherche sur les Données Massives, Université Laval, Québec, Canada
| | - Ria Sonigara
- Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, Canada
- Centre de Recherche sur les Données Massives, Université Laval, Québec, Canada
- Département de Biologie, Faculté des Sciences et de Génie, Université Laval, Québec, Canada
| | - Cynthia Gagné-Thivierge
- Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, Canada
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, Canada
- Centre de Recherche sur les Données Massives, Université Laval, Québec, Canada
- Département de Biologie, Faculté des Sciences et de Génie, Université Laval, Québec, Canada
| | - Alexandre K Dubé
- Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, Canada
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, Canada
- Centre de Recherche sur les Données Massives, Université Laval, Québec, Canada
- Département de Biologie, Faculté des Sciences et de Génie, Université Laval, Québec, Canada
| | - Christian R Landry
- Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, Canada.
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, Canada.
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, Canada.
- Centre de Recherche sur les Données Massives, Université Laval, Québec, Canada.
- Département de Biologie, Faculté des Sciences et de Génie, Université Laval, Québec, Canada.
| |
Collapse
|
26
|
Braberg H, Echeverria I, Kaake RM, Sali A, Krogan NJ. From systems to structure - using genetic data to model protein structures. Nat Rev Genet 2022; 23:342-354. [PMID: 35013567 PMCID: PMC8744059 DOI: 10.1038/s41576-021-00441-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/25/2021] [Indexed: 12/11/2022]
Abstract
Understanding the effects of genetic variation is a fundamental problem in biology that requires methods to analyse both physical and functional consequences of sequence changes at systems-wide and mechanistic scales. To achieve a systems view, protein interaction networks map which proteins physically interact, while genetic interaction networks inform on the phenotypic consequences of perturbing these protein interactions. Until recently, understanding the molecular mechanisms that underlie these interactions often required biophysical methods to determine the structures of the proteins involved. The past decade has seen the emergence of new approaches based on coevolution, deep mutational scanning and genome-scale genetic or chemical-genetic interaction mapping that enable modelling of the structures of individual proteins or protein complexes. Here, we review the emerging use of large-scale genetic datasets and deep learning approaches to model protein structures and their interactions, and discuss the integration of structural data from different sources.
Collapse
Affiliation(s)
- Hannes Braberg
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Ignacia Echeverria
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
| | - Robyn M Kaake
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Gladstone Institutes, San Francisco, CA, USA
| | - Andrej Sali
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, USA
| | - Nevan J Krogan
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA, USA.
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA, USA.
- Gladstone Institutes, San Francisco, CA, USA.
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
27
|
Park Y, Metzger BPH, Thornton JW. Epistatic drift causes gradual decay of predictability in protein evolution. Science 2022; 376:823-830. [PMID: 35587978 DOI: 10.1126/science.abn6895] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Epistatic interactions can make the outcomes of evolution unpredictable, but no comprehensive data are available on the extent and temporal dynamics of changes in the effects of mutations as protein sequences evolve. Here, we use phylogenetic deep mutational scanning to measure the functional effect of every possible amino acid mutation in a series of ancestral and extant steroid receptor DNA binding domains. Across 700 million years of evolution, epistatic interactions caused the effects of most mutations to become decorrelated from their initial effects and their windows of evolutionary accessibility to open and close transiently. Most effects changed gradually and without bias at rates that were largely constant across time, indicating a neutral process caused by many weak epistatic interactions. Our findings show that protein sequences drift inexorably into contingency and unpredictability, but that the process is statistically predictable, given sufficient phylogenetic and experimental data.
Collapse
Affiliation(s)
- Yeonwoo Park
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA
| | - Brian P H Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Joseph W Thornton
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA.,Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA.,Department of Human Genetics, University of Chicago, Chicago, IL, USA
| |
Collapse
|
28
|
Ding D, Green AG, Wang B, Lite TLV, Weinstein EN, Marks DS, Laub MT. Co-evolution of interacting proteins through non-contacting and non-specific mutations. Nat Ecol Evol 2022; 6:590-603. [PMID: 35361892 PMCID: PMC9090974 DOI: 10.1038/s41559-022-01688-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Accepted: 01/31/2022] [Indexed: 01/08/2023]
Abstract
Proteins often accumulate neutral mutations that do not affect current functions but can profoundly influence future mutational possibilities and functions. Understanding such hidden potential has major implications for protein design and evolutionary forecasting but has been limited by a lack of systematic efforts to identify potentiating mutations. Here, through the comprehensive analysis of a bacterial toxin-antitoxin system, we identified all possible single substitutions in the toxin that enable it to tolerate otherwise interface-disrupting mutations in its antitoxin. Strikingly, the majority of enabling mutations in the toxin do not contact and promote tolerance non-specifically to many different antitoxin mutations, despite covariation in homologues occurring primarily between specific pairs of contacting residues across the interface. In addition, the enabling mutations we identified expand future mutational paths that both maintain old toxin-antitoxin interactions and form new ones. These non-specific mutations are missed by widely used covariation and machine learning methods. Identifying such enabling mutations will be critical for ensuring continued binding of therapeutically relevant proteins, such as antibodies, aimed at evolving targets.
Collapse
Affiliation(s)
- David Ding
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Anna G Green
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Boyuan Wang
- Department of Pharmacology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Thuy-Lan Vo Lite
- Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School, Boston, MA, USA
| | | | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Michael T Laub
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
29
|
Faure AJ, Domingo J, Schmiedel JM, Hidalgo-Carcedo C, Diss G, Lehner B. Mapping the energetic and allosteric landscapes of protein binding domains. Nature 2022; 604:175-183. [PMID: 35388192 DOI: 10.1038/s41586-022-04586-4] [Citation(s) in RCA: 83] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 02/25/2022] [Indexed: 11/09/2022]
Abstract
Allosteric communication between distant sites in proteins is central to biological regulation but still poorly characterized, limiting understanding, engineering and drug development1-6. An important reason for this is the lack of methods to comprehensively quantify allostery in diverse proteins. Here we address this shortcoming and present a method that uses deep mutational scanning to globally map allostery. The approach uses an efficient experimental design to infer en masse the causal biophysical effects of mutations by quantifying multiple molecular phenotypes-here we examine binding and protein abundance-in multiple genetic backgrounds and fitting thermodynamic models using neural networks. We apply the approach to two of the most common protein interaction domains found in humans, an SH3 domain and a PDZ domain, to produce comprehensive atlases of allosteric communication. Allosteric mutations are abundant, with a large mutational target space of network-altering 'edgetic' variants. Mutations are more likely to be allosteric closer to binding interfaces, at glycine residues and at specific residues connecting to an opposite surface within the PDZ domain. This general approach of quantifying mutational effects for multiple molecular phenotypes and in multiple genetic backgrounds should enable the energetic and allosteric landscapes of many proteins to be rapidly and comprehensively mapped.
Collapse
Affiliation(s)
- Andre J Faure
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Júlia Domingo
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,New York Genome Center (NYGC), New York, NY, USA
| | - Jörn M Schmiedel
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Cristina Hidalgo-Carcedo
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Guillaume Diss
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Ben Lehner
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain. .,Universitat Pompeu Fabra (UPF), Barcelona, Spain. .,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
30
|
Environmental selection and epistasis in an empirical phenotype-environment-fitness landscape. Nat Ecol Evol 2022; 6:427-438. [PMID: 35210579 DOI: 10.1038/s41559-022-01675-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 12/14/2021] [Indexed: 11/08/2022]
Abstract
Fitness landscapes, mappings of genotype/phenotype to their effects on fitness, are invaluable concepts in evolutionary biochemistry. Although widely discussed, measurements of phenotype-fitness landscapes in proteins remain scarce. Here, we quantify all single mutational effects on fitness and phenotype (EC50) of VIM-2 β-lactamase across a 64-fold range of ampicillin concentrations. We then construct a phenotype-fitness landscape that takes variations in environmental selection pressure into account. We found that a simple, empirical landscape accurately models the ~39,000 mutational data points, suggesting that the evolution of VIM-2 can be predicted on the basis of the selection environment. Our landscape provides new quantitative knowledge on the evolution of the β-lactamases and proteins in general, particularly their evolutionary dynamics under subinhibitory antibiotic concentrations, as well as the mechanisms and environmental dependence of non-specific epistasis.
Collapse
|
31
|
Scheele RA, Lindenburg LH, Petek M, Schober M, Dalby KN, Hollfelder F. Droplet-based screening of phosphate transfer catalysis reveals how epistasis shapes MAP kinase interactions with substrates. Nat Commun 2022; 13:844. [PMID: 35149678 PMCID: PMC8837617 DOI: 10.1038/s41467-022-28396-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 01/10/2022] [Indexed: 11/20/2022] Open
Abstract
The combination of ultrahigh-throughput screening and sequencing informs on function and intragenic epistasis within combinatorial protein mutant libraries. Establishing a droplet-based, in vitro compartmentalised approach for robust expression and screening of protein kinase cascades (>107 variants/day) allowed us to dissect the intrinsic molecular features of the MKK-ERK signalling pathway, without interference from endogenous cellular components. In a six-residue combinatorial library of the MKK1 docking domain, we identified 29,563 sequence permutations that allow MKK1 to efficiently phosphorylate and activate its downstream target kinase ERK2. A flexibly placed hydrophobic sequence motif emerges which is defined by higher order epistatic interactions between six residues, suggesting synergy that enables high connectivity in the sequence landscape. Through positive epistasis, MKK1 maintains function during mutagenesis, establishing the importance of co-dependent residues in mammalian protein kinase-substrate interactions, and creating a scenario for the evolution of diverse human signalling networks. Here, the authors use a droplet-based screen for phosphate transfer catalysis, testing variants of the human protein kinase MKK1 for its ability to activate its downstream target ERK2. Data reveal a flexible motif in the MKK1 docking domain that promotes efficient activation of ERK2, and suggest epistasis between the residues within that sequence.
Collapse
Affiliation(s)
- Remkes A Scheele
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
| | | | - Maya Petek
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK.,Faculty of Medicine, University of Maribor, SI-2000, Maribor, Slovenia
| | - Markus Schober
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
| | - Kevin N Dalby
- Division of Chemical Biology and Medicinal Chemistry, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Florian Hollfelder
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK.
| |
Collapse
|
32
|
Kunowska N, Stelzl U. Decoding the cellular effects of genetic variation through interaction proteomics. Curr Opin Chem Biol 2022; 66:102100. [PMID: 34801969 DOI: 10.1016/j.cbpa.2021.102100] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 10/07/2021] [Accepted: 10/14/2021] [Indexed: 12/24/2022]
Abstract
It is often unclear how genetic variation translates into cellular phenotypes, including how much of the coding variation can be recovered in the proteome. Proteogenomic analyses of heterogenous cell lines revealed that the genetic differences impact mostly the abundance and stoichiometry of protein complexes, with the effects propagating post-transcriptionally via protein interactions onto other subunits. Conversely, large scale binary interaction analyses of missense variants revealed that loss of interaction is widespread and caused by about 50% disease-associated mutations, while deep scanning mutagenesis of binary interactions identified thousands of interaction-deficient variants per interaction. The idea that phenotypes arise from genetic variation through protein-protein interaction is therefore substantiated by both forward and reverse interaction proteomics. With improved methodologies, these two approaches combined can close the knowledge gap between nucleotide sequence variation and its functional consequences on the cellular proteome.
Collapse
Affiliation(s)
- Natalia Kunowska
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Austria
| | - Ulrich Stelzl
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Austria; BioTechMed-Graz, Austria; Field of Excellence BioHealth - University of Graz, Austria.
| |
Collapse
|
33
|
Staller MV, Ramirez E, Kotha SR, Holehouse AS, Pappu RV, Cohen BA. Directed mutational scanning reveals a balance between acidic and hydrophobic residues in strong human activation domains. Cell Syst 2022; 13:334-345.e5. [PMID: 35120642 PMCID: PMC9241528 DOI: 10.1016/j.cels.2022.01.002] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 10/20/2021] [Accepted: 01/05/2022] [Indexed: 01/01/2023]
Abstract
Acidic activation domains are intrinsically disordered regions of the transcription factors that bind coactivators. The intrinsic disorder and low evolutionary conservation of activation domains have made it difficult to identify the sequence features that control activity. To address this problem, we designed thousands of variants in seven acidic activation domains and measured their activities with a high-throughput assay in human cell culture. We found that strong activation domain activity requires a balance between the number of acidic residues and aromatic and leucine residues. These findings motivated a predictor of acidic activation domains that scans the human proteome for clusters of aromatic and leucine residues embedded in regions of high acidity. This predictor identifies known activation domains and accurately predicts previously unidentified ones. Our results support a flexible acidic exposure model of activation domains in which the acidic residues solubilize hydrophobic motifs so that they can interact with coactivators. A record of this paper’s transparent peer review process is included in the supplemental information. Transcriptional activation domains are poorly conserved, intrinsically disordered regions of the transcription factors that remain difficult to predict from protein sequences. A high-throughput method reveals how strong activation domains require a balance between acidic and hydrophobic residues. This balance powers an accurate predictor of activation domains on human transcription factors.
Collapse
Affiliation(s)
- Max V Staller
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA; Center for Computational Biology, University of California Berkeley, Berkeley, CA 94720, USA.
| | - Eddie Ramirez
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA
| | - Sanjana R Kotha
- Center for Computational Biology, University of California Berkeley, Berkeley, CA 94720, USA
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA; Center for Science and Engineering of Living Systems, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Rohit V Pappu
- Center for Science and Engineering of Living Systems, Washington University in St. Louis, St. Louis, MO 63130, USA; Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Barak A Cohen
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA.
| |
Collapse
|
34
|
Mutant libraries reveal negative design shielding proteins from supramolecular self-assembly and relocalization in cells. Proc Natl Acad Sci U S A 2022; 119:2101117119. [PMID: 35078932 PMCID: PMC8812688 DOI: 10.1073/pnas.2101117119] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/16/2021] [Indexed: 01/07/2023] Open
Abstract
Genetic mutations fuel organismal evolution but can also cause disease. As proteins are the cell’s workhorses, the ways in which mutations can disrupt their structure, stability, function, and interactions have been studied extensively. However, proteins evolve and function in a cellular context, and our ability to relate changes in protein sequence to cell-level phenotypes remains limited. In particular, the molecular mechanism underlying most disease-associated mutations is unknown. Here, we show that mutations changing a protein’s surface chemistry can dramatically impact its supramolecular self-assembly and localization in the cell. These results highlight the complex nature of genotype–phenotype relationships with a simple system. Understanding the molecular consequences of mutations in proteins is essential to map genotypes to phenotypes and interpret the increasing wealth of genomic data. While mutations are known to disrupt protein structure and function, their potential to create new structures and localization phenotypes has not yet been mapped to a sequence space. To map this relationship, we employed two homo-oligomeric protein complexes in which the internal symmetry exacerbates the impact of mutations. We mutagenized three surface residues of each complex and monitored the mutations’ effect on localization and assembly phenotypes in yeast cells. While surface mutations are classically viewed as benign, our analysis of several hundred mutants revealed they often trigger three main phenotypes in these proteins: nuclear localization, the formation of puncta, and fibers. Strikingly, more than 50% of random mutants induced one of these phenotypes in both complexes. Analyzing the mutant’s sequences showed that surface stickiness and net charge are two key physicochemical properties associated with these changes. In one complex, more than 60% of mutants self-assembled into fibers. Such a high frequency is explained by negative design: charged residues shield the complex from self-interacting with copies of itself, and the sole removal of the charges induces its supramolecular self-assembly. A subsequent analysis of several other complexes targeted with alanine mutations suggested that such negative design is common. These results highlight that minimal perturbations in protein surfaces’ physicochemical properties can frequently drive assembly and localization changes in a cellular context.
Collapse
|
35
|
Dubé AK, Dandage R, Dibyachintan S, Dionne U, Després PC, Landry CR. Deep Mutational Scanning of Protein-Protein Interactions Between Partners Expressed from Their Endogenous Loci In Vivo. Methods Mol Biol 2022; 2477:237-259. [PMID: 35524121 DOI: 10.1007/978-1-0716-2257-5_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Deep mutational scanning (DMS) generates mutants of a protein of interest in a comprehensive manner. CRISPR-Cas9 technology enables large-scale genome editing with high efficiency. Using both DMS and CRISPR-Cas9 therefore allows us to investigate the effects of thousands of mutations inserted directly in the genome. Combined with protein-fragment complementation assay (PCA), which enables the quantitative measurement of protein-protein interactions (PPIs) in vivo, these methods allow for the systematic assessment of the effects of mutations on PPIs in living cells. Here, we describe a method leveraging DMS, CRISPR-Cas9, and PCA to study the effect of point mutations on PPIs mediated by protein domains in yeast.
Collapse
Affiliation(s)
- Alexandre K Dubé
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada.
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada.
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada.
- Département de Biologie, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada.
| | - Rohan Dandage
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada
- Département de Biologie, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
| | - Soham Dibyachintan
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada
- Département de Biologie, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
- Department of Chemical Engineering, Indian Institute of Technology Bombay (IIT), Powai, Mumbai, Maharashtra, India
| | - Ugo Dionne
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada
- Centre de recherche du Centre Hospitalier Universitaire (CHU) de Québec, Université Laval, Québec, QC, Canada
- Centre de recherche sur le cancer de l'Université Laval, Québec, QC, Canada
| | - Philippe C Després
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada
| | - Christian R Landry
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada.
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada.
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada.
- Département de Biologie, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada.
| |
Collapse
|
36
|
Luo Y, Jiang G, Yu T, Liu Y, Vo L, Ding H, Su Y, Qian WW, Zhao H, Peng J. ECNet is an evolutionary context-integrated deep learning framework for protein engineering. Nat Commun 2021; 12:5743. [PMID: 34593817 PMCID: PMC8484459 DOI: 10.1038/s41467-021-25976-8] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2021] [Accepted: 09/09/2021] [Indexed: 11/28/2022] Open
Abstract
Machine learning has been increasingly used for protein engineering. However, because the general sequence contexts they capture are not specific to the protein being engineered, the accuracy of existing machine learning algorithms is rather limited. Here, we report ECNet (evolutionary context-integrated neural network), a deep-learning algorithm that exploits evolutionary contexts to predict functional fitness for protein engineering. This algorithm integrates local evolutionary context from homologous sequences that explicitly model residue-residue epistasis for the protein of interest with the global evolutionary context that encodes rich semantic and structural features from the enormous protein sequence universe. As such, it enables accurate mapping from sequence to function and provides generalization from low-order mutants to higher-order mutants. We show that ECNet predicts the sequence-function relationship more accurately as compared to existing machine learning algorithms by using ~50 deep mutational scanning and random mutagenesis datasets. Moreover, we used ECNet to guide the engineering of TEM-1 β-lactamase and identified variants with improved ampicillin resistance with high success rates.
Collapse
Affiliation(s)
- Yunan Luo
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
| | - Guangde Jiang
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
| | - Tianhao Yu
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
| | - Yang Liu
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
| | - Lam Vo
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
| | - Hantian Ding
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
| | - Yufeng Su
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
| | - Wesley Wei Qian
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
| | - Huimin Zhao
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA.
| | - Jian Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA.
| |
Collapse
|
37
|
Abstract
Duplication and divergence is a major mechanism by which new proteins and functions emerge in biology. Consequently, most organisms, in all domains of life, have genomes that encode large paralogous families of proteins. For recently duplicated pathways to acquire different, independent functions, the two paralogs must acquire mutations that effectively insulate them from one another. For instance, paralogous signaling proteins must acquire mutations that endow them with different interaction specificities such that they can participate in different signaling pathways without disruptive cross-talk. Although duplicated genes undoubtedly shape each other's evolution as they diverge and attain new functions, it is less clear how other paralogs impact or constrain gene duplication. Does the establishment of a new pathway by duplication and divergence require the system-wide optimization of all paralogs? The answer has profound implications for molecular evolution and our ability to engineer biological systems. Here, we discuss models, experiments, and approaches for tackling this question, and for understanding how new proteins and pathways are born.
Collapse
Affiliation(s)
- Conor J McClune
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Chemical Engineering and ChEM-H Institute, Stanford University, Stanford, CA 94305, USA
| | - Michael T Laub
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| |
Collapse
|
38
|
Moesslacher CS, Kohlmayr JM, Stelzl U. Exploring absent protein function in yeast: assaying post translational modification and human genetic variation. MICROBIAL CELL (GRAZ, AUSTRIA) 2021; 8:164-183. [PMID: 34395585 PMCID: PMC8329848 DOI: 10.15698/mic2021.08.756] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 06/13/2021] [Accepted: 06/18/2021] [Indexed: 01/08/2023]
Abstract
Yeast is a valuable eukaryotic model organism that has evolved many processes conserved up to humans, yet many protein functions, including certain DNA and protein modifications, are absent. It is this absence of protein function that is fundamental to approaches using yeast as an in vivo test system to investigate human proteins. Functionality of the heterologous expressed proteins is connected to a quantitative, selectable phenotype, enabling the systematic analyses of mechanisms and specificity of DNA modification, post-translational protein modifications as well as the impact of annotated cancer mutations and coding variation on protein activity and interaction. Through continuous improvements of yeast screening systems, this is increasingly carried out on a global scale using deep mutational scanning approaches. Here we discuss the applicability of yeast systems to investigate absent human protein function with a specific focus on the impact of protein variation on protein-protein interaction modulation.
Collapse
Affiliation(s)
- Christina S Moesslacher
- Institute of Pharmaceutical Sciences and BioTechMed-Graz, University of Graz, Graz, Austria
- Contributed equally to the writing of this review
| | - Johanna M Kohlmayr
- Institute of Pharmaceutical Sciences and BioTechMed-Graz, University of Graz, Graz, Austria
- Contributed equally to the writing of this review
| | - Ulrich Stelzl
- Institute of Pharmaceutical Sciences and BioTechMed-Graz, University of Graz, Graz, Austria
- Contributed equally to the writing of this review
| |
Collapse
|
39
|
Radaeva M, Ton AT, Hsing M, Ban F, Cherkasov A. Drugging the 'undruggable'. Therapeutic targeting of protein-DNA interactions with the use of computer-aided drug discovery methods. Drug Discov Today 2021; 26:2660-2679. [PMID: 34332092 DOI: 10.1016/j.drudis.2021.07.018] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 06/22/2021] [Accepted: 07/17/2021] [Indexed: 02/09/2023]
Abstract
Transcription factors (TFs) act as major oncodrivers in many cancers and are frequently regarded as high-value therapeutic targets. The functionality of TFs relies on direct protein-DNA interactions, which are notoriously difficult to target with small molecules. However, this prior view of the 'undruggability' of protein-DNA interfaces has shifted substantially in recent years, in part because of significant advances in computer-aided drug discovery (CADD). In this review, we highlight recent examples of successful CADD campaigns resulting in drug candidates that directly interfere with protein-DNA interactions of several key cancer TFs, including androgen receptor (AR), ETS-related gene (ERG), MYC, thymocyte selection-associated high mobility group box protein (TOX), topoisomerase II (TOP2), and signal transducer and activator of transcription 3 (STAT3). Importantly, these findings open novel and compelling avenues for therapeutic targeting of over 1600 human TFs implicated in many conditions including and beyond cancer.
Collapse
Affiliation(s)
- Mariia Radaeva
- Vancouver Prostate Centre and the Department of Urologic Sciences, University of British Columbia, 2660 Oak Street, Vancouver, BC V6H 3Z6, Canada
| | - Anh-Tien Ton
- Vancouver Prostate Centre and the Department of Urologic Sciences, University of British Columbia, 2660 Oak Street, Vancouver, BC V6H 3Z6, Canada
| | - Michael Hsing
- Vancouver Prostate Centre and the Department of Urologic Sciences, University of British Columbia, 2660 Oak Street, Vancouver, BC V6H 3Z6, Canada
| | - Fuqiang Ban
- Vancouver Prostate Centre and the Department of Urologic Sciences, University of British Columbia, 2660 Oak Street, Vancouver, BC V6H 3Z6, Canada
| | - Artem Cherkasov
- Vancouver Prostate Centre and the Department of Urologic Sciences, University of British Columbia, 2660 Oak Street, Vancouver, BC V6H 3Z6, Canada.
| |
Collapse
|
40
|
|
41
|
Dandage R, Berger CM, Gagnon-Arsenault I, Moon KM, Stacey RG, Foster LJ, Landry CR. Frequent Assembly of Chimeric Complexes in the Protein Interaction Network of an Interspecies Yeast Hybrid. Mol Biol Evol 2021; 38:1384-1401. [PMID: 33252673 PMCID: PMC8042767 DOI: 10.1093/molbev/msaa298] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Hybrids between species often show extreme phenotypes, including some that take place at the molecular level. In this study, we investigated the phenotypes of an interspecies diploid hybrid in terms of protein–protein interactions inferred from protein correlation profiling. We used two yeast species, Saccharomyces cerevisiae and Saccharomyces uvarum, which are interfertile, but yet have proteins diverged enough to be differentiated using mass spectrometry. Most of the protein–protein interactions are similar between hybrid and parents, and are consistent with the assembly of chimeric complexes, which we validated using an orthogonal approach for the prefoldin complex. We also identified instances of altered protein–protein interactions in the hybrid, for instance, in complexes related to proteostasis and in mitochondrial protein complexes. Overall, this study uncovers the likely frequent occurrence of chimeric protein complexes with few exceptions, which may result from incompatibilities or imbalances between the parental proteomes.
Collapse
Affiliation(s)
- Rohan Dandage
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada.,PROTEO, Le Réseau Québécois de Recherche sur la Fonction, la Structure et L'ingénierie des Protéines, Université Laval, Québec, QC, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.,Département de Biologie, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada
| | - Caroline M Berger
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada.,PROTEO, Le Réseau Québécois de Recherche sur la Fonction, la Structure et L'ingénierie des Protéines, Université Laval, Québec, QC, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.,Département de Biologie, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada
| | - Isabelle Gagnon-Arsenault
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada.,PROTEO, Le Réseau Québécois de Recherche sur la Fonction, la Structure et L'ingénierie des Protéines, Université Laval, Québec, QC, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.,Département de Biologie, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada
| | - Kyung-Mee Moon
- Department of Biochemistry & Molecular Biology, and Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Richard Greg Stacey
- Department of Biochemistry & Molecular Biology, and Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Leonard J Foster
- Department of Biochemistry & Molecular Biology, and Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Christian R Landry
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada.,PROTEO, Le Réseau Québécois de Recherche sur la Fonction, la Structure et L'ingénierie des Protéines, Université Laval, Québec, QC, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.,Département de Biologie, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada
| |
Collapse
|
42
|
Manrubia S, Cuesta JA, Aguirre J, Ahnert SE, Altenberg L, Cano AV, Catalán P, Diaz-Uriarte R, Elena SF, García-Martín JA, Hogeweg P, Khatri BS, Krug J, Louis AA, Martin NS, Payne JL, Tarnowski MJ, Weiß M. From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics. Phys Life Rev 2021; 38:55-106. [PMID: 34088608 DOI: 10.1016/j.plrev.2021.03.004] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 03/01/2021] [Indexed: 12/21/2022]
Abstract
Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced in the space of genotypes by sequences mapped to molecular structures, has revealed important facts that deeply affect the dynamical description of evolutionary processes. Empirical evidence supporting the fundamental relevance of features such as phenotypic bias is mounting as well, while the synthesis of conceptual and experimental progress leads to questioning current assumptions on the nature of evolutionary dynamics-cancer progression models or synthetic biology approaches being notable examples. This work delves with a critical and constructive attitude into our current knowledge of how genotypes map onto molecular phenotypes and organismal functions, and discusses theoretical and empirical avenues to broaden and improve this comprehension. As a final goal, this community should aim at deriving an updated picture of evolutionary processes soundly relying on the structural properties of genotype spaces, as revealed by modern techniques of molecular and functional analysis.
Collapse
Affiliation(s)
- Susanna Manrubia
- Department of Systems Biology, Centro Nacional de Biotecnología (CSIC), Madrid, Spain; Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain; Instituto de Biocomputación y Física de Sistemas Complejos (BiFi), Universidad de Zaragoza, Spain; UC3M-Santander Big Data Institute (IBiDat), Getafe, Madrid, Spain
| | - Jacobo Aguirre
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Centro de Astrobiología, CSIC-INTA, ctra. de Ajalvir km 4, 28850 Torrejón de Ardoz, Madrid, Spain
| | - Sebastian E Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK; The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK
| | | | - Alejandro V Cano
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pablo Catalán
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain
| | - Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain; Instituto de Investigaciones Biomédicas "Alberto Sols" (UAM-CSIC), Madrid, Spain
| | - Santiago F Elena
- Instituto de Biología Integrativa de Sistemas, I(2)SysBio (CSIC-UV), València, Spain; The Santa Fe Institute, Santa Fe, NM, USA
| | | | - Paulien Hogeweg
- Theoretical Biology and Bioinformatics Group, Utrecht University, the Netherlands
| | - Bhavin S Khatri
- The Francis Crick Institute, London, UK; Department of Life Sciences, Imperial College London, London, UK
| | - Joachim Krug
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, UK
| | - Nora S Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Marcel Weiß
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| |
Collapse
|
43
|
Routh S, Acharyya A, Dhar R. A two-step PCR assembly for construction of gene variants across large mutational distances. Biol Methods Protoc 2021; 6:bpab007. [PMID: 33928191 PMCID: PMC8062255 DOI: 10.1093/biomethods/bpab007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 03/09/2021] [Accepted: 04/01/2021] [Indexed: 11/14/2022] Open
Abstract
Construction of empirical fitness landscapes has transformed our understanding of genotype–phenotype relationships across genes. However, most empirical fitness landscapes have been constrained to the local genotype neighbourhood of a gene primarily due to our limited ability to systematically construct genotypes that differ by a large number of mutations. Although a few methods have been proposed in the literature, these techniques are complex owing to several steps of construction or contain a large number of amplification cycles that increase chances of non-specific mutations. A few other described methods require amplification of the whole vector, thereby increasing the chances of vector backbone mutations that can have unintended consequences for study of fitness landscapes. Thus, this has substantially constrained us from traversing large mutational distances in the genotype network, thereby limiting our understanding of the interactions between multiple mutations and the role these interactions play in evolution of novel phenotypes. In the current work, we present a simple but powerful approach that allows us to systematically and accurately construct gene variants at large mutational distances. Our approach relies on building-up small fragments containing targeted mutations in the first step followed by assembly of these fragments into the complete gene fragment by polymerase chain reaction (PCR). We demonstrate the utility of our approach by constructing variants that differ by up to 11 mutations in a model gene. Our work thus provides an accurate method for construction of multi-mutant variants of genes and therefore will transform the studies of empirical fitness landscapes by enabling exploration of genotypes that are far away from a starting genotype.
Collapse
Affiliation(s)
- Shreya Routh
- Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, India
| | - Anamika Acharyya
- Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, India
| | - Riddhiman Dhar
- Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, India
- Correspondence address. Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, India. Tel. +91-3222-304562; E-mail:
| |
Collapse
|
44
|
Sruthi CK, Prakash MK. Disentangling the Contribution of Each Descriptive Characteristic of Every Single Mutation to Its Functional Effects. J Chem Inf Model 2021; 61:2090-2098. [PMID: 33754712 DOI: 10.1021/acs.jcim.0c01223] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Mutational effects predictions continue to improve in accuracy as advanced artificial intelligence (AI) algorithms are trained on exhaustive experimental data. The next natural questions to ask are if it is possible to gain insights into which attribute of the mutation contributes how much to the mutational effects and if one can develop universal rules for mapping the descriptors to mutational effects. In this work, we mainly address the former aspect using a framework of interpretable AI. Relations between the physicochemical descriptors and their contributions to the mutational effects are extracted by analyzing the data on 29,832 variants from eight systematic deep mutational scan studies. An opposite trend in the dependence of fitness and solubility on the distance of the amino acid from the catalytic sites could be extracted and quantified. The dependence of the mutational effect contributions on the position-specific scoring matrix (PSSM) score for the amino acid after mutation or the BLOSUM score of the substitution showed universal trends. Our attempts in the present work to explain the quantitative differences in the dependence on conservation and SASA across proteins were not successful. The work nevertheless brings transparency into the predictions and development of rules, and will hopefully lead to empirically uncovering the universalities among these rules.
Collapse
Affiliation(s)
- C K Sruthi
- Theoretical Sciences Unit, Jawaharlal Nehru Centre for Advanced Scientific Research, Bangalore 560064, India
| | - Meher K Prakash
- Theoretical Sciences Unit, Jawaharlal Nehru Centre for Advanced Scientific Research, Bangalore 560064, India
| |
Collapse
|
45
|
Nedrud D, Coyote-Maestas W, Schmidt D. A large-scale survey of pairwise epistasis reveals a mechanism for evolutionary expansion and specialization of PDZ domains. Proteins 2021; 89:899-914. [PMID: 33620761 DOI: 10.1002/prot.26067] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 02/02/2021] [Accepted: 02/18/2021] [Indexed: 12/21/2022]
Abstract
Deep mutational scanning (DMS) facilitates data-driven models of protein structure and function. Here, we adapted Saturated Programmable Insertion Engineering (SPINE) as a programmable DMS technique. We validate SPINE with a reference single mutant dataset in the PSD95 PDZ3 domain and then characterize most pairwise double mutants to study epistasis. We observe wide-spread proximal negative epistasis, which we attribute to mutations affecting thermodynamic stability, and strong long-range positive epistasis, which is enriched in an evolutionarily conserved and function-defining network of "sector" and clade-specifying residues. Conditional neutrality of mutations in clade-specifying residues compensates for deleterious mutations in sector positions. This suggests that epistatic interactions between these position pairs facilitated the evolutionary expansion and specialization of PDZ domains. We propose that SPINE provides easy experimental access to reveal epistasis signatures in proteins that will improve our understanding of the structural basis for protein function and adaptation.
Collapse
Affiliation(s)
- David Nedrud
- Department of Biochemistry, Molecular Biology & Biophysics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Willow Coyote-Maestas
- Department of Biochemistry, Molecular Biology & Biophysics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Daniel Schmidt
- Department of Genetics, Cell Biology & Development, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
46
|
Munro D, Singh M. DeMaSk: a deep mutational scanning substitution matrix and its use for variant impact prediction. Bioinformatics 2020; 36:5322-5329. [PMID: 33325500 PMCID: PMC8016454 DOI: 10.1093/bioinformatics/btaa1030] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 10/16/2020] [Accepted: 11/30/2020] [Indexed: 01/27/2023] Open
Abstract
Motivation Accurately predicting the quantitative impact of a substitution on a protein’s molecular function would be a great aid in understanding the effects of observed genetic variants across populations. While this remains a challenging task, new approaches can leverage data from the increasing numbers of comprehensive deep mutational scanning (DMS) studies that systematically mutate proteins and measure fitness. Results We introduce DeMaSk, an intuitive and interpretable method based only upon DMS datasets and sequence homologs that predicts the impact of missense mutations within any protein. DeMaSk first infers a directional amino acid substitution matrix from DMS datasets and then fits a linear model that combines these substitution scores with measures of per-position evolutionary conservation and variant frequency across homologs. Despite its simplicity, DeMaSk has state-of-the-art performance in predicting the impact of amino acid substitutions, and can easily and rapidly be applied to any protein sequence. Availability and implementation https://demask.princeton.edu generates fitness impact predictions and visualizations for any user-submitted protein sequence. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Daniel Munro
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, 08544, USA
| | - Mona Singh
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, 08544, USA.,Department of Computer Science, Princeton University, Princeton, 08544, USA
| |
Collapse
|
47
|
Braberg H, Echeverria I, Bohn S, Cimermancic P, Shiver A, Alexander R, Xu J, Shales M, Dronamraju R, Jiang S, Dwivedi G, Bogdanoff D, Chaung KK, Hüttenhain R, Wang S, Mavor D, Pellarin R, Schneidman D, Bader JS, Fraser JS, Morris J, Haber JE, Strahl BD, Gross CA, Dai J, Boeke JD, Sali A, Krogan NJ. Genetic interaction mapping informs integrative structure determination of protein complexes. Science 2020; 370:eaaz4910. [PMID: 33303586 PMCID: PMC7946025 DOI: 10.1126/science.aaz4910] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 07/23/2020] [Accepted: 10/23/2020] [Indexed: 12/17/2022]
Abstract
Determining structures of protein complexes is crucial for understanding cellular functions. Here, we describe an integrative structure determination approach that relies on in vivo measurements of genetic interactions. We construct phenotypic profiles for point mutations crossed against gene deletions or exposed to environmental perturbations, followed by converting similarities between two profiles into an upper bound on the distance between the mutated residues. We determine the structure of the yeast histone H3-H4 complex based on ~500,000 genetic interactions of 350 mutants. We then apply the method to subunits Rpb1-Rpb2 of yeast RNA polymerase II and subunits RpoB-RpoC of bacterial RNA polymerase. The accuracy is comparable to that based on chemical cross-links; using restraints from both genetic interactions and cross-links further improves model accuracy and precision. The approach provides an efficient means to augment integrative structure determination with in vivo observations.
Collapse
Affiliation(s)
- Hannes Braberg
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Ignacia Echeverria
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Stefan Bohn
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Peter Cimermancic
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Anthony Shiver
- Graduate Group in Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Richard Alexander
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Jiewei Xu
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Michael Shales
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Raghuvar Dronamraju
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, NC 27599, USA
| | - Shuangying Jiang
- CAS Key Laboratory of Quantitative Engineering Biology, Guangdong Provincial Key Laboratory of Synthetic Genomics and Shenzhen Key Laboratory of Synthetic Genomics, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Gajendradhar Dwivedi
- Department of Biology and Rosenstiel Basic Medical Sciences Research Center, Brandeis University, Waltham, MA 02454, USA
| | - Derek Bogdanoff
- Center for Advanced Technology, Department of Biophysics and Biochemistry, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Kaitlin K Chaung
- Center for Advanced Technology, Department of Biophysics and Biochemistry, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Ruth Hüttenhain
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Shuyi Wang
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - David Mavor
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Riccardo Pellarin
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Dina Schneidman
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Joel S Bader
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - James S Fraser
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - John Morris
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94158, USA
| | - James E Haber
- Department of Biology and Rosenstiel Basic Medical Sciences Research Center, Brandeis University, Waltham, MA 02454, USA
| | - Brian D Strahl
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, NC 27599, USA
| | - Carol A Gross
- Department of Microbiology and Immunology and Department of Cell and Tissue Biology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Junbiao Dai
- CAS Key Laboratory of Quantitative Engineering Biology, Guangdong Provincial Key Laboratory of Synthetic Genomics and Shenzhen Key Laboratory of Synthetic Genomics, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Jef D Boeke
- NYU Langone Health, New York, NY 10016, USA.
- High Throughput Biology Center and Department of Molecular Biology & Genetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016, USA
- Department of Biomedical Engineering, NYU Tandon School of Engineering, Brooklyn, NY 11201, USA
| | - Andrej Sali
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA.
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Nevan J Krogan
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA.
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| |
Collapse
|
48
|
Kinsler G, Geiler-Samerotte K, Petrov DA. Fitness variation across subtle environmental perturbations reveals local modularity and global pleiotropy of adaptation. eLife 2020; 9:e61271. [PMID: 33263280 PMCID: PMC7880691 DOI: 10.7554/elife.61271] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 12/02/2020] [Indexed: 02/07/2023] Open
Abstract
Building a genotype-phenotype-fitness map of adaptation is a central goal in evolutionary biology. It is difficult even when adaptive mutations are known because it is hard to enumerate which phenotypes make these mutations adaptive. We address this problem by first quantifying how the fitness of hundreds of adaptive yeast mutants responds to subtle environmental shifts. We then model the number of phenotypes these mutations collectively influence by decomposing these patterns of fitness variation. We find that a small number of inferred phenotypes can predict fitness of the adaptive mutations near their original glucose-limited evolution condition. Importantly, inferred phenotypes that matter little to fitness at or near the evolution condition can matter strongly in distant environments. This suggests that adaptive mutations are locally modular - affecting a small number of phenotypes that matter to fitness in the environment where they evolved - yet globally pleiotropic - affecting additional phenotypes that may reduce or improve fitness in new environments.
Collapse
Affiliation(s)
- Grant Kinsler
- Department of Biology, Stanford UniversityStanfordUnited States
| | - Kerry Geiler-Samerotte
- Department of Biology, Stanford UniversityStanfordUnited States
- Center for Mechanisms of Evolution, School of Life Sciences, Arizona State UniversityTempeUnited States
| | - Dmitri A Petrov
- Department of Biology, Stanford UniversityStanfordUnited States
| |
Collapse
|
49
|
Lyons DM, Zou Z, Xu H, Zhang J. Idiosyncratic epistasis creates universals in mutational effects and evolutionary trajectories. Nat Ecol Evol 2020; 4:1685-1693. [PMID: 32895516 PMCID: PMC7710555 DOI: 10.1038/s41559-020-01286-y] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Accepted: 07/23/2020] [Indexed: 01/06/2023]
Abstract
Patterns of epistasis and shapes of fitness landscapes are of wide interest because of their bearings on a number of evolutionary theories. The common phenomena of slowing fitness increases during adaptations and diminishing returns from beneficial mutations are believed to reflect a concave fitness landscape and a preponderance of negative epistasis. Paradoxically, fitness decreases tend to decelerate and harm from deleterious mutations shrinks during the accumulation of random mutations-patterns thought to indicate a convex fitness landscape and a predominance of positive epistasis. Current theories cannot resolve this apparent contradiction. Here, we show that the phenotypic effect of a mutation varies substantially depending on the specific genetic background and that this idiosyncrasy in epistasis creates all of the above trends without requiring a biased distribution of epistasis. The idiosyncratic epistasis theory explains the universalities in mutational effects and evolutionary trajectories as emerging from randomness due to biological complexity.
Collapse
Affiliation(s)
| | | | | | - Jianzhi Zhang
- Correspondence to Jianzhi Zhang, Department of Ecology and Evolutionary Biology, University of Michigan, 4018 Biological Sciences Building, 1105 North University Avenue, Ann Arbor, MI 48109, USA, Phone: 734-763-0527,
| |
Collapse
|
50
|
Zurek PJ, Knyphausen P, Neufeld K, Pushpanath A, Hollfelder F. UMI-linked consensus sequencing enables phylogenetic analysis of directed evolution. Nat Commun 2020; 11:6023. [PMID: 33243970 PMCID: PMC7691348 DOI: 10.1038/s41467-020-19687-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 10/12/2020] [Indexed: 11/09/2022] Open
Abstract
The success of protein evolution campaigns is strongly dependent on the sequence context in which mutations are introduced, stemming from pervasive non-additive interactions between a protein's amino acids ('intra-gene epistasis'). Our limited understanding of such epistasis hinders the correct prediction of the functional contributions and adaptive potential of mutations. Here we present a straightforward unique molecular identifier (UMI)-linked consensus sequencing workflow (UMIC-seq) that simplifies mapping of evolutionary trajectories based on full-length sequences. Attaching UMIs to gene variants allows accurate consensus generation for closely related genes with nanopore sequencing. We exemplify the utility of this approach by reconstructing the artificial phylogeny emerging in three rounds of directed evolution of an amine dehydrogenase biocatalyst via ultrahigh throughput droplet screening. Uniquely, we are able to identify lineages and their founding variant, as well as non-additive interactions between mutations within a full gene showing sign epistasis. Access to deep and accurate long reads will facilitate prediction of key beneficial mutations and adaptive potential based on in silico analysis of large sequence datasets.
Collapse
Affiliation(s)
- Paul Jannis Zurek
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
- Johnson Matthey Plc, Cambridge, CB4 0WE, UK
| | - Philipp Knyphausen
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
| | - Katharina Neufeld
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
- Johnson Matthey Plc, Cambridge, CB4 0WE, UK
| | | | - Florian Hollfelder
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK.
| |
Collapse
|