1
|
Crandall JG, Zhou X, Rokas A, Hittinger CT. Specialization Restricts the Evolutionary Paths Available to Yeast Sugar Transporters. Mol Biol Evol 2024; 41:msae228. [PMID: 39492761 PMCID: PMC11571961 DOI: 10.1093/molbev/msae228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 10/22/2024] [Accepted: 10/25/2024] [Indexed: 11/05/2024] Open
Abstract
Functional innovation at the protein level is a key source of evolutionary novelties. The constraints on functional innovations are likely to be highly specific in different proteins, which are shaped by their unique histories and the extent of global epistasis that arises from their structures and biochemistries. These contextual nuances in the sequence-function relationship have implications both for a basic understanding of the evolutionary process and for engineering proteins with desirable properties. Here, we have investigated the molecular basis of novel function in a model member of an ancient, conserved, and biotechnologically relevant protein family. These Major Facilitator Superfamily sugar porters are a functionally diverse group of proteins that are thought to be highly plastic and evolvable. By dissecting a recent evolutionary innovation in an α-glucoside transporter from the yeast Saccharomyces eubayanus, we show that the ability to transport a novel substrate requires high-order interactions between many protein regions and numerous specific residues proximal to the transport channel. To reconcile the functional diversity of this family with the constrained evolution of this model protein, we generated new, state-of-the-art genome annotations for 332 Saccharomycotina yeast species spanning ∼400 My of evolution. By integrating phylogenetic and phenotypic analyses across these species, we show that the model yeast α-glucoside transporters likely evolved from a multifunctional ancestor and became subfunctionalized. The accumulation of additive and epistatic substitutions likely entrenched this subfunction, which made the simultaneous acquisition of multiple interacting substitutions the only reasonably accessible path to novelty.
Collapse
Affiliation(s)
- Johnathan G Crandall
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Center for Genomic Science Innovation, DOE Great Lakes Bioenergy Research Center, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Xiaofan Zhou
- Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou 510642, China
- Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - Antonis Rokas
- Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - Chris Todd Hittinger
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Center for Genomic Science Innovation, DOE Great Lakes Bioenergy Research Center, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726, USA
| |
Collapse
|
2
|
Sanders BR, Thomas LS, Lewis NM, Ferguson ZA, Graves JL, Thomas MD. It Takes Two to Make a Thing Go Right: Epistasis, Two-Component Response Systems, and Bacterial Adaptation. Microorganisms 2024; 12:2000. [PMID: 39458309 PMCID: PMC11510482 DOI: 10.3390/microorganisms12102000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2024] [Revised: 09/27/2024] [Accepted: 09/30/2024] [Indexed: 10/28/2024] Open
Abstract
Understanding the interplay between genotype and fitness is a core question in evolutionary biology. Here, we address this challenge in the context of microbial adaptation to environmental stressors. This study explores the role of epistasis in bacterial adaptation by examining genetic and phenotypic changes in silver-adapted Escherichia coli populations, focusing on the role of beneficial mutations in two-component response systems (TCRS). To do this, we measured 24-hour growth assays and conducted whole-genome DNA and RNA sequencing on E. coli mutants that confer resistance to ionic silver. We showed recently that the R15L cusS mutation is central to silver resistance, primarily through upregulation of the cus efflux system. However, here we show that this mutation's effectiveness is significantly enhanced by epistatic interactions with additional mutations in regulatory genes such as ompR, rho, and fur. These interactions reconfigure global stress response networks, resulting in robust and varied resistance strategies across different populations. This study underscores the critical role of epistasis in bacterial adaptation, illustrating how interactions between multiple mutations and how genetic backgrounds shape the resistance phenotypes of E. coli populations. This work also allowed for refinement of our model describing the role TCRS genes play in bacterial adaptation by now emphasizing that adaptation to environmental stressors is a complex, context-dependent process, driven by the dynamic interplay between genetic and environmental factors. These findings have broader implications for understanding microbial evolution and developing strategies to combat antimicrobial resistance.
Collapse
Affiliation(s)
| | | | | | | | | | - Misty D. Thomas
- Department of Biology, North Carolina Agricultural and Technical State University, Greensboro, NC 27411, USA; (B.R.S.); (L.S.T.); (N.M.L.); (Z.A.F.)
| |
Collapse
|
3
|
Park Y, Metzger BPH, Thornton JW. The simplicity of protein sequence-function relationships. Nat Commun 2024; 15:7953. [PMID: 39261454 PMCID: PMC11390738 DOI: 10.1038/s41467-024-51895-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 08/20/2024] [Indexed: 09/13/2024] Open
Abstract
How complex are the rules by which a protein's sequence determines its function? High-order epistatic interactions among residues are thought to be pervasive, suggesting an idiosyncratic and unpredictable sequence-function relationship. But many prior studies may have overestimated epistasis, because they analyzed sequence-function relationships relative to a single reference sequence-which causes measurement noise and local idiosyncrasies to snowball into high-order epistasis-or they did not fully account for global nonlinearities. Here we present a reference-free method that jointly infers specific epistatic interactions and global nonlinearity using a bird's-eye view of sequence space. This technique yields the simplest explanation of sequence-function relationships and is more robust than existing methods to measurement noise, missing data, and model misspecification. We reanalyze 20 experimental datasets and find that context-independent amino acid effects and pairwise interactions, along with a simple nonlinearity to account for limited dynamic range, explain a median of 96% of phenotypic variance and over 92% in every case. Only a tiny fraction of genotypes are strongly affected by higher-order epistasis. Sequence-function relationships are also sparse: a miniscule fraction of amino acids and interactions account for 90% of phenotypic variance. Sequence-function causality across these datasets is therefore simple, opening the way for tractable approaches to characterize proteins' genetic architecture.
Collapse
Affiliation(s)
- Yeonwoo Park
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA
- Center for RNA Research, Institute for Basic Science, Seoul, Republic of Korea
| | - Brian P H Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Joseph W Thornton
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA.
- Department of Human Genetics, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
4
|
Lipsh-Sokolik R, Fleishman SJ. Addressing epistasis in the design of protein function. Proc Natl Acad Sci U S A 2024; 121:e2314999121. [PMID: 39133844 PMCID: PMC11348311 DOI: 10.1073/pnas.2314999121] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024] Open
Abstract
Mutations in protein active sites can dramatically improve function. The active site, however, is densely packed and extremely sensitive to mutations. Therefore, some mutations may only be tolerated in combination with others in a phenomenon known as epistasis. Epistasis reduces the likelihood of obtaining improved functional variants and dramatically slows natural and lab evolutionary processes. Research has shed light on the molecular origins of epistasis and its role in shaping evolutionary trajectories and outcomes. In addition, sequence- and AI-based strategies that infer epistatic relationships from mutational patterns in natural or experimental evolution data have been used to design functional protein variants. In recent years, combinations of such approaches and atomistic design calculations have successfully predicted highly functional combinatorial mutations in active sites. These were used to design thousands of functional active-site variants, demonstrating that, while our understanding of epistasis remains incomplete, some of the determinants that are critical for accurate design are now sufficiently understood. We conclude that the space of active-site variants that has been explored by evolution may be expanded dramatically to enhance natural activities or discover new ones. Furthermore, design opens the way to systematically exploring sequence and structure space and mutational impacts on function, deepening our understanding and control over protein activity.
Collapse
Affiliation(s)
- Rosalie Lipsh-Sokolik
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Sarel J Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
5
|
Johnston KE, Almhjell PJ, Watkins-Dulaney EJ, Liu G, Porter NJ, Yang J, Arnold FH. A combinatorially complete epistatic fitness landscape in an enzyme active site. Proc Natl Acad Sci U S A 2024; 121:e2400439121. [PMID: 39074291 PMCID: PMC11317637 DOI: 10.1073/pnas.2400439121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Accepted: 06/17/2024] [Indexed: 07/31/2024] Open
Abstract
Protein engineering often targets binding pockets or active sites which are enriched in epistasis-nonadditive interactions between amino acid substitutions-and where the combined effects of multiple single substitutions are difficult to predict. Few existing sequence-fitness datasets capture epistasis at large scale, especially for enzyme catalysis, limiting the development and assessment of model-guided enzyme engineering approaches. We present here a combinatorially complete, 160,000-variant fitness landscape across four residues in the active site of an enzyme. Assaying the native reaction of a thermostable β-subunit of tryptophan synthase (TrpB) in a nonnative environment yielded a landscape characterized by significant epistasis and many local optima. These effects prevent simulated directed evolution approaches from efficiently reaching the global optimum. There is nonetheless wide variability in the effectiveness of different directed evolution approaches, which together provide experimental benchmarks for computational and machine learning workflows. The most-fit TrpB variants contain a substitution that is nearly absent in natural TrpB sequences-a result that conservation-based predictions would not capture. Thus, although fitness prediction using evolutionary data can enrich in more-active variants, these approaches struggle to identify and differentiate among the most-active variants, even for this near-native function. Overall, this work presents a large-scale testing ground for model-guided enzyme engineering and suggests that efficient navigation of epistatic fitness landscapes can be improved by advances in both machine learning and physical modeling.
Collapse
Affiliation(s)
- Kadina E. Johnston
- Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA91125
| | - Patrick J. Almhjell
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA91125
| | - Ella J. Watkins-Dulaney
- Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA91125
| | - Grace Liu
- Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA91125
| | - Nicholas J. Porter
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA91125
| | - Jason Yang
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA91125
| | - Frances H. Arnold
- Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA91125
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA91125
| |
Collapse
|
6
|
Parkhill SL, Johnson EO. Integrating bacterial molecular genetics with chemical biology for renewed antibacterial drug discovery. Biochem J 2024; 481:839-864. [PMID: 38958473 PMCID: PMC11346456 DOI: 10.1042/bcj20220062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 06/20/2024] [Accepted: 06/24/2024] [Indexed: 07/04/2024]
Abstract
The application of dyes to understanding the aetiology of infection inspired antimicrobial chemotherapy and the first wave of antibacterial drugs. The second wave of antibacterial drug discovery was driven by rapid discovery of natural products, now making up 69% of current antibacterial drugs. But now with the most prevalent natural products already discovered, ∼107 new soil-dwelling bacterial species must be screened to discover one new class of natural product. Therefore, instead of a third wave of antibacterial drug discovery, there is now a discovery bottleneck. Unlike natural products which are curated by billions of years of microbial antagonism, the vast synthetic chemical space still requires artificial curation through the therapeutics science of antibacterial drugs - a systematic understanding of how small molecules interact with bacterial physiology, effect desired phenotypes, and benefit the host. Bacterial molecular genetics can elucidate pathogen biology relevant to therapeutics development, but it can also be applied directly to understanding mechanisms and liabilities of new chemical agents with new mechanisms of action. Therefore, the next phase of antibacterial drug discovery could be enabled by integrating chemical expertise with systematic dissection of bacterial infection biology. Facing the ambitious endeavour to find new molecules from nature or new-to-nature which cure bacterial infections, the capabilities furnished by modern chemical biology and molecular genetics can be applied to prospecting for chemical modulators of new targets which circumvent prevalent resistance mechanisms.
Collapse
Affiliation(s)
- Susannah L. Parkhill
- Systems Chemical Biology of Infection and Resistance Laboratory, The Francis Crick Institute, London, U.K
- Faculty of Life Sciences, University College London, London, U.K
| | - Eachan O. Johnson
- Systems Chemical Biology of Infection and Resistance Laboratory, The Francis Crick Institute, London, U.K
- Faculty of Life Sciences, University College London, London, U.K
- Department of Chemistry, Imperial College, London, U.K
- Department of Chemistry, King's College London, London, U.K
| |
Collapse
|
7
|
Kilgore HR, Chinn I, Mikhael PG, Mitnikov I, Van Dongen C, Zylberberg G, Afeyan L, Banani S, Wilson-Hawken S, Lee TI, Barzilay R, Young RA. Protein codes promote selective subcellular compartmentalization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.15.589616. [PMID: 38659952 PMCID: PMC11042338 DOI: 10.1101/2024.04.15.589616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Cells have evolved mechanisms to distribute ~10 billion protein molecules to subcellular compartments where diverse proteins involved in shared functions must efficiently assemble. Here, we demonstrate that proteins with shared functions share amino acid sequence codes that guide them to compartment destinations. A protein language model, ProtGPS, was developed that predicts with high performance the compartment localization of human proteins excluded from the training set. ProtGPS successfully guided generation of novel protein sequences that selectively assemble in targeted subcellular compartments. ProtGPS also identified pathological mutations that change this code and lead to altered subcellular localization of proteins. Our results indicate that protein sequences contain not only a folding code, but also a previously unrecognized code governing their distribution in specific cellular compartments.
Collapse
Affiliation(s)
- Henry R. Kilgore
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Itamar Chinn
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Peter G. Mikhael
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Ilan Mitnikov
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | - Guy Zylberberg
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Lena Afeyan
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Salman Banani
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Susana Wilson-Hawken
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Program of Computational & Systems Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Tong Ihn Lee
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Regina Barzilay
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Richard A. Young
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
8
|
Metzger BPH, Park Y, Starr TN, Thornton JW. Epistasis facilitates functional evolution in an ancient transcription factor. eLife 2024; 12:RP88737. [PMID: 38767330 PMCID: PMC11105156 DOI: 10.7554/elife.88737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
Abstract
A protein's genetic architecture - the set of causal rules by which its sequence produces its functions - also determines its possible evolutionary trajectories. Prior research has proposed that the genetic architecture of proteins is very complex, with pervasive epistatic interactions that constrain evolution and make function difficult to predict from sequence. Most of this work has analyzed only the direct paths between two proteins of interest - excluding the vast majority of possible genotypes and evolutionary trajectories - and has considered only a single protein function, leaving unaddressed the genetic architecture of functional specificity and its impact on the evolution of new functions. Here, we develop a new method based on ordinal logistic regression to directly characterize the global genetic determinants of multiple protein functions from 20-state combinatorial deep mutational scanning (DMS) experiments. We use it to dissect the genetic architecture and evolution of a transcription factor's specificity for DNA, using data from a combinatorial DMS of an ancient steroid hormone receptor's capacity to activate transcription from two biologically relevant DNA elements. We show that the genetic architecture of DNA recognition consists of a dense set of main and pairwise effects that involve virtually every possible amino acid state in the protein-DNA interface, but higher-order epistasis plays only a tiny role. Pairwise interactions enlarge the set of functional sequences and are the primary determinants of specificity for different DNA elements. They also massively expand the number of opportunities for single-residue mutations to switch specificity from one DNA target to another. By bringing variants with different functions close together in sequence space, pairwise epistasis therefore facilitates rather than constrains the evolution of new functions.
Collapse
Affiliation(s)
- Brian PH Metzger
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
| | - Yeonwoo Park
- Program in Genetics, Genomics, and Systems Biology, University of ChicagoChicagoUnited States
| | - Tyler N Starr
- Department of Biochemistry and Molecular Biophysics, University of ChicagoChicagoUnited States
| | - Joseph W Thornton
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
- Department of Human Genetics, University of ChicagoChicagoUnited States
| |
Collapse
|
9
|
Dibyachintan S, Dube AK, Bradley D, Lemieux P, Dionne U, Landry CR. Cryptic genetic variation shapes the fate of gene duplicates in a protein interaction network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.23.581840. [PMID: 38464075 PMCID: PMC10925128 DOI: 10.1101/2024.02.23.581840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Paralogous genes are often redundant for long periods of time before they diverge in function. While their functions are preserved, paralogous proteins can accumulate mutations that, through epistasis, could impact their fate in the future. By quantifying the impact of all single-amino acid substitutions on the binding of two myosin proteins to their interaction partners, we find that the future evolution of these proteins is highly contingent on their regulatory divergence and the mutations that have silently accumulated in their protein binding domains. Differences in the promoter strength of the two paralogs amplify the impact of mutations on binding in the lowly expressed one. While some mutations would be sufficient to non-functionalize one paralog, they would have minimal impact on the other. Our results reveal how functionally equivalent protein domains could be destined to specific fates by regulatory and cryptic coding sequence changes that currently have little to no functional impact.
Collapse
Affiliation(s)
- Soham Dibyachintan
- PROTEO-Regroupement Québécois de Recherche sur la Fonction, l'Ingénierie et les Applications des Protéines, Québec, QC, Canada
- Centre de Recherche en Données Massives de l'Université Laval, Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Québec, QC, Canada
| | - Alexandre K Dube
- PROTEO-Regroupement Québécois de Recherche sur la Fonction, l'Ingénierie et les Applications des Protéines, Québec, QC, Canada
- Centre de Recherche en Données Massives de l'Université Laval, Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Québec, QC, Canada
- Département de Biologie, Université Laval, Québec, QC, Canada
| | - David Bradley
- PROTEO-Regroupement Québécois de Recherche sur la Fonction, l'Ingénierie et les Applications des Protéines, Québec, QC, Canada
- Centre de Recherche en Données Massives de l'Université Laval, Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Québec, QC, Canada
- Département de Biologie, Université Laval, Québec, QC, Canada
| | - Pascale Lemieux
- PROTEO-Regroupement Québécois de Recherche sur la Fonction, l'Ingénierie et les Applications des Protéines, Québec, QC, Canada
- Centre de Recherche en Données Massives de l'Université Laval, Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Québec, QC, Canada
| | - Ugo Dionne
- PROTEO-Regroupement Québécois de Recherche sur la Fonction, l'Ingénierie et les Applications des Protéines, Québec, QC, Canada
- Centre de Recherche en Données Massives de l'Université Laval, Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Current affiliation: Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Christian R Landry
- PROTEO-Regroupement Québécois de Recherche sur la Fonction, l'Ingénierie et les Applications des Protéines, Québec, QC, Canada
- Centre de Recherche en Données Massives de l'Université Laval, Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Québec, QC, Canada
- Département de Biologie, Université Laval, Québec, QC, Canada
| |
Collapse
|
10
|
Yang YH, Wei YL, She ZY. Kinesin-7 CENP-E in tumorigenesis: Chromosome instability, spindle assembly checkpoint, and applications. Front Mol Biosci 2024; 11:1366113. [PMID: 38560520 PMCID: PMC10978661 DOI: 10.3389/fmolb.2024.1366113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 03/04/2024] [Indexed: 04/04/2024] Open
Abstract
Kinesin motors are a large family of molecular motors that walk along microtubules to fulfill many roles in intracellular transport, microtubule organization, and chromosome alignment. Kinesin-7 CENP-E (Centromere protein E) is a chromosome scaffold-associated protein that is located in the corona layer of centromeres, which participates in kinetochore-microtubule attachment, chromosome alignment, and spindle assembly checkpoint. Over the past 3 decades, CENP-E has attracted great interest as a promising new mitotic target for cancer therapy and drug development. In this review, we describe expression patterns of CENP-E in multiple tumors and highlight the functions of CENP-E in cancer cell proliferation. We summarize recent advances in structural domains, roles, and functions of CENP-E in cell division. Notably, we describe the dual functions of CENP-E in inhibiting and promoting tumorigenesis. We summarize the mechanisms by which CENP-E affects tumorigenesis through chromosome instability and spindle assembly checkpoints. Finally, we overview and summarize the CENP-E-specific inhibitors, mechanisms of drug resistances and their applications.
Collapse
Affiliation(s)
- Yu-Hao Yang
- Department of Cell Biology and Genetics, The School of Basic Medical Sciences, Fujian Medical University, Fuzhou, China
- Key Laboratory of Stem Cell Engineering and Regenerative Medicine, Fujian Province University, Fuzhou, China
| | - Ya-Lan Wei
- Medical Research Center, Fujian Maternity and Child Health Hospital, Fuzhou, China
- College of Clinical Medicine for Obstetrics and Gynecology and Pediatrics, Fujian Medical University, Fuzhou, China
| | - Zhen-Yu She
- Department of Cell Biology and Genetics, The School of Basic Medical Sciences, Fujian Medical University, Fuzhou, China
- Key Laboratory of Stem Cell Engineering and Regenerative Medicine, Fujian Province University, Fuzhou, China
| |
Collapse
|
11
|
Chu HY, Fong JHC, Thean DGL, Zhou P, Fung FKC, Huang Y, Wong ASL. Accurate top protein variant discovery via low-N pick-and-validate machine learning. Cell Syst 2024; 15:193-203.e6. [PMID: 38340729 DOI: 10.1016/j.cels.2024.01.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 10/11/2023] [Accepted: 01/18/2024] [Indexed: 02/12/2024]
Abstract
A strategy to obtain the greatest number of best-performing variants with least amount of experimental effort over the vast combinatorial mutational landscape would have enormous utility in boosting resource producibility for protein engineering. Toward this goal, we present a simple and effective machine learning-based strategy that outperforms other state-of-the-art methods. Our strategy integrates zero-shot prediction and multi-round sampling to direct active learning via experimenting with only a few predicted top variants. We find that four rounds of low-N pick-and-validate sampling of 12 variants for machine learning yielded the best accuracy of up to 92.6% in selecting the true top 1% variants in combinatorial mutant libraries, whereas two rounds of 24 variants can also be used. We demonstrate our strategy in successfully discovering high-performance protein variants from diverse families including the CRISPR-based genome editors, supporting its generalizable application for solving protein engineering tasks. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Hoi Yee Chu
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - John H C Fong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Dawn G L Thean
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Peng Zhou
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Frederic K C Fung
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Yuanhua Huang
- School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Department of Statistics and Actuarial Science, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Alan S L Wong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China.
| |
Collapse
|
12
|
Park Y, Metzger BP, Thornton JW. The simplicity of protein sequence-function relationships. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.02.556057. [PMID: 37732229 PMCID: PMC10508729 DOI: 10.1101/2023.09.02.556057] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
How complicated is the genetic architecture of proteins - the set of causal effects by which sequence determines function? High-order epistatic interactions among residues are thought to be pervasive, making a protein's function difficult to predict or understand from its sequence. Most studies, however, used methods that overestimate epistasis, because they analyze genetic architecture relative to a designated reference sequence - causing measurement noise and small local idiosyncrasies to propagate into pervasive high-order interactions - or have not effectively accounted for global nonlinearity in the sequence-function relationship. Here we present a new reference-free method that jointly estimates global nonlinearity and specific epistatic interactions across a protein's entire genotype-phenotype map. This method yields a maximally efficient explanation of a protein's genetic architecture and is more robust than existing methods to measurement noise, partial sampling, and model misspecification. We reanalyze 20 combinatorial mutagenesis experiments from a diverse set of proteins and find that additive and pairwise effects, along with a simple nonlinearity to account for limited dynamic range, explain a median of 96% of total variance in measured phenotypes (and >92% in every case). Only a tiny fraction of genotypes are strongly affected by third- or higher-order epistasis. Genetic architecture is also sparse: the number of terms required to explain the vast majority of variance is smaller than the number of genotypes by many orders of magnitude. The sequence-function relationship in most proteins is therefore far simpler than previously thought, opening the way for new and tractable approaches to characterize it.
Collapse
Affiliation(s)
- Yeonwoo Park
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL 60637
- Current affiliation: Center for RNA Research, Institute for Basic Science, Seoul, Republic of Korea 08826
| | - Brian P.H. Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637
- Current affiliation: Department of Biological Sciences, Purdue University, West Lafayette, IN 47907
| | - Joseph W. Thornton
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637
- Department of Human Genetics, University of Chicago, Chicago, IL 60637
| |
Collapse
|
13
|
Yan Z, Wang J. Evolution shapes interaction patterns for epistasis and specific protein binding in a two-component signaling system. Commun Chem 2024; 7:13. [PMID: 38233668 PMCID: PMC10794238 DOI: 10.1038/s42004-024-01098-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 01/05/2024] [Indexed: 01/19/2024] Open
Abstract
The elegant design of protein sequence/structure/function relationships arises from the interaction patterns between amino acid positions. A central question is how evolutionary forces shape the interaction patterns that encode long-range epistasis and binding specificity. Here, we combined family-wide evolutionary analysis of natural homologous sequences and structure-oriented evolution simulation for two-component signaling (TCS) system. The magnitude-frequency relationship of coupling conservation between positions manifests a power-law-like distribution and the positions with highly coupling conservation are sparse but distributed intensely on the binding surfaces and hydrophobic core. The structure-specific interaction pattern involves further optimization of local frustrations at or near the binding surface to adapt the binding partner. The construction of family-wide conserved interaction patterns and structure-specific ones demonstrates that binding specificity is modulated by both direct intermolecular interactions and long-range epistasis across the binding complex. Evolution sculpts the interaction patterns via sequence variations at both family-wide and structure-specific levels for TCS system.
Collapse
Affiliation(s)
- Zhiqiang Yan
- Center for Theoretical Interdisciplinary Sciences, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, 325001, PR China
| | - Jin Wang
- Department of Chemistry and Physics, State University of New York at Stony Brook, Stony Brook, NY, 11790, USA.
| |
Collapse
|
14
|
Avizemer Z, Martí-Gómez C, Hoch SY, McCandlish DM, Fleishman SJ. Evolutionary paths that link orthogonal pairs of binding proteins. RESEARCH SQUARE 2023:rs.3.rs-2836905. [PMID: 37131620 PMCID: PMC10153392 DOI: 10.21203/rs.3.rs-2836905/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Some protein binding pairs exhibit extreme specificities that functionally insulate them from homologs. Such pairs evolve mostly by accumulating single-point mutations, and mutants are selected if their affinity exceeds the threshold required for function1-4. Thus, homologous and high-specificity binding pairs bring to light an evolutionary conundrum: how does a new specificity evolve while maintaining the required affinity in each intermediate5,6? Until now, a fully functional single-mutation path that connects two orthogonal pairs has only been described where the pairs were mutationally close thus enabling experimental enumeration of all intermediates2. We present an atomistic and graph-theoretical framework for discovering low molecular strain single-mutation paths that connect two extant pairs, enabling enumeration beyond experimental capability. We apply it to two orthogonal bacterial colicin endonuclease-immunity pairs separated by 17 interface mutations7. We were not able to find a strain-free and functional path in the sequence space defined by the two extant pairs. But including mutations that bridge amino acids that cannot be exchanged through single-nucleotide mutations led us to a strain-free 19-mutation trajectory that is completely viable in vivo. Our experiments show that the specificity switch is remarkably abrupt, resulting from only one radical mutation on each partner. Furthermore, each of the critical specificity-switch mutations increases fitness, demonstrating that functional divergence could be driven by positive Darwinian selection. These results reveal how even radical functional changes in an epistatic fitness landscape may evolve.
Collapse
Affiliation(s)
- Ziv Avizemer
- Department of Biomolecular Sciences, Weizmann Institute of Science, 7610001, Rehovot, Israel
| | - Carlos Martí-Gómez
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724
| | - Shlomo Yakir Hoch
- Department of Biomolecular Sciences, Weizmann Institute of Science, 7610001, Rehovot, Israel
| | - David M. McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724
| | - Sarel J. Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, 7610001, Rehovot, Israel
| |
Collapse
|
15
|
Yang L, Liang X, Zhang N, Lu L. STAR: A Web Server for Assisting Directed Protein Evolution with Machine Learning. ACS OMEGA 2023; 8:44751-44756. [PMID: 38046324 PMCID: PMC10688154 DOI: 10.1021/acsomega.3c04832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 10/10/2023] [Accepted: 10/12/2023] [Indexed: 12/05/2023]
Abstract
Protein engineering has made significant contributions to industries such as agriculture, food, and pharmaceuticals. In recent years, directed evolution combined with artificial intelligence has emerged as a cutting-edge R&D approach. However, the application of machine learning techniques can be challenging for those without relevant experience and coding skills. To address this issue, we have developed a web-based protein sequence recommendation system: STAR (Sequence recommendaTion via ARtificial intelligence). Our system utilizes Bayesian optimization as its backbone and includes a filtering step using a regression model to enhance the success rate of recommended sequences. Additionally, we have incorporated an in silico-directed evolution approach to expand the exploration of the protein space. The Web site can be accessed at https://www.FindProteinStar.com/.
Collapse
Affiliation(s)
- Likun Yang
- Asymchem Life Science (Tianjin) Co.,
Ltd, Tianjin 300457, P. R. China
| | - Xiaoli Liang
- Asymchem Life Science (Tianjin) Co.,
Ltd, Tianjin 300457, P. R. China
| | - Na Zhang
- Asymchem Life Science (Tianjin) Co.,
Ltd, Tianjin 300457, P. R. China
| | - Lu Lu
- Asymchem Life Science (Tianjin) Co.,
Ltd, Tianjin 300457, P. R. China
| |
Collapse
|
16
|
Feng H, Li F, Wang T, Xing XH, Zeng AP, Zhang C. Deep-learning-assisted Sort-Seq enables high-throughput profiling of gene expression characteristics with high precision. SCIENCE ADVANCES 2023; 9:eadg5296. [PMID: 37939173 PMCID: PMC10631719 DOI: 10.1126/sciadv.adg5296] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 10/06/2023] [Indexed: 11/10/2023]
Abstract
Owing to the nondeterministic and nonlinear nature of gene expression, the steady-state intracellular protein abundance of a clonal population forms a distribution. The characteristics of this distribution, including expression strength and noise, are closely related to cellular behavior. However, quantitative description of these characteristics has so far relied on arrayed methods, which are time-consuming and labor-intensive. To address this issue, we propose a deep-learning-assisted Sort-Seq approach (dSort-Seq) in this work, enabling high-throughput profiling of expression properties with high precision. We demonstrated the validity of dSort-Seq for large-scale assaying of the dose-response relationships of biosensors. In addition, we comprehensively investigated the contribution of transcription and translation to noise production in Escherichia coli, from which we found that the expression noise is strongly coupled with the mean expression level. We also found that the transcriptional interference caused by overlapping RpoD-binding sites contributes to noise production, which suggested the existence of a simple and feasible noise control strategy in E. coli.
Collapse
Affiliation(s)
- Huibao Feng
- MOE Key Laboratory for Industrial Biocatalysis, Institute of Biochemical Engineering, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
| | - Fan Li
- MOE Key Laboratory for Industrial Biocatalysis, Institute of Biochemical Engineering, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
| | - Tianmin Wang
- Tsinghua-Peking Center for Life Sciences, School of Medicine, Tsinghua University, Beijing 100084, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Xin-hui Xing
- MOE Key Laboratory for Industrial Biocatalysis, Institute of Biochemical Engineering, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
- Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
| | - An-ping Zeng
- Institute of Bioprocess and Biosystems Engineering, Hamburg University of Technology, Hamburg 21073, Germany
- Center of Synthetic Biology and Integrated Bioengineering, School of Engineering, Westlake University, Hangzhou 310024, China
| | - Chong Zhang
- MOE Key Laboratory for Industrial Biocatalysis, Institute of Biochemical Engineering, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
- Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China
| |
Collapse
|
17
|
Qiu Y, Wei GW. Artificial intelligence-aided protein engineering: from topological data analysis to deep protein language models. Brief Bioinform 2023; 24:bbad289. [PMID: 37580175 PMCID: PMC10516362 DOI: 10.1093/bib/bbad289] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 07/14/2023] [Accepted: 07/26/2023] [Indexed: 08/16/2023] Open
Abstract
Protein engineering is an emerging field in biotechnology that has the potential to revolutionize various areas, such as antibody design, drug discovery, food security, ecology, and more. However, the mutational space involved is too vast to be handled through experimental means alone. Leveraging accumulative protein databases, machine learning (ML) models, particularly those based on natural language processing (NLP), have considerably expedited protein engineering. Moreover, advances in topological data analysis (TDA) and artificial intelligence-based protein structure prediction, such as AlphaFold2, have made more powerful structure-based ML-assisted protein engineering strategies possible. This review aims to offer a comprehensive, systematic, and indispensable set of methodological components, including TDA and NLP, for protein engineering and to facilitate their future development.
Collapse
Affiliation(s)
- Yuchi Qiu
- Department of Mathematics, Michigan State University, East Lansing, 48824 MI, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, 48824 MI, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, 48824 MI, USA
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, 48824 MI, USA
| |
Collapse
|
18
|
Chen L, Zhang Z, Li Z, Li R, Huo R, Chen L, Wang D, Luo X, Chen K, Liao C, Zheng M. Learning protein fitness landscapes with deep mutational scanning data from multiple sources. Cell Syst 2023; 14:706-721.e5. [PMID: 37591206 DOI: 10.1016/j.cels.2023.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 05/30/2023] [Accepted: 07/18/2023] [Indexed: 08/19/2023]
Abstract
One of the key points of machine learning-assisted directed evolution (MLDE) is the accurate learning of the fitness landscape, a conceptual mapping from sequence variants to the desired function. Here, we describe a multi-protein training scheme that leverages the existing deep mutational scanning data from diverse proteins to aid in understanding the fitness landscape of a new protein. Proof-of-concept trials are designed to validate this training scheme in three aspects: random and positional extrapolation for single-variant effects, zero-shot fitness predictions for new proteins, and extrapolation for higher-order variant effects from single-variant effects. Moreover, our study identified previously overlooked strong baselines, and their unexpectedly good performance brings our attention to the pitfalls of MLDE. Overall, these results may improve our understanding of the association between different protein fitness profiles and shed light on developing better machine learning-assisted approaches to the directed evolution of proteins. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Lin Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zehong Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhenghao Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; Shanghai Institute for Advanced Immunochemical Studies, School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Rui Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; School of Pharmacy, China Pharmaceutical University, Nanjing 211198, China
| | - Ruifeng Huo
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Lifan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | | | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kaixian Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China; School of Pharmacy, China Pharmaceutical University, Nanjing 211198, China
| | - Cangsong Liao
- University of Chinese Academy of Sciences, Beijing 100049, China; Chemical Biology Research Center, Shanghai Institute of Materia Medica, Chinese Academy of Science, Shanghai 201203, China.
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China; School of Pharmacy, China Pharmaceutical University, Nanjing 211198, China; School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China.
| |
Collapse
|
19
|
Haddox HK, Galloway JG, Dadonaite B, Bloom JD, Matsen IV FA, DeWitt WS. Jointly modeling deep mutational scans identifies shifted mutational effects among SARS-CoV-2 spike homologs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.31.551037. [PMID: 37577604 PMCID: PMC10418112 DOI: 10.1101/2023.07.31.551037] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Deep mutational scanning (DMS) is a high-throughput experimental technique that measures the effects of thousands of mutations to a protein. These experiments can be performed on multiple homologs of a protein or on the same protein selected under multiple conditions. It is often of biological interest to identify mutations with shifted effects across homologs or conditions. However, it is challenging to determine if observed shifts arise from biological signal or experimental noise. Here, we describe a method for jointly inferring mutational effects across multiple DMS experiments while also identifying mutations that have shifted in their effects among experiments. A key aspect of our method is to regularize the inferred shifts, so that they are nonzero only when strongly supported by the data. We apply this method to DMS experiments that measure how mutations to spike proteins from SARS-CoV-2 variants (Delta, Omicron BA.1, and Omicron BA.2) affect cell entry. Most mutational effects are conserved between these spike homologs, but a fraction have markedly shifted. We experimentally validate a subset of the mutations inferred to have shifted effects, and confirm differences of > 1,000-fold in the impact of the same mutation on spike-mediated viral infection across spikes from different SARS-CoV-2 variants. Overall, our work establishes a general approach for comparing sets of DMS experiments to identify biologically important shifts in mutational effects.
Collapse
Affiliation(s)
- Hugh K. Haddox
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98102, USA
| | - Jared G. Galloway
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98102, USA
| | - Bernadeta Dadonaite
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Jesse D. Bloom
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98102, USA
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, Seattle, WA 98109, USA
| | - Frederick A. Matsen IV
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98102, USA
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
- Howard Hughes Medical Institute, Seattle, WA 98109, USA
- Department of Statistics, University of Washington, Seattle, WA 98195, USA
| | - William S. DeWitt
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
20
|
Baier F, Gauye F, Perez-Carrasco R, Payne JL, Schaerli Y. Environment-dependent epistasis increases phenotypic diversity in gene regulatory networks. SCIENCE ADVANCES 2023; 9:eadf1773. [PMID: 37224262 DOI: 10.1126/sciadv.adf1773] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Accepted: 04/17/2023] [Indexed: 05/26/2023]
Abstract
Mutations to gene regulatory networks can be maladaptive or a source of evolutionary novelty. Epistasis confounds our understanding of how mutations affect the expression patterns of gene regulatory networks, a challenge exacerbated by the dependence of epistasis on the environment. We used the toolkit of synthetic biology to systematically assay the effects of pairwise and triplet combinations of mutant genotypes on the expression pattern of a gene regulatory network expressed in Escherichia coli that interprets an inducer gradient across a spatial domain. We uncovered a preponderance of epistasis that can switch in magnitude and sign across the inducer gradient to produce a greater diversity of expression pattern phenotypes than would be possible in the absence of such environment-dependent epistasis. We discuss our findings in the context of the evolution of hybrid incompatibilities and evolutionary novelties.
Collapse
Affiliation(s)
- Florian Baier
- Department of Fundamental Microbiology, University of Lausanne, Biophore Building, 1015 Lausanne, Switzerland
| | - Florence Gauye
- Department of Fundamental Microbiology, University of Lausanne, Biophore Building, 1015 Lausanne, Switzerland
| | | | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, 8092 Zurich, Switzerland
| | - Yolanda Schaerli
- Department of Fundamental Microbiology, University of Lausanne, Biophore Building, 1015 Lausanne, Switzerland
| |
Collapse
|
21
|
Santos-Moreno J, Tasiudi E, Kusumawardhani H, Stelling J, Schaerli Y. Robustness and innovation in synthetic genotype networks. Nat Commun 2023; 14:2454. [PMID: 37117168 PMCID: PMC10147661 DOI: 10.1038/s41467-023-38033-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 04/13/2023] [Indexed: 04/30/2023] Open
Abstract
Genotype networks are sets of genotypes connected by small mutational changes that share the same phenotype. They facilitate evolutionary innovation by enabling the exploration of different neighborhoods in genotype space. Genotype networks, first suggested by theoretical models, have been empirically confirmed for proteins and RNAs. Comparative studies also support their existence for gene regulatory networks (GRNs), but direct experimental evidence is lacking. Here, we report the construction of three interconnected genotype networks of synthetic GRNs producing three distinct phenotypes in Escherichia coli. Our synthetic GRNs contain three nodes regulating each other by CRISPR interference and governing the expression of fluorescent reporters. The genotype networks, composed of over twenty different synthetic GRNs, provide robustness in face of mutations while enabling transitions to innovative phenotypes. Through realistic mathematical modeling, we quantify robustness and evolvability for the complete genotype-phenotype map and link these features mechanistically to GRN motifs. Our work thereby exemplifies how GRN evolution along genotype networks might be driving evolutionary innovation.
Collapse
Affiliation(s)
- Javier Santos-Moreno
- Department of Fundamental Microbiology, University of Lausanne, Biophore Building, 1015, Lausanne, Switzerland
- Department of Medicine and Life Sciences, Pompeu Fabra University, 00803, Barcelona, Spain
| | - Eve Tasiudi
- Department of Biosystems Science and Engineering, ETH Zurich and SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Hadiastri Kusumawardhani
- Department of Fundamental Microbiology, University of Lausanne, Biophore Building, 1015, Lausanne, Switzerland
| | - Joerg Stelling
- Department of Biosystems Science and Engineering, ETH Zurich and SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Yolanda Schaerli
- Department of Fundamental Microbiology, University of Lausanne, Biophore Building, 1015, Lausanne, Switzerland.
| |
Collapse
|
22
|
Nieves M, Buschiazzo A, Trajtenberg F. Structural features of sensory two component systems: a synthetic biology perspective. Biochem J 2023; 480:127-140. [PMID: 36688908 DOI: 10.1042/bcj20210798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 01/05/2023] [Accepted: 01/06/2023] [Indexed: 01/24/2023]
Abstract
All living organisms include a set of signaling devices that confer the ability to dynamically perceive and adapt to the fluctuating environment. Two-component systems are part of this sensory machinery that regulates the execution of different genetic and/or biochemical programs in response to specific physical or chemical signals. In the last two decades, there has been tremendous progress in our molecular understanding on how signals are detected, the allosteric mechanisms that control intramolecular information transmission and the specificity determinants that guarantee correct wiring. All this information is starting to be exploited in the development of new synthetic networks. Connecting multiple molecular players, analogous to programming lines of code, can provide the resources to build new sophisticated biocomputing systems. The Synthetic Biology field is starting to revolutionize several scientific fields, such as biomedicine and agriculture, propelling the development of new solutions. Expanding the spectrum of available nanodevices in the toolbox is key to unleash its full potential. This review aims to discuss, from a structural perspective, how to take advantage of the vast array of sensor and effector protein modules involved in two-component systems for the construction of new synthetic circuits.
Collapse
Affiliation(s)
- Marcos Nieves
- Laboratory of Molecular and Structural Microbiology, Institut Pasteur de Montevideo, Montevideo, Uruguay
| | - Alejandro Buschiazzo
- Laboratory of Molecular and Structural Microbiology, Institut Pasteur de Montevideo, Montevideo, Uruguay
- Département de Microbiologie, Institut Pasteur, Paris, France
| | - Felipe Trajtenberg
- Laboratory of Molecular and Structural Microbiology, Institut Pasteur de Montevideo, Montevideo, Uruguay
| |
Collapse
|
23
|
Hu R, Fu L, Chen Y, Chen J, Qiao Y, Si T. Protein engineering via Bayesian optimization-guided evolutionary algorithm and robotic experiments. Brief Bioinform 2023; 24:6958505. [PMID: 36562723 DOI: 10.1093/bib/bbac570] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 11/14/2022] [Accepted: 11/22/2022] [Indexed: 12/24/2022] Open
Abstract
Directed protein evolution applies repeated rounds of genetic mutagenesis and phenotypic screening and is often limited by experimental throughput. Through in silico prioritization of mutant sequences, machine learning has been applied to reduce wet lab burden to a level practical for human researchers. On the other hand, robotics permits large batches and rapid iterations for protein engineering cycles, but such capacities have not been well exploited in existing machine learning-assisted directed evolution approaches. Here, we report a scalable and batched method, Bayesian Optimization-guided EVOlutionary (BO-EVO) algorithm, to guide multiple rounds of robotic experiments to explore protein fitness landscapes of combinatorial mutagenesis libraries. We first examined various design specifications based on an empirical landscape of protein G domain B1. Then, BO-EVO was successfully generalized to another empirical landscape of an Escherichia coli kinase PhoQ, as well as simulated NK landscapes with up to moderate epistasis. This approach was then applied to guide robotic library creation and screening to engineer enzyme specificity of RhlA, a key biosynthetic enzyme for rhamnolipid biosurfactants. A 4.8-fold improvement in producing a target rhamnolipid congener was achieved after examining less than 1% of all possible mutants after four iterations. Overall, BO-EVO proves to be an efficient and general approach to guide combinatorial protein engineering without prior knowledge.
Collapse
Affiliation(s)
- Ruyun Hu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Lihao Fu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.,CAS Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen 518055, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yongcan Chen
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.,CAS Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen 518055, China
| | - Junyu Chen
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yu Qiao
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Tong Si
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.,CAS Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen 518055, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
24
|
Kim JS, Born A, Till JKA, Liu L, Kant S, Henen MA, Vögeli B, Vázquez-Torres A. Promiscuity of response regulators for thioredoxin steers bacterial virulence. Nat Commun 2022; 13:6210. [PMID: 36266276 PMCID: PMC9584953 DOI: 10.1038/s41467-022-33983-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Accepted: 10/11/2022] [Indexed: 12/24/2022] Open
Abstract
The exquisite specificity between a sensor kinase and its cognate response regulator ensures faithful partner selectivity within two-component pairs concurrently firing in a single bacterium, minimizing crosstalk with other members of this conserved family of paralogous proteins. We show that conserved hydrophobic and charged residues on the surface of thioredoxin serve as a docking station for structurally diverse response regulators. Using the OmpR protein, we identify residues in the flexible linker and the C-terminal β-hairpin that enable associations of this archetypical response regulator with thioredoxin, but are dispensable for interactions of this transcription factor to its cognate sensor kinase EnvZ, DNA or RNA polymerase. Here we show that the promiscuous interactions of response regulators with thioredoxin foster the flow of information through otherwise highly dedicated two-component signaling systems, thereby enabling both the transcription of Salmonella pathogenicity island-2 genes as well as growth of this intracellular bacterium in macrophages and mice.
Collapse
Affiliation(s)
- Ju-Sim Kim
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Immunology & Microbiology, Aurora, Colorado USA
| | - Alexandra Born
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Biochemistry & Molecular Genetics, Aurora, Colorado USA
| | - James Karl A. Till
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Immunology & Microbiology, Aurora, Colorado USA
| | - Lin Liu
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Immunology & Microbiology, Aurora, Colorado USA
| | - Sashi Kant
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Immunology & Microbiology, Aurora, Colorado USA
| | - Morkos A. Henen
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Biochemistry & Molecular Genetics, Aurora, Colorado USA ,grid.10251.370000000103426662Faculty of Pharmacy, Mansoura University, Mansoura, 35516 Egypt
| | - Beat Vögeli
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Biochemistry & Molecular Genetics, Aurora, Colorado USA
| | - Andrés Vázquez-Torres
- University of Colorado School of Medicine, Department of Immunology & Microbiology, Aurora, Colorado, USA. .,Veterans Affairs Eastern Colorado Health Care System, Denver, Colorado, USA.
| |
Collapse
|
25
|
Harman JL, Reardon PN, Costello SM, Warren GD, Phillips SR, Connor PJ, Marqusee S, Harms MJ. Evolution avoids a pathological stabilizing interaction in the immune protein S100A9. Proc Natl Acad Sci U S A 2022; 119:e2208029119. [PMID: 36194634 PMCID: PMC9565474 DOI: 10.1073/pnas.2208029119] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 09/07/2022] [Indexed: 01/03/2023] Open
Abstract
Stability constrains evolution. While much is known about constraints on destabilizing mutations, less is known about the constraints on stabilizing mutations. We recently identified a mutation in the innate immune protein S100A9 that provides insight into such constraints. When introduced into human S100A9, M63F simultaneously increases the stability of the protein and disrupts its natural ability to activate Toll-like receptor 4. Using chemical denaturation, we found that M63F stabilizes a calcium-bound conformation of hS100A9. We then used NMR to solve the structure of the mutant protein, revealing that the mutation distorts the hydrophobic binding surface of hS100A9, explaining its deleterious effect on function. Hydrogen-deuterium exchange (HDX) experiments revealed stabilization of the region around M63F in the structure, notably Phe37. In the structure of the M63F mutant, the Phe37 and Phe63 sidechains are in contact, plausibly forming an edge-face π-stack. Mutating Phe37 to Leu abolished the stabilizing effect of M63F as probed by both chemical denaturation and HDX. It also restored the biological activity of S100A9 disrupted by M63F. These findings reveal that Phe63 creates a molecular staple with Phe37 that stabilizes a nonfunctional conformation of the protein, thus disrupting function. Using a bioinformatic analysis, we found that S100A9 proteins from different organisms rarely have Phe at both positions 37 and 63, suggesting that avoiding a pathological stabilizing interaction indeed constrains S100A9 evolution. This work highlights an important evolutionary constraint on stabilizing mutations, namely, that they must avoid inappropriately stabilizing nonfunctional protein conformations.
Collapse
Affiliation(s)
- Joseph L. Harman
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Patrick N. Reardon
- College of Science, NMR Facility, Oregon State University, Corvallis, OR 97331
| | - Shawn M. Costello
- Biophysics Graduate Program, University of California, Berkeley, Berkeley, CA 94720
| | - Gus D. Warren
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Sophia R. Phillips
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Patrick J. Connor
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Susan Marqusee
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720
- Department of Chemistry, University of California, Berkeley, Berkeley, CA 94720
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA 94720
| | - Michael J. Harms
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| |
Collapse
|
26
|
Azbukina N, Zharikova A, Ramensky V. Intragenic compensation through the lens of deep mutational scanning. Biophys Rev 2022; 14:1161-1182. [PMID: 36345285 PMCID: PMC9636336 DOI: 10.1007/s12551-022-01005-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/26/2022] [Indexed: 12/20/2022] Open
Abstract
A significant fraction of mutations in proteins are deleterious and result in adverse consequences for protein function, stability, or interaction with other molecules. Intragenic compensation is a specific case of positive epistasis when a neutral missense mutation cancels effect of a deleterious mutation in the same protein. Permissive compensatory mutations facilitate protein evolution, since without them all sequences would be extremely conserved. Understanding compensatory mechanisms is an important scientific challenge at the intersection of protein biophysics and evolution. In human genetics, intragenic compensatory interactions are important since they may result in variable penetrance of pathogenic mutations or fixation of pathogenic human alleles in orthologous proteins from related species. The latter phenomenon complicates computational and clinical inference of an allele's pathogenicity. Deep mutational scanning is a relatively new technique that enables experimental studies of functional effects of thousands of mutations in proteins. We review the important aspects of the field and discuss existing limitations of current datasets. We reviewed ten published DMS datasets with quantified functional effects of single and double mutations and described rates and patterns of intragenic compensation in eight of them. Supplementary Information The online version contains supplementary material available at 10.1007/s12551-022-01005-w.
Collapse
Affiliation(s)
- Nadezhda Azbukina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
| | - Anastasia Zharikova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| | - Vasily Ramensky
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| |
Collapse
|
27
|
Abstract
One core goal of genetics is to systematically understand the mapping between the DNA sequence of an organism (genotype) and its measurable characteristics (phenotype). Understanding this mapping is often challenging because of interactions between mutations, where the result of combining several different mutations can be very different than the sum of their individual effects. Here we provide a statistical framework for modeling complex genetic interactions of this type. The key idea is to ask how fast the effects of mutations change when introducing the same mutation in increasingly distant genetic backgrounds. We then propose a model for phenotypic prediction that takes into account this tendency for the effects of mutations to be more similar in nearby genetic backgrounds. Contemporary high-throughput mutagenesis experiments are providing an increasingly detailed view of the complex patterns of genetic interaction that occur between multiple mutations within a single protein or regulatory element. By simultaneously measuring the effects of thousands of combinations of mutations, these experiments have revealed that the genotype–phenotype relationship typically reflects not only genetic interactions between pairs of sites but also higher-order interactions among larger numbers of sites. However, modeling and understanding these higher-order interactions remains challenging. Here we present a method for reconstructing sequence-to-function mappings from partially observed data that can accommodate all orders of genetic interaction. The main idea is to make predictions for unobserved genotypes that match the type and extent of epistasis found in the observed data. This information on the type and extent of epistasis can be extracted by considering how phenotypic correlations change as a function of mutational distance, which is equivalent to estimating the fraction of phenotypic variance due to each order of genetic interaction (additive, pairwise, three-way, etc.). Using these estimated variance components, we then define an empirical Bayes prior that in expectation matches the observed pattern of epistasis and reconstruct the genotype–phenotype mapping by conducting Gaussian process regression under this prior. To demonstrate the power of this approach, we present an application to the antibody-binding domain GB1 and also provide a detailed exploration of a dataset consisting of high-throughput measurements for the splicing efficiency of human pre-mRNA 5′ splice sites, for which we also validate our model predictions via additional low-throughput experiments.
Collapse
|
28
|
Qiu Y, Wei GW. CLADE 2.0: Evolution-Driven Cluster Learning-Assisted Directed Evolution. J Chem Inf Model 2022; 62:4629-4641. [PMID: 36154171 DOI: 10.1021/acs.jcim.2c01046] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Directed evolution, a revolutionary biotechnology in protein engineering, optimizes protein fitness by searching an astronomical mutational space via expensive experiments. The cluster learning-assisted directed evolution (CLADE) efficiently explores the mutational space via a combination of unsupervised hierarchical clustering and supervised learning. However, the initial-stage sampling in CLADE treats all clusters equally despite many clusters containing a large portion of non-functional mutations. Recent statistical and deep learning tools enable evolutionary density modeling to access protein fitness in an unsupervised manner. In this work, we construct an ensemble of multiple evolutionary scores to guide the initial sampling in CLADE. The resulting evolutionary score-enhanced CLADE, called CLADE 2.0, efficiently selects a training set within a small informative space using the evolution-driven clustering sampling. CLADE 2.0 is validated by using two benchmark libraries both having 160,000 sequences from four-site mutational combinations. Extensive computational experiments and comparisons with existing cutting-edge methods indicate that CLADE 2.0 is a new state-of-art tool for machine learning-assisted directed evolution.
Collapse
Affiliation(s)
- Yuchi Qiu
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States.,Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States.,Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
29
|
Park Y, Metzger BPH, Thornton JW. Epistatic drift causes gradual decay of predictability in protein evolution. Science 2022; 376:823-830. [PMID: 35587978 DOI: 10.1126/science.abn6895] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Epistatic interactions can make the outcomes of evolution unpredictable, but no comprehensive data are available on the extent and temporal dynamics of changes in the effects of mutations as protein sequences evolve. Here, we use phylogenetic deep mutational scanning to measure the functional effect of every possible amino acid mutation in a series of ancestral and extant steroid receptor DNA binding domains. Across 700 million years of evolution, epistatic interactions caused the effects of most mutations to become decorrelated from their initial effects and their windows of evolutionary accessibility to open and close transiently. Most effects changed gradually and without bias at rates that were largely constant across time, indicating a neutral process caused by many weak epistatic interactions. Our findings show that protein sequences drift inexorably into contingency and unpredictability, but that the process is statistically predictable, given sufficient phylogenetic and experimental data.
Collapse
Affiliation(s)
- Yeonwoo Park
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA
| | - Brian P H Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Joseph W Thornton
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA.,Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA.,Department of Human Genetics, University of Chicago, Chicago, IL, USA
| |
Collapse
|
30
|
Wittmann BJ, Johnston KE, Almhjell PJ, Arnold FH. evSeq: Cost-Effective Amplicon Sequencing of Every Variant in a Protein Library. ACS Synth Biol 2022; 11:1313-1324. [PMID: 35172576 DOI: 10.1021/acssynbio.1c00592] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Widespread availability of protein sequence-fitness data would revolutionize both our biochemical understanding of proteins and our ability to engineer them. Unfortunately, even though thousands of protein variants are generated and evaluated for fitness during a typical protein engineering campaign, most are never sequenced, leaving a wealth of potential sequence-fitness information untapped. Primarily, this is because sequencing is unnecessary for many protein engineering strategies; the added cost and effort of sequencing are thus unjustified. It also results from the fact that, even though many lower-cost sequencing strategies have been developed, they often require at least some access to and experience with sequencing or computational resources, both of which can be barriers to access. Here, we present every variant sequencing (evSeq), a method and collection of tools/standardized components for sequencing a variable region within every variant gene produced during a protein engineering campaign at a cost of cents per variant. evSeq was designed to democratize low-cost sequencing for protein engineers and, indeed, anyone interested in engineering biological systems. Execution of its wet-lab component is simple, requires no sequencing experience to perform, relies only on resources and services typically available to biology labs, and slots neatly into existing protein engineering workflows. Analysis of evSeq data is likewise made simple by its accompanying software (found at github.com/fhalab/evSeq, documentation at fhalab.github.io/evSeq), which can be run on a personal laptop and was designed to be accessible to users with no computational experience. Low-cost and easy-to-use, evSeq makes the collection of extensive protein variant sequence-fitness data practical.
Collapse
Affiliation(s)
- Bruce J. Wittmann
- Division of Biology and Biological Engineering, California Institute of Technology, MC 210-41, 1200 E. California Boulevard, Pasadena, California 91125, United States
| | - Kadina E. Johnston
- Division of Biology and Biological Engineering, California Institute of Technology, MC 210-41, 1200 E. California Boulevard, Pasadena, California 91125, United States
| | - Patrick J. Almhjell
- Division of Chemistry and Chemical Engineering, California Institute of Technology, MC 210-41, 1200 E. California Boulevard, Pasadena, California 91125, United States
| | - Frances H. Arnold
- Division of Biology and Biological Engineering, California Institute of Technology, MC 210-41, 1200 E. California Boulevard, Pasadena, California 91125, United States
- Division of Chemistry and Chemical Engineering, California Institute of Technology, MC 210-41, 1200 E. California Boulevard, Pasadena, California 91125, United States
| |
Collapse
|
31
|
Scheele RA, Lindenburg LH, Petek M, Schober M, Dalby KN, Hollfelder F. Droplet-based screening of phosphate transfer catalysis reveals how epistasis shapes MAP kinase interactions with substrates. Nat Commun 2022; 13:844. [PMID: 35149678 PMCID: PMC8837617 DOI: 10.1038/s41467-022-28396-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 01/10/2022] [Indexed: 11/20/2022] Open
Abstract
The combination of ultrahigh-throughput screening and sequencing informs on function and intragenic epistasis within combinatorial protein mutant libraries. Establishing a droplet-based, in vitro compartmentalised approach for robust expression and screening of protein kinase cascades (>107 variants/day) allowed us to dissect the intrinsic molecular features of the MKK-ERK signalling pathway, without interference from endogenous cellular components. In a six-residue combinatorial library of the MKK1 docking domain, we identified 29,563 sequence permutations that allow MKK1 to efficiently phosphorylate and activate its downstream target kinase ERK2. A flexibly placed hydrophobic sequence motif emerges which is defined by higher order epistatic interactions between six residues, suggesting synergy that enables high connectivity in the sequence landscape. Through positive epistasis, MKK1 maintains function during mutagenesis, establishing the importance of co-dependent residues in mammalian protein kinase-substrate interactions, and creating a scenario for the evolution of diverse human signalling networks. Here, the authors use a droplet-based screen for phosphate transfer catalysis, testing variants of the human protein kinase MKK1 for its ability to activate its downstream target ERK2. Data reveal a flexible motif in the MKK1 docking domain that promotes efficient activation of ERK2, and suggest epistasis between the residues within that sequence.
Collapse
Affiliation(s)
- Remkes A Scheele
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
| | | | - Maya Petek
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK.,Faculty of Medicine, University of Maribor, SI-2000, Maribor, Slovenia
| | - Markus Schober
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
| | - Kevin N Dalby
- Division of Chemical Biology and Medicinal Chemistry, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Florian Hollfelder
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK.
| |
Collapse
|
32
|
Dubé AK, Dandage R, Dibyachintan S, Dionne U, Després PC, Landry CR. Deep Mutational Scanning of Protein-Protein Interactions Between Partners Expressed from Their Endogenous Loci In Vivo. Methods Mol Biol 2022; 2477:237-259. [PMID: 35524121 DOI: 10.1007/978-1-0716-2257-5_14] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Deep mutational scanning (DMS) generates mutants of a protein of interest in a comprehensive manner. CRISPR-Cas9 technology enables large-scale genome editing with high efficiency. Using both DMS and CRISPR-Cas9 therefore allows us to investigate the effects of thousands of mutations inserted directly in the genome. Combined with protein-fragment complementation assay (PCA), which enables the quantitative measurement of protein-protein interactions (PPIs) in vivo, these methods allow for the systematic assessment of the effects of mutations on PPIs in living cells. Here, we describe a method leveraging DMS, CRISPR-Cas9, and PCA to study the effect of point mutations on PPIs mediated by protein domains in yeast.
Collapse
Affiliation(s)
- Alexandre K Dubé
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada.
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada.
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada.
- Département de Biologie, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada.
| | - Rohan Dandage
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada
- Département de Biologie, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
| | - Soham Dibyachintan
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada
- Département de Biologie, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
- Department of Chemical Engineering, Indian Institute of Technology Bombay (IIT), Powai, Mumbai, Maharashtra, India
| | - Ugo Dionne
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada
- Centre de recherche du Centre Hospitalier Universitaire (CHU) de Québec, Université Laval, Québec, QC, Canada
- Centre de recherche sur le cancer de l'Université Laval, Québec, QC, Canada
| | - Philippe C Després
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada
| | - Christian R Landry
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada.
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada.
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada.
- Département de Biologie, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada.
| |
Collapse
|
33
|
Baquero F, Martínez JL, F. Lanza V, Rodríguez-Beltrán J, Galán JC, San Millán A, Cantón R, Coque TM. Evolutionary Pathways and Trajectories in Antibiotic Resistance. Clin Microbiol Rev 2021; 34:e0005019. [PMID: 34190572 PMCID: PMC8404696 DOI: 10.1128/cmr.00050-19] [Citation(s) in RCA: 95] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Evolution is the hallmark of life. Descriptions of the evolution of microorganisms have provided a wealth of information, but knowledge regarding "what happened" has precluded a deeper understanding of "how" evolution has proceeded, as in the case of antimicrobial resistance. The difficulty in answering the "how" question lies in the multihierarchical dimensions of evolutionary processes, nested in complex networks, encompassing all units of selection, from genes to communities and ecosystems. At the simplest ontological level (as resistance genes), evolution proceeds by random (mutation and drift) and directional (natural selection) processes; however, sequential pathways of adaptive variation can occasionally be observed, and under fixed circumstances (particular fitness landscapes), evolution is predictable. At the highest level (such as that of plasmids, clones, species, microbiotas), the systems' degrees of freedom increase dramatically, related to the variable dispersal, fragmentation, relatedness, or coalescence of bacterial populations, depending on heterogeneous and changing niches and selective gradients in complex environments. Evolutionary trajectories of antibiotic resistance find their way in these changing landscapes subjected to random variations, becoming highly entropic and therefore unpredictable. However, experimental, phylogenetic, and ecogenetic analyses reveal preferential frequented paths (highways) where antibiotic resistance flows and propagates, allowing some understanding of evolutionary dynamics, modeling and designing interventions. Studies on antibiotic resistance have an applied aspect in improving individual health, One Health, and Global Health, as well as an academic value for understanding evolution. Most importantly, they have a heuristic significance as a model to reduce the negative influence of anthropogenic effects on the environment.
Collapse
Affiliation(s)
- F. Baquero
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - J. L. Martínez
- National Center for Biotechnology (CNB-CSIC), Madrid, Spain
| | - V. F. Lanza
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Central Bioinformatics Unit, Ramón y Cajal Institute for Health Research (IRYCIS), Madrid, Spain
| | - J. Rodríguez-Beltrán
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - J. C. Galán
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - A. San Millán
- National Center for Biotechnology (CNB-CSIC), Madrid, Spain
| | - R. Cantón
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - T. M. Coque
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| |
Collapse
|
34
|
Chu WT, Yan Z, Chu X, Zheng X, Liu Z, Xu L, Zhang K, Wang J. Physics of biomolecular recognition and conformational dynamics. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2021; 84:126601. [PMID: 34753115 DOI: 10.1088/1361-6633/ac3800] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 11/09/2021] [Indexed: 06/13/2023]
Abstract
Biomolecular recognition usually leads to the formation of binding complexes, often accompanied by large-scale conformational changes. This process is fundamental to biological functions at the molecular and cellular levels. Uncovering the physical mechanisms of biomolecular recognition and quantifying the key biomolecular interactions are vital to understand these functions. The recently developed energy landscape theory has been successful in quantifying recognition processes and revealing the underlying mechanisms. Recent studies have shown that in addition to affinity, specificity is also crucial for biomolecular recognition. The proposed physical concept of intrinsic specificity based on the underlying energy landscape theory provides a practical way to quantify the specificity. Optimization of affinity and specificity can be adopted as a principle to guide the evolution and design of molecular recognition. This approach can also be used in practice for drug discovery using multidimensional screening to identify lead compounds. The energy landscape topography of molecular recognition is important for revealing the underlying flexible binding or binding-folding mechanisms. In this review, we first introduce the energy landscape theory for molecular recognition and then address four critical issues related to biomolecular recognition and conformational dynamics: (1) specificity quantification of molecular recognition; (2) evolution and design in molecular recognition; (3) flexible molecular recognition; (4) chromosome structural dynamics. The results described here and the discussions of the insights gained from the energy landscape topography can provide valuable guidance for further computational and experimental investigations of biomolecular recognition and conformational dynamics.
Collapse
Affiliation(s)
- Wen-Ting Chu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Zhiqiang Yan
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Xiakun Chu
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, NY 11794, United States of America
| | - Xiliang Zheng
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Zuojia Liu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Li Xu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Kun Zhang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Jin Wang
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, NY 11794, United States of America
| |
Collapse
|
35
|
Abstract
Directed evolution, a strategy for protein engineering, optimizes protein properties (i.e., fitness) by expensive and time-consuming screening or selection of large mutational sequence space. Machine learning-assisted directed evolution (MLDE), which screens sequence properties in silico, can accelerate the optimization and reduce the experimental burden. This work introduces a MLDE framework, cluster learning-assisted directed evolution (CLADE), that combines hierarchical unsupervised clustering sampling and supervised learning to guide protein engineering. The clustering sampling selectively picks and screens variants in targeted subspaces, which guides the subsequent generation of diverse training sets. In the last stage, accurate predictions via supervised learning models improve final outcomes. By sequentially screening 480 sequences out of 160,000 in a four-site combinatorial library with five equal experimental batches, CLADE achieves the global maximal fitness hit rate up to 91.0% and 34.0% for GB1 and PhoQ datasets, respectively, improved from 18.6% and 7.2% obtained by random-sampling-based MLDE.
Collapse
Affiliation(s)
- Yuchi Qiu
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Jian Hu
- Department of Chemistry, Michigan State University, MI, 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI, 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI, 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Corresponding author:
| |
Collapse
|
36
|
Heyne M, Shirian J, Cohen I, Peleg Y, Radisky ES, Papo N, Shifman JM. Climbing Up and Down Binding Landscapes through Deep Mutational Scanning of Three Homologous Protein-Protein Complexes. J Am Chem Soc 2021; 143:17261-17275. [PMID: 34609866 PMCID: PMC8532158 DOI: 10.1021/jacs.1c08707] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Indexed: 12/18/2022]
Abstract
Protein-protein interactions (PPIs) have evolved to display binding affinities that can support their function. As such, cognate and noncognate PPIs could be highly similar structurally but exhibit huge differences in binding affinities. To understand this phenomenon, we study three homologous protease-inhibitor PPIs that span 9 orders of magnitude in binding affinity. Using state-of-the-art methodology that combines protein randomization, affinity sorting, deep sequencing, and data normalization, we report quantitative binding landscapes consisting of ΔΔGbind values for the three PPIs, gleaned from tens of thousands of single and double mutations. We show that binding landscapes of the three complexes are strikingly different and depend on the PPI evolutionary optimality. We observe different patterns of couplings between mutations for the three PPIs with negative and positive epistasis appearing most frequently at hot-spot and cold-spot positions, respectively. The evolutionary trends observed here are likely to be universal to other biological complexes in the cell.
Collapse
Affiliation(s)
- Michael Heyne
- Department
of Biological Chemistry, The Alexander Silberman Institute of Life
Sciences, The Hebrew University of Jerusalem, Jerusalem, 9190401, Israel
- Avram
and Stella Goldstein-Goren Department of Biotechnology Engineering
and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, 8410501, Israel
| | - Jason Shirian
- Department
of Biological Chemistry, The Alexander Silberman Institute of Life
Sciences, The Hebrew University of Jerusalem, Jerusalem, 9190401, Israel
| | - Itay Cohen
- Avram
and Stella Goldstein-Goren Department of Biotechnology Engineering
and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, 8410501, Israel
| | - Yoav Peleg
- Life
Sciences Core Facilities (LSCF) Structural Proteomics Unit (SPU), Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Evette S. Radisky
- Department
of Cancer Biology, Mayo Clinic Comprehensive
Cancer Center, Jacksonville, Florida 32224, United States
| | - Niv Papo
- Avram
and Stella Goldstein-Goren Department of Biotechnology Engineering
and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, 8410501, Israel
| | - Julia M. Shifman
- Department
of Biological Chemistry, The Alexander Silberman Institute of Life
Sciences, The Hebrew University of Jerusalem, Jerusalem, 9190401, Israel
| |
Collapse
|
37
|
Epistasis shapes the fitness landscape of an allosteric specificity switch. Nat Commun 2021; 12:5562. [PMID: 34548494 PMCID: PMC8455584 DOI: 10.1038/s41467-021-25826-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Accepted: 09/03/2021] [Indexed: 11/08/2022] Open
Abstract
Epistasis is a major determinant in the emergence of novel protein function. In allosteric proteins, direct interactions between inducer-binding mutations propagate through the allosteric network, manifesting as epistasis at the level of biological function. Elucidating this relationship between local interactions and their global effects is essential to understanding evolution of allosteric proteins. We integrate computational design, structural and biophysical analysis to characterize the emergence of novel inducer specificity in an allosteric transcription factor. Adaptive landscapes of different inducers of the designed mutant show that a few strong epistatic interactions constrain the number of viable sequence pathways, revealing ridges in the fitness landscape leading to new specificity. The structure of the designed mutant shows that a striking change in inducer orientation still retains allosteric function. Comparing biophysical and functional properties suggests a nonlinear relationship between inducer binding affinity and allostery. Our results highlight the functional and evolutionary complexity of allosteric proteins. Epistasis plays an important role in the evolution of novel protein functions because it determines the mutational path a protein takes. Here, the authors combine functional, structural and biophysical analyses to characterize epistasis in a computationally redesigned ligand-inducible allosteric transcription factor and found that epistasis creates distinct biophysical and biological functional landscapes.
Collapse
|
38
|
The search for universality in evolutionary landscapes: Comment on "From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics" by Susanna Manrubia, José A. Cuesta, et al. Phys Life Rev 2021; 39:76-78. [PMID: 34507904 DOI: 10.1016/j.plrev.2021.08.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 08/19/2021] [Indexed: 11/21/2022]
|
39
|
Phillips AM, Lawrence KR, Moulana A, Dupic T, Chang J, Johnson MS, Cvijovic I, Mora T, Walczak AM, Desai MM. Binding affinity landscapes constrain the evolution of broadly neutralizing anti-influenza antibodies. eLife 2021; 10:71393. [PMID: 34491198 PMCID: PMC8476123 DOI: 10.7554/elife.71393] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 09/05/2021] [Indexed: 12/12/2022] Open
Abstract
Over the past two decades, several broadly neutralizing antibodies (bnAbs) that confer protection against diverse influenza strains have been isolated. Structural and biochemical characterization of these bnAbs has provided molecular insight into how they bind distinct antigens. However, our understanding of the evolutionary pathways leading to bnAbs, and thus how best to elicit them, remains limited. Here, we measure equilibrium dissociation constants of combinatorially complete mutational libraries for two naturally isolated influenza bnAbs (CR9114, 16 heavy-chain mutations; CR6261, 11 heavy-chain mutations), reconstructing all possible evolutionary intermediates back to the unmutated germline sequences. We find that these two libraries exhibit strikingly different patterns of breadth: while many variants of CR6261 display moderate affinity to diverse antigens, those of CR9114 display appreciable affinity only in specific, nested combinations. By examining the extensive pairwise and higher order epistasis between mutations, we find key sites with strong synergistic interactions that are highly similar across antigens for CR6261 and different for CR9114. Together, these features of the binding affinity landscapes strongly favor sequential acquisition of affinity to diverse antigens for CR9114, while the acquisition of breadth to more similar antigens for CR6261 is less constrained. These results, if generalizable to other bnAbs, may explain the molecular basis for the widespread observation that sequential exposure favors greater breadth, and such mechanistic insight will be essential for predicting and eliciting broadly protective immune responses.
Collapse
Affiliation(s)
- Angela M Phillips
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States
| | - Katherine R Lawrence
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States.,NSF-Simons Center for Mathematical and Statistical Analysis of Biology, Harvard University, Cambridge, United States.,Quantitative Biology Initiative, Harvard University, Cambridge, United States.,Department of Physics, Massachusetts Institute of Technology, Cambridge, United States
| | - Alief Moulana
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States
| | - Thomas Dupic
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States
| | - Jeffrey Chang
- Department of Physics, Harvard University, Cambridge, United States
| | - Milo S Johnson
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States
| | - Ivana Cvijovic
- Department of Applied Physics, Stanford University, Stanford, United States
| | - Thierry Mora
- Laboratoire de physique de ÍÉcole Normale Supérieure, CNRS, PSL University, Sorbonne Université, and Université de Paris, Paris, France
| | - Aleksandra M Walczak
- Laboratoire de physique de ÍÉcole Normale Supérieure, CNRS, PSL University, Sorbonne Université, and Université de Paris, Paris, France
| | - Michael M Desai
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States.,NSF-Simons Center for Mathematical and Statistical Analysis of Biology, Harvard University, Cambridge, United States.,Quantitative Biology Initiative, Harvard University, Cambridge, United States.,Department of Physics, Harvard University, Cambridge, United States
| |
Collapse
|
40
|
Groisman EA, Duprey A, Choi J. How the PhoP/PhoQ System Controls Virulence and Mg 2+ Homeostasis: Lessons in Signal Transduction, Pathogenesis, Physiology, and Evolution. Microbiol Mol Biol Rev 2021; 85:e0017620. [PMID: 34191587 PMCID: PMC8483708 DOI: 10.1128/mmbr.00176-20] [Citation(s) in RCA: 64] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
The PhoP/PhoQ two-component system governs virulence, Mg2+ homeostasis, and resistance to a variety of antimicrobial agents, including acidic pH and cationic antimicrobial peptides, in several Gram-negative bacterial species. Best understood in Salmonella enterica serovar Typhimurium, the PhoP/PhoQ system consists o-regulated gene products alter PhoP-P amounts, even under constant inducing conditions. PhoP-P controls the abundance of hundreds of proteins both directly, by having transcriptional effects on the corresponding genes, and indirectly, by modifying the abundance, activity, or stability of other transcription factors, regulatory RNAs, protease regulators, and metabolites. The investigation of PhoP/PhoQ has uncovered novel forms of signal transduction and the physiological consequences of regulon evolution.
Collapse
Affiliation(s)
- Eduardo A. Groisman
- Department of Microbial Pathogenesis, Yale School of Medicine, New Haven, Connecticut, USA
- Yale Microbial Sciences Institute, West Haven, Connecticut, USA
| | - Alexandre Duprey
- Department of Microbial Pathogenesis, Yale School of Medicine, New Haven, Connecticut, USA
| | - Jeongjoon Choi
- Department of Microbial Pathogenesis, Yale School of Medicine, New Haven, Connecticut, USA
| |
Collapse
|
41
|
Abstract
Duplication and divergence is a major mechanism by which new proteins and functions emerge in biology. Consequently, most organisms, in all domains of life, have genomes that encode large paralogous families of proteins. For recently duplicated pathways to acquire different, independent functions, the two paralogs must acquire mutations that effectively insulate them from one another. For instance, paralogous signaling proteins must acquire mutations that endow them with different interaction specificities such that they can participate in different signaling pathways without disruptive cross-talk. Although duplicated genes undoubtedly shape each other's evolution as they diverge and attain new functions, it is less clear how other paralogs impact or constrain gene duplication. Does the establishment of a new pathway by duplication and divergence require the system-wide optimization of all paralogs? The answer has profound implications for molecular evolution and our ability to engineer biological systems. Here, we discuss models, experiments, and approaches for tackling this question, and for understanding how new proteins and pathways are born.
Collapse
Affiliation(s)
- Conor J McClune
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Chemical Engineering and ChEM-H Institute, Stanford University, Stanford, CA 94305, USA
| | - Michael T Laub
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| |
Collapse
|
42
|
Huene AL, Chen T, Nicotra ML. New binding specificities evolve via point mutation in an invertebrate allorecognition gene. iScience 2021; 24:102811. [PMID: 34296075 PMCID: PMC8282982 DOI: 10.1016/j.isci.2021.102811] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 06/16/2021] [Accepted: 06/28/2021] [Indexed: 01/04/2023] Open
Abstract
Many organisms use genetic self-recognition systems to distinguish themselves from conspecifics. In the cnidarian, Hydractinia symbiolongicarpus, self-recognition is partially controlled by allorecognition 2 (Alr2). Alr2 encodes a highly polymorphic transmembrane protein that discriminates self from nonself by binding in trans to other Alr2 proteins with identical or similar sequences. Here, we focused on the N-terminal domain of Alr2, which can determine its binding specificity. We pair ancestral sequence reconstruction and experimental assays to show that amino acid substitutions can create sequences with novel binding specificities either directly (via one mutation) or via sequential mutations and intermediates with relaxed specificities. We also show that one side of the domain has experienced positive selection and likely forms the binding interface. Our results provide direct evidence that point mutations can generate Alr2 proteins with novel binding specificities. This provides a plausible mechanism for the generation and maintenance of functional variation in nature.
Collapse
Affiliation(s)
- Aidan L. Huene
- Thomas E. Starzl Transplantation Institute, Department of Surgery, University of Pittsburgh, Pittsburgh, PA 15260, USA
- Center for Evolutionary Biology and Medicine, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Traci Chen
- Thomas E. Starzl Transplantation Institute, Department of Surgery, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Matthew L. Nicotra
- Thomas E. Starzl Transplantation Institute, Department of Surgery, University of Pittsburgh, Pittsburgh, PA 15260, USA
- Center for Evolutionary Biology and Medicine, University of Pittsburgh, Pittsburgh, PA 15260, USA
- Department of Immunology, University of Pittsburgh, Pittsburgh, PA 15260, USA
| |
Collapse
|
43
|
Jalal ASB, Tran NT, Stevenson CE, Chan EW, Lo R, Tan X, Noy A, Lawson DM, Le TBK. Diversification of DNA-Binding Specificity by Permissive and Specificity-Switching Mutations in the ParB/Noc Protein Family. Cell Rep 2021; 32:107928. [PMID: 32698006 PMCID: PMC7383237 DOI: 10.1016/j.celrep.2020.107928] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 03/25/2020] [Accepted: 06/26/2020] [Indexed: 12/17/2022] Open
Abstract
Specific interactions between proteins and DNA are essential to many biological processes. Yet, it remains unclear how the diversification in DNA-binding specificity was brought about, and the mutational paths that led to changes in specificity are unknown. Using a pair of evolutionarily related DNA-binding proteins, each with a different DNA preference (ParB [Partitioning Protein B] and Noc [Nucleoid Occlusion Factor], which both play roles in bacterial chromosome maintenance), we show that specificity is encoded by a set of four residues at the protein-DNA interface. Combining X-ray crystallography and deep mutational scanning of the interface, we suggest that permissive mutations must be introduced before specificity-switching mutations to reprogram specificity and that mutational paths to new specificity do not necessarily involve dual-specificity intermediates. Overall, our results provide insight into the possible evolutionary history of ParB and Noc and, in a broader context, might be useful for understanding the evolution of other classes of DNA-binding proteins. DNA-binding specificity for parS and NBS is conserved within ParB and Noc family Specificity is encoded by a set of four residues at the protein-DNA interface Mutations must be introduced in a defined order to reprogram specificity
Collapse
Affiliation(s)
- Adam S B Jalal
- Department of Molecular Microbiology, John Innes Centre, Norwich NR4 7UH, UK
| | - Ngat T Tran
- Department of Molecular Microbiology, John Innes Centre, Norwich NR4 7UH, UK
| | - Clare E Stevenson
- Department of Biological Chemistry, John Innes Centre, Norwich NR4 7UH, UK
| | - Elliot W Chan
- Department of Physics, Biological Physical Sciences Institute, University of York, York YO10, UK
| | - Rebecca Lo
- Department of Molecular Microbiology, John Innes Centre, Norwich NR4 7UH, UK
| | - Xiao Tan
- Department of Molecular Microbiology, John Innes Centre, Norwich NR4 7UH, UK
| | - Agnes Noy
- Department of Physics, Biological Physical Sciences Institute, University of York, York YO10, UK
| | - David M Lawson
- Department of Biological Chemistry, John Innes Centre, Norwich NR4 7UH, UK
| | - Tung B K Le
- Department of Molecular Microbiology, John Innes Centre, Norwich NR4 7UH, UK.
| |
Collapse
|
44
|
Filsinger GT, Wannier TM, Pedersen FB, Lutz ID, Zhang J, Stork DA, Debnath A, Gozzi K, Kuchwara H, Volf V, Wang S, Rios X, Gregg CJ, Lajoie MJ, Shipman SL, Aach J, Laub MT, Church GM. Characterizing the portability of phage-encoded homologous recombination proteins. Nat Chem Biol 2021; 17:394-402. [PMID: 33462496 PMCID: PMC7990699 DOI: 10.1038/s41589-020-00710-5] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 11/02/2020] [Accepted: 11/13/2020] [Indexed: 01/29/2023]
Abstract
Efficient genome editing methods are essential for biotechnology and fundamental research. Homologous recombination (HR) is the most versatile method of genome editing, but techniques that rely on host RecA-mediated pathways are inefficient and laborious. Phage-encoded single-stranded DNA annealing proteins (SSAPs) improve HR 1,000-fold above endogenous levels. However, they are not broadly functional. Using Escherichia coli, Lactococcus lactis, Mycobacterium smegmatis, Lactobacillus rhamnosus and Caulobacter crescentus, we investigated the limited portability of SSAPs. We find that these proteins specifically recognize the C-terminal tail of the host's single-stranded DNA-binding protein (SSB) and are portable between species only if compatibility with this host domain is maintained. Furthermore, we find that co-expressing SSAPs with SSBs can significantly improve genome editing efficiency, in some species enabling SSAP functionality even without host compatibility. Finally, we find that high-efficiency HR far surpasses the mutational capacity of commonly used random mutagenesis methods, generating exceptional phenotypes that are inaccessible through sequential nucleotide conversions.
Collapse
Affiliation(s)
- Gabriel T. Filsinger
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Cambridge, Massachusetts, USA.,Correspondence to: ,
| | - Timothy M. Wannier
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Cambridge, Massachusetts, USA.,Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| | - Felix B. Pedersen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M, Denmark
| | - Isaac D. Lutz
- Institute for Protein Design, University of Washington, Seattle, Washington, USA.,Department of Bioengineering, University of Washington, Seattle, Washington, USA
| | - Julie Zhang
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Devon A. Stork
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Cambridge, Massachusetts, USA.,Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
| | - Anik Debnath
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA.,Tenza Inc., Cambridge, MA
| | - Kevin Gozzi
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Helene Kuchwara
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| | - Verena Volf
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Cambridge, Massachusetts, USA.,Harvard University John A. Paulson School of Engineering and Applied Sciences, Cambridge, Massachusetts, USA
| | - Stan Wang
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Cambridge, Massachusetts, USA.,Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| | - Xavier Rios
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| | | | - Marc J. Lajoie
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| | - Seth L. Shipman
- Gladstone Institutes, San Francisco, CA,Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA
| | - John Aach
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| | - Michael T. Laub
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - George M. Church
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Cambridge, Massachusetts, USA.,Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA.,Correspondence to: ,
| |
Collapse
|
45
|
Smerlak M. Neutral quasispecies evolution and the maximal entropy random walk. SCIENCE ADVANCES 2021; 7:7/16/eabb2376. [PMID: 33853768 PMCID: PMC8046360 DOI: 10.1126/sciadv.abb2376] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Accepted: 02/24/2021] [Indexed: 06/12/2023]
Abstract
Even if they have no impact on phenotype, neutral mutations are not equivalent in the eyes of evolution: A robust neutral variant-one which remains functional after further mutations-is more likely to spread in a large, diverse population than a fragile one. Quasispecies theory shows that the equilibrium frequency of a genotype is proportional to its eigenvector centrality in the neutral network. This paper explores the link between the selection for mutational robustness and the navigability of neutral networks. I show that sequences of neutral mutations follow a "maximal entropy random walk," a canonical Markov chain on graphs with nonlocal, nondiffusive dynamics. I revisit M. Smith's word-game model of evolution in this light, finding that the likelihood of certain sequences of substitutions can decrease with the population size. These counterintuitive results underscore the fertility of the interface between evolutionary dynamics, information theory, and physics.
Collapse
Affiliation(s)
- M Smerlak
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.
| |
Collapse
|
46
|
Nedrud D, Coyote-Maestas W, Schmidt D. A large-scale survey of pairwise epistasis reveals a mechanism for evolutionary expansion and specialization of PDZ domains. Proteins 2021; 89:899-914. [PMID: 33620761 DOI: 10.1002/prot.26067] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 02/02/2021] [Accepted: 02/18/2021] [Indexed: 12/21/2022]
Abstract
Deep mutational scanning (DMS) facilitates data-driven models of protein structure and function. Here, we adapted Saturated Programmable Insertion Engineering (SPINE) as a programmable DMS technique. We validate SPINE with a reference single mutant dataset in the PSD95 PDZ3 domain and then characterize most pairwise double mutants to study epistasis. We observe wide-spread proximal negative epistasis, which we attribute to mutations affecting thermodynamic stability, and strong long-range positive epistasis, which is enriched in an evolutionarily conserved and function-defining network of "sector" and clade-specifying residues. Conditional neutrality of mutations in clade-specifying residues compensates for deleterious mutations in sector positions. This suggests that epistatic interactions between these position pairs facilitated the evolutionary expansion and specialization of PDZ domains. We propose that SPINE provides easy experimental access to reveal epistasis signatures in proteins that will improve our understanding of the structural basis for protein function and adaptation.
Collapse
Affiliation(s)
- David Nedrud
- Department of Biochemistry, Molecular Biology & Biophysics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Willow Coyote-Maestas
- Department of Biochemistry, Molecular Biology & Biophysics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Daniel Schmidt
- Department of Genetics, Cell Biology & Development, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
47
|
Lite TLV, Grant RA, Nocedal I, Littlehale ML, Guo MS, Laub MT. Uncovering the basis of protein-protein interaction specificity with a combinatorially complete library. eLife 2020; 9:e60924. [PMID: 33107822 PMCID: PMC7669267 DOI: 10.7554/elife.60924] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 10/26/2020] [Indexed: 12/27/2022] Open
Abstract
Protein-protein interaction specificity is often encoded at the primary sequence level. However, the contributions of individual residues to specificity are usually poorly understood and often obscured by mutational robustness, sequence degeneracy, and epistasis. Using bacterial toxin-antitoxin systems as a model, we screened a combinatorially complete library of antitoxin variants at three key positions against two toxins. This library enabled us to measure the effect of individual substitutions on specificity in hundreds of genetic backgrounds. These distributions allow inferences about the general nature of interface residues in promoting specificity. We find that positive and negative contributions to specificity are neither inherently coupled nor mutually exclusive. Further, a wild-type antitoxin appears optimized for specificity as no substitutions improve discrimination between cognate and non-cognate partners. By comparing crystal structures of paralogous complexes, we provide a rationale for our observations. Collectively, this work provides a generalizable approach to understanding the logic of molecular recognition.
Collapse
Affiliation(s)
- Thuy-Lan V Lite
- Department of Biology Massachusetts Institute of TechnologyCambridgeUnited States
| | - Robert A Grant
- Department of Biology Massachusetts Institute of TechnologyCambridgeUnited States
| | - Isabel Nocedal
- Department of Biology Massachusetts Institute of TechnologyCambridgeUnited States
| | - Megan L Littlehale
- Department of Biology Massachusetts Institute of TechnologyCambridgeUnited States
| | - Monica S Guo
- Department of Biology Massachusetts Institute of TechnologyCambridgeUnited States
| | - Michael T Laub
- Department of Biology Massachusetts Institute of TechnologyCambridgeUnited States
- Howard Hughes Medical Institute Massachusetts Institute of TechnologyCambridgeUnited States
| |
Collapse
|
48
|
Vihinen M. Functional effects of protein variants. Biochimie 2020; 180:104-120. [PMID: 33164889 DOI: 10.1016/j.biochi.2020.10.009] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 10/15/2020] [Accepted: 10/19/2020] [Indexed: 12/11/2022]
Abstract
Genetic and other variations frequently affect protein functions. Scientific articles can contain confusing descriptions about which function or property is affected, and in many cases the statements are pure speculation without any experimental evidence. To clarify functional effects of protein variations of genetic or non-genetic origin, a systematic conceptualisation and framework are introduced. This framework describes protein functional effects on abundance, activity, specificity and affinity, along with countermeasures, which allow cells, tissues and organisms to tolerate, avoid, repair, attenuate or resist (TARAR) the effects. Effects on abundance discussed include gene dosage, restricted expression, mis-localisation and degradation. Enzymopathies, effects on kinetics, allostery and regulation of protein activity are subtopics for the effects of variants on activity. Variation outcomes on specificity and affinity comprise promiscuity, specificity, affinity and moonlighting. TARAR mechanisms redress variations with active and passive processes including chaperones, redundancy, robustness, canalisation and metabolic and signalling rewiring. A framework for pragmatic protein function analysis and presentation is introduced. All of the mechanisms and effects are described along with representative examples, most often in relation to diseases. In addition, protein function is discussed from evolutionary point of view. Application of the presented framework facilitates unambiguous, detailed and specific description of functional effects and their systematic study.
Collapse
Affiliation(s)
- Mauno Vihinen
- Department of Experimental Medical Science, BMC B13, Lund University, SE-22 184, Lund, Sweden.
| |
Collapse
|
49
|
Amalgamated cross-species transcriptomes reveal organ-specific propensity in gene expression evolution. Nat Commun 2020; 11:4459. [PMID: 32900997 PMCID: PMC7479108 DOI: 10.1038/s41467-020-18090-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 07/29/2020] [Indexed: 12/24/2022] Open
Abstract
The origins of multicellular physiology are tied to evolution of gene expression. Genes can shift expression as organisms evolve, but how ancestral expression influences altered descendant expression is not well understood. To examine this, we amalgamate 1,903 RNA-seq datasets from 182 research projects, including 6 organs in 21 vertebrate species. Quality control eliminates project-specific biases, and expression shifts are reconstructed using gene-family-wise phylogenetic Ornstein-Uhlenbeck models. Expression shifts following gene duplication result in more drastic changes in expression properties than shifts without gene duplication. The expression properties are tightly coupled with protein evolutionary rate, depending on whether and how gene duplication occurred. Fluxes in expression patterns among organs are nonrandom, forming modular connections that are reshaped by gene duplication. Thus, if expression shifts, ancestral expression in some organs induces a strong propensity for expression in particular organs in descendants. Regardless of whether the shifts are adaptive or not, this supports a major role for what might be termed preadaptive pathways of gene expression evolution.
Collapse
|
50
|
Pisa R, Phua DYZ, Kapoor TM. Distinct Mechanisms of Resistance to a CENP-E Inhibitor Emerge in Near-Haploid and Diploid Cancer Cells. Cell Chem Biol 2020; 27:850-857.e6. [PMID: 32442423 DOI: 10.1016/j.chembiol.2020.05.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Revised: 04/03/2020] [Accepted: 05/04/2020] [Indexed: 12/20/2022]
Abstract
Aberrant chromosome numbers in cancer cells may impose distinct constraints on the emergence of drug resistance-a major factor limiting the long-term efficacy of molecularly targeted therapeutics. However, for most anticancer drugs we lack analyses of drug-resistance mechanisms in cells with different karyotypes. Here, we focus on GSK923295, a mitotic kinesin CENP-E inhibitor that was evaluated in clinical trials as a cancer therapeutic. We performed unbiased selections to isolate inhibitor-resistant clones in diploid and near-haploid cancer cell lines. In diploid cells we identified single-point mutations that can suppress inhibitor binding. In contrast,transcriptome analyses revealed that the C-terminus of CENP-E was disrupted in GSK923295-resistant near-haploid cells. While chemical inhibition of CENP-E is toxic to near-haploid cells, knockout of the CENPE gene does not suppress haploid cell proliferation, suggesting that deletion of the CENP-E C-terminus can confer resistance to GSK923295. Together, these findings indicate that different chromosome copy numbers in cells can alter epistatic dependencies and lead to distinct modes of chemotype-specific resistance.
Collapse
Affiliation(s)
- Rudolf Pisa
- Laboratory of Chemistry and Cell Biology, The Rockefeller University, New York, NY 10065, USA; Tri-Institutional PhD Program in Chemical Biology, The Rockefeller University, New York, NY 10065, USA
| | - Donovan Y Z Phua
- Laboratory of Chemistry and Cell Biology, The Rockefeller University, New York, NY 10065, USA; Laboratory of Structural Biophysics and Mechanobiology, The Rockefeller University, New York, NY 10065, USA
| | - Tarun M Kapoor
- Laboratory of Chemistry and Cell Biology, The Rockefeller University, New York, NY 10065, USA.
| |
Collapse
|