1
|
Barkman TJ. Applications of ancestral sequence reconstruction for understanding the evolution of plant specialized metabolism. Philos Trans R Soc Lond B Biol Sci 2024; 379:20230348. [PMID: 39343033 PMCID: PMC11439504 DOI: 10.1098/rstb.2023.0348] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 04/10/2024] [Accepted: 04/15/2024] [Indexed: 10/01/2024] Open
Abstract
Studies of enzymes in modern-day plants have documented the diversity of metabolic activities retained by species today but only provide limited insight into how those properties evolved. Ancestral sequence reconstruction (ASR) is an approach that provides statistical estimates of ancient plant enzyme sequences which can then be resurrected to test hypotheses about the evolution of catalytic activities and pathway assembly. Here, I review the insights that have been obtained using ASR to study plant metabolism and highlight important methodological aspects. Overall, studies of resurrected plant enzymes show that (i) exaptation is widespread such that even low or undetectable levels of ancestral activity with a substrate can later become the apparent primary activity of descendant enzymes, (ii) intramolecular epistasis may or may not limit evolutionary paths towards catalytic or substrate preference switches, and (iii) ancient pathway flux often differs from modern-day metabolic networks. These and other insights gained from ASR would not have been possible using only modern-day sequences. Future ASR studies characterizing entire ancestral metabolic networks as well as those that link ancient structures with enzymatic properties should continue to provide novel insights into how the chemical diversity of plants evolved. This article is part of the theme issue 'The evolution of plant metabolism'.
Collapse
Affiliation(s)
- Todd J. Barkman
- Department of Biological Sciences, Western Michigan University, Kalamazoo, MI49008, USA
| |
Collapse
|
2
|
Vila JA. The origin of mutational epistasis. EUROPEAN BIOPHYSICS JOURNAL : EBJ 2024:10.1007/s00249-024-01725-9. [PMID: 39443382 DOI: 10.1007/s00249-024-01725-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 10/03/2024] [Accepted: 10/06/2024] [Indexed: 10/25/2024]
Abstract
The interconnected processes of protein folding, mutations, epistasis, and evolution have all been the subject of extensive analysis throughout the years due to their significance for structural and evolutionary biology. The origin (molecular basis) of epistasis-the non-additive interactions between mutations-is still, nonetheless, unknown. The existence of a new perspective on protein folding, a problem that needs to be conceived as an 'analytic whole', will enable us to shed light on the origin of mutational epistasis at the simplest level-within proteins-while also uncovering the reasons why the genetic background in which they occur, a key component of molecular evolution, could foster changes in epistasis effects. Additionally, because mutations are the source of epistasis, more research is needed to determine the impact of post-translational modifications, which can potentially increase the proteome's diversity by several orders of magnitude, on mutational epistasis and protein evolvability. Finally, a protein evolution thermodynamic-based analysis that does not consider specific mutational steps or epistasis effects will be briefly discussed. Our study explores the complex processes behind the evolution of proteins upon mutations, clearing up some previously unresolved issues, and providing direction for further research.
Collapse
Affiliation(s)
- Jorge A Vila
- IMASL-CONICET, Ejército de Los Andes 950, 5700, San Luis, Argentina.
| |
Collapse
|
3
|
D’Oliviera A, Dai X, Mottaghinia S, Olson S, Geissler EP, Etienne L, Zhang Y, Mugridge JS. Recognition and Cleavage of Human tRNA Methyltransferase TRMT1 by the SARS-CoV-2 Main Protease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.20.529306. [PMID: 36865253 PMCID: PMC9980103 DOI: 10.1101/2023.02.20.529306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]
Abstract
The SARS-CoV-2 main protease (Mpro, or Nsp5) is critical for the production of functional viral proteins during infection and, like many viral proteases, can also target host proteins to subvert their cellular functions. Here, we show that the human tRNA methyltransferase TRMT1 can be recognized and cleaved by SARS-CoV-2 Mpro. TRMT1 installs the N 2,N 2-dimethylguanosine (m2,2G) modification on mammalian tRNAs, which promotes global protein synthesis and cellular redox homeostasis. We find that Mpro can cleave endogenous TRMT1 in human cell lysate, resulting in removal of the TRMT1 zinc finger domain. TRMT1 proteolysis results in elimination of TRMT1 tRNA methyltransferase activity and reduced tRNA binding affinity. Evolutionary analysis shows that the TRMT1 cleavage site is highly conserved in mammals, except in Muroidea, where TRMT1 is likely resistant to cleavage. In primates, regions outside the cleavage site with rapid evolution could indicate adaptation to ancient viral pathogens. Furthermore, we determined the structure of a TRMT1 peptide in complex with Mpro, revealing a substrate binding conformation distinct from the majority of available Mpro-peptide complexes. Kinetic parameters for peptide cleavage show that the TRMT1(526-536) sequence is cleaved with comparable efficiency to the Mpro-targeted nsp8/9 viral cleavage site. Mutagenesis studies and molecular dynamics simulations together indicate that kinetic discrimination occurs during a later step of Mpro-mediated proteolysis that follows substrate binding. Our results provide new information about the structural basis for Mpro substrate recognition and cleavage, the functional roles of the TRMT1 zinc finger domain in tRNA binding and modification, and the regulation of TRMT1 activity by SARS-CoV-2 Mpro. These studies could inform future therapeutic design targeting Mpro and raise the possibility that proteolysis of human TRMT1 during SARS-CoV-2 infection suppresses protein translation and oxidative stress response to impact viral pathogenesis. Significance Statement Viral proteases can strategically target human proteins to manipulate host biochemistry during infection. Here, we show that the SARS-CoV-2 main protease (Mpro) can specifically recognize and cleave the human tRNA methyltransferase enzyme TRMT1, and that cleavage of TRMT1 cripples its ability to install a key modification on human tRNAs that is critical for protein translation. Our structural and functional analysis of the Mpro-TRMT1 interaction shows how the flexible Mpro active site engages a conserved sequence in TRMT1 in an uncommon binding mode to catalyze its cleavage and inactivation. These studies provide new insights into substrate recognition by SARS-CoV-2 Mpro that could help guide future antiviral therapeutic development and show how proteolysis of TRMT1 during SARS-CoV-2 infection impairs both TRMT1 tRNA binding and tRNA modification activity to disrupt host translation and potentially impact COVID-19 pathogenesis or phenotypes.
Collapse
Affiliation(s)
- Angel D’Oliviera
- Department of Chemistry & Biochemistry, University of Delaware, Newark, DE 19716
| | - Xuhang Dai
- Department of Chemistry, New York University, New York, NY 10003
| | - Saba Mottaghinia
- CIRI (Centre International de Recherche en Infectiologie), Univ Lyon, Inserm, U1111, Université Claude Bernard Lyon 1, CNRS, UMR5308, ENS de Lyon, F-69007 Lyon, France
| | - Sophie Olson
- Department of Chemistry & Biochemistry, University of Delaware, Newark, DE 19716
| | - Evan P. Geissler
- Department of Chemistry & Biochemistry, University of Delaware, Newark, DE 19716
| | - Lucie Etienne
- CIRI (Centre International de Recherche en Infectiologie), Univ Lyon, Inserm, U1111, Université Claude Bernard Lyon 1, CNRS, UMR5308, ENS de Lyon, F-69007 Lyon, France
| | - Yingkai Zhang
- Department of Chemistry, New York University, New York, NY 10003
- Simons Center for Computational Physical Chemistry at New York University, New York, NY 10003
| | - Jeffrey S. Mugridge
- Department of Chemistry & Biochemistry, University of Delaware, Newark, DE 19716
| |
Collapse
|
4
|
Alpay BA, Desai MM. Effects of selection stringency on the outcomes of directed evolution. PLoS One 2024; 19:e0311438. [PMID: 39401192 PMCID: PMC11472920 DOI: 10.1371/journal.pone.0311438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2024] [Accepted: 09/18/2024] [Indexed: 10/17/2024] Open
Abstract
Directed evolution makes mutant lineages compete in climbing complicated sequence-function landscapes. Given this underlying complexity it is unclear how selection stringency, a ubiquitous parameter of directed evolution, impacts the outcome. Here we approach this question in terms of the fitnesses of the candidate variants at each round and the heterogeneity of their distributions of fitness effects. We show that even if the fittest mutant is most likely to yield the fittest mutants in the next round of selection, diversification can improve outcomes by sampling a larger variety of fitness effects. We find that heterogeneity in fitness effects between variants, larger population sizes, and evolution over a greater number of rounds all encourage diversification.
Collapse
Affiliation(s)
- Berk A. Alpay
- Systems, Synthetic, and Quantitative Biology Program, Harvard University, Cambridge, MA, United States of America
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, United States of America
| | - Michael M. Desai
- Systems, Synthetic, and Quantitative Biology Program, Harvard University, Cambridge, MA, United States of America
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, United States of America
- Department of Physics, Harvard University, Cambridge, MA, United States of America
| |
Collapse
|
5
|
Breimann S, Kamp F, Steiner H, Frishman D. AAontology: An Ontology of Amino Acid Scales for Interpretable Machine Learning. J Mol Biol 2024; 436:168717. [PMID: 39053689 DOI: 10.1016/j.jmb.2024.168717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 07/15/2024] [Accepted: 07/19/2024] [Indexed: 07/27/2024]
Abstract
Amino acid scales are crucial for protein prediction tasks, many of them being curated in the AAindex database. Despite various clustering attempts to organize them and to better understand their relationships, these approaches lack the fine-grained classification necessary for satisfactory interpretability in many protein prediction problems. To address this issue, we developed AAontology-a two-level classification for 586 amino acid scales (mainly from AAindex) together with an in-depth analysis of their relations-using bag-of-word-based classification, clustering, and manual refinement over multiple iterations. AAontology organizes physicochemical scales into 8 categories and 67 subcategories, enhancing the interpretability of scale-based machine learning methods in protein bioinformatics. Thereby it enables researchers to gain a deeper biological insight. We anticipate that AAontology will be a building block to link amino acid properties with protein function and dysfunctions as well as aid informed decision-making in mutation analysis or protein drug design.
Collapse
Affiliation(s)
- Stephan Breimann
- Department of Bioinformatics, School of Life Sciences, Technical University of Munich, Freising, Germany; Ludwig-Maximilians-University Munich, Biomedical Center, Division of Metabolic Biochemistry, Munich, Germany; German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
| | - Frits Kamp
- Ludwig-Maximilians-University Munich, Biomedical Center, Division of Metabolic Biochemistry, Munich, Germany
| | - Harald Steiner
- Ludwig-Maximilians-University Munich, Biomedical Center, Division of Metabolic Biochemistry, Munich, Germany; German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, School of Life Sciences, Technical University of Munich, Freising, Germany.
| |
Collapse
|
6
|
Di Bari L, Bisardi M, Cotogno S, Weigt M, Zamponi F. Emergent time scales of epistasis in protein evolution. Proc Natl Acad Sci U S A 2024; 121:e2406807121. [PMID: 39325427 PMCID: PMC11459137 DOI: 10.1073/pnas.2406807121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 08/17/2024] [Indexed: 09/27/2024] Open
Abstract
We introduce a data-driven epistatic model of protein evolution, capable of generating evolutionary trajectories spanning very different time scales reaching from individual mutations to diverged homologs. Our in silico evolution encompasses random nucleotide mutations, insertions and deletions, and models selection using a fitness landscape, which is inferred via a generative probabilistic model for protein families. We show that the proposed framework accurately reproduces the sequence statistics of both short-time (experimental) and long-time (natural) protein evolution, suggesting applicability also to relatively data-poor intermediate evolutionary time scales, which are currently inaccessible to evolution experiments. Our model uncovers a highly collective nature of epistasis, gradually changing the fitness effect of mutations in a diverging sequence context, rather than acting via strong interactions between individual mutations. This collective nature triggers the emergence of a long evolutionary time scale, separating fast mutational processes inside a given sequence context, from the slow evolution of the context itself. The model quantitatively reproduces epistatic phenomena such as contingency and entrenchment, as well as the loss of predictability in protein evolution observed in deep mutational scanning experiments of distant homologs. It thereby deepens our understanding of the interplay between mutation and selection in shaping protein diversity and functions, allows one to statistically forecast evolution, and challenges the prevailing independent-site models of protein evolution, which are unable to capture the fundamental importance of epistasis.
Collapse
Affiliation(s)
- Leonardo Di Bari
- Dipartimento Scienza Applicata e Tecnologia, Politecnico di Torino, I-10129Torino, Italy
| | - Matteo Bisardi
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire de Biologie Computationnelle et Quantitative, ParisF-75005, France
| | - Sabrina Cotogno
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire de Biologie Computationnelle et Quantitative, ParisF-75005, France
| | - Martin Weigt
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire de Biologie Computationnelle et Quantitative, ParisF-75005, France
| | - Francesco Zamponi
- Dipartimento di Fisica, Sapienza Università di Roma, 00185Rome, Italy
| |
Collapse
|
7
|
Ma C, Luo Y, Zhang C, Cheng C, Hua N, Liu X, Wu J, Qin L, Yu P, Luo J, Yang F, Jiang LH, Zhang G, Yang W. Evolutionary trajectory of TRPM2 channel activation by adenosine diphosphate ribose and calcium. Sci Bull (Beijing) 2024; 69:2892-2905. [PMID: 38734586 DOI: 10.1016/j.scib.2024.04.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 02/07/2024] [Accepted: 04/19/2024] [Indexed: 05/13/2024]
Abstract
Ion channel activation upon ligand gating triggers a myriad of biological events and, therefore, evolution of ligand gating mechanism is of fundamental importance. TRPM2, a typical ancient ion channel, is activated by adenosine diphosphate ribose (ADPR) and calcium and its activation has evolved from a simple mode in invertebrates to a more complex one in vertebrates, but the evolutionary process is still unknown. Molecular evolutionary analysis of TRPM2s from more than 280 different animal species has revealed that, the C-terminal NUDT9-H domain has evolved from an enzyme to a ligand binding site for activation, while the N-terminal MHR domain maintains a conserved ligand binding site. Calcium gating pattern has also evolved, from one Ca2+-binding site as in sea anemones to three sites as in human. Importantly, we identified a new group represented by olTRPM2, which has a novel gating mode and fills the missing link of the channel gating evolution. We conclude that the TRPM2 ligand binding or activation mode evolved through at least three identifiable stages in the past billion years from simple to complicated and coordinated. Such findings benefit the evolutionary investigations of other channels and proteins.
Collapse
Affiliation(s)
- Cheng Ma
- Department of Biophysics and Department of Neurosurgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China; Protein Facility, Core Facilities, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Yanping Luo
- Department of Biophysics and Department of Neurosurgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Congyi Zhang
- Department of Biophysics and Department of Neurosurgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Cheng Cheng
- Department of Biophysics and Department of Neurosurgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Ning Hua
- Department of Biophysics and Department of Neurosurgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Xiaocao Liu
- Department of Biophysics and Department of Neurosurgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Jianan Wu
- Department of Biophysics and Department of Neurosurgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Luying Qin
- Department of Biophysics and Department of Neurosurgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Peilin Yu
- Department of Toxicology, and Department of Medical Oncology of The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Jianhong Luo
- Department of Neurobiology, Affiliated Mental Health Center, College of Brain Science and Brain Medicine, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Fan Yang
- Department of Biophysics, and Kidney Disease Center of The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Lin-Hua Jiang
- Sino-UK Joint Laboratory of Brain Function and Injury of Henan Province, and Department of Physiology and Pathophysiology, Xinxiang Medical University, Xinxiang 453004, China; Henan Collaborative Innovation Center of Prevention and Treatment of Mental Disorder, The Second Affiliated Hospital of Xinxiang Medical University, Xinxiang 453004, China
| | - Guojie Zhang
- Evolutionary & Organismal Biology Research Center, School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Wei Yang
- Department of Biophysics and Department of Neurosurgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China; GuiZhou University Medical College, Guiyang 550025, China.
| |
Collapse
|
8
|
Cueno ME, Kamio N, Imai K. Avian influenza A H5N1 hemagglutinin protein models have distinct structural patterns re-occurring across the 1959-2023 strains. Biosystems 2024; 246:105347. [PMID: 39349133 DOI: 10.1016/j.biosystems.2024.105347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 09/26/2024] [Accepted: 09/27/2024] [Indexed: 10/02/2024]
Abstract
Influenza A H5N1 hemagglutinin (HA) plays a crucial role in viral pathogenesis and changes in the HA receptor binding domain (RBD) have been attributed to alterations in viral pathogenesis. Mutations often occur within the HA which in-turn results in HA structural changes that consequently contribute to protein evolution. However, the possible occurrence of mutations that results to reversion of the HA protein (going back to an ancestral protein conformation) which in-turn creates distinct HA structural patterns across the 1959-2023 H5N1 viral evolution has never been investigated. Here, we generated and verified the quality of the HA models, identified similar HA structural patterns, and elucidated the possible variations in HA RBD structural dynamics. Our results show that there are 7 distinct structural patterns occurring among the 1959-2023 H5N1 HA models which suggests that reversion of the HA protein putatively occurs during viral evolution. Similarly, we found that the HA RBD structural dynamics vary among the 7 distinct structural patterns possibly affecting viral pathogenesis.
Collapse
Affiliation(s)
- Marni E Cueno
- Department of Microbiology and Immunology, Nihon University School of Dentistry, Tokyo, 101-8310, Japan.
| | - Noriaki Kamio
- Department of Microbiology and Immunology, Nihon University School of Dentistry, Tokyo, 101-8310, Japan
| | - Kenichi Imai
- Department of Microbiology and Immunology, Nihon University School of Dentistry, Tokyo, 101-8310, Japan
| |
Collapse
|
9
|
Duan B, Qiu C, Lockless SW, Sze SH, Kaplan CD. Higher-order epistasis within Pol II trigger loop haplotypes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.20.576280. [PMID: 38293233 PMCID: PMC10827151 DOI: 10.1101/2024.01.20.576280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
RNA polymerase II (Pol II) has a highly conserved domain, the trigger loop (TL), that controls transcription fidelity and speed. We previously probed pairwise genetic interactions between residues within and surrounding the TL for the purpose of understand functional interactions between residues and to understand how individual mutants might alter TL function. We identified widespread incompatibility between TLs of different species when placed in the Saccharomyces cerevisiae Pol II context, indicating species-specific interactions between otherwise highly conserved TLs and its surroundings. These interactions represent epistasis between TL residues and the rest of Pol II. We sought to understand why certain TL sequences are incompatible with S. cerevisiae Pol II and to dissect the nature of genetic interactions within multiply substituted TLs as a window on higher order epistasis in this system. We identified both positive and negative higher-order residue interactions within example TL haplotypes. Intricate higher-order epistasis formed by TL residues was sometimes only apparent from analysis of intermediate genotypes, emphasizing complexity of epistatic interactions. Furthermore, we distinguished TL substitutions with distinct classes of epistatic patterns, suggesting specific TL residues that potentially influence TL evolution. Our examples of complex residue interactions suggest possible pathways for epistasis to facilitate Pol II evolution.
Collapse
Affiliation(s)
- Bingbing Duan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260
| | - Chenxi Qiu
- Department of Genetics, Harvard Medical School, Boston, MA 02215
| | - Steve W Lockless
- Department of Biology, Texas A&M University, College Station, TX 77843
| | - Sing-Hoi Sze
- Department of Computer Science & Engineering, Texas A&M University, College Station, TX 77843
- Department of Biochemistry & Biophysics, Texas A&M University, College Station, TX 77843
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260
| |
Collapse
|
10
|
Meiri R, Aharoni Lotati SL, Orenstein Y, Papo N. Deep neural networks for predicting the affinity landscape of protein-protein interactions. iScience 2024; 27:110772. [PMID: 39310756 PMCID: PMC11416218 DOI: 10.1016/j.isci.2024.110772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 06/27/2024] [Accepted: 08/15/2024] [Indexed: 09/25/2024] Open
Abstract
Studies determining protein-protein interactions (PPIs) by deep mutational scanning have focused mainly on a narrow range of affinities within complexes and thus include only partial coverage of the mutation space of given proteins. By inserting an affinity-reducing N-terminal alanine in the N-terminal domain of the tissue inhibitor of metalloproteinases-2 (N-TIMP2), we overcame the limitation of its narrow affinity range for matrix metalloproteinase 9 (MMP9CAT). We trained deep neural networks (DNNs) to quantitatively predict the binding affinity of unobserved wild-type variants and variants carrying an N-terminal alanine. Good correlation was obtained between predicted and observed log2 enrichment ratio (ER) values, which also correlated with the affinity of N-TIMP2 variants to MMP9CAT. Our ability to predict affinities of unobserved N-TIMP2 variants was confirmed on an independent dataset of experimentally validated N-TIMP2 proteins. This ability is of significant importance in the field of PPI prediction and for developing therapies targeting these interactions.
Collapse
Affiliation(s)
- Reut Meiri
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Shay-Lee Aharoni Lotati
- Avram and Stella Goldstein-Goren Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Yaron Orenstein
- Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel
| | - Niv Papo
- Avram and Stella Goldstein-Goren Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
11
|
Park Y, Metzger BPH, Thornton JW. The simplicity of protein sequence-function relationships. Nat Commun 2024; 15:7953. [PMID: 39261454 PMCID: PMC11390738 DOI: 10.1038/s41467-024-51895-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 08/20/2024] [Indexed: 09/13/2024] Open
Abstract
How complex are the rules by which a protein's sequence determines its function? High-order epistatic interactions among residues are thought to be pervasive, suggesting an idiosyncratic and unpredictable sequence-function relationship. But many prior studies may have overestimated epistasis, because they analyzed sequence-function relationships relative to a single reference sequence-which causes measurement noise and local idiosyncrasies to snowball into high-order epistasis-or they did not fully account for global nonlinearities. Here we present a reference-free method that jointly infers specific epistatic interactions and global nonlinearity using a bird's-eye view of sequence space. This technique yields the simplest explanation of sequence-function relationships and is more robust than existing methods to measurement noise, missing data, and model misspecification. We reanalyze 20 experimental datasets and find that context-independent amino acid effects and pairwise interactions, along with a simple nonlinearity to account for limited dynamic range, explain a median of 96% of phenotypic variance and over 92% in every case. Only a tiny fraction of genotypes are strongly affected by higher-order epistasis. Sequence-function relationships are also sparse: a miniscule fraction of amino acids and interactions account for 90% of phenotypic variance. Sequence-function causality across these datasets is therefore simple, opening the way for tractable approaches to characterize proteins' genetic architecture.
Collapse
Affiliation(s)
- Yeonwoo Park
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA
- Center for RNA Research, Institute for Basic Science, Seoul, Republic of Korea
| | - Brian P H Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Joseph W Thornton
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA.
- Department of Human Genetics, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
12
|
Hollmann F, Sanchis J, Reetz MT. Learning from Protein Engineering by Deconvolution of Multi-Mutational Variants. Angew Chem Int Ed Engl 2024; 63:e202404880. [PMID: 38884594 DOI: 10.1002/anie.202404880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 06/05/2024] [Accepted: 06/06/2024] [Indexed: 06/18/2024]
Abstract
This review analyzes a development in biochemistry, enzymology and biotechnology that originally came as a surprise. Following the establishment of directed evolution of stereoselective enzymes in organic chemistry, the concept of partial or complete deconvolution of selective multi-mutational variants was introduced. Early deconvolution experiments of stereoselective variants led to the finding that mutations can interact cooperatively or antagonistically with one another, not just additively. During the past decade, this phenomenon was shown to be general. In some studies, molecular dynamics (MD) and quantum mechanics/molecular mechanics (QM/MM) computations were performed in order to shed light on the origin of non-additivity at all stages of an evolutionary upward climb. Data of complete deconvolution can be used to construct unique multi-dimensional rugged fitness pathway landscapes, which provide mechanistic insights different from traditional fitness landscapes. Along a related line, biochemists have long tested the result of introducing two point mutations in an enzyme for mechanistic reasons, followed by a comparison of the respective double mutant in so-called double mutant cycles, which originally showed only additive effects, but more recently also uncovered cooperative and antagonistic non-additive effects. We conclude with suggestions for future work, and call for a unified overall picture of non-additivity and epistasis.
Collapse
Affiliation(s)
- Frank Hollmann
- Department of Biotechnology, Delft University of Technology, Van der Maasweg 9, 2629HZ, Delft, Netherlands
| | - Joaquin Sanchis
- Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Victoria, 3052, Australia
| | - Manfred T Reetz
- Max-Plank-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45481, Mülheim, Germany
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
| |
Collapse
|
13
|
Taylor AL, Starr TN. Deep mutational scanning of SARS-CoV-2 Omicron BA.2.86 and epistatic emergence of the KP.3 variant. Virus Evol 2024; 10:veae067. [PMID: 39310091 PMCID: PMC11414647 DOI: 10.1093/ve/veae067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 08/20/2024] [Accepted: 08/28/2024] [Indexed: 09/25/2024] Open
Abstract
Deep mutational scanning experiments aid in the surveillance and forecasting of viral evolution by providing prospective measurements of mutational effects on viral traits, but epistatic shifts in the impacts of mutations can hinder viral forecasting when measurements were made in outdated strain backgrounds. Here, we report measurements of the impact of all single amino acid mutations on ACE2-binding affinity and protein folding and expression in the SARS-CoV-2 Omicron BA.2.86 spike receptor-binding domain. As with other SARS-CoV-2 variants, we find a plastic and evolvable basis for receptor binding, with many mutations at the ACE2 interface maintaining or even improving ACE2-binding affinity. Despite its large genetic divergence, mutational effects in BA.2.86 have not diverged greatly from those measured in its Omicron BA.2 ancestor. However, we do identify strong positive epistasis among subsequent mutations that have accrued in BA.2.86 descendants. Specifically, the Q493E mutation that decreased ACE2-binding affinity in all previous SARS-CoV-2 backgrounds is reversed in sign to enhance human ACE2-binding affinity when coupled with L455S and F456L in the currently emerging KP.3 variant. Our results point to a modest degree of epistatic drift in mutational effects during recent SARS-CoV-2 evolution but highlight how these small epistatic shifts can have important consequences for the emergence of new SARS-CoV-2 variants.
Collapse
Affiliation(s)
- Ashley L Taylor
- Department of Biochemistry, University of Utah School of Medicine, 15 N Medical Dr E, Salt Lake City, UT 84112, USA
| | - Tyler N Starr
- Department of Biochemistry, University of Utah School of Medicine, 15 N Medical Dr E, Salt Lake City, UT 84112, USA
| |
Collapse
|
14
|
Dietler N, Abbara A, Choudhury S, Bitbol AF. Impact of phylogeny on the inference of functional sectors from protein sequence data. PLoS Comput Biol 2024; 20:e1012091. [PMID: 39312591 PMCID: PMC11449291 DOI: 10.1371/journal.pcbi.1012091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 10/03/2024] [Accepted: 09/10/2024] [Indexed: 09/25/2024] Open
Abstract
Statistical analysis of multiple sequence alignments of homologous proteins has revealed groups of coevolving amino acids called sectors. These groups of amino-acid sites feature collective correlations in their amino-acid usage, and they are associated to functional properties. Modeling showed that nonlinear selection on an additive functional trait of a protein is generically expected to give rise to a functional sector. These modeling results motivated a principled method, called ICOD, which is designed to identify functional sectors, as well as mutational effects, from sequence data. However, a challenge for all methods aiming to identify sectors from multiple sequence alignments is that correlations in amino-acid usage can also arise from the mere fact that homologous sequences share common ancestry, i.e. from phylogeny. Here, we generate controlled synthetic data from a minimal model comprising both phylogeny and functional sectors. We use this data to dissect the impact of phylogeny on sector identification and on mutational effect inference by different methods. We find that ICOD is most robust to phylogeny, but that conservation is also quite robust. Next, we consider natural multiple sequence alignments of protein families for which deep mutational scan experimental data is available. We show that in this natural data, conservation and ICOD best identify sites with strong functional roles, in agreement with our results on synthetic data. Importantly, these two methods have different premises, since they respectively focus on conservation and on correlations. Thus, their joint use can reveal complementary information.
Collapse
Affiliation(s)
- Nicola Dietler
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Alia Abbara
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Subham Choudhury
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Anne-Florence Bitbol
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
15
|
Šakanović A, Kranjc N, Omersa N, Aden S, Kežar A, Kisovec M, Zavec AB, Caserman S, Gilbert RJC, Podobnik M, Crnković A, Anderluh G. In vitro evolution driven by epistasis reveals alternative cholesterol-specific binding motifs of perfringolysin O. J Biol Chem 2024; 300:107664. [PMID: 39128714 PMCID: PMC11416283 DOI: 10.1016/j.jbc.2024.107664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 07/22/2024] [Accepted: 08/05/2024] [Indexed: 08/13/2024] Open
Abstract
The crucial molecular factors that shape the interfaces of lipid-binding proteins with their target ligands and surfaces remain unknown due to the complex makeup of biological membranes. Cholesterol, the major modulator of bilayer structure in mammalian cell membranes, is recognized by various proteins, including the well-studied cholesterol-dependent cytolysins. Here, we use in vitro evolution to investigate the molecular adaptations that preserve the cholesterol specificity of perfringolysin O, the prototypical cholesterol-dependent cytolysin from Clostridium perfringens. We identify variants with altered membrane-binding interfaces whose cholesterol-specific activity exceeds that of the wild-type perfringolysin O. These novel variants represent alternative evolutionary outcomes and have mutations at conserved positions that can only accumulate when epistatic constraints are alleviated. Our results improve the current understanding of the biochemical malleability of the surface of a lipid-binding protein.
Collapse
Affiliation(s)
- Aleksandra Šakanović
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, Ljubljana, Slovenia
| | - Nace Kranjc
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, Ljubljana, Slovenia
| | - Neža Omersa
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, Ljubljana, Slovenia
| | - Saša Aden
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, Ljubljana, Slovenia
| | - Andreja Kežar
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, Ljubljana, Slovenia
| | - Matic Kisovec
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, Ljubljana, Slovenia
| | - Apolonija Bedina Zavec
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, Ljubljana, Slovenia
| | - Simon Caserman
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, Ljubljana, Slovenia
| | - Robert J C Gilbert
- Division of Structural Biology, Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Marjetka Podobnik
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, Ljubljana, Slovenia
| | - Ana Crnković
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, Ljubljana, Slovenia.
| | - Gregor Anderluh
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, Ljubljana, Slovenia.
| |
Collapse
|
16
|
Lipsh-Sokolik R, Fleishman SJ. Addressing epistasis in the design of protein function. Proc Natl Acad Sci U S A 2024; 121:e2314999121. [PMID: 39133844 PMCID: PMC11348311 DOI: 10.1073/pnas.2314999121] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024] Open
Abstract
Mutations in protein active sites can dramatically improve function. The active site, however, is densely packed and extremely sensitive to mutations. Therefore, some mutations may only be tolerated in combination with others in a phenomenon known as epistasis. Epistasis reduces the likelihood of obtaining improved functional variants and dramatically slows natural and lab evolutionary processes. Research has shed light on the molecular origins of epistasis and its role in shaping evolutionary trajectories and outcomes. In addition, sequence- and AI-based strategies that infer epistatic relationships from mutational patterns in natural or experimental evolution data have been used to design functional protein variants. In recent years, combinations of such approaches and atomistic design calculations have successfully predicted highly functional combinatorial mutations in active sites. These were used to design thousands of functional active-site variants, demonstrating that, while our understanding of epistasis remains incomplete, some of the determinants that are critical for accurate design are now sufficiently understood. We conclude that the space of active-site variants that has been explored by evolution may be expanded dramatically to enhance natural activities or discover new ones. Furthermore, design opens the way to systematically exploring sequence and structure space and mutational impacts on function, deepening our understanding and control over protein activity.
Collapse
Affiliation(s)
- Rosalie Lipsh-Sokolik
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Sarel J Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
17
|
Wu Y, Yang Y, Lu G, Xiang WL, Sun TY, Chen KW, Lv X, Gui YF, Zeng RQ, Du YK, Fu CH, Huang JW, Chen CC, Guo RT, Yu LJ. Unleashing the Power of Evolution in Xylanase Engineering: Investigating the Role of Distal Mutation Regulation. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:18201-18213. [PMID: 39082219 DOI: 10.1021/acs.jafc.4c03245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
The drive to enhance enzyme performance in industrial applications frequently clashes with the practical limitations of exhaustive experimental screening, underscoring the urgency for more refined and strategic methodologies in enzyme engineering. In this study, xylanase Xyl-1 was used as the model, coupling evolutionary insights with energy functions to obtain theoretical potential mutants, which were subsequently validated experimentally. We observed that mutations in the nonloop region primarily aimed at enhancing stability and also encountered selective pressure for activity. Notably, mutations in this region simultaneously boosted the Xyl-1 stability and activity, achieving a 65% success rate. Using a greedy strategy, mutant M4 was developed, achieving a 12 °C higher melting temperature and doubled activity. By integration of spectroscopy, crystallography, and quantum mechanics/molecular mechanics molecular dynamics, the mechanism behind the enhanced thermal stability of M4 was elucidated. It was determined that the activity differences between M4 and the wild type were primarily driven by dynamic factors influenced by distal mutations. In conclusion, the study emphasizes the pivotal role of evolution-based approaches in augmenting the stability and activity of the enzymes. It sheds light on the unique adaptive mechanisms employed by various structural regions of proteins and expands our understanding of the intricate relationship between distant mutations and enzyme dynamics.
Collapse
Affiliation(s)
- Ya Wu
- Institute of Resource Biology and Biotechnology, Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, 1037 Luoyu Road, Wuhan 430074, China
- Key Laboratory of Molecular Biophysics, Ministry of Education, 1037 Luoyu Road, Wuhan 430074, China
| | - Yu Yang
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Hongshan Laboratory, Hubei Collaborative Innovation Center for Green Transformation of Bio-Resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences, Hubei University, Wuhan 430062, China
| | - Gen Lu
- Institute of Resource Biology and Biotechnology, Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, 1037 Luoyu Road, Wuhan 430074, China
- Key Laboratory of Molecular Biophysics, Ministry of Education, 1037 Luoyu Road, Wuhan 430074, China
| | - Wan-Lu Xiang
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Hongshan Laboratory, Hubei Collaborative Innovation Center for Green Transformation of Bio-Resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences, Hubei University, Wuhan 430062, China
| | - Tian-Yu Sun
- Shenzhen Bay Laboratory, Shenzhen 518132, China
| | - Ke-Wei Chen
- Lab of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
| | - Xiang Lv
- Ministry of Education Key Laboratory of Industrial Biotechnology, School of Biotechnology, Jiangnan University, Wuxi 214122, China
| | - Yi-Fan Gui
- Institute of Resource Biology and Biotechnology, Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, 1037 Luoyu Road, Wuhan 430074, China
- Key Laboratory of Molecular Biophysics, Ministry of Education, 1037 Luoyu Road, Wuhan 430074, China
| | - Rui-Qi Zeng
- Institute of Resource Biology and Biotechnology, Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, 1037 Luoyu Road, Wuhan 430074, China
| | - Yi-Kai Du
- Institute of Resource Biology and Biotechnology, Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, 1037 Luoyu Road, Wuhan 430074, China
| | - Chun-Hua Fu
- Institute of Resource Biology and Biotechnology, Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, 1037 Luoyu Road, Wuhan 430074, China
- Key Laboratory of Molecular Biophysics, Ministry of Education, 1037 Luoyu Road, Wuhan 430074, China
| | - Jian-Wen Huang
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Hongshan Laboratory, Hubei Collaborative Innovation Center for Green Transformation of Bio-Resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences, Hubei University, Wuhan 430062, China
| | - Chun-Chi Chen
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Hongshan Laboratory, Hubei Collaborative Innovation Center for Green Transformation of Bio-Resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences, Hubei University, Wuhan 430062, China
- Zhejiang Key Laboratory of Medical Epigenetics, Department of Immunology and Pathogen Biology, School of Basic Medical Sciences, Hangzhou Normal University, Hangzhou 311121, China
| | - Rey-Ting Guo
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Hongshan Laboratory, Hubei Collaborative Innovation Center for Green Transformation of Bio-Resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences, Hubei University, Wuhan 430062, China
- Zhejiang Key Laboratory of Medical Epigenetics, Department of Immunology and Pathogen Biology, School of Basic Medical Sciences, Hangzhou Normal University, Hangzhou 311121, China
| | - Long-Jiang Yu
- Institute of Resource Biology and Biotechnology, Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, 1037 Luoyu Road, Wuhan 430074, China
- Key Laboratory of Molecular Biophysics, Ministry of Education, 1037 Luoyu Road, Wuhan 430074, China
| |
Collapse
|
18
|
Johnston KE, Almhjell PJ, Watkins-Dulaney EJ, Liu G, Porter NJ, Yang J, Arnold FH. A combinatorially complete epistatic fitness landscape in an enzyme active site. Proc Natl Acad Sci U S A 2024; 121:e2400439121. [PMID: 39074291 PMCID: PMC11317637 DOI: 10.1073/pnas.2400439121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Accepted: 06/17/2024] [Indexed: 07/31/2024] Open
Abstract
Protein engineering often targets binding pockets or active sites which are enriched in epistasis-nonadditive interactions between amino acid substitutions-and where the combined effects of multiple single substitutions are difficult to predict. Few existing sequence-fitness datasets capture epistasis at large scale, especially for enzyme catalysis, limiting the development and assessment of model-guided enzyme engineering approaches. We present here a combinatorially complete, 160,000-variant fitness landscape across four residues in the active site of an enzyme. Assaying the native reaction of a thermostable β-subunit of tryptophan synthase (TrpB) in a nonnative environment yielded a landscape characterized by significant epistasis and many local optima. These effects prevent simulated directed evolution approaches from efficiently reaching the global optimum. There is nonetheless wide variability in the effectiveness of different directed evolution approaches, which together provide experimental benchmarks for computational and machine learning workflows. The most-fit TrpB variants contain a substitution that is nearly absent in natural TrpB sequences-a result that conservation-based predictions would not capture. Thus, although fitness prediction using evolutionary data can enrich in more-active variants, these approaches struggle to identify and differentiate among the most-active variants, even for this near-native function. Overall, this work presents a large-scale testing ground for model-guided enzyme engineering and suggests that efficient navigation of epistatic fitness landscapes can be improved by advances in both machine learning and physical modeling.
Collapse
Affiliation(s)
- Kadina E. Johnston
- Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA91125
| | - Patrick J. Almhjell
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA91125
| | - Ella J. Watkins-Dulaney
- Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA91125
| | - Grace Liu
- Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA91125
| | - Nicholas J. Porter
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA91125
| | - Jason Yang
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA91125
| | - Frances H. Arnold
- Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA91125
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA91125
| |
Collapse
|
19
|
Marsili G, Pallotto C, Fortuna C, Amendola A, Fiorentini C, Esperti S, Blanc P, Suardi LR, Giulietta V, Argentini C. Fifty years after the first identification of Toscana virus in Italy: Genomic characterization of viral isolates within lineage A and aminoacidic markers of evolution. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2024; 122:105601. [PMID: 38830443 DOI: 10.1016/j.meegid.2024.105601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 04/18/2024] [Accepted: 05/03/2024] [Indexed: 06/05/2024]
Abstract
Toscana Virus (TosV) was firstly isolated from phlebotomine in our Institute about fifty years ago. Later, in 1984-1985, TosV infection, although asymptomatic in most cases, was shown to cause disease in humans, mainly fever and meningitis. By means of genetic analysis of part of M segment, we describe 3 new viral isolates obtained directly from cerebrospinal fluid or sera samples of patients diagnosed with TosV infection in July 2020 in Tuscany region. Phylogenesis was used to propose the clustering of TosV lineage A strains in 3 main groups, whereas deep mutational analysis based on 12 amino acid positions, allowed the identification of 9 putative strains. We discuss deep mutational analysis as a method to identify molecular signature of host adaptation and/or pathogenesis.
Collapse
Affiliation(s)
- Giulia Marsili
- Dipartimento di Malattie Infettive, Istituto Superiore di Sanità, Roma, Italy
| | - Carlo Pallotto
- SOC Malattie Infettive 1, Azienda USL Toscana Centro, Bagno a Ripoli, Firenze, Italy; Clinica delle Malattie Infettive, Azienda Ospedaliera Santa Maria della Misericordia, Università di Perugia, Perugia, Italy
| | - Claudia Fortuna
- Dipartimento di Malattie Infettive, Istituto Superiore di Sanità, Roma, Italy
| | - Antonello Amendola
- Dipartimento di Malattie Infettive, Istituto Superiore di Sanità, Roma, Italy
| | | | - Sara Esperti
- SOC Malattie Infettive 1, Azienda USL Toscana Centro, Bagno a Ripoli, Firenze, Italy; Dipartimento di Malattie Infettive, Azienda Ospedaliero-Universitaria di Modena, Policlinico di Modena, Università di Modena e Reggio Emilia, Modena, Italy
| | - Pierluigi Blanc
- SOC Malattie Infettive 1, Azienda USL Toscana Centro, Bagno a Ripoli, Firenze, Italy; SOC Malattie Infettive 2, Azienda USL Toscana Centro, Pistoia, Italy
| | - Lorenzo Roberto Suardi
- SOC Malattie Infettive 1, Azienda USL Toscana Centro, Bagno a Ripoli, Firenze, Italy; UO Malattie Infettive, Azienda Ospedaliero-Universitaria Pisana, Pisa, Italy
| | - Venturi Giulietta
- Dipartimento di Malattie Infettive, Istituto Superiore di Sanità, Roma, Italy
| | - Claudio Argentini
- Dipartimento di Malattie Infettive, Istituto Superiore di Sanità, Roma, Italy.
| |
Collapse
|
20
|
Listov D, Goverde CA, Correia BE, Fleishman SJ. Opportunities and challenges in design and optimization of protein function. Nat Rev Mol Cell Biol 2024; 25:639-653. [PMID: 38565617 PMCID: PMC7616297 DOI: 10.1038/s41580-024-00718-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2024] [Indexed: 04/04/2024]
Abstract
The field of protein design has made remarkable progress over the past decade. Historically, the low reliability of purely structure-based design methods limited their application, but recent strategies that combine structure-based and sequence-based calculations, as well as machine learning tools, have dramatically improved protein engineering and design. In this Review, we discuss how these methods have enabled the design of increasingly complex structures and therapeutically relevant activities. Additionally, protein optimization methods have improved the stability and activity of complex eukaryotic proteins. Thanks to their increased reliability, computational design methods have been applied to improve therapeutics and enzymes for green chemistry and have generated vaccine antigens, antivirals and drug-delivery nano-vehicles. Moreover, the high success of design methods reflects an increased understanding of basic rules that govern the relationships among protein sequence, structure and function. However, de novo design is still limited mostly to α-helix bundles, restricting its potential to generate sophisticated enzymes and diverse protein and small-molecule binders. Designing complex protein structures is a challenging but necessary next step if we are to realize our objective of generating new-to-nature activities.
Collapse
Affiliation(s)
- Dina Listov
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Casper A Goverde
- Institute of Bioengineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Bruno E Correia
- Institute of Bioengineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| | - Sarel Jacob Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
21
|
Vila JA. Analysis of proteins in the light of mutations. EUROPEAN BIOPHYSICS JOURNAL : EBJ 2024; 53:255-265. [PMID: 38955858 DOI: 10.1007/s00249-024-01714-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 05/23/2024] [Accepted: 06/18/2024] [Indexed: 07/04/2024]
Abstract
Proteins have evolved through mutations-amino acid substitutions-since life appeared on Earth, some 109 years ago. The study of these phenomena has been of particular significance because of their impact on protein stability, function, and structure. This study offers a new viewpoint on how the most recent findings in these areas can be used to explore the impact of mutations on protein sequence, stability, and evolvability. Preliminary results indicate that: (1) mutations can be viewed as sensitive probes to identify 'typos' in the amino-acid sequence, and also to assess the resistance of naturally occurring proteins to unwanted sequence alterations; (2) the presence of 'typos' in the amino acid sequence, rather than being an evolutionary obstacle, could promote faster evolvability and, in turn, increase the likelihood of higher protein stability; (3) the mutation site is far more important than the substituted amino acid in terms of the marginal stability changes of the protein, and (4) the unpredictability of protein evolution at the molecular level-by mutations-exists even in the absence of epistasis effects. Finally, the Darwinian concept of evolution "descent with modification" and experimental evidence endorse one of the results of this study, which suggests that some regions of any protein sequence are susceptible to mutations while others are not. This work contributes to our general understanding of protein responses to mutations and may spur significant progress in our efforts to develop methods to accurately forecast changes in protein stability, their propensity for metamorphism, and their ability to evolve.
Collapse
Affiliation(s)
- Jorge A Vila
- IMASL-CONICET, Universidad Nacional de San Luis, Ejército de los Andes 950, 5700, San Luis, Argentina.
| |
Collapse
|
22
|
Chamness LM, Kuntz CP, McKee AG, Penn WD, Hemmerich CM, Rusch DB, Woods H, Dyotima, Meiler J, Schlebach JP. Divergent folding-mediated epistasis among unstable membrane protein variants. eLife 2024; 12:RP92406. [PMID: 39078397 PMCID: PMC11288631 DOI: 10.7554/elife.92406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/31/2024] Open
Abstract
Many membrane proteins are prone to misfolding, which compromises their functional expression at the plasma membrane. This is particularly true for the mammalian gonadotropin-releasing hormone receptor GPCRs (GnRHR). We recently demonstrated that evolutionary GnRHR modifications appear to have coincided with adaptive changes in cotranslational folding efficiency. Though protein stability is known to shape evolution, it is unclear how cotranslational folding constraints modulate the synergistic, epistatic interactions between mutations. We therefore compared the pairwise interactions formed by mutations that disrupt the membrane topology (V276T) or tertiary structure (W107A) of GnRHR. Using deep mutational scanning, we evaluated how the plasma membrane expression of these variants is modified by hundreds of secondary mutations. An analysis of 251 mutants in three genetic backgrounds reveals that V276T and W107A form distinct epistatic interactions that depend on both the severity and the mechanism of destabilization. V276T forms predominantly negative epistatic interactions with destabilizing mutations in soluble loops. In contrast, W107A forms positive interactions with mutations in both loops and transmembrane domains that reflect the diminishing impacts of the destabilizing mutations in variants that are already unstable. These findings reveal how epistasis is remodeled by conformational defects in membrane proteins and in unstable proteins more generally.
Collapse
Affiliation(s)
- Laura M Chamness
- Department of Chemistry, Indiana UniversityBloomingtonUnited States
| | - Charles P Kuntz
- The James Tarpo Jr. and Margaret Tarpo Department of Chemistry, Purdue UniversityWest LafayetteUnited States
| | - Andrew G McKee
- Department of Chemistry, Indiana UniversityBloomingtonUnited States
| | - Wesley D Penn
- Department of Chemistry, Indiana UniversityBloomingtonUnited States
| | | | - Douglas B Rusch
- Center for Genomics and Bioinformatics, Indiana UniversityBloomingtonUnited States
| | - Hope Woods
- Department of Chemistry, Vanderbilt UniversityNashvilleUnited States
- Chemical and Physical Biology Program, Vanderbilt UniversityNashvilleUnited States
| | - Dyotima
- Department of Chemistry, Indiana UniversityBloomingtonUnited States
| | - Jens Meiler
- Department of Chemistry, Vanderbilt UniversityNashvilleUnited States
- Institute for Drug Discovery, Leipzig UniversityLeipzigGermany
| | - Jonathan P Schlebach
- The James Tarpo Jr. and Margaret Tarpo Department of Chemistry, Purdue UniversityWest LafayetteUnited States
| |
Collapse
|
23
|
Taylor AL, Starr TN. Deep mutational scanning of SARS-CoV-2 Omicron BA.2.86 and epistatic emergence of the KP.3 variant. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.23.604853. [PMID: 39091888 PMCID: PMC11291116 DOI: 10.1101/2024.07.23.604853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
Deep mutational scanning experiments aid in the surveillance and forecasting of viral evolution by providing prospective measurements of mutational effects on viral traits, but epistatic shifts in the impacts of mutations can hinder viral forecasting when measurements were made in outdated strain backgrounds. Here, we report measurements of the impact of all single amino acid mutations on ACE2-binding affinity and protein folding and expression in the SARS-CoV-2 Omicron BA.2.86 spike receptor-binding domain (RBD). As with other SARS-CoV-2 variants, we find a plastic and evolvable basis for receptor binding, with many mutations at the ACE2 interface maintaining or even improving ACE2-binding affinity. Despite its large genetic divergence, mutational effects in BA.2.86 have not diverged greatly from those measured in its Omicron BA.2 ancestor. However, we do identify strong positive epistasis among subsequent mutations that have accrued in BA.2.86 descendants. Specifically, the Q493E mutation that decreased ACE2-binding affinity in all previous SARS-CoV-2 backgrounds is reversed in sign to enhance human ACE2-binding affinity when coupled with L455S and F456L in the currently emerging KP.3 variant. Our results point to a modest degree of epistatic drift in mutational effects during recent SARS-CoV-2 evolution but highlight how these small epistatic shifts can have important consequences for the emergence of new SARS-CoV-2 variants.
Collapse
Affiliation(s)
- Ashley L. Taylor
- Department of Biochemistry, University of Utah School of Medicine, Salt Lake City, UT 84112, USA
| | - Tyler N. Starr
- Department of Biochemistry, University of Utah School of Medicine, Salt Lake City, UT 84112, USA
| |
Collapse
|
24
|
Crandall JG, Zhou X, Rokas A, Hittinger CT. Specialization restricts the evolutionary paths available to yeast sugar transporters. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.22.604696. [PMID: 39091816 PMCID: PMC11291069 DOI: 10.1101/2024.07.22.604696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
Functional innovation at the protein level is a key source of evolutionary novelties. The constraints on functional innovations are likely to be highly specific in different proteins, which are shaped by their unique histories and the extent of global epistasis that arises from their structures and biochemistries. These contextual nuances in the sequence-function relationship have implications both for a basic understanding of the evolutionary process and for engineering proteins with desirable properties. Here, we have investigated the molecular basis of novel function in a model member of an ancient, conserved, and biotechnologically relevant protein family. These Major Facilitator Superfamily sugar porters are a functionally diverse group of proteins that are thought to be highly plastic and evolvable. By dissecting a recent evolutionary innovation in an α-glucoside transporter from the yeast Saccharomyces eubayanus, we show that the ability to transport a novel substrate requires high-order interactions between many protein regions and numerous specific residues proximal to the transport channel. To reconcile the functional diversity of this family with the constrained evolution of this model protein, we generated new, state-of-the-art genome annotations for 332 Saccharomycotina yeast species spanning approximately 400 million years of evolution. By integrating phylogenetic and phenotypic analyses across these species, we show that the model yeast α-glucoside transporters likely evolved from a multifunctional ancestor and became subfunctionalized. The accumulation of additive and epistatic substitutions likely entrenched this subfunction, which made the simultaneous acquisition of multiple interacting substitutions the only reasonably accessible path to novelty.
Collapse
Affiliation(s)
- Johnathan G. Crandall
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Center for Genomic Science Innovation, DOE Great Lakes Bioenergy Research Center, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Xiaofan Zhou
- Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou 510642, China
- Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - Antonis Rokas
- Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - Chris Todd Hittinger
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Center for Genomic Science Innovation, DOE Great Lakes Bioenergy Research Center, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726, USA
| |
Collapse
|
25
|
Chitra U, Arnold BJ, Raphael BJ. Quantifying higher-order epistasis: beware the chimera. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.17.603976. [PMID: 39071303 PMCID: PMC11275791 DOI: 10.1101/2024.07.17.603976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Epistasis, or interactions in which alleles at one locus modify the fitness effects of alleles at other loci, plays a fundamental role in genetics, protein evolution, and many other areas of biology. Epistasis is typically quantified by computing the deviation from the expected fitness under an additive or multiplicative model using one of several formulae. However, these formulae are not all equivalent. Importantly, one widely used formula - which we call the chimeric formula - measures deviations from a multiplicative fitness model on an additive scale, thus mixing two measurement scales. We show that for pairwise interactions, the chimeric formula yields a different magnitude, but the same sign (synergistic vs. antagonistic) of epistasis compared to the multiplicative formula that measures both fitness and deviations on a multiplicative scale. However, for higher-order interactions, we show that the chimeric formula can have both different magnitude and sign compared to the multiplicative formula - thus confusing negative epistatic interactions with positive interactions, and vice versa. We resolve these inconsistencies by deriving fundamental connections between the different epistasis formulae and the parameters of the multivariate Bernoulli distribution . Our results demonstrate that the additive and multiplicative epistasis formulae are more mathematically sound than the chimeric formula. Moreover, we demonstrate that the mathematical issues with the chimeric epistasis formula lead to markedly different biological interpretations of real data. Analyzing multi-gene knockout data in yeast, multi-way drug interactions in E. coli , and deep mutational scanning (DMS) of several proteins, we find that 10 - 60% of higher-order interactions have a change in sign with the multiplicative or additive epistasis formula. These sign changes result in qualitatively different findings on functional divergence in the yeast genome, synergistic vs. antagonistic drug interactions, and and epistasis between protein mutations. In particular, in the yeast data, the more appropriate multiplicative formula identifies nearly 500 additional negative three-way interactions, thus extending the trigenic interaction network by 25%.
Collapse
|
26
|
Norn C, Oliveira F, André I. Improved prediction of site-rates from structure with averaging across homologs. Protein Sci 2024; 33:e5086. [PMID: 38923241 PMCID: PMC11196898 DOI: 10.1002/pro.5086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 05/12/2024] [Accepted: 06/04/2024] [Indexed: 06/28/2024]
Abstract
Variation in mutation rates at sites in proteins can largely be understood by the constraint that proteins must fold into stable structures. Models that calculate site-specific rates based on protein structure and a thermodynamic stability model have shown a significant but modest ability to predict empirical site-specific rates calculated from sequence. Models that use detailed atomistic models of protein energetics do not outperform simpler approaches using packing density. We demonstrate that a fundamental reason for this is that empirical site-specific rates are the result of the average effect of many different microenvironments in a phylogeny. By analyzing the results of evolutionary dynamics simulations, we show how averaging site-specific rates across many extant protein structures can lead to correct recovery of site-rate prediction. This result is also demonstrated in natural protein sequences and experimental structures. Using predicted structures, we demonstrate that atomistic models can improve upon contact density metrics in predicting site-specific rates from a structure. The results give fundamental insights into the factors governing the distribution of site-specific rates in protein families.
Collapse
Affiliation(s)
- Christoffer Norn
- Department of Biochemistry and Structural BiologyLund UniversityLundSweden
- Bioinnovation Institute FoundationKøbenhavnDenmark
| | - Fábio Oliveira
- Department of Biochemistry and Structural BiologyLund UniversityLundSweden
| | - Ingemar André
- Department of Biochemistry and Structural BiologyLund UniversityLundSweden
| |
Collapse
|
27
|
Nguyen A, Zhao H, Myagmarsuren D, Srinivasan S, Wu D, Chen J, Piszczek G, Schuck P. Modulation of biophysical properties of nucleocapsid protein in the mutant spectrum of SARS-CoV-2. eLife 2024; 13:RP94836. [PMID: 38941236 PMCID: PMC11213569 DOI: 10.7554/elife.94836] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024] Open
Abstract
Genetic diversity is a hallmark of RNA viruses and the basis for their evolutionary success. Taking advantage of the uniquely large genomic database of SARS-CoV-2, we examine the impact of mutations across the spectrum of viable amino acid sequences on the biophysical phenotypes of the highly expressed and multifunctional nucleocapsid protein. We find variation in the physicochemical parameters of its extended intrinsically disordered regions (IDRs) sufficient to allow local plasticity, but also observe functional constraints that similarly occur in related coronaviruses. In biophysical experiments with several N-protein species carrying mutations associated with major variants, we find that point mutations in the IDRs can have nonlocal impact and modulate thermodynamic stability, secondary structure, protein oligomeric state, particle formation, and liquid-liquid phase separation. In the Omicron variant, distant mutations in different IDRs have compensatory effects in shifting a delicate balance of interactions controlling protein assembly properties, and include the creation of a new protein-protein interaction interface in the N-terminal IDR through the defining P13L mutation. A picture emerges where genetic diversity is accompanied by significant variation in biophysical characteristics of functional N-protein species, in particular in the IDRs.
Collapse
Affiliation(s)
- Ai Nguyen
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, United States
| | - Huaying Zhao
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, United States
| | - Dulguun Myagmarsuren
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, United States
| | - Sanjana Srinivasan
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, United States
| | - Di Wu
- Biophysics Core Facility, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, United States
| | - Jiji Chen
- Advanced Imaging and Microscopy Resource, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, United States
| | - Grzegorz Piszczek
- Biophysics Core Facility, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, United States
| | - Peter Schuck
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, United States
| |
Collapse
|
28
|
Chen L, Yu K, Ma A, Zhu W, Wang H, Tang X, Tang Y, Li Y, Li J. Enhanced Thermostability of Nattokinase by Computation-Based Rational Redesign of Flexible Regions. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:14241-14254. [PMID: 38864682 DOI: 10.1021/acs.jafc.4c02335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2024]
Abstract
Nattokinase is a nutrient in healthy food natto that has the function of preventing and treating blood thrombus. However, its low thermostability and fibrinolytic activity limit its application in food and pharmaceuticals. In this study, we used bioinformatics analysis to identify two loops (loop10 and loop12) in the flexible region of nattokinase rAprY. Using this basis, we screened the G131S-S161T variant, which showed a 2.38-fold increase in half-life at 55 °C, and the M3 variant, which showed a 2.01-fold increase in activity, by using a thermostability prediction algorithm. Bioinformatics analysis revealed that the enhanced thermostability of the G131S-S161T variant was due to the increased rigidity and structural shrinkage of the overall structure. Additionally, the increased rigidity of the local region surrounding the active center and its mutated sites helps maintain its normal conformation in high-temperature environments. The increased catalytic activity of the M3 variant may be due to its more efficient substrate binding mechanism. We investigated strategies to improve the thermostability and fibrinolytic activity of nattokinase, and the resulting variants show promise for industrial production and application.
Collapse
Affiliation(s)
- Liangqi Chen
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
- Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830017, China
| | - Kongfang Yu
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
| | - Aixia Ma
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
- Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830017, China
| | - Wenhui Zhu
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
| | - Hong Wang
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
| | - Xiyu Tang
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
- Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830017, China
| | - Yaolei Tang
- Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830017, China
- The Third People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi 830000, China
| | - Yuan Li
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
- Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830017, China
| | - Jinyao Li
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
- Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830017, China
| |
Collapse
|
29
|
Burata OE, O'Donnell E, Hyun J, Lucero RM, Thomas JE, Gibbs EM, Reacher I, Carney NA, Stockbridge RB. Peripheral positions encode transport specificity in the small multidrug resistance exporters. Proc Natl Acad Sci U S A 2024; 121:e2403273121. [PMID: 38865266 PMCID: PMC11194549 DOI: 10.1073/pnas.2403273121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 05/02/2024] [Indexed: 06/14/2024] Open
Abstract
In secondary active transporters, a relatively limited set of protein folds have evolved diverse solute transport functions. Because of the conformational changes inherent to transport, altering substrate specificity typically involves remodeling the entire structural landscape, limiting our understanding of how novel substrate specificities evolve. In the current work, we examine a structurally minimalist family of model transport proteins, the small multidrug resistance (SMR) transporters, to understand the molecular basis for the emergence of a novel substrate specificity. We engineer a selective SMR protein to promiscuously export quaternary ammonium antiseptics, similar to the activity of a clade of multidrug exporters in this family. Using combinatorial mutagenesis and deep sequencing, we identify the necessary and sufficient molecular determinants of this engineered activity. Using X-ray crystallography, solid-supported membrane electrophysiology, binding assays, and a proteoliposome-based quaternary ammonium antiseptic transport assay that we developed, we dissect the mechanistic contributions of these residues to substrate polyspecificity. We find that substrate preference changes not through modification of the residues that directly interact with the substrate but through mutations peripheral to the binding pocket. Our work provides molecular insight into substrate promiscuity among the SMRs and can be applied to understand multidrug export and the evolution of novel transport functions more generally.
Collapse
Affiliation(s)
- Olive E Burata
- Program in Chemical Biology, University of Michigan, Ann Arbor, MI 48109
| | - Ever O'Donnell
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, MI 48109
| | - Jeonghoon Hyun
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, MI 48109
| | - Rachael M Lucero
- Program in Chemical Biology, University of Michigan, Ann Arbor, MI 48109
| | - Junius E Thomas
- Program in Chemical Biology, University of Michigan, Ann Arbor, MI 48109
| | - Ethan M Gibbs
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, MI 48109
| | - Isabella Reacher
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, MI 48109
| | - Nolan A Carney
- Program in Chemical Biology, University of Michigan, Ann Arbor, MI 48109
| | - Randy B Stockbridge
- Program in Chemical Biology, University of Michigan, Ann Arbor, MI 48109
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, MI 48109
| |
Collapse
|
30
|
Xu J, Li T, Huang WE, Zhou NY. Semi-rational design of nitroarene dioxygenase for catalytic ability toward 2,4-dichloronitrobenzene. Appl Environ Microbiol 2024; 90:e0143623. [PMID: 38709097 PMCID: PMC11218619 DOI: 10.1128/aem.01436-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 04/05/2024] [Indexed: 05/07/2024] Open
Abstract
Rieske non-heme dioxygenase family enzymes play an important role in the aerobic biodegradation of nitroaromatic pollutants, but no active dioxygenases are available in nature for initial reactions in the degradation of many refractory pollutants like 2,4-dichloronitrobenzene (24DCNB). Here, we report the engineering of hotspots in 2,3-dichloronitrobenzene dioxygenase from Diaphorobacter sp. strain JS3051, achieved through molecular dynamic simulation analysis and site-directed mutagenesis, with the aim of enhancing its catalytic activity toward 24DCNB. The computationally predicted activity scores were largely consistent with the detected activities in wet experiments. Among them, the two most beneficial mutations (E204M and M248I) were obtained, and the combined mutant reached up to a 62-fold increase in activity toward 24DCNB, generating a single product, 3,5-dichlorocatechol, which is a naturally occurring compound. In silico analysis confirmed that residue 204 affected the substrate preference for meta-substituted nitroarenes, while residue 248 may influence substrate preference by interaction with residue 295. Overall, this study provides a framework for manipulating nitroarene dioxygenases using computational methods to address various nitroarene contamination problems.IMPORTANCEAs a result of human activities, various nitroaromatic pollutants continue to enter the biosphere with poor degradability, and dioxygenation is an important kickoff step to remove toxic nitro-groups and convert them into degradable products. The biodegradation of many nitroarenes has been reported over the decades; however, many others still lack corresponding enzymes to initiate their degradation. Although rieske non-heme dioxygenase family enzymes play extraordinarily important roles in the aerobic biodegradation of various nitroaromatic pollutants, prediction of their substrate specificity is difficult. This work greatly improved the catalytic activity of dioxygenase against 2,4-dichloronitrobenzene by computer-aided semi-rational design, paving a new way for the evolution strategy of nitroarene dioxygenase. This study highlights the potential for using enzyme structure-function information with computational pre-screening methods to rapidly tailor the catalytic functions of enzymes toward poorly biodegradable contaminants.
Collapse
Affiliation(s)
- Jia Xu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Tao Li
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Wei E. Huang
- Department of Engineering Science, University of Oxford, Oxford, United Kingdom
| | - Ning-Yi Zhou
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
31
|
Alpay BA, Desai MM. Effects of selection stringency on the outcomes of directed evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.09.598029. [PMID: 38895455 PMCID: PMC11185767 DOI: 10.1101/2024.06.09.598029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Directed evolution makes mutant lineages compete in climbing complicated sequence-function landscapes. Given this underlying complexity it is unclear how selection stringency, a ubiquitous parameter of directed evolution, impacts the outcome. Here we approach this question in terms of the fitnesses of the candidate variants at each round and the heterogeneity of their distributions of fitness effects. We show that even if the fittest mutant is most likely to yield the fittest mutants in the next round of selection, diversification can improve outcomes by sampling a larger variety of fitness effects. We find that heterogeneity in fitness effects between variants, larger population sizes, and evolution over a greater number of rounds all encourage diversification.
Collapse
Affiliation(s)
- Berk A. Alpay
- Systems, Synthetic, and Quantitative Biology Program, Harvard University, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Michael M. Desai
- Systems, Synthetic, and Quantitative Biology Program, Harvard University, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Department of Physics, Harvard University, Cambridge, MA, USA
| |
Collapse
|
32
|
Joseph J. Increased Positive Selection in Highly Recombining Genes Does not Necessarily Reflect an Evolutionary Advantage of Recombination. Mol Biol Evol 2024; 41:msae107. [PMID: 38829800 PMCID: PMC11173204 DOI: 10.1093/molbev/msae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 04/08/2024] [Accepted: 05/28/2024] [Indexed: 06/05/2024] Open
Abstract
It is commonly thought that the long-term advantage of meiotic recombination is to dissipate genetic linkage, allowing natural selection to act independently on different loci. It is thus theoretically expected that genes with higher recombination rates evolve under more effective selection. On the other hand, recombination is often associated with GC-biased gene conversion (gBGC), which theoretically interferes with selection by promoting the fixation of deleterious GC alleles. To test these predictions, several studies assessed whether selection was more effective in highly recombining genes (due to dissipation of genetic linkage) or less effective (due to gBGC), assuming a fixed distribution of fitness effects (DFE) for all genes. In this study, I directly derive the DFE from a gene's evolutionary history (shaped by mutation, selection, drift, and gBGC) under empirical fitness landscapes. I show that genes that have experienced high levels of gBGC are less fit and thus have more opportunities for beneficial mutations. Only a small decrease in the genome-wide intensity of gBGC leads to the fixation of these beneficial mutations, particularly in highly recombining genes. This results in increased positive selection in highly recombining genes that is not caused by more effective selection. Additionally, I show that the death of a recombination hotspot can lead to a higher dN/dS than its birth, but with substitution patterns biased towards AT, and only at selected positions. This shows that controlling for a substitution bias towards GC is therefore not sufficient to rule out the contribution of gBGC to signatures of accelerated evolution. Finally, although gBGC does not affect the fixation probability of GC-conservative mutations, I show that by altering the DFE, gBGC can also significantly affect nonsynonymous GC-conservative substitution patterns.
Collapse
Affiliation(s)
- Julien Joseph
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne, France
| |
Collapse
|
33
|
Xue S, Han Y, Wu F, Wang Q. Mutations in the SARS-CoV-2 spike receptor binding domain and their delicate balance between ACE2 affinity and antibody evasion. Protein Cell 2024; 15:403-418. [PMID: 38442025 PMCID: PMC11131022 DOI: 10.1093/procel/pwae007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 02/05/2024] [Indexed: 03/07/2024] Open
Abstract
Intensive selection pressure constrains the evolutionary trajectory of SARS-CoV-2 genomes and results in various novel variants with distinct mutation profiles. Point mutations, particularly those within the receptor binding domain (RBD) of SARS-CoV-2 spike (S) protein, lead to the functional alteration in both receptor engagement and monoclonal antibody (mAb) recognition. Here, we review the data of the RBD point mutations possessed by major SARS-CoV-2 variants and discuss their individual effects on ACE2 affinity and immune evasion. Many single amino acid substitutions within RBD epitopes crucial for the antibody evasion capacity may conversely weaken ACE2 binding affinity. However, this weakened effect could be largely compensated by specific epistatic mutations, such as N501Y, thus maintaining the overall ACE2 affinity for the spike protein of all major variants. The predominant direction of SARS-CoV-2 evolution lies neither in promoting ACE2 affinity nor evading mAb neutralization but in maintaining a delicate balance between these two dimensions. Together, this review interprets how RBD mutations efficiently resist antibody neutralization and meanwhile how the affinity between ACE2 and spike protein is maintained, emphasizing the significance of comprehensive assessment of spike mutations.
Collapse
Affiliation(s)
- Song Xue
- Key Laboratory of Medical Molecular Virology (MOE/NHC/CAMS), Shanghai Institute of Infectious Disease and Biosecurity, Shanghai Frontiers Science Center of Pathogenic Microorganisms and Infection, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Yuru Han
- Key Laboratory of Medical Molecular Virology (MOE/NHC/CAMS), Shanghai Institute of Infectious Disease and Biosecurity, Shanghai Frontiers Science Center of Pathogenic Microorganisms and Infection, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Fan Wu
- Key Laboratory of Medical Molecular Virology (MOE/NHC/CAMS), Shanghai Institute of Infectious Disease and Biosecurity, Shanghai Frontiers Science Center of Pathogenic Microorganisms and Infection, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Qiao Wang
- Key Laboratory of Medical Molecular Virology (MOE/NHC/CAMS), Shanghai Institute of Infectious Disease and Biosecurity, Shanghai Frontiers Science Center of Pathogenic Microorganisms and Infection, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai 200032, China
| |
Collapse
|
34
|
Gonzales J, Kim I, Hwang W, Cho JH. Evolutionary rewiring of the dynamic network underpinning allosteric epistasis in NS1 of influenza A virus. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.24.595776. [PMID: 38826371 PMCID: PMC11142230 DOI: 10.1101/2024.05.24.595776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Viral proteins frequently mutate to evade or antagonize host innate immune responses, yet the impact of these mutations on the molecular energy landscape remains unclear. Epistasis, the intramolecular communications between mutations, often renders the combined mutational effects unpredictable. Nonstructural protein 1 (NS1) is a major virulence factor of the influenza A virus (IAV) that activates host PI3K by binding to its p85β subunit. Here, we present the deep analysis for the impact of evolutionary mutations in NS1 that emerged between the 1918 pandemic IAV strain and its descendant PR8 strain. Our analysis reveal how the mutations rewired inter-residue communications which underlies long-range allosteric and epistatic networks in NS1. Our findings show that PR8 NS1 binds to p85β with approximately 10-fold greater affinity than 1918 NS1 due to allosteric mutational effects. Notably, these mutations also exhibited long-range epistatic effects. NMR chemical shift perturbation and methyl-axis order parameter analyses revealed that the mutations induced long-range structural and dynamic changes in PR8 NS1, enhancing its affinity to p85β. Complementary MD simulations and graph-based network analysis uncover how these mutations rewire dynamic residue interaction networks, which underlies the long-range epistasis and allosteric effects on p85β-binding affinity. Significantly, we find that conformational dynamics of residues with high betweenness centrality play a crucial role in communications between network communities and are highly conserved across influenza A virus evolution. These findings advance our mechanistic understanding of the allosteric and epistatic communications between distant residues and provides insight into their role in the molecular evolution of NS1.
Collapse
|
35
|
Metzger BPH, Park Y, Starr TN, Thornton JW. Epistasis facilitates functional evolution in an ancient transcription factor. eLife 2024; 12:RP88737. [PMID: 38767330 PMCID: PMC11105156 DOI: 10.7554/elife.88737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
Abstract
A protein's genetic architecture - the set of causal rules by which its sequence produces its functions - also determines its possible evolutionary trajectories. Prior research has proposed that the genetic architecture of proteins is very complex, with pervasive epistatic interactions that constrain evolution and make function difficult to predict from sequence. Most of this work has analyzed only the direct paths between two proteins of interest - excluding the vast majority of possible genotypes and evolutionary trajectories - and has considered only a single protein function, leaving unaddressed the genetic architecture of functional specificity and its impact on the evolution of new functions. Here, we develop a new method based on ordinal logistic regression to directly characterize the global genetic determinants of multiple protein functions from 20-state combinatorial deep mutational scanning (DMS) experiments. We use it to dissect the genetic architecture and evolution of a transcription factor's specificity for DNA, using data from a combinatorial DMS of an ancient steroid hormone receptor's capacity to activate transcription from two biologically relevant DNA elements. We show that the genetic architecture of DNA recognition consists of a dense set of main and pairwise effects that involve virtually every possible amino acid state in the protein-DNA interface, but higher-order epistasis plays only a tiny role. Pairwise interactions enlarge the set of functional sequences and are the primary determinants of specificity for different DNA elements. They also massively expand the number of opportunities for single-residue mutations to switch specificity from one DNA target to another. By bringing variants with different functions close together in sequence space, pairwise epistasis therefore facilitates rather than constrains the evolution of new functions.
Collapse
Affiliation(s)
- Brian PH Metzger
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
| | - Yeonwoo Park
- Program in Genetics, Genomics, and Systems Biology, University of ChicagoChicagoUnited States
| | - Tyler N Starr
- Department of Biochemistry and Molecular Biophysics, University of ChicagoChicagoUnited States
| | - Joseph W Thornton
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
- Department of Human Genetics, University of ChicagoChicagoUnited States
| |
Collapse
|
36
|
Ribeiro TDS, Lollar MJ, Sprengelmeyer QD, Huang Y, Benson DM, Orr MS, Johnson ZC, Corbett-Detig RB, Pool JE. Recombinant inbred line panels inform the genetic architecture and interactions of adaptive traits in Drosophila melanogaster. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.14.594228. [PMID: 38798433 PMCID: PMC11118405 DOI: 10.1101/2024.05.14.594228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
The distribution of allelic effects on traits, along with their gene-by-gene and gene-by-environment interactions, contributes to the phenotypes available for selection and the trajectories of adaptive variants. Nonetheless, uncertainty persists regarding the effect sizes underlying adaptations and the importance of genetic interactions. Herein, we aimed to investigate the genetic architecture and the epistatic and environmental interactions involving loci that contribute to multiple adaptive traits using two new panels of Drosophila melanogaster recombinant inbred lines (RILs). To better fit our data, we re-implemented functions from R/qtl (Broman et al. 2003) using additive genetic models. We found 14 quantitative trait loci (QTL) underlying melanism, wing size, song pattern, and ethanol resistance. By combining our mapping results with population genetic statistics, we identified potential new genes related to these traits. None of the detected QTLs showed clear evidence of epistasis, and our power analysis indicated that we should have seen at least one significant interaction if sign epistasis or strong positive epistasis played a pervasive role in trait evolution. In contrast, we did find roles for gene-by-environment interactions involving pigmentation traits. Overall, our data suggest that the genetic architecture of adaptive traits often involves alleles of detectable effect, that strong epistasis does not always play a role in adaptation, and that environmental interactions can modulate the effect size of adaptive alleles.
Collapse
Affiliation(s)
- Tiago da Silva Ribeiro
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Department of Integrative Biology, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Matthew J. Lollar
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | | | - Yuheng Huang
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Derek M. Benson
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Megan S. Orr
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Zachary C. Johnson
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Russell B. Corbett-Detig
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - John E. Pool
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Department of Integrative Biology, University of Wisconsin-Madison, Madison, WI, 53706, USA
| |
Collapse
|
37
|
Reddy KD, Rasool B, Akher FB, Kutlešić N, Pant S, Boudker O. Evolutionary analysis reveals the origin of sodium coupling in glutamate transporters. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.03.569786. [PMID: 38106174 PMCID: PMC10723334 DOI: 10.1101/2023.12.03.569786] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Secondary active membrane transporters harness the energy of ion gradients to concentrate their substrates. Homologous transporters evolved to couple transport to different ions in response to changing environments and needs. The bases of such diversification, and thus principles of ion coupling, are unexplored. Employing phylogenetics and ancestral protein reconstruction, we investigated sodium-coupled transport in prokaryotic glutamate transporters, a mechanism ubiquitous across life domains and critical to neurotransmitter recycling in humans. We found that the evolutionary transition from sodium-dependent to independent substrate binding to the transporter preceded changes in the coupling mechanism. Structural and functional experiments suggest that the transition entailed allosteric mutations, making sodium binding dispensable without affecting ion-binding sites. Allosteric tuning of transporters' energy landscapes might be a widespread route of their functional diversification.
Collapse
Affiliation(s)
- Krishna D. Reddy
- Dept. of Physiology & Biophysics, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
| | - Burha Rasool
- Dept. of Physiology & Biophysics, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
| | - Farideh Badichi Akher
- Dept. of Physiology & Biophysics, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
| | - Nemanja Kutlešić
- Dept. of Physiology & Biophysics, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
| | - Swati Pant
- Dept. of Biochemistry, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
| | - Olga Boudker
- Dept. of Physiology & Biophysics, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
- Howard Hughes Medical Institute, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
| |
Collapse
|
38
|
Meger AT, Spence MA, Sandhu M, Matthews D, Chen J, Jackson CJ, Raman S. Rugged fitness landscapes minimize promiscuity in the evolution of transcriptional repressors. Cell Syst 2024; 15:374-387.e6. [PMID: 38537640 PMCID: PMC11299162 DOI: 10.1016/j.cels.2024.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 09/08/2023] [Accepted: 03/05/2024] [Indexed: 04/20/2024]
Abstract
How a protein's function influences the shape of its fitness landscape, smooth or rugged, is a fundamental question in evolutionary biochemistry. Smooth landscapes arise when incremental mutational steps lead to a progressive change in function, as commonly seen in enzymes and binding proteins. On the other hand, rugged landscapes are poorly understood because of the inherent unpredictability of how sequence changes affect function. Here, we experimentally characterize the entire sequence phylogeny, comprising 1,158 extant and ancestral sequences, of the DNA-binding domain (DBD) of the LacI/GalR transcriptional repressor family. Our analysis revealed an extremely rugged landscape with rapid switching of specificity, even between adjacent nodes. Further, the ruggedness arises due to the necessity of the repressor to simultaneously evolve specificity for asymmetric operators and disfavors potentially adverse regulatory crosstalk. Our study provides fundamental insight into evolutionary, molecular, and biophysical rules of genetic regulation through the lens of fitness landscapes.
Collapse
Affiliation(s)
- Anthony T Meger
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Matthew A Spence
- Research School of Chemistry, Australian National University, Canberra, ACT 2601, Australia
| | - Mahakaran Sandhu
- Research School of Chemistry, Australian National University, Canberra, ACT 2601, Australia
| | - Dana Matthews
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
| | - Jackie Chen
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Colin J Jackson
- Research School of Chemistry, Australian National University, Canberra, ACT 2601, Australia; ARC Centre of Excellence for Innovations in Peptide & Protein Science, Research School of Chemistry, Australian National University, Canberra, ACT 2601, Australia; ARC Centre of Excellence for Innovations in Synthetic Biology, Research School of Chemistry, Australian National University, Canberra, ACT 2601, Australia.
| | - Srivatsan Raman
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA; Department of Bacteriology, University of Wisconsin-Madison, Madison, WI 53706, USA; Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA.
| |
Collapse
|
39
|
Raisinghani N, Alshahrani M, Gupta G, Verkhivker G. Ensemble-Based Mutational Profiling and Network Analysis of the SARS-CoV-2 Spike Omicron XBB Lineages for Interactions with the ACE2 Receptor and Antibodies: Cooperation of Binding Hotspots in Mediating Epistatic Couplings Underlies Binding Mechanism and Immune Escape. Int J Mol Sci 2024; 25:4281. [PMID: 38673865 PMCID: PMC11049863 DOI: 10.3390/ijms25084281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 04/09/2024] [Accepted: 04/11/2024] [Indexed: 04/28/2024] Open
Abstract
In this study, we performed a computational study of binding mechanisms for the SARS-CoV-2 spike Omicron XBB lineages with the host cell receptor ACE2 and a panel of diverse class one antibodies. The central objective of this investigation was to examine the molecular factors underlying epistatic couplings among convergent evolution hotspots that enable optimal balancing of ACE2 binding and antibody evasion for Omicron variants BA.1, BA2, BA.3, BA.4/BA.5, BQ.1.1, XBB.1, XBB.1.5, and XBB.1.5 + L455F/F456L. By combining evolutionary analysis, molecular dynamics simulations, and ensemble-based mutational scanning of spike protein residues in complexes with ACE2, we identified structural stability and binding affinity hotspots that are consistent with the results of biochemical studies. In agreement with the results of deep mutational scanning experiments, our quantitative analysis correctly reproduced strong and variant-specific epistatic effects in the XBB.1.5 and BA.2 variants. It was shown that Y453W and F456L mutations can enhance ACE2 binding when coupled with Q493 in XBB.1.5, while these mutations become destabilized when coupled with the R493 position in the BA.2 variant. The results provided a molecular rationale of the epistatic mechanism in Omicron variants, showing a central role of the Q493/R493 hotspot in modulating epistatic couplings between convergent mutational sites L455F and F456L in XBB lineages. The results of mutational scanning and binding analysis of the Omicron XBB spike variants with ACE2 receptors and a panel of class one antibodies provide a quantitative rationale for the experimental evidence that epistatic interactions of the physically proximal binding hotspots Y501, R498, Q493, L455F, and F456L can determine strong ACE2 binding, while convergent mutational sites F456L and F486P are instrumental in mediating broad antibody resistance. The study supports a mechanism in which the impact on ACE2 binding affinity is mediated through a small group of universal binding hotspots, while the effect of immune evasion could be more variant-dependent and modulated by convergent mutational sites in the conformationally adaptable spike regions.
Collapse
Affiliation(s)
- Nishank Raisinghani
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA; (N.R.); (M.A.); (G.G.)
| | - Mohammed Alshahrani
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA; (N.R.); (M.A.); (G.G.)
| | - Grace Gupta
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA; (N.R.); (M.A.); (G.G.)
| | - Gennady Verkhivker
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA; (N.R.); (M.A.); (G.G.)
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA 92618, USA
| |
Collapse
|
40
|
Dibyachintan S, Dube AK, Bradley D, Lemieux P, Dionne U, Landry CR. Cryptic genetic variation shapes the fate of gene duplicates in a protein interaction network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.23.581840. [PMID: 38464075 PMCID: PMC10925128 DOI: 10.1101/2024.02.23.581840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Paralogous genes are often redundant for long periods of time before they diverge in function. While their functions are preserved, paralogous proteins can accumulate mutations that, through epistasis, could impact their fate in the future. By quantifying the impact of all single-amino acid substitutions on the binding of two myosin proteins to their interaction partners, we find that the future evolution of these proteins is highly contingent on their regulatory divergence and the mutations that have silently accumulated in their protein binding domains. Differences in the promoter strength of the two paralogs amplify the impact of mutations on binding in the lowly expressed one. While some mutations would be sufficient to non-functionalize one paralog, they would have minimal impact on the other. Our results reveal how functionally equivalent protein domains could be destined to specific fates by regulatory and cryptic coding sequence changes that currently have little to no functional impact.
Collapse
Affiliation(s)
- Soham Dibyachintan
- PROTEO-Regroupement Québécois de Recherche sur la Fonction, l'Ingénierie et les Applications des Protéines, Québec, QC, Canada
- Centre de Recherche en Données Massives de l'Université Laval, Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Québec, QC, Canada
| | - Alexandre K Dube
- PROTEO-Regroupement Québécois de Recherche sur la Fonction, l'Ingénierie et les Applications des Protéines, Québec, QC, Canada
- Centre de Recherche en Données Massives de l'Université Laval, Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Québec, QC, Canada
- Département de Biologie, Université Laval, Québec, QC, Canada
| | - David Bradley
- PROTEO-Regroupement Québécois de Recherche sur la Fonction, l'Ingénierie et les Applications des Protéines, Québec, QC, Canada
- Centre de Recherche en Données Massives de l'Université Laval, Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Québec, QC, Canada
- Département de Biologie, Université Laval, Québec, QC, Canada
| | - Pascale Lemieux
- PROTEO-Regroupement Québécois de Recherche sur la Fonction, l'Ingénierie et les Applications des Protéines, Québec, QC, Canada
- Centre de Recherche en Données Massives de l'Université Laval, Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Québec, QC, Canada
| | - Ugo Dionne
- PROTEO-Regroupement Québécois de Recherche sur la Fonction, l'Ingénierie et les Applications des Protéines, Québec, QC, Canada
- Centre de Recherche en Données Massives de l'Université Laval, Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Current affiliation: Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Christian R Landry
- PROTEO-Regroupement Québécois de Recherche sur la Fonction, l'Ingénierie et les Applications des Protéines, Québec, QC, Canada
- Centre de Recherche en Données Massives de l'Université Laval, Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Québec, QC, Canada
- Département de Biologie, Université Laval, Québec, QC, Canada
| |
Collapse
|
41
|
Nguyen A, Zhao H, Myagmarsuren D, Srinivasan S, Wu D, Chen J, Piszczek G, Schuck P. Modulation of Biophysical Properties of Nucleocapsid Protein in the Mutant Spectrum of SARS-CoV-2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.21.568093. [PMID: 38045241 PMCID: PMC10690151 DOI: 10.1101/2023.11.21.568093] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Genetic diversity is a hallmark of RNA viruses and the basis for their evolutionary success. Taking advantage of the uniquely large genomic database of SARS-CoV-2, we examine the impact of mutations across the spectrum of viable amino acid sequences on the biophysical phenotypes of the highly expressed and multifunctional nucleocapsid protein. We find variation in the physicochemical parameters of its extended intrinsically disordered regions (IDRs) sufficient to allow local plasticity, but also exhibiting functional constraints that similarly occur in related coronaviruses. In biophysical experiments with several N-protein species carrying mutations associated with major variants, we find that point mutations in the IDRs can have nonlocal impact and modulate thermodynamic stability, secondary structure, protein oligomeric state, particle formation, and liquid-liquid phase separation. In the Omicron variant, distant mutations in different IDRs have compensatory effects in shifting a delicate balance of interactions controlling protein assembly properties, and include the creation of a new protein-protein interaction interface in the N-terminal IDR through the defining P13L mutation. A picture emerges where genetic diversity is accompanied by significant variation in biophysical characteristics of functional N-protein species, in particular in the IDRs.
Collapse
Affiliation(s)
- Ai Nguyen
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, MD 20892, USA
| | - Huaying Zhao
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, MD 20892, USA
| | - Dulguun Myagmarsuren
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, MD 20892, USA
| | - Sanjana Srinivasan
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, MD 20892, USA
| | - Di Wu
- Biophysics Core Facility, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Jiji Chen
- Advanced Imaging and Microscopy Resource, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, MD 20892, USA
| | - Grzegorz Piszczek
- Biophysics Core Facility, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Peter Schuck
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
42
|
Wang X, Li A, Li X, Cui H. Empowering Protein Engineering through Recombination of Beneficial Substitutions. Chemistry 2024; 30:e202303889. [PMID: 38288640 DOI: 10.1002/chem.202303889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Indexed: 02/24/2024]
Abstract
Directed evolution stands as a seminal technology for generating novel protein functionalities, a cornerstone in biocatalysis, metabolic engineering, and synthetic biology. Today, with the development of various mutagenesis methods and advanced analytical machines, the challenge of diversity generation and high-throughput screening platforms is largely solved, and one of the remaining challenges is: how to empower the potential of single beneficial substitutions with recombination to achieve the epistatic effect. This review overviews experimental and computer-assisted recombination methods in protein engineering campaigns. In addition, integrated and machine learning-guided strategies were highlighted to discuss how these recombination approaches contribute to generating the screening library with better diversity, coverage, and size. A decision tree was finally summarized to guide the further selection of proper recombination strategies in practice, which was beneficial for accelerating protein engineering.
Collapse
Affiliation(s)
- Xinyue Wang
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| | - Anni Li
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| | - Xiujuan Li
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| | - Haiyang Cui
- School of Life Sciences, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| |
Collapse
|
43
|
Ferreiro D, Branco C, Arenas M. Selection among site-dependent structurally constrained substitution models of protein evolution by approximate Bayesian computation. Bioinformatics 2024; 40:btae096. [PMID: 38374231 PMCID: PMC10914458 DOI: 10.1093/bioinformatics/btae096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 01/15/2024] [Accepted: 02/16/2024] [Indexed: 02/21/2024] Open
Abstract
MOTIVATION The selection among substitution models of molecular evolution is fundamental for obtaining accurate phylogenetic inferences. At the protein level, evolutionary analyses are traditionally based on empirical substitution models but these models make unrealistic assumptions and are being surpassed by structurally constrained substitution (SCS) models. The SCS models often consider site-dependent evolution, a process that provides realism but complicates their implementation into likelihood functions that are commonly used for substitution model selection. RESULTS We present a method to perform selection among site-dependent SCS models, also among empirical and site-dependent SCS models, based on the approximate Bayesian computation (ABC) approach and its implementation into the computational framework ProteinModelerABC. The framework implements ABC with and without regression adjustments and includes diverse empirical and site-dependent SCS models of protein evolution. Using extensive simulated data, we found that it provides selection among SCS and empirical models with acceptable accuracy. As illustrative examples, we applied the framework to analyze a variety of protein families observing that SCS models fit them better than the corresponding best-fitting empirical substitution models. AVAILABILITY AND IMPLEMENTATION ProteinModelerABC is freely available from https://github.com/DavidFerreiro/ProteinModelerABC, can run in parallel and includes a graphical user interface. The framework is distributed with detailed documentation and ready-to-use examples.
Collapse
Affiliation(s)
- David Ferreiro
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| | - Catarina Branco
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| | - Miguel Arenas
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| |
Collapse
|
44
|
Seitz EE, McCandlish DM, Kinney JB, Koo PK. Interpreting cis-regulatory mechanisms from genomic deep neural networks using surrogate models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.14.567120. [PMID: 38013993 PMCID: PMC10680760 DOI: 10.1101/2023.11.14.567120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Deep neural networks (DNNs) have greatly advanced the ability to predict genome function from sequence. Interpreting genomic DNNs in terms of biological mechanisms, however, remains difficult. Here we introduce SQUID, a genomic DNN interpretability framework based on surrogate modeling. SQUID approximates genomic DNNs in user-specified regions of sequence space using surrogate models, i.e., simpler models that are mechanistically interpretable. Importantly, SQUID removes the confounding effects that nonlinearities and heteroscedastic noise in functional genomics data can have on model interpretation. Benchmarking analysis on multiple genomic DNNs shows that SQUID, when compared to established interpretability methods, identifies motifs that are more consistent across genomic loci and yields improved single-nucleotide variant-effect predictions. SQUID also supports surrogate models that quantify epistatic interactions within and between cis-regulatory elements. SQUID thus advances the ability to mechanistically interpret genomic DNNs.
Collapse
Affiliation(s)
- Evan E Seitz
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Peter K Koo
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| |
Collapse
|
45
|
Yao Z, Zhang L, Duan Y, Tang X, Lu J. Molecular insights into the adaptive evolution of SARS-CoV-2 spike protein. J Infect 2024; 88:106121. [PMID: 38367704 DOI: 10.1016/j.jinf.2024.106121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 02/02/2024] [Accepted: 02/10/2024] [Indexed: 02/19/2024]
Abstract
The COVID-19 pandemic, caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has substantially damaged the global economy and human health. The spike (S) protein of coronaviruses plays a pivotal role in viral entry by binding to host cell receptors. Additionally, it acts as the primary target for neutralizing antibodies in those infected and is the central focus for currently utilized or researched vaccines. During the virus's adaptation to the human host, the S protein of SARS-CoV-2 has undergone significant evolution. As the COVID-19 pandemic has unfolded, new mutations have arisen and vanished, giving rise to distinctive amino acid profiles within variant of concern strains of SARS-CoV-2. Notably, many of these changes in the S protein have been positively selected, leading to substantial alterations in viral characteristics, such as heightened transmissibility and immune evasion capabilities. This review aims to provide an overview of our current understanding of the structural implications associated with key amino acid changes in the S protein of SARS-CoV-2. These research findings shed light on the intricate and dynamic nature of viral evolution, underscoring the importance of continuous monitoring and analysis of viral genomes. Through these molecular-level investigations, we can attain deeper insights into the virus's adaptive evolution, offering valuable guidance for designing vaccines and developing antiviral drugs to combat the ever-evolving viral threats.
Collapse
Affiliation(s)
- Zhuocheng Yao
- College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China
| | - Lin Zhang
- College of Fishery, Ocean University of China, Qingdao 266003, China
| | - Yuange Duan
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing 100871, China
| | - Xiaolu Tang
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing 100871, China
| | - Jian Lu
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing 100871, China.
| |
Collapse
|
46
|
Wait SJ, Expòsit M, Lin S, Rappleye M, Lee JD, Colby SA, Torp L, Asencio A, Smith A, Regnier M, Moussavi-Harami F, Baker D, Kim CK, Berndt A. Machine learning-guided engineering of genetically encoded fluorescent calcium indicators. NATURE COMPUTATIONAL SCIENCE 2024; 4:224-236. [PMID: 38532137 DOI: 10.1038/s43588-024-00611-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 02/15/2024] [Indexed: 03/28/2024]
Abstract
Here we used machine learning to engineer genetically encoded fluorescent indicators, protein-based sensors critical for real-time monitoring of biological activity. We used machine learning to predict the outcomes of sensor mutagenesis by analyzing established libraries that link sensor sequences to functions. Using the GCaMP calcium indicator as a scaffold, we developed an ensemble of three regression models trained on experimentally derived GCaMP mutation libraries. The trained ensemble performed an in silico functional screen on 1,423 novel, uncharacterized GCaMP variants. As a result, we identified the ensemble-derived GCaMP (eGCaMP) variants, eGCaMP and eGCaMP+, which achieve both faster kinetics and larger ∆F/F0 responses upon stimulation than previously published fast variants. Furthermore, we identified a combinatorial mutation with extraordinary dynamic range, eGCaMP2+, which outperforms the tested sixth-, seventh- and eighth-generation GCaMPs. These findings demonstrate the value of machine learning as a tool to facilitate the efficient engineering of proteins for desired biophysical characteristics.
Collapse
Affiliation(s)
- Sarah J Wait
- Molecular Engineering and Sciences Institute, University of Washington, Seattle, WA, USA
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
| | - Marc Expòsit
- Molecular Engineering and Sciences Institute, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Sophia Lin
- Center for Neuroscience, University of California, Davis, Davis, CA, USA
- Department of Neurology, University of California, Davis, Davis, CA, USA
| | - Michael Rappleye
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
- Institute of Pharmacology and Toxicology, University of Zürich, Zurich, Switzerland
| | - Justin Daho Lee
- Molecular Engineering and Sciences Institute, University of Washington, Seattle, WA, USA
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
| | - Samuel A Colby
- Molecular Engineering and Sciences Institute, University of Washington, Seattle, WA, USA
| | - Lily Torp
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Anthony Asencio
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Annette Smith
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
| | - Michael Regnier
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Farid Moussavi-Harami
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
- Division of Cardiology, University of Washington, Seattle, WA, USA
| | - David Baker
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Christina K Kim
- Center for Neuroscience, University of California, Davis, Davis, CA, USA
- Department of Neurology, University of California, Davis, Davis, CA, USA
| | - Andre Berndt
- Molecular Engineering and Sciences Institute, University of Washington, Seattle, WA, USA.
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA.
- Department of Bioengineering, University of Washington, Seattle, WA, USA.
- Center for Neurobiology of Addiction, Pain, and Emotion, University of Washington, Seattle, WA, USA.
| |
Collapse
|
47
|
Yang J, Li FZ, Arnold FH. Opportunities and Challenges for Machine Learning-Assisted Enzyme Engineering. ACS CENTRAL SCIENCE 2024; 10:226-241. [PMID: 38435522 PMCID: PMC10906252 DOI: 10.1021/acscentsci.3c01275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 12/26/2023] [Accepted: 01/16/2024] [Indexed: 03/05/2024]
Abstract
Enzymes can be engineered at the level of their amino acid sequences to optimize key properties such as expression, stability, substrate range, and catalytic efficiency-or even to unlock new catalytic activities not found in nature. Because the search space of possible proteins is vast, enzyme engineering usually involves discovering an enzyme starting point that has some level of the desired activity followed by directed evolution to improve its "fitness" for a desired application. Recently, machine learning (ML) has emerged as a powerful tool to complement this empirical process. ML models can contribute to (1) starting point discovery by functional annotation of known protein sequences or generating novel protein sequences with desired functions and (2) navigating protein fitness landscapes for fitness optimization by learning mappings between protein sequences and their associated fitness values. In this Outlook, we explain how ML complements enzyme engineering and discuss its future potential to unlock improved engineering outcomes.
Collapse
Affiliation(s)
- Jason Yang
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Francesca-Zhoufan Li
- Division
of Biology and Biological Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Frances H. Arnold
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
- Division
of Biology and Biological Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
48
|
Zhang E, Neugebauer ME, Krasnow NA, Liu DR. Phage-assisted evolution of highly active cytosine base editors with enhanced selectivity and minimal sequence context preference. Nat Commun 2024; 15:1697. [PMID: 38402281 PMCID: PMC10894238 DOI: 10.1038/s41467-024-45969-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 02/07/2024] [Indexed: 02/26/2024] Open
Abstract
TadA-derived cytosine base editors (TadCBEs) enable programmable C•G-to-T•A editing while retaining the small size, high on-target activity, and low off-target activity of TadA deaminases. Existing TadCBEs, however, exhibit residual A•T-to-G•C editing at certain positions and lower editing efficiencies at some sequence contexts and with non-SpCas9 targeting domains. To address these limitations, we use phage-assisted evolution to evolve CBE6s from a TadA-mediated dual cytosine and adenine base editor, discovering mutations at N46 and Y73 in TadA that prevent A•T-to-G•C editing and improve C•G-to-T•A editing with expanded sequence-context compatibility, respectively. In E. coli, CBE6 variants offer high C•G-to-T•A editing and no detected A•T-to-G•C editing in any sequence context. In human cells, CBE6 variants exhibit broad Cas domain compatibility and retain low off-target editing despite exceeding BE4max and previous TadCBEs in on-target editing efficiency. Finally, we show that the high selectivity of CBE6 variants is well-suited for therapeutically relevant stop codon installation without creating unwanted missense mutations from residual A•T-to-G•C editing.
Collapse
Affiliation(s)
- Emily Zhang
- Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA
| | - Monica E Neugebauer
- Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA
| | - Nicholas A Krasnow
- Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA
| | - David R Liu
- Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA.
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
49
|
Clark JD, Mi X, Mitchell DA, Shukla D. Substrate Prediction for RiPP Biosynthetic Enzymes via Masked Language Modeling and Transfer Learning. ARXIV 2024:arXiv:2402.15181v1. [PMID: 38463513 PMCID: PMC10925380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Ribosomally synthesized and post-translationally modified peptide (RiPP) biosynthetic enzymes often exhibit promiscuous substrate preferences that cannot be reduced to simple rules. Large language models are promising tools for predicting such peptide fitness landscapes. However, state-of-the-art protein language models are trained on relatively few peptide sequences. A previous study comprehensively profiled the peptide substrate preferences of LazBF (a two-component serine dehydratase) and LazDEF (a three-component azole synthetase) from the lactazole biosynthetic pathway. We demonstrated that masked language modeling of LazBF substrate preferences produced language model embeddings that improved downstream classification models of both LazBF and LazDEF substrates. Similarly, masked language modelling of LazDEF substrate preferences produced embeddings that improved the performance of classification models of both LazBF and LazDEF substrates. Our results suggest that the models learned functional forms that are transferable between distinct enzymatic transformations that act within the same biosynthetic pathway. Our transfer learning method improved performance and data efficiency in data-scarce scenarios. We then fine-tuned models on each data set and showed that the fine-tuned models provided interpretable insight that we anticipate will facilitate the design of substrate libraries that are compatible with desired RiPP biosynthetic pathways.
Collapse
Affiliation(s)
- Joseph D Clark
- School of Molecular and Cellular Biology,University of Illinois at Urbana-Champaign,Urbana, IL 61801, USA
| | - Xuenan Mi
- Center for Biophysics and Quantitative Biology,University of Illinois at Urbana-Champaign,Urbana, IL 61801, USA
| | - Douglas A Mitchell
- Department of Chemistry,University of Illinois at Urbana-Champaign,Urbana, IL 61801, USA
| | - Diwakar Shukla
- Center for Biophysics and Quantitative Biology,University of Illinois at Urbana-Champaign,Urbana, IL 61801, USA
- Department of Chemical and Biomolecular Engineering,University of Illinois at Urbana-Champaign,Urbana, IL 61801, USA
- Department of Bioengineering,University of Illinois at Urbana-Champaign,Urbana, IL 61801, USA
| |
Collapse
|
50
|
Radojković M, Ubbink M. Positive epistasis drives clavulanic acid resistance in double mutant libraries of BlaC β-lactamase. Commun Biol 2024; 7:197. [PMID: 38368480 PMCID: PMC10874438 DOI: 10.1038/s42003-024-05868-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Accepted: 01/26/2024] [Indexed: 02/19/2024] Open
Abstract
Phenotypic effects of mutations are highly dependent on the genetic backgrounds in which they occur, due to epistatic effects. To test how easily the loss of enzyme activity can be compensated for, we screen mutant libraries of BlaC, a β-lactamase from Mycobacterium tuberculosis, for fitness in the presence of carbenicillin and the inhibitor clavulanic acid. Using a semi-rational approach and deep sequencing, we prepare four double-site saturation libraries and determine the relative fitness effect for 1534/1540 (99.6%) of the unique library members at two temperatures. Each library comprises variants of a residue known to be relevant for clavulanic acid resistance as well as residue 105, which regulates access to the active site. Variants with greatly improved fitness were identified within each library, demonstrating that compensatory mutations for loss of activity can be readily found. In most cases, the fittest variants are a result of positive epistasis, indicating strong synergistic effects between the chosen residue pairs. Our study sheds light on a role of epistasis in the evolution of functional residues and underlines the highly adaptive potential of BlaC.
Collapse
Affiliation(s)
- Marko Radojković
- Leiden Institute of Chemistry, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Marcellus Ubbink
- Leiden Institute of Chemistry, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands.
| |
Collapse
|