1
|
Landwehr GM, Bogart JW, Magalhaes C, Hammarlund EG, Karim AS, Jewett MC. Accelerated enzyme engineering by machine-learning guided cell-free expression. Nat Commun 2025; 16:865. [PMID: 39833164 PMCID: PMC11747319 DOI: 10.1038/s41467-024-55399-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Accepted: 12/09/2024] [Indexed: 01/22/2025] Open
Abstract
Enzyme engineering is limited by the challenge of rapidly generating and using large datasets of sequence-function relationships for predictive design. To address this challenge, we develop a machine learning (ML)-guided platform that integrates cell-free DNA assembly, cell-free gene expression, and functional assays to rapidly map fitness landscapes across protein sequence space and optimize enzymes for multiple, distinct chemical reactions. We apply this platform to engineer amide synthetases by evaluating substrate preference for 1217 enzyme variants in 10,953 unique reactions. We use these data to build augmented ridge regression ML models for predicting amide synthetase variants capable of making 9 small molecule pharmaceuticals. Over these nine compounds, ML-predicted enzyme variants demonstrate 1.6- to 42-fold improved activity relative to the parent. Our ML-guided, cell-free framework promises to accelerate enzyme engineering by enabling iterative exploration of protein sequence space to build specialized biocatalysts in parallel.
Collapse
Affiliation(s)
- Grant M Landwehr
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
| | - Jonathan W Bogart
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
| | - Carol Magalhaes
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
| | - Eric G Hammarlund
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
| | - Ashty S Karim
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA.
| | - Michael C Jewett
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA.
- Department of Bioengineering, Stanford University, Stanford, CA, USA.
| |
Collapse
|
2
|
Barkman TJ. Applications of ancestral sequence reconstruction for understanding the evolution of plant specialized metabolism. Philos Trans R Soc Lond B Biol Sci 2024; 379:20230348. [PMID: 39343033 PMCID: PMC11439504 DOI: 10.1098/rstb.2023.0348] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 04/10/2024] [Accepted: 04/15/2024] [Indexed: 10/01/2024] Open
Abstract
Studies of enzymes in modern-day plants have documented the diversity of metabolic activities retained by species today but only provide limited insight into how those properties evolved. Ancestral sequence reconstruction (ASR) is an approach that provides statistical estimates of ancient plant enzyme sequences which can then be resurrected to test hypotheses about the evolution of catalytic activities and pathway assembly. Here, I review the insights that have been obtained using ASR to study plant metabolism and highlight important methodological aspects. Overall, studies of resurrected plant enzymes show that (i) exaptation is widespread such that even low or undetectable levels of ancestral activity with a substrate can later become the apparent primary activity of descendant enzymes, (ii) intramolecular epistasis may or may not limit evolutionary paths towards catalytic or substrate preference switches, and (iii) ancient pathway flux often differs from modern-day metabolic networks. These and other insights gained from ASR would not have been possible using only modern-day sequences. Future ASR studies characterizing entire ancestral metabolic networks as well as those that link ancient structures with enzymatic properties should continue to provide novel insights into how the chemical diversity of plants evolved. This article is part of the theme issue 'The evolution of plant metabolism'.
Collapse
Affiliation(s)
- Todd J. Barkman
- Department of Biological Sciences, Western Michigan University, Kalamazoo, MI49008, USA
| |
Collapse
|
3
|
Vila JA. The origin of mutational epistasis. EUROPEAN BIOPHYSICS JOURNAL : EBJ 2024; 53:473-480. [PMID: 39443382 DOI: 10.1007/s00249-024-01725-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 10/03/2024] [Accepted: 10/06/2024] [Indexed: 10/25/2024]
Abstract
The interconnected processes of protein folding, mutations, epistasis, and evolution have all been the subject of extensive analysis throughout the years due to their significance for structural and evolutionary biology. The origin (molecular basis) of epistasis-the non-additive interactions between mutations-is still, nonetheless, unknown. The existence of a new perspective on protein folding, a problem that needs to be conceived as an 'analytic whole', will enable us to shed light on the origin of mutational epistasis at the simplest level-within proteins-while also uncovering the reasons why the genetic background in which they occur, a key component of molecular evolution, could foster changes in epistasis effects. Additionally, because mutations are the source of epistasis, more research is needed to determine the impact of post-translational modifications, which can potentially increase the proteome's diversity by several orders of magnitude, on mutational epistasis and protein evolvability. Finally, a protein evolution thermodynamic-based analysis that does not consider specific mutational steps or epistasis effects will be briefly discussed. Our study explores the complex processes behind the evolution of proteins upon mutations, clearing up some previously unresolved issues, and providing direction for further research.
Collapse
Affiliation(s)
- Jorge A Vila
- IMASL-CONICET, Ejército de Los Andes 950, 5700, San Luis, Argentina.
| |
Collapse
|
4
|
Tripp A, Braun M, Wieser F, Oberdorfer G, Lechner H. Click, Compute, Create: A Review of Web-based Tools for Enzyme Engineering. Chembiochem 2024; 25:e202400092. [PMID: 38634409 DOI: 10.1002/cbic.202400092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 04/14/2024] [Accepted: 04/15/2024] [Indexed: 04/19/2024]
Abstract
Enzyme engineering, though pivotal across various biotechnological domains, is often plagued by its time-consuming and labor-intensive nature. This review aims to offer an overview of supportive in silico methodologies for this demanding endeavor. Starting from methods to predict protein structures, to classification of their activity and even the discovery of new enzymes we continue with describing tools used to increase thermostability and production yields of selected targets. Subsequently, we discuss computational methods to modulate both, the activity as well as selectivity of enzymes. Last, we present recent approaches based on cutting-edge machine learning methods to redesign enzymes. With exception of the last chapter, there is a strong focus on methods easily accessible via web-interfaces or simple Python-scripts, therefore readily useable for a diverse and broad community.
Collapse
Affiliation(s)
- Adrian Tripp
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
| | - Markus Braun
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
| | - Florian Wieser
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
| | - Gustav Oberdorfer
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
- BioTechMed, Graz, Austria
| | - Horst Lechner
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
- BioTechMed, Graz, Austria
| |
Collapse
|
5
|
Gantz M, Mathis SV, Nintzel FEH, Lio P, Hollfelder F. On synergy between ultrahigh throughput screening and machine learning in biocatalyst engineering. Faraday Discuss 2024; 252:89-114. [PMID: 39133073 PMCID: PMC11318516 DOI: 10.1039/d4fd00065j] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 04/23/2024] [Indexed: 08/13/2024]
Abstract
Protein design and directed evolution have separately contributed enormously to protein engineering. Without being mutually exclusive, the former relies on computation from first principles, while the latter is a combinatorial approach based on chance. Advances in ultrahigh throughput (uHT) screening, next generation sequencing and machine learning may create alternative routes to engineered proteins, where functional information linked to specific sequences is interpreted and extrapolated in silico. In particular, the miniaturisation of functional tests in water-in-oil emulsion droplets with picoliter volumes and their rapid generation and analysis (>1 kHz) allows screening of >107-membered libraries in a day. Subsequently, decoding the selected clones by short or long-read sequencing methods leads to large sequence-function datasets that may allow extrapolation from experimental directed evolution to further improved mutants beyond the observed hits. In this work, we explore experimental strategies for how to draw up 'fitness landscapes' in sequence space with uHT droplet microfluidics, review the current state of AI/ML in enzyme engineering and discuss how uHT datasets may be combined with AI/ML to make meaningful predictions and accelerate biocatalyst engineering.
Collapse
Affiliation(s)
- Maximilian Gantz
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA, UK
| | - Simon V Mathis
- Department of Computer Science, University of Cambridge, 15 JJ Thomson Avenue, Cambridge CB3 0FD, UK
| | - Friederike E H Nintzel
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA, UK
| | - Pietro Lio
- Department of Computer Science, University of Cambridge, 15 JJ Thomson Avenue, Cambridge CB3 0FD, UK
| | - Florian Hollfelder
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA, UK
| |
Collapse
|
6
|
Hollmann F, Sanchis J, Reetz MT. Learning from Protein Engineering by Deconvolution of Multi-Mutational Variants. Angew Chem Int Ed Engl 2024; 63:e202404880. [PMID: 38884594 DOI: 10.1002/anie.202404880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 06/05/2024] [Accepted: 06/06/2024] [Indexed: 06/18/2024]
Abstract
This review analyzes a development in biochemistry, enzymology and biotechnology that originally came as a surprise. Following the establishment of directed evolution of stereoselective enzymes in organic chemistry, the concept of partial or complete deconvolution of selective multi-mutational variants was introduced. Early deconvolution experiments of stereoselective variants led to the finding that mutations can interact cooperatively or antagonistically with one another, not just additively. During the past decade, this phenomenon was shown to be general. In some studies, molecular dynamics (MD) and quantum mechanics/molecular mechanics (QM/MM) computations were performed in order to shed light on the origin of non-additivity at all stages of an evolutionary upward climb. Data of complete deconvolution can be used to construct unique multi-dimensional rugged fitness pathway landscapes, which provide mechanistic insights different from traditional fitness landscapes. Along a related line, biochemists have long tested the result of introducing two point mutations in an enzyme for mechanistic reasons, followed by a comparison of the respective double mutant in so-called double mutant cycles, which originally showed only additive effects, but more recently also uncovered cooperative and antagonistic non-additive effects. We conclude with suggestions for future work, and call for a unified overall picture of non-additivity and epistasis.
Collapse
Affiliation(s)
- Frank Hollmann
- Department of Biotechnology, Delft University of Technology, Van der Maasweg 9, 2629HZ, Delft, Netherlands
| | - Joaquin Sanchis
- Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Victoria, 3052, Australia
| | - Manfred T Reetz
- Max-Plank-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45481, Mülheim, Germany
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
| |
Collapse
|
7
|
Cheng P, Mao C, Tang J, Yang S, Cheng Y, Wang W, Gu Q, Han W, Chen H, Li S, Chen Y, Zhou J, Li W, Pan A, Zhao S, Huang X, Zhu S, Zhang J, Shu W, Wang S. Zero-shot prediction of mutation effects with multimodal deep representation learning guides protein engineering. Cell Res 2024; 34:630-647. [PMID: 38969803 PMCID: PMC11369238 DOI: 10.1038/s41422-024-00989-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 06/03/2024] [Indexed: 07/07/2024] Open
Abstract
Mutations in amino acid sequences can provoke changes in protein function. Accurate and unsupervised prediction of mutation effects is critical in biotechnology and biomedicine, but remains a fundamental challenge. To resolve this challenge, here we present Protein Mutational Effect Predictor (ProMEP), a general and multiple sequence alignment-free method that enables zero-shot prediction of mutation effects. A multimodal deep representation learning model embedded in ProMEP was developed to comprehensively learn both sequence and structure contexts from ~160 million proteins. ProMEP achieves state-of-the-art performance in mutational effect prediction and accomplishes a tremendous improvement in speed, enabling efficient and intelligent protein engineering. Specifically, ProMEP accurately forecasts mutational consequences on the gene-editing enzymes TnpB and TadA, and successfully guides the development of high-performance gene-editing tools with their engineered variants. The gene-editing efficiency of a 5-site mutant of TnpB reaches up to 74.04% (vs 24.66% for the wild type); and the base editing tool developed on the basis of a TadA 15-site mutant (in addition to the A106V/D108N double mutation that renders deoxyadenosine deaminase activity to TadA) exhibits an A-to-G conversion frequency of up to 77.27% (vs 69.80% for ABE8e, a previous TadA-based adenine base editor) with significantly reduced bystander and off-target effects compared to ABE8e. ProMEP not only showcases superior performance in predicting mutational effects on proteins but also demonstrates a great capability to guide protein engineering. Therefore, ProMEP enables efficient exploration of the gigantic protein space and facilitates practical design of proteins, thereby advancing studies in biomedicine and synthetic biology.
Collapse
Affiliation(s)
- Peng Cheng
- Bioinformatics Center of AMMS, Beijing, China
| | - Cong Mao
- State Key Laboratory of Reproductive Medicine and Offspring Health, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Care Hospital, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Jin Tang
- Zhejiang Lab, Hangzhou, Zhejiang, China
| | - Sen Yang
- Bioinformatics Center of AMMS, Beijing, China
| | - Yu Cheng
- State Key Laboratory of Reproductive Medicine and Offspring Health, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Care Hospital, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Wuke Wang
- Zhejiang Lab, Hangzhou, Zhejiang, China
| | - Qiuxi Gu
- State Key Laboratory of Reproductive Medicine and Offspring Health, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Care Hospital, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Wei Han
- Zhejiang Lab, Hangzhou, Zhejiang, China
| | - Hao Chen
- State Key Laboratory of Reproductive Medicine and Offspring Health, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Care Hospital, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Sihan Li
- State Key Laboratory of Reproductive Medicine and Offspring Health, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Care Hospital, Nanjing Medical University, Nanjing, Jiangsu, China
| | | | | | - Wuju Li
- Bioinformatics Center of AMMS, Beijing, China
| | - Aimin Pan
- Zhejiang Lab, Hangzhou, Zhejiang, China
| | - Suwen Zhao
- iHuman Institute, ShanghaiTech University, Shanghai, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| | - Xingxu Huang
- Zhejiang Lab, Hangzhou, Zhejiang, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| | | | - Jun Zhang
- State Key Laboratory of Reproductive Medicine and Offspring Health, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Care Hospital, Nanjing Medical University, Nanjing, Jiangsu, China.
| | - Wenjie Shu
- Bioinformatics Center of AMMS, Beijing, China.
| | | |
Collapse
|
8
|
Gonçalves C, Harrison MC, Steenwyk JL, Opulente DA, LaBella AL, Wolters JF, Zhou X, Shen XX, Groenewald M, Hittinger CT, Rokas A. Diverse signatures of convergent evolution in cactus-associated yeasts. PLoS Biol 2024; 22:e3002832. [PMID: 39312572 PMCID: PMC11449361 DOI: 10.1371/journal.pbio.3002832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Revised: 10/03/2024] [Accepted: 09/05/2024] [Indexed: 09/25/2024] Open
Abstract
Many distantly related organisms have convergently evolved traits and lifestyles that enable them to live in similar ecological environments. However, the extent of phenotypic convergence evolving through the same or distinct genetic trajectories remains an open question. Here, we leverage a comprehensive dataset of genomic and phenotypic data from 1,049 yeast species in the subphylum Saccharomycotina (Kingdom Fungi, Phylum Ascomycota) to explore signatures of convergent evolution in cactophilic yeasts, ecological specialists associated with cacti. We inferred that the ecological association of yeasts with cacti arose independently approximately 17 times. Using a machine learning-based approach, we further found that cactophily can be predicted with 76% accuracy from both functional genomic and phenotypic data. The most informative feature for predicting cactophily was thermotolerance, which we found to be likely associated with altered evolutionary rates of genes impacting the cell envelope in several cactophilic lineages. We also identified horizontal gene transfer and duplication events of plant cell wall-degrading enzymes in distantly related cactophilic clades, suggesting that putatively adaptive traits evolved independently through disparate molecular mechanisms. Notably, we found that multiple cactophilic species and their close relatives have been reported as emerging human opportunistic pathogens, suggesting that the cactophilic lifestyle-and perhaps more generally lifestyles favoring thermotolerance-might preadapt yeasts to cause human disease. This work underscores the potential of a multifaceted approach involving high-throughput genomic and phenotypic data to shed light onto ecological adaptation and highlights how convergent evolution to wild environments could facilitate the transition to human pathogenicity.
Collapse
Affiliation(s)
- Carla Gonçalves
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- Associate Laboratory i4HB—Institute for Health and Bioeconomy and UCIBIO—Applied Molecular Biosciences Unit, Department of Life Sciences, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
- UCIBIO-i4HB, Departamento de Ciências da Vida, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Caparica, Portugal
| | - Marie-Claire Harrison
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Jacob L. Steenwyk
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- Howards Hughes Medical Institute and the Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Dana A. Opulente
- Laboratory of Genetics, DOE Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Biology Department, Villanova University, Villanova, Pennsylvania, United States of America
| | - Abigail L. LaBella
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, North Carolina, United States of America
| | - John F. Wolters
- Laboratory of Genetics, DOE Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Xiaofan Zhou
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou, China
| | - Xing-Xing Shen
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- College of Agriculture and Biotechnology and Centre for Evolutionary & Organismal Biology, Zhejiang University, Hangzhou, China
| | | | - Chris Todd Hittinger
- Laboratory of Genetics, DOE Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
| |
Collapse
|
9
|
Lipsh-Sokolik R, Fleishman SJ. Addressing epistasis in the design of protein function. Proc Natl Acad Sci U S A 2024; 121:e2314999121. [PMID: 39133844 PMCID: PMC11348311 DOI: 10.1073/pnas.2314999121] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024] Open
Abstract
Mutations in protein active sites can dramatically improve function. The active site, however, is densely packed and extremely sensitive to mutations. Therefore, some mutations may only be tolerated in combination with others in a phenomenon known as epistasis. Epistasis reduces the likelihood of obtaining improved functional variants and dramatically slows natural and lab evolutionary processes. Research has shed light on the molecular origins of epistasis and its role in shaping evolutionary trajectories and outcomes. In addition, sequence- and AI-based strategies that infer epistatic relationships from mutational patterns in natural or experimental evolution data have been used to design functional protein variants. In recent years, combinations of such approaches and atomistic design calculations have successfully predicted highly functional combinatorial mutations in active sites. These were used to design thousands of functional active-site variants, demonstrating that, while our understanding of epistasis remains incomplete, some of the determinants that are critical for accurate design are now sufficiently understood. We conclude that the space of active-site variants that has been explored by evolution may be expanded dramatically to enhance natural activities or discover new ones. Furthermore, design opens the way to systematically exploring sequence and structure space and mutational impacts on function, deepening our understanding and control over protein activity.
Collapse
Affiliation(s)
- Rosalie Lipsh-Sokolik
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Sarel J Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
10
|
Vila JA. Analysis of proteins in the light of mutations. EUROPEAN BIOPHYSICS JOURNAL : EBJ 2024; 53:255-265. [PMID: 38955858 DOI: 10.1007/s00249-024-01714-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 05/23/2024] [Accepted: 06/18/2024] [Indexed: 07/04/2024]
Abstract
Proteins have evolved through mutations-amino acid substitutions-since life appeared on Earth, some 109 years ago. The study of these phenomena has been of particular significance because of their impact on protein stability, function, and structure. This study offers a new viewpoint on how the most recent findings in these areas can be used to explore the impact of mutations on protein sequence, stability, and evolvability. Preliminary results indicate that: (1) mutations can be viewed as sensitive probes to identify 'typos' in the amino-acid sequence, and also to assess the resistance of naturally occurring proteins to unwanted sequence alterations; (2) the presence of 'typos' in the amino acid sequence, rather than being an evolutionary obstacle, could promote faster evolvability and, in turn, increase the likelihood of higher protein stability; (3) the mutation site is far more important than the substituted amino acid in terms of the marginal stability changes of the protein, and (4) the unpredictability of protein evolution at the molecular level-by mutations-exists even in the absence of epistasis effects. Finally, the Darwinian concept of evolution "descent with modification" and experimental evidence endorse one of the results of this study, which suggests that some regions of any protein sequence are susceptible to mutations while others are not. This work contributes to our general understanding of protein responses to mutations and may spur significant progress in our efforts to develop methods to accurately forecast changes in protein stability, their propensity for metamorphism, and their ability to evolve.
Collapse
Affiliation(s)
- Jorge A Vila
- IMASL-CONICET, Universidad Nacional de San Luis, Ejército de los Andes 950, 5700, San Luis, Argentina.
| |
Collapse
|
11
|
Chen L, Yu K, Ma A, Zhu W, Wang H, Tang X, Tang Y, Li Y, Li J. Enhanced Thermostability of Nattokinase by Computation-Based Rational Redesign of Flexible Regions. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:14241-14254. [PMID: 38864682 DOI: 10.1021/acs.jafc.4c02335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2024]
Abstract
Nattokinase is a nutrient in healthy food natto that has the function of preventing and treating blood thrombus. However, its low thermostability and fibrinolytic activity limit its application in food and pharmaceuticals. In this study, we used bioinformatics analysis to identify two loops (loop10 and loop12) in the flexible region of nattokinase rAprY. Using this basis, we screened the G131S-S161T variant, which showed a 2.38-fold increase in half-life at 55 °C, and the M3 variant, which showed a 2.01-fold increase in activity, by using a thermostability prediction algorithm. Bioinformatics analysis revealed that the enhanced thermostability of the G131S-S161T variant was due to the increased rigidity and structural shrinkage of the overall structure. Additionally, the increased rigidity of the local region surrounding the active center and its mutated sites helps maintain its normal conformation in high-temperature environments. The increased catalytic activity of the M3 variant may be due to its more efficient substrate binding mechanism. We investigated strategies to improve the thermostability and fibrinolytic activity of nattokinase, and the resulting variants show promise for industrial production and application.
Collapse
Affiliation(s)
- Liangqi Chen
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
- Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830017, China
| | - Kongfang Yu
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
| | - Aixia Ma
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
- Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830017, China
| | - Wenhui Zhu
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
| | - Hong Wang
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
| | - Xiyu Tang
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
- Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830017, China
| | - Yaolei Tang
- Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830017, China
- The Third People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi 830000, China
| | - Yuan Li
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
- Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830017, China
| | - Jinyao Li
- Institute of Materia Medica, College of Pharmacy, Xinjiang University, Urumqi 830017, China
- Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830017, China
| |
Collapse
|
12
|
Joseph J. Increased Positive Selection in Highly Recombining Genes Does not Necessarily Reflect an Evolutionary Advantage of Recombination. Mol Biol Evol 2024; 41:msae107. [PMID: 38829800 PMCID: PMC11173204 DOI: 10.1093/molbev/msae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 04/08/2024] [Accepted: 05/28/2024] [Indexed: 06/05/2024] Open
Abstract
It is commonly thought that the long-term advantage of meiotic recombination is to dissipate genetic linkage, allowing natural selection to act independently on different loci. It is thus theoretically expected that genes with higher recombination rates evolve under more effective selection. On the other hand, recombination is often associated with GC-biased gene conversion (gBGC), which theoretically interferes with selection by promoting the fixation of deleterious GC alleles. To test these predictions, several studies assessed whether selection was more effective in highly recombining genes (due to dissipation of genetic linkage) or less effective (due to gBGC), assuming a fixed distribution of fitness effects (DFE) for all genes. In this study, I directly derive the DFE from a gene's evolutionary history (shaped by mutation, selection, drift, and gBGC) under empirical fitness landscapes. I show that genes that have experienced high levels of gBGC are less fit and thus have more opportunities for beneficial mutations. Only a small decrease in the genome-wide intensity of gBGC leads to the fixation of these beneficial mutations, particularly in highly recombining genes. This results in increased positive selection in highly recombining genes that is not caused by more effective selection. Additionally, I show that the death of a recombination hotspot can lead to a higher dN/dS than its birth, but with substitution patterns biased towards AT, and only at selected positions. This shows that controlling for a substitution bias towards GC is therefore not sufficient to rule out the contribution of gBGC to signatures of accelerated evolution. Finally, although gBGC does not affect the fixation probability of GC-conservative mutations, I show that by altering the DFE, gBGC can also significantly affect nonsynonymous GC-conservative substitution patterns.
Collapse
Affiliation(s)
- Julien Joseph
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne, France
| |
Collapse
|
13
|
Metzger BPH, Park Y, Starr TN, Thornton JW. Epistasis facilitates functional evolution in an ancient transcription factor. eLife 2024; 12:RP88737. [PMID: 38767330 PMCID: PMC11105156 DOI: 10.7554/elife.88737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
Abstract
A protein's genetic architecture - the set of causal rules by which its sequence produces its functions - also determines its possible evolutionary trajectories. Prior research has proposed that the genetic architecture of proteins is very complex, with pervasive epistatic interactions that constrain evolution and make function difficult to predict from sequence. Most of this work has analyzed only the direct paths between two proteins of interest - excluding the vast majority of possible genotypes and evolutionary trajectories - and has considered only a single protein function, leaving unaddressed the genetic architecture of functional specificity and its impact on the evolution of new functions. Here, we develop a new method based on ordinal logistic regression to directly characterize the global genetic determinants of multiple protein functions from 20-state combinatorial deep mutational scanning (DMS) experiments. We use it to dissect the genetic architecture and evolution of a transcription factor's specificity for DNA, using data from a combinatorial DMS of an ancient steroid hormone receptor's capacity to activate transcription from two biologically relevant DNA elements. We show that the genetic architecture of DNA recognition consists of a dense set of main and pairwise effects that involve virtually every possible amino acid state in the protein-DNA interface, but higher-order epistasis plays only a tiny role. Pairwise interactions enlarge the set of functional sequences and are the primary determinants of specificity for different DNA elements. They also massively expand the number of opportunities for single-residue mutations to switch specificity from one DNA target to another. By bringing variants with different functions close together in sequence space, pairwise epistasis therefore facilitates rather than constrains the evolution of new functions.
Collapse
Affiliation(s)
- Brian PH Metzger
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
| | - Yeonwoo Park
- Program in Genetics, Genomics, and Systems Biology, University of ChicagoChicagoUnited States
| | - Tyler N Starr
- Department of Biochemistry and Molecular Biophysics, University of ChicagoChicagoUnited States
| | - Joseph W Thornton
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
- Department of Human Genetics, University of ChicagoChicagoUnited States
| |
Collapse
|
14
|
Fröhlich C, Bunzel HA, Buda K, Mulholland AJ, van der Kamp MW, Johnsen PJ, Leiros HKS, Tokuriki N. Epistasis arises from shifting the rate-limiting step during enzyme evolution of a β-lactamase. Nat Catal 2024; 7:499-509. [PMID: 38828429 PMCID: PMC11136654 DOI: 10.1038/s41929-024-01117-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 01/25/2024] [Indexed: 06/05/2024]
Abstract
Epistasis, the non-additive effect of mutations, can provide combinatorial improvements to enzyme activity that substantially exceed the gains from individual mutations. Yet the molecular mechanisms of epistasis remain elusive, undermining our ability to predict pathogen evolution and engineer biocatalysts. Here we reveal how directed evolution of a β-lactamase yielded highly epistatic activity enhancements. Evolution selected four mutations that increase antibiotic resistance 40-fold, despite their marginal individual effects (≤2-fold). Synergistic improvements coincided with the introduction of super-stochiometric burst kinetics, indicating that epistasis is rooted in the enzyme's conformational dynamics. Our analysis reveals that epistasis stemmed from distinct effects of each mutation on the catalytic cycle. The initial mutation increased protein flexibility and accelerated substrate binding, which is rate-limiting in the wild-type enzyme. Subsequent mutations predominantly boosted the chemical steps by fine-tuning substrate interactions. Our work identifies an overlooked cause for epistasis: changing the rate-limiting step can result in substantial synergy that boosts enzyme activity.
Collapse
Affiliation(s)
| | - H. Adrian Bunzel
- Department of Biosystem Science and Engineering, ETH Zurich, Basel, Switzerland
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, UK
- School of Biochemistry, University of Bristol, Bristol, UK
| | - Karol Buda
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia Canada
| | - Adrian J. Mulholland
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, UK
| | - Marc W. van der Kamp
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, UK
- School of Biochemistry, University of Bristol, Bristol, UK
| | - Pål J. Johnsen
- Department of Pharmacy, UiT The Arctic University of Norway, Tromsø, Norway
| | | | - Nobuhiko Tokuriki
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia Canada
| |
Collapse
|
15
|
Sesta L, Pagnani A, Fernandez-de-Cossio-Diaz J, Uguzzoni G. Inference of annealed protein fitness landscapes with AnnealDCA. PLoS Comput Biol 2024; 20:e1011812. [PMID: 38377054 PMCID: PMC10878520 DOI: 10.1371/journal.pcbi.1011812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 01/08/2024] [Indexed: 02/22/2024] Open
Abstract
The design of proteins with specific tasks is a major challenge in molecular biology with important diagnostic and therapeutic applications. High-throughput screening methods have been developed to systematically evaluate protein activity, but only a small fraction of possible protein variants can be tested using these techniques. Computational models that explore the sequence space in-silico to identify the fittest molecules for a given function are needed to overcome this limitation. In this article, we propose AnnealDCA, a machine-learning framework to learn the protein fitness landscape from sequencing data derived from a broad range of experiments that use selection and sequencing to quantify protein activity. We demonstrate the effectiveness of our method by applying it to antibody Rep-Seq data of immunized mice and screening experiments, assessing the quality of the fitness landscape reconstructions. Our method can be applied to several experimental cases where a population of protein variants undergoes various rounds of selection and sequencing, without relying on the computation of variants enrichment ratios, and thus can be used even in cases of disjoint sequence samples.
Collapse
Affiliation(s)
- Luca Sesta
- Department of Applied Science and Technology, Politecnico di Torino, Torino, Italy
| | - Andrea Pagnani
- Department of Applied Science and Technology, Politecnico di Torino, Torino, Italy
- Italian Institute for Genomic Medicine, Torino, Italy
- INFN, Sezione di Torino, Torino, Italy
| | | | | |
Collapse
|
16
|
Buda K, Miton CM, Tokuriki N. Pervasive epistasis exposes intramolecular networks in adaptive enzyme evolution. Nat Commun 2023; 14:8508. [PMID: 38129396 PMCID: PMC10739712 DOI: 10.1038/s41467-023-44333-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Accepted: 12/08/2023] [Indexed: 12/23/2023] Open
Abstract
Enzyme evolution is characterized by constant alterations of the intramolecular residue networks supporting their functions. The rewiring of these network interactions can give rise to epistasis. As mutations accumulate, the epistasis observed across diverse genotypes may appear idiosyncratic, that is, exhibit unique effects in different genetic backgrounds. Here, we unveil a quantitative picture of the prevalence and patterns of epistasis in enzyme evolution by analyzing 41 fitness landscapes generated from seven enzymes. We show that >94% of all mutational and epistatic effects appear highly idiosyncratic, which greatly distorted the functional prediction of the evolved enzymes. By examining seemingly idiosyncratic changes in epistasis along adaptive trajectories, we expose several instances of higher-order, intramolecular rewiring. Using complementary structural data, we outline putative molecular mechanisms explaining higher-order epistasis along two enzyme trajectories. Our work emphasizes the prevalence of epistasis and provides an approach to exploring this phenomenon through a molecular lens.
Collapse
Affiliation(s)
- Karol Buda
- Michael Smith Laboratories, University of British Columbia, Vancouver, Canada
| | - Charlotte M Miton
- Michael Smith Laboratories, University of British Columbia, Vancouver, Canada
| | - Nobuhiko Tokuriki
- Michael Smith Laboratories, University of British Columbia, Vancouver, Canada.
| |
Collapse
|
17
|
Kouba P, Kohout P, Haddadi F, Bushuiev A, Samusevich R, Sedlar J, Damborsky J, Pluskal T, Sivic J, Mazurenko S. Machine Learning-Guided Protein Engineering. ACS Catal 2023; 13:13863-13895. [PMID: 37942269 PMCID: PMC10629210 DOI: 10.1021/acscatal.3c02743] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 09/20/2023] [Indexed: 11/10/2023]
Abstract
Recent progress in engineering highly promising biocatalysts has increasingly involved machine learning methods. These methods leverage existing experimental and simulation data to aid in the discovery and annotation of promising enzymes, as well as in suggesting beneficial mutations for improving known targets. The field of machine learning for protein engineering is gathering steam, driven by recent success stories and notable progress in other areas. It already encompasses ambitious tasks such as understanding and predicting protein structure and function, catalytic efficiency, enantioselectivity, protein dynamics, stability, solubility, aggregation, and more. Nonetheless, the field is still evolving, with many challenges to overcome and questions to address. In this Perspective, we provide an overview of ongoing trends in this domain, highlight recent case studies, and examine the current limitations of machine learning-based methods. We emphasize the crucial importance of thorough experimental validation of emerging models before their use for rational protein design. We present our opinions on the fundamental problems and outline the potential directions for future research.
Collapse
Affiliation(s)
- Petr Kouba
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
- Faculty of
Electrical Engineering, Czech Technical
University in Prague, Technicka 2, 166 27 Prague 6, Czech Republic
| | - Pavel Kohout
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Faraneh Haddadi
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Anton Bushuiev
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Raman Samusevich
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
| | - Jiri Sedlar
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Jiri Damborsky
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Tomas Pluskal
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
| | - Josef Sivic
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Stanislav Mazurenko
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| |
Collapse
|
18
|
Yu H, Zhang X, Acevedo-Rocha CG, Li A, Reetz MT. Protein engineering using mutability landscapes: Controlling site-selectivity of P450-catalyzed steroid hydroxylation. Methods Enzymol 2023; 693:191-229. [PMID: 37977731 DOI: 10.1016/bs.mie.2023.09.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2023]
Abstract
Directed evolution and rational design have been used widely in engineering enzymes for their application in synthetic organic chemistry and biotechnology. With stereoselectivity playing a crucial role in catalysis for the synthesis of valuable chemical and pharmaceutical compounds, rational design has not achieved such wide success in this specific area compared to directed evolution. Nevertheless, one bottleneck of directed evolution is the laborious screening efforts and the observed trade-offs in catalytic profiles. This has motivated researchers to develop more efficient protein engineering methods. As a prime approach, mutability landscaping avoids such trade-offs by providing more information of sequence-function relationships. Here, we describe an application of this efficient protein engineering method to improve the regio-/stereoselectivity and activity of P450BM3 for steroid hydroxylation, while keeping the mutagenesis libraries small so that they will require only minimal screening.
Collapse
Affiliation(s)
- Huili Yu
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Key Laboratory of Industrial Biotechnology, School of life science, Hubei University, Wuhan, P.R. China
| | - Xiaodong Zhang
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Key Laboratory of Industrial Biotechnology, School of life science, Hubei University, Wuhan, P.R. China
| | - Carlos G Acevedo-Rocha
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
| | - Aitao Li
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Key Laboratory of Industrial Biotechnology, School of life science, Hubei University, Wuhan, P.R. China.
| | - Manfred T Reetz
- Max-Planck-Institut für Kohlenforschung Kaiser-Wilhelm-Platz 1, Muelheim, Germany; Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 32 West 7th Avenue, Tianjin, P. R. China.
| |
Collapse
|
19
|
Gonçalves C, Harrison MC, Steenwyk JL, Opulente DA, LaBella AL, Wolters JF, Zhou X, Shen XX, Groenewald M, Hittinger CT, Rokas A. Diverse signatures of convergent evolution in cacti-associated yeasts. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.14.557833. [PMID: 37745407 PMCID: PMC10515907 DOI: 10.1101/2023.09.14.557833] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Many distantly related organisms have convergently evolved traits and lifestyles that enable them to live in similar ecological environments. However, the extent of phenotypic convergence evolving through the same or distinct genetic trajectories remains an open question. Here, we leverage a comprehensive dataset of genomic and phenotypic data from 1,049 yeast species in the subphylum Saccharomycotina (Kingdom Fungi, Phylum Ascomycota) to explore signatures of convergent evolution in cactophilic yeasts, ecological specialists associated with cacti. We inferred that the ecological association of yeasts with cacti arose independently ~17 times. Using machine-learning, we further found that cactophily can be predicted with 76% accuracy from functional genomic and phenotypic data. The most informative feature for predicting cactophily was thermotolerance, which is likely associated with duplication and altered evolutionary rates of genes impacting the cell envelope in several cactophilic lineages. We also identified horizontal gene transfer and duplication events of plant cell wall-degrading enzymes in distantly related cactophilic clades, suggesting that putatively adaptive traits evolved through disparate molecular mechanisms. Remarkably, multiple cactophilic lineages and their close relatives are emerging human opportunistic pathogens, suggesting that the cactophilic lifestyle-and perhaps more generally lifestyles favoring thermotolerance-may preadapt yeasts to cause human disease. This work underscores the potential of a multifaceted approach involving high throughput genomic and phenotypic data to shed light onto ecological adaptation and highlights how convergent evolution to wild environments could facilitate the transition to human pathogenicity.
Collapse
Affiliation(s)
- Carla Gonçalves
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- Present address: Associate Laboratory i4HB—Institute for Health and Bioeconomy and UCIBIO—Applied Molecular Biosciences Unit, Department of Life Sciences, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
- Present address: UCIBIO-i4HB, Departamento de Ciências da Vida, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Caparica, Portugal
| | - Marie-Claire Harrison
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - Jacob L. Steenwyk
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- Howards Hughes Medical Institute and the Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Dana A. Opulente
- Laboratory of Genetics, DOE Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institu te, University of Wisconsin-Madison, Madison, WI 53726, USA
- Biology Department, Villanova University, Villanova, PA 19085, USA
| | - Abigail L. LaBella
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte NC 28223
| | - John F. Wolters
- Laboratory of Genetics, DOE Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institu te, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Xiaofan Zhou
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou 510642, China
| | - Xing-Xing Shen
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- College of Agriculture and Biotechnology and Centre for Evolutionary & Organismal Biology, Zhejiang University, Hangzhou 310058, China
| | | | - Chris Todd Hittinger
- Laboratory of Genetics, DOE Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institu te, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Antonis Rokas
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| |
Collapse
|
20
|
Charlebois DA. Quantitative systems-based prediction of antimicrobial resistance evolution. NPJ Syst Biol Appl 2023; 9:40. [PMID: 37679446 PMCID: PMC10485028 DOI: 10.1038/s41540-023-00304-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 08/24/2023] [Indexed: 09/09/2023] Open
Abstract
Predicting evolution is a fundamental problem in biology with practical implications for treating antimicrobial resistance, which is a complex system-level phenomenon. In this perspective article, we explore the limits of predicting antimicrobial resistance evolution, quantitatively define the predictability and repeatability of microevolutionary processes, and speculate on how these quantities vary across temporal, biological, and complexity scales. The opportunities and challenges for predicting antimicrobial resistance in the context of systems biology are also discussed. Based on recent research, we conclude that the evolution of antimicrobial resistance can be predicted using a systems biology approach integrating quantitative models with multiscale data from microbial evolution experiments.
Collapse
Affiliation(s)
- Daniel A Charlebois
- Department of Physics, University of Alberta, Edmonton, AB, T6G-2E1, Canada.
- Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G-2E9, Canada.
| |
Collapse
|
21
|
Bauer J, Rajagopal N, Gupta P, Gupta P, Nixon AE, Kumar S. How can we discover developable antibody-based biotherapeutics? Front Mol Biosci 2023; 10:1221626. [PMID: 37609373 PMCID: PMC10441133 DOI: 10.3389/fmolb.2023.1221626] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 07/10/2023] [Indexed: 08/24/2023] Open
Abstract
Antibody-based biotherapeutics have emerged as a successful class of pharmaceuticals despite significant challenges and risks to their discovery and development. This review discusses the most frequently encountered hurdles in the research and development (R&D) of antibody-based biotherapeutics and proposes a conceptual framework called biopharmaceutical informatics. Our vision advocates for the syncretic use of computation and experimentation at every stage of biologic drug discovery, considering developability (manufacturability, safety, efficacy, and pharmacology) of potential drug candidates from the earliest stages of the drug discovery phase. The computational advances in recent years allow for more precise formulation of disease concepts, rapid identification, and validation of targets suitable for therapeutic intervention and discovery of potential biotherapeutics that can agonize or antagonize them. Furthermore, computational methods for de novo and epitope-specific antibody design are increasingly being developed, opening novel computationally driven opportunities for biologic drug discovery. Here, we review the opportunities and limitations of emerging computational approaches for optimizing antigens to generate robust immune responses, in silico generation of antibody sequences, discovery of potential antibody binders through virtual screening, assessment of hits, identification of lead drug candidates and their affinity maturation, and optimization for developability. The adoption of biopharmaceutical informatics across all aspects of drug discovery and development cycles should help bring affordable and effective biotherapeutics to patients more quickly.
Collapse
Affiliation(s)
- Joschka Bauer
- Early Stage Pharmaceutical Development Biologicals, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach/Riss, Germany
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
| | - Nandhini Rajagopal
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Priyanka Gupta
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Pankaj Gupta
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Andrew E. Nixon
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Sandeep Kumar
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| |
Collapse
|
22
|
Vila JA. Protein folding rate evolution upon mutations. Biophys Rev 2023; 15:661-669. [PMID: 37681091 PMCID: PMC10480377 DOI: 10.1007/s12551-023-01088-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 06/24/2023] [Indexed: 09/09/2023] Open
Abstract
Despite the spectacular success of cutting-edge protein fold prediction methods, many critical questions remain unanswered, including why proteins can reach their native state in a biologically reasonable time. A satisfactory answer to this simple question could shed light on the slowest folding rate of proteins as well as how mutations-amino-acid substitutions and/or post-translational modifications-might affect it. Preliminary results indicate that (i) Anfinsen's dogma validity ensures that proteins reach their native state on a reasonable timescale regardless of their sequence or length, and (ii) it is feasible to determine the evolution of protein folding rates without accounting for epistasis effects or the mutational trajectories between the starting and target sequences. These results have direct implications for evolutionary biology because they lay the groundwork for a better understanding of why, and to what extent, mutations-a crucial element of evolution and a factor influencing it-affect protein evolvability. Furthermore, they may spur significant progress in our efforts to solve crucial structural biology problems, such as how a sequence encodes its folding.
Collapse
Affiliation(s)
- Jorge A. Vila
- IMASL-CONICET, Universidad Nacional de San Luis, Ejército de Los Andes 950, 5700 San Luis, Argentina
| |
Collapse
|
23
|
Daalman WKG, Sweep E, Laan L. A tractable physical model for the yeast polarity predicts epistasis and fitness. Philos Trans R Soc Lond B Biol Sci 2023; 378:20220044. [PMID: 37004720 PMCID: PMC10067261 DOI: 10.1098/rstb.2022.0044] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2023] Open
Abstract
Accurate phenotype prediction based on genetic information has numerous societal applications, such as crop design or cellular factories. Epistasis, when biological components interact, complicates modelling phenotypes from genotypes. Here we show an approach to mitigate this complication for polarity establishment in budding yeast, where mechanistic information is abundant. We coarse-grain molecular interactions into a so-called mesotype, which we combine with gene expression noise into a physical cell cycle model. First, we show with computer simulations that the mesotype allows validation of the most current biochemical polarity models by quantitatively matching doubling times. Second, the mesotype elucidates epistasis emergence as exemplified by evaluating the predicted mutational effect of key polarity protein Bem1p when combined with known interactors or under different growth conditions. This example also illustrates how unlikely evolutionary trajectories can become more accessible. The tractability of our biophysically justifiable approach inspires a road-map towards bottom-up modelling complementary to statistical inferences. This article is part of the theme issue ‘Interdisciplinary approaches to predicting evolutionary biology’.
Collapse
Affiliation(s)
| | - Els Sweep
- Department of Bionanoscience, TU Delft, 2629 HZ Delft, The Netherlands
| | - Liedewij Laan
- Department of Bionanoscience, TU Delft, 2629 HZ Delft, The Netherlands
| |
Collapse
|
24
|
Weinstein JY, Martí-Gómez C, Lipsh-Sokolik R, Hoch SY, Liebermann D, Nevo R, Weissman H, Petrovich-Kopitman E, Margulies D, Ivankov D, McCandlish DM, Fleishman SJ. Designed active-site library reveals thousands of functional GFP variants. Nat Commun 2023; 14:2890. [PMID: 37210560 PMCID: PMC10199939 DOI: 10.1038/s41467-023-38099-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 04/13/2023] [Indexed: 05/22/2023] Open
Abstract
Mutations in a protein active site can lead to dramatic and useful changes in protein activity. The active site, however, is sensitive to mutations due to a high density of molecular interactions, substantially reducing the likelihood of obtaining functional multipoint mutants. We introduce an atomistic and machine-learning-based approach, called high-throughput Functional Libraries (htFuncLib), that designs a sequence space in which mutations form low-energy combinations that mitigate the risk of incompatible interactions. We apply htFuncLib to the GFP chromophore-binding pocket, and, using fluorescence readout, recover >16,000 unique designs encoding as many as eight active-site mutations. Many designs exhibit substantial and useful diversity in functional thermostability (up to 96 °C), fluorescence lifetime, and quantum yield. By eliminating incompatible active-site mutations, htFuncLib generates a large diversity of functional sequences. We envision that htFuncLib will be used in one-shot optimization of activity in enzymes, binders, and other proteins.
Collapse
Affiliation(s)
| | - Carlos Martí-Gómez
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Rosalie Lipsh-Sokolik
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Shlomo Yakir Hoch
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Demian Liebermann
- Department of Chemical and Biological Physics, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Reinat Nevo
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Haim Weissman
- Department of Molecular Chemistry and Materials Science, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | | | - David Margulies
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Dmitry Ivankov
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Sarel J Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel.
| |
Collapse
|
25
|
Chen Y, Hu R, Li K, Zhang Y, Fu L, Zhang J, Si T. Deep Mutational Scanning of an Oxygen-Independent Fluorescent Protein CreiLOV for Comprehensive Profiling of Mutational and Epistatic Effects. ACS Synth Biol 2023; 12:1461-1473. [PMID: 37066862 PMCID: PMC10204710 DOI: 10.1021/acssynbio.2c00662] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Indexed: 04/18/2023]
Abstract
Oxygen-independent, flavin mononucleotide-based fluorescent proteins (FbFPs) are promising alternatives to green fluorescent protein in anaerobic contexts. Deep mutational scanning performs systematic profiling of protein sequence-function relationships but has not been applied to FbFPs. Focusing on CreiLOV from Chlamydomonas reinhardtii, we created and analyzed two comprehensive mutant collections: (1) single-residue, site-saturation mutagenesis libraries covering all 118 residues; and (2) a full combinatorial metagenesis library among 20 mutations at 15 residues, where mutation and residue selection was based on single-site mutagenesis results. Notably, the second type of library is indispensable to study higher-order epistasis but underrepresented in the literature. Using optimized FACS-seq assays, 2,185 (>92.5%) out of 2,360 possible single-site mutants and 165,428 (>89.7%) out of 184,320 possible combinatorial mutants were reliably assigned with fitness values. We constructed statistical and machine-learning models to analyze the CreiLOV data set, enabling accurate fitness prediction of higher-order mutants using lower-order mutagenesis data. In addition, we successfully isolated CreiLOV variants with improved fluorescence quantum yield and thermostability. This work provides new empirical data and design rules to engineer combinatorial protein variants.
Collapse
Affiliation(s)
- Yongcan Chen
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Ruyun Hu
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Keyi Li
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yating Zhang
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Lihao Fu
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- University
of Chinese Academy of Sciences, Beijing 100049, China
| | - Jianzhi Zhang
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Tong Si
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- BGI-Shenzhen, Shenzhen 518083, China
- University
of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
26
|
Gantz M, Neun S, Medcalf EJ, van Vliet LD, Hollfelder F. Ultrahigh-Throughput Enzyme Engineering and Discovery in In Vitro Compartments. Chem Rev 2023; 123:5571-5611. [PMID: 37126602 PMCID: PMC10176489 DOI: 10.1021/acs.chemrev.2c00910] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Indexed: 05/03/2023]
Abstract
Novel and improved biocatalysts are increasingly sourced from libraries via experimental screening. The success of such campaigns is crucially dependent on the number of candidates tested. Water-in-oil emulsion droplets can replace the classical test tube, to provide in vitro compartments as an alternative screening format, containing genotype and phenotype and enabling a readout of function. The scale-down to micrometer droplet diameters and picoliter volumes brings about a >107-fold volume reduction compared to 96-well-plate screening. Droplets made in automated microfluidic devices can be integrated into modular workflows to set up multistep screening protocols involving various detection modes to sort >107 variants a day with kHz frequencies. The repertoire of assays available for droplet screening covers all seven enzyme commission (EC) number classes, setting the stage for widespread use of droplet microfluidics in everyday biochemical experiments. We review the practicalities of adapting droplet screening for enzyme discovery and for detailed kinetic characterization. These new ways of working will not just accelerate discovery experiments currently limited by screening capacity but profoundly change the paradigms we can probe. By interfacing the results of ultrahigh-throughput droplet screening with next-generation sequencing and deep learning, strategies for directed evolution can be implemented, examined, and evaluated.
Collapse
Affiliation(s)
| | | | | | | | - Florian Hollfelder
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Rd, Cambridge CB2 1GA, U.K.
| |
Collapse
|
27
|
Wonderlick DR, Widom JR, Harms MJ. Disentangling contact and ensemble epistasis in a riboswitch. Biophys J 2023; 122:1600-1612. [PMID: 36710492 PMCID: PMC10183321 DOI: 10.1016/j.bpj.2023.01.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 01/09/2023] [Accepted: 01/24/2023] [Indexed: 01/29/2023] Open
Abstract
Mutations introduced into macromolecules often exhibit epistasis, where the effect of one mutation alters the effect of another. Knowing the mechanisms that lead to epistasis is important for understanding how macromolecules work and evolve, as well as for effective macromolecular engineering. Here, we investigate the interplay between "contact epistasis" (epistasis arising from physical interactions between mutated residues) and "ensemble epistasis" (epistasis that occurs when a mutation redistributes the conformational ensemble of a macromolecule, thus changing the effect of the second mutation). We argue that the two mechanisms can be distinguished in allosteric macromolecules by measuring epistasis at differing allosteric effector concentrations. Contact epistasis manifests as nonadditivity in the microscopic equilibrium constants describing the conformational ensemble. This epistatic effect is independent of allosteric effector concentration. Ensemble epistasis manifests as nonadditivity in thermodynamic observables-such as ligand binding-that are determined by the distribution of ensemble conformations. This epistatic effect strongly depends on allosteric effector concentration. Using this framework, we experimentally investigated the origins of epistasis in three pairwise mutant cycles introduced into the adenine riboswitch aptamer domain by measuring ligand binding as a function of allosteric effector concentration. We found evidence for both contact and ensemble epistasis in all cycles. Furthermore, we found that the two mechanisms of epistasis could interact with each other. For example, in one mutant cycle we observed 6 kcal/mol of contact epistasis in a microscopic equilibrium constant. In that same cycle, the maximum epistasis in ligand binding was only 1.5 kcal/mol: shifts in the ensemble masked the contribution of contact epistasis. Finally, our work yields simple heuristics for identifying contact and ensemble epistasis based on measurements of a biochemical observable as a function of allosteric effector concentration.
Collapse
Affiliation(s)
- Daria R Wonderlick
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon
| | - Julia R Widom
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon; Institute for Molecular Biology, University of Oregon, Eugene, Oregon; Oregon Center for Optical, Molecular, & Quantum Science, University of Oregon, Eugene, Oregon
| | - Michael J Harms
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon; Institute for Molecular Biology, University of Oregon, Eugene, Oregon.
| |
Collapse
|
28
|
Chiang CH, Wymore T, Rodríguez Benítez A, Hussain A, Smith JL, Brooks CL, Narayan ARH. Deciphering the evolution of flavin-dependent monooxygenase stereoselectivity using ancestral sequence reconstruction. Proc Natl Acad Sci U S A 2023; 120:e2218248120. [PMID: 37014851 PMCID: PMC10104550 DOI: 10.1073/pnas.2218248120] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 03/06/2023] [Indexed: 04/05/2023] Open
Abstract
Controlling the selectivity of a reaction is critical for target-oriented synthesis. Accessing complementary selectivity profiles enables divergent synthetic strategies, but is challenging to achieve in biocatalytic reactions given enzymes' innate preferences of a single selectivity. Thus, it is critical to understand the structural features that control selectivity in biocatalytic reactions to achieve tunable selectivity. Here, we investigate the structural features that control the stereoselectivity in an oxidative dearomatization reaction that is key to making azaphilone natural products. Crystal structures of enantiocomplementary biocatalysts guided the development of multiple hypotheses centered on the structural features that control the stereochemical outcome of the reaction; however, in many cases, direct substitutions of active site residues in natural proteins led to inactive enzymes. Ancestral sequence reconstruction (ASR) and resurrection were employed as an alternative strategy to probe the impact of each residue on the stereochemical outcome of the dearomatization reaction. These studies suggest that two mechanisms are active in controlling the stereochemical outcome of the oxidative dearomatization reaction: one involving multiple active site residues in AzaH and the other dominated by a single Phe to Tyr switch in TropB and AfoD. Moreover, this study suggests that the flavin-dependent monooxygenases (FDMOs) adopt simple and flexible strategies to control stereoselectivity, which has led to stereocomplementary azaphilone natural products produced by fungi. This paradigm of combining ASR and resurrection with mutational and computational studies showcases sets of tools for understanding enzyme mechanisms and provides a solid foundation for future protein engineering efforts.
Collapse
Affiliation(s)
- Chang-Hwa Chiang
- Department of Chemistry, University of Michigan, Ann Arbor, MI48109
- Life Sciences Institute, University of Michigan, Ann Arbor, MI48109
| | - Troy Wymore
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY11794
- Department of Chemistry, Stony Brook University, Stony Brook, NY11794
| | - Attabey Rodríguez Benítez
- Life Sciences Institute, University of Michigan, Ann Arbor, MI48109
- Program in Chemical Biology, University of Michigan, Ann Arbor, MI48109
| | - Azam Hussain
- Macromolecular Science and Engineering Program, University of Michigan, Ann Arbor, MI48109
| | - Janet L. Smith
- Life Sciences Institute, University of Michigan, Ann Arbor, MI48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI48109
| | - Charles L. Brooks
- Department of Chemistry, University of Michigan, Ann Arbor, MI48109
- Program in Chemical Biology, University of Michigan, Ann Arbor, MI48109
- Department of Biophysics, University of Michigan, Ann Arbor, MI48109
| | - Alison R. H. Narayan
- Department of Chemistry, University of Michigan, Ann Arbor, MI48109
- Life Sciences Institute, University of Michigan, Ann Arbor, MI48109
- Program in Chemical Biology, University of Michigan, Ann Arbor, MI48109
| |
Collapse
|
29
|
Alejaldre L, Lemay-St-Denis C, Pelletier JN, Quaglia D. Tuning Selectivity in CalA Lipase: Beyond Tunnel Engineering. Biochemistry 2023; 62:396-409. [PMID: 36580299 PMCID: PMC9851156 DOI: 10.1021/acs.biochem.2c00513] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 12/15/2022] [Indexed: 12/30/2022]
Abstract
Engineering studies of Candida (Pseudozyma) antarctica lipase A (CalA) have demonstrated the potential of this enzyme in the selective hydrolysis of fatty acid esters of different chain lengths. CalA has been shown to bind substrates preferentially through an acyl-chain binding tunnel accessed via the hydrolytic active site; it has also been shown that selectivity for substrates of longer or shorter chain length can be tuned, for instance by modulating steric hindrance within the tunnel. Here we demonstrate that, whereas the tunnel region is certainly of paramount importance for substrate recognition, residues in distal regions of the enzyme can also modulate substrate selectivity. To this end, we investigate variants that carry one or more substitutions within the substrate tunnel as well as in distal regions. Combining experimental determination of the substrate selectivity using natural and synthetic substrates with computational characterization of protein dynamics and of tunnels, we deconvolute the effect of key substitutions and demonstrate that epistatic interactions contribute to procuring selectivity toward either long-chain or short/medium-chain fatty acid esters. We demonstrate that various mechanisms contribute to the diverse selectivity profiles, ranging from reshaping tunnel morphology and tunnel stabilization to obstructing the main substrate-binding tunnel, highlighting the dynamic nature of the substrate-binding region. This work provides important insights into the versatility of this robust lipase toward diverse applications.
Collapse
Affiliation(s)
- Lorea Alejaldre
- PROTEO,
The Québec Network for Research on Protein, Function, Engineering
and Applications, https://proteo.ca/en/
- CGCC, Center
in Green Chemistry and Catalysis, Montréal, QC, CanadaG1V 0A6
- Department
of Biochemistry and Molecular Medicine, Université de Montréal, Montréal, QC, CanadaH3T 1J4
| | - Claudèle Lemay-St-Denis
- PROTEO,
The Québec Network for Research on Protein, Function, Engineering
and Applications, https://proteo.ca/en/
- CGCC, Center
in Green Chemistry and Catalysis, Montréal, QC, CanadaG1V 0A6
- Department
of Biochemistry and Molecular Medicine, Université de Montréal, Montréal, QC, CanadaH3T 1J4
| | - Joelle N. Pelletier
- PROTEO,
The Québec Network for Research on Protein, Function, Engineering
and Applications, https://proteo.ca/en/
- CGCC, Center
in Green Chemistry and Catalysis, Montréal, QC, CanadaG1V 0A6
- Department
of Biochemistry and Molecular Medicine, Université de Montréal, Montréal, QC, CanadaH3T 1J4
- Department
of Chemistry, Université de Montréal, Montréal, QC, CanadaH2V 0B3
| | - Daniela Quaglia
- PROTEO,
The Québec Network for Research on Protein, Function, Engineering
and Applications, https://proteo.ca/en/
- CGCC, Center
in Green Chemistry and Catalysis, Montréal, QC, CanadaG1V 0A6
- Department
of Chemistry, Université de Montréal, Montréal, QC, CanadaH2V 0B3
- Department
of Chemistry, Carleton University, Ottawa, ON, CanadaK1S 5B6
| |
Collapse
|
30
|
Phillips AM, Maurer DP, Brooks C, Dupic T, Schmidt AG, Desai MM. Hierarchical sequence-affinity landscapes shape the evolution of breadth in an anti-influenza receptor binding site antibody. eLife 2023; 12:83628. [PMID: 36625542 PMCID: PMC9995116 DOI: 10.7554/elife.83628] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 01/09/2023] [Indexed: 01/11/2023] Open
Abstract
Broadly neutralizing antibodies (bnAbs) that neutralize diverse variants of a particular virus are of considerable therapeutic interest. Recent advances have enabled us to isolate and engineer these antibodies as therapeutics, but eliciting them through vaccination remains challenging, in part due to our limited understanding of how antibodies evolve breadth. Here, we analyze the landscape by which an anti-influenza receptor binding site (RBS) bnAb, CH65, evolved broad affinity to diverse H1 influenza strains. We do this by generating an antibody library of all possible evolutionary intermediates between the unmutated common ancestor (UCA) and the affinity-matured CH65 antibody and measure the affinity of each intermediate to three distinct H1 antigens. We find that affinity to each antigen requires a specific set of mutations - distributed across the variable light and heavy chains - that interact non-additively (i.e., epistatically). These sets of mutations form a hierarchical pattern across the antigens, with increasingly divergent antigens requiring additional epistatic mutations beyond those required to bind less divergent antigens. We investigate the underlying biochemical and structural basis for these hierarchical sets of epistatic mutations and find that epistasis between heavy chain mutations and a mutation in the light chain at the VH-VL interface is essential for binding a divergent H1. Collectively, this is the first work to comprehensively characterize epistasis between heavy and light chain mutations and shows that such interactions are both strong and widespread. Together with our previous study analyzing a different class of anti-influenza antibodies, our results implicate epistasis as a general feature of antibody sequence-affinity landscapes that can potentiate and constrain the evolution of breadth.
Collapse
Affiliation(s)
- Angela M Phillips
- Department of Organismic and Evolutionary Biology, Harvard UniversityCambridgeUnited States
- Department of Microbiology and Immunology, University of California, San FranciscoSan FranciscoUnited States
| | - Daniel P Maurer
- Ragon Institute of MGH, MIT, and HarvardCambridgeUnited States
- Department of Microbiology, Harvard Medical SchoolBostonUnited States
| | - Caelan Brooks
- Department of Physics, Harvard UniversityCambridgeUnited States
| | - Thomas Dupic
- Department of Organismic and Evolutionary Biology, Harvard UniversityCambridgeUnited States
| | - Aaron G Schmidt
- Ragon Institute of MGH, MIT, and HarvardCambridgeUnited States
- Department of Microbiology, Harvard Medical SchoolBostonUnited States
| | - Michael M Desai
- Department of Organismic and Evolutionary Biology, Harvard UniversityCambridgeUnited States
- Department of Physics, Harvard UniversityCambridgeUnited States
- NSF-Simons Center for Mathematical and Statistical Analysis of Biology, Harvard UniversityCambridgeUnited States
- Quantitative Biology Initiative, Harvard UniversityCambridgeUnited States
| |
Collapse
|
31
|
Pang C, Liu S, Zhang G, Zhou J, Du G, Li J. Improving the catalytic efficiency of Pseudomonas aeruginosa lipoxygenase by semi-rational design. Enzyme Microb Technol 2023; 162:110120. [DOI: 10.1016/j.enzmictec.2022.110120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 08/30/2022] [Accepted: 09/01/2022] [Indexed: 10/14/2022]
|
32
|
Wortel MT, Agashe D, Bailey SF, Bank C, Bisschop K, Blankers T, Cairns J, Colizzi ES, Cusseddu D, Desai MM, van Dijk B, Egas M, Ellers J, Groot AT, Heckel DG, Johnson ML, Kraaijeveld K, Krug J, Laan L, Lässig M, Lind PA, Meijer J, Noble LM, Okasha S, Rainey PB, Rozen DE, Shitut S, Tans SJ, Tenaillon O, Teotónio H, de Visser JAGM, Visser ME, Vroomans RMA, Werner GDA, Wertheim B, Pennings PS. Towards evolutionary predictions: Current promises and challenges. Evol Appl 2023; 16:3-21. [PMID: 36699126 PMCID: PMC9850016 DOI: 10.1111/eva.13513] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 11/11/2022] [Accepted: 11/14/2022] [Indexed: 12/14/2022] Open
Abstract
Evolution has traditionally been a historical and descriptive science, and predicting future evolutionary processes has long been considered impossible. However, evolutionary predictions are increasingly being developed and used in medicine, agriculture, biotechnology and conservation biology. Evolutionary predictions may be used for different purposes, such as to prepare for the future, to try and change the course of evolution or to determine how well we understand evolutionary processes. Similarly, the exact aspect of the evolved population that we want to predict may also differ. For example, we could try to predict which genotype will dominate, the fitness of the population or the extinction probability of a population. In addition, there are many uses of evolutionary predictions that may not always be recognized as such. The main goal of this review is to increase awareness of methods and data in different research fields by showing the breadth of situations in which evolutionary predictions are made. We describe how diverse evolutionary predictions share a common structure described by the predictive scope, time scale and precision. Then, by using examples ranging from SARS-CoV2 and influenza to CRISPR-based gene drives and sustainable product formation in biotechnology, we discuss the methods for predicting evolution, the factors that affect predictability and how predictions can be used to prevent evolution in undesirable directions or to promote beneficial evolution (i.e. evolutionary control). We hope that this review will stimulate collaboration between fields by establishing a common language for evolutionary predictions.
Collapse
Affiliation(s)
- Meike T. Wortel
- Swammerdam Institute for Life SciencesUniversity of AmsterdamAmsterdamThe Netherlands
| | - Deepa Agashe
- National Centre for Biological SciencesBangaloreIndia
| | | | - Claudia Bank
- Institute of Ecology and EvolutionUniversity of BernBernSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
- Gulbenkian Science InstituteOeirasPortugal
| | - Karen Bisschop
- Institute for Biodiversity and Ecosystem DynamicsUniversity of AmsterdamAmsterdamThe Netherlands
- Origins CenterGroningenThe Netherlands
- Laboratory of Aquatic Biology, KU Leuven KulakKortrijkBelgium
| | - Thomas Blankers
- Institute for Biodiversity and Ecosystem DynamicsUniversity of AmsterdamAmsterdamThe Netherlands
- Origins CenterGroningenThe Netherlands
| | | | - Enrico Sandro Colizzi
- Origins CenterGroningenThe Netherlands
- Mathematical InstituteLeiden UniversityLeidenThe Netherlands
| | | | | | - Bram van Dijk
- Max Planck Institute for Evolutionary BiologyPlönGermany
| | - Martijn Egas
- Institute for Biodiversity and Ecosystem DynamicsUniversity of AmsterdamAmsterdamThe Netherlands
| | - Jacintha Ellers
- Department of Ecological ScienceVrije Universiteit AmsterdamAmsterdamThe Netherlands
| | - Astrid T. Groot
- Institute for Biodiversity and Ecosystem DynamicsUniversity of AmsterdamAmsterdamThe Netherlands
| | | | | | - Ken Kraaijeveld
- Leiden Centre for Applied BioscienceUniversity of Applied Sciences LeidenLeidenThe Netherlands
| | - Joachim Krug
- Institute for Biological PhysicsUniversity of CologneCologneGermany
| | - Liedewij Laan
- Department of Bionanoscience, Kavli Institute of NanoscienceTU DelftDelftThe Netherlands
| | - Michael Lässig
- Institute for Biological PhysicsUniversity of CologneCologneGermany
| | - Peter A. Lind
- Department Molecular BiologyUmeå UniversityUmeåSweden
| | - Jeroen Meijer
- Theoretical Biology and Bioinformatics, Department of BiologyUtrecht UniversityUtrechtThe Netherlands
| | - Luke M. Noble
- Institute de Biologie, École Normale Supérieure, CNRS, InsermParisFrance
| | | | - Paul B. Rainey
- Department of Microbial Population BiologyMax Planck Institute for Evolutionary BiologyPlönGermany
- Laboratoire Biophysique et Évolution, CBI, ESPCI Paris, Université PSL, CNRSParisFrance
| | - Daniel E. Rozen
- Institute of Biology, Leiden UniversityLeidenThe Netherlands
| | - Shraddha Shitut
- Origins CenterGroningenThe Netherlands
- Institute of Biology, Leiden UniversityLeidenThe Netherlands
| | | | | | | | | | - Marcel E. Visser
- Department of Animal EcologyNetherlands Institute of Ecology (NIOO‐KNAW)WageningenThe Netherlands
| | - Renske M. A. Vroomans
- Origins CenterGroningenThe Netherlands
- Informatics InstituteUniversity of AmsterdamAmsterdamThe Netherlands
| | | | - Bregje Wertheim
- Groningen Institute for Evolutionary Life SciencesUniversity of GroningenGroningenThe Netherlands
| | | |
Collapse
|
33
|
Rodríguez-Escribano D, Pliego-Magán R, de Salas F, Aza P, Gentili P, Ihalainen P, Levée T, Meyer V, Petit-Conil M, Tapin-Lingua S, Lecourt M, Camarero S. Tailor-made alkaliphilic and thermostable fungal laccases for industrial wood processing. BIOTECHNOLOGY FOR BIOFUELS AND BIOPRODUCTS 2022; 15:149. [PMID: 36581887 PMCID: PMC9798632 DOI: 10.1186/s13068-022-02247-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 12/14/2022] [Indexed: 06/17/2023]
Abstract
BACKGROUND During the kraft process to obtain cellulosic pulp from wood, most of the lignin is removed by high-temperature alkaline cooking, released in the black liquors and usually incinerated for energy. However, kraft lignins are a valuable source of phenolic compounds that can be valorized in new bio-based products. The aim of this work is to develop laccases capable of working under the extreme conditions of high temperature and pH, typical of the industrial conversion of wood into kraft pulp and fibreboard, in order to provide extremophilic biocatalysts for depolymerising kraft lignin, and enzyme-assisted technologies for kraft pulp and fibreboard production. RESULTS Through systematic enzyme engineering, combining enzyme-directed evolution and rational design, we changed the optimal pH of the laccase for oxidation of lignin phenols from acidic to basic, enhanced the catalytic activity at alkaline pH and increased the thermal tolerance of the enzyme by accumulating up to eight mutations in the protein sequence. The extremophilic laccase variants show maximum activity at 70 °C and oxidize kraft lignin at pH 10. Their integration into industrial-type processes saves energy and chemicals. As a pre-bleaching stage, the enzymes promote kraft pulp bleachability and significantly reduce the need for chlorine dioxide compared to the industrial sequence. Their application in wood chips during fibreboard production, facilitates the defibering stage, with less energy required. CONCLUSIONS A set of new alkaliphilic and thermophilic fungal laccases has been developed to operate under the extreme conditions of high temperature and pH typical of industrial wood conversion processes. For the first time basidiomycete laccases of high-redox potential show activity on lignin-derived phenols and polymeric lignin at pH 10. Considering the extreme conditions of current industrial processes for kraft pulp and fibreboard production, the new tailor-made laccases constitute a step forward towards turning kraft pulp mills into biorefineries. Their use as biocatalysts in the wood conversion sector is expected to support the development of more environmentally sound and efficient processes, and more sustainable products.
Collapse
Affiliation(s)
| | - Rocío Pliego-Magán
- Centro de Investigaciones Biológicas Margarita Salas, CSIC. Ramiro de Maeztu 9, 28040 Madrid, Spain
| | - Felipe de Salas
- Centro de Investigaciones Biológicas Margarita Salas, CSIC. Ramiro de Maeztu 9, 28040 Madrid, Spain
| | - Pablo Aza
- Centro de Investigaciones Biológicas Margarita Salas, CSIC. Ramiro de Maeztu 9, 28040 Madrid, Spain
| | - Patrizia Gentili
- Sapienza Università Di Roma, Piazzale Aldo Moro, 5, 00185 Rome, RM Italy
| | | | - Thomas Levée
- MetGen Oy, Rakentajantie 26, 20780 Kaarina, Finland
| | - Valérie Meyer
- Centre Technique du Papier (CTP), Domaine Universitaire, 38044 Grenoble Cedex 9, France
| | - Michel Petit-Conil
- Centre Technique du Papier (CTP), Domaine Universitaire, 38044 Grenoble Cedex 9, France
| | | | - Michael Lecourt
- FCBA Institut Technologique, 341 Rue de La Papeterie, 38610 Gières, France
| | - Susana Camarero
- Centro de Investigaciones Biológicas Margarita Salas, CSIC. Ramiro de Maeztu 9, 28040 Madrid, Spain
| |
Collapse
|
34
|
Sugiki S, Niide T, Toya Y, Shimizu H. Logistic Regression-Guided Identification of Cofactor Specificity-Contributing Residues in Enzyme with Sequence Datasets Partitioned by Catalytic Properties. ACS Synth Biol 2022; 11:3973-3985. [PMID: 36321539 PMCID: PMC9764414 DOI: 10.1021/acssynbio.2c00315] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Changing the substrate/cofactor specificity of an enzyme requires multiple mutations at spatially adjacent positions around the substrate pocket. However, this is challenging when solely based on crystal structure information because enzymes undergo dynamic conformational changes during the reaction process. Herein, we proposed a method for estimating the contribution of each amino acid residue to substrate specificity by deploying a phylogenetic analysis with logistic regression. Since this method can estimate the candidate amino acids for mutation by ranking, it is readable and can be used in protein engineering. We demonstrated our concept using redox cofactor conversion of the Escherichia coli malic enzyme as a model, which still lacks crystal structure elucidation. The use of logistic regression with amino acid sequences classified by cofactor specificity showed that the NADP+-dependent malic enzyme completely switched cofactor specificity to NAD+ dependence without the need for a practical screening step. The model showed that surrounding residues made a greater contribution to cofactor specificity than those in the interior of the substrate pocket. These residues might be difficult to identify from crystal structure observations. We show that a highly accurate and inferential machine learning model was obtained using amino acid sequences of structurally homologous and functionally distinct enzymes as input data.
Collapse
|
35
|
Wittmund M, Cadet F, Davari MD. Learning Epistasis and Residue Coevolution Patterns: Current Trends and Future Perspectives for Advancing Enzyme Engineering. ACS Catal 2022. [DOI: 10.1021/acscatal.2c01426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Marcel Wittmund
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120 Halle, Germany
| | - Frederic Cadet
- Laboratory of Excellence LABEX GR, DSIMB, Inserm UMR S1134, University of Paris city & University of Reunion, Paris 75014, France
| | - Mehdi D. Davari
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120 Halle, Germany
| |
Collapse
|
36
|
Prevalence and mechanisms of evolutionary contingency in human influenza H3N2 neuraminidase. Nat Commun 2022; 13:6443. [PMID: 36307418 PMCID: PMC9616408 DOI: 10.1038/s41467-022-34060-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 10/12/2022] [Indexed: 12/25/2022] Open
Abstract
Neuraminidase (NA) of human influenza H3N2 virus has evolved rapidly and been accumulating mutations for more than half-century. However, biophysical constraints that govern the evolutionary trajectories of NA remain largely elusive. Here, we show that among 70 natural mutations that are present in the NA of a recent human H3N2 strain, >10% are deleterious for an ancestral strain. By mapping the permissive mutations using combinatorial mutagenesis and next-generation sequencing, an extensive epistatic network is revealed. Biophysical and structural analyses further demonstrate that certain epistatic interactions can be explained by non-additive stability effect, which in turn modulates membrane trafficking and enzymatic activity of NA. Additionally, our results suggest that other biophysical mechanisms also contribute to epistasis in NA evolution. Overall, these findings not only provide mechanistic insights into the evolution of human influenza NA and elucidate its sequence-structure-function relationship, but also have important implications for the development of next-generation influenza vaccines.
Collapse
|
37
|
Azbukina N, Zharikova A, Ramensky V. Intragenic compensation through the lens of deep mutational scanning. Biophys Rev 2022; 14:1161-1182. [PMID: 36345285 PMCID: PMC9636336 DOI: 10.1007/s12551-022-01005-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/26/2022] [Indexed: 12/20/2022] Open
Abstract
A significant fraction of mutations in proteins are deleterious and result in adverse consequences for protein function, stability, or interaction with other molecules. Intragenic compensation is a specific case of positive epistasis when a neutral missense mutation cancels effect of a deleterious mutation in the same protein. Permissive compensatory mutations facilitate protein evolution, since without them all sequences would be extremely conserved. Understanding compensatory mechanisms is an important scientific challenge at the intersection of protein biophysics and evolution. In human genetics, intragenic compensatory interactions are important since they may result in variable penetrance of pathogenic mutations or fixation of pathogenic human alleles in orthologous proteins from related species. The latter phenomenon complicates computational and clinical inference of an allele's pathogenicity. Deep mutational scanning is a relatively new technique that enables experimental studies of functional effects of thousands of mutations in proteins. We review the important aspects of the field and discuss existing limitations of current datasets. We reviewed ten published DMS datasets with quantified functional effects of single and double mutations and described rates and patterns of intragenic compensation in eight of them. Supplementary Information The online version contains supplementary material available at 10.1007/s12551-022-01005-w.
Collapse
Affiliation(s)
- Nadezhda Azbukina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
| | - Anastasia Zharikova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| | - Vasily Ramensky
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| |
Collapse
|
38
|
Zheng Y, Zhang B, Xie Y, Lin J, Wei D. Using a novel data-driven combinatorial mutagenesis strategy to engineer an alcohol dehydrogenase for efficient geraniol synthesis. Biochem Eng J 2022. [DOI: 10.1016/j.bej.2022.108568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
39
|
Jayaraman V, Toledo‐Patiño S, Noda‐García L, Laurino P. Mechanisms of protein evolution. Protein Sci 2022; 31:e4362. [PMID: 35762715 PMCID: PMC9214755 DOI: 10.1002/pro.4362] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/11/2022] [Accepted: 05/14/2022] [Indexed: 11/06/2022]
Abstract
How do proteins evolve? How do changes in sequence mediate changes in protein structure, and in turn in function? This question has multiple angles, ranging from biochemistry and biophysics to evolutionary biology. This review provides a brief integrated view of some key mechanistic aspects of protein evolution. First, we explain how protein evolution is primarily driven by randomly acquired genetic mutations and selection for function, and how these mutations can even give rise to completely new folds. Then, we also comment on how phenotypic protein variability, including promiscuity, transcriptional and translational errors, may also accelerate this process, possibly via "plasticity-first" mechanisms. Finally, we highlight open questions in the field of protein evolution, with respect to the emergence of more sophisticated protein systems such as protein complexes, pathways, and the emergence of pre-LUCA enzymes.
Collapse
Affiliation(s)
- Vijay Jayaraman
- Department of Molecular Cell BiologyWeizmann Institute of ScienceRehovotIsrael
| | - Saacnicteh Toledo‐Patiño
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| | - Lianet Noda‐García
- Department of Plant Pathology and Microbiology, Institute of Environmental Sciences, Robert H. Smith Faculty of Agriculture, Food and EnvironmentHebrew University of JerusalemRehovotIsrael
| | - Paola Laurino
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| |
Collapse
|
40
|
Barnes JE, Miller CR, Ytreberg FM. Searching for a mechanistic description of pairwise epistasis in protein systems. Proteins 2022; 90:1474-1485. [DOI: 10.1002/prot.26328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 11/05/2021] [Accepted: 02/22/2022] [Indexed: 11/09/2022]
Affiliation(s)
- Jonathan E. Barnes
- Department of Physics University of Idaho Moscow Idaho USA
- Institute for Modeling Collaboration and Innovation, University of Idaho Moscow Idaho USA
| | - Craig R. Miller
- Institute for Modeling Collaboration and Innovation, University of Idaho Moscow Idaho USA
- Department of Biological Sciences University of Idaho Moscow Idaho USA
- Institute for Interdisciplinary Data Sciences, University of Idaho Moscow Idaho USA
| | - Frederick Marty Ytreberg
- Department of Physics University of Idaho Moscow Idaho USA
- Institute for Modeling Collaboration and Innovation, University of Idaho Moscow Idaho USA
- Institute for Interdisciplinary Data Sciences, University of Idaho Moscow Idaho USA
| |
Collapse
|
41
|
Brissos V, Borges P, Núñez-Franco R, Lucas MF, Frazão C, Monza E, Masgrau L, Cordeiro TN, Martins LO. Distal Mutations Shape Substrate-Binding Sites during Evolution of a Metallo-Oxidase into a Laccase. ACS Catal 2022; 12:5022-5035. [PMID: 36567772 PMCID: PMC9775220 DOI: 10.1021/acscatal.2c00336] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Laccases are in increasing demand as innovative solutions in the biorefinery fields. Here, we combine mutagenesis with structural, kinetic, and in silico analyses to characterize the molecular features that cause the evolution of a hyperthermostable metallo-oxidase from the multicopper oxidase family into a laccase (k cat 273 s-1 for a bulky aromatic substrate). We show that six mutations scattered across the enzyme collectively modulate dynamics to improve the binding and catalysis of a bulky aromatic substrate. The replacement of residues during the early stages of evolution is a stepping stone for altering the shape and size of substrate-binding sites. Binding sites are then fine-tuned through high-order epistasis interactions by inserting distal mutations during later stages of evolution. Allosterically coupled, long-range dynamic networks favor catalytically competent conformational states that are more suitable for recognizing and stabilizing the aromatic substrate. This work provides mechanistic insight into enzymatic and evolutionary molecular mechanisms and spots the importance of iterative experimental and computational analyses to understand local-to-global changes.
Collapse
Affiliation(s)
- Vânia Brissos
- Instituto
de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av da República, 2780-157 Oeiras, Portugal
| | - Patrícia
T. Borges
- Instituto
de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av da República, 2780-157 Oeiras, Portugal
| | | | | | - Carlos Frazão
- Instituto
de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av da República, 2780-157 Oeiras, Portugal
| | - Emanuele Monza
- Zymvol
Biomodeling, Carrer Roc
Boronat, 117, 08018 Barcelona, Spain
| | - Laura Masgrau
- Zymvol
Biomodeling, Carrer Roc
Boronat, 117, 08018 Barcelona, Spain,Department
of Chemistry, Universitat Autònoma
de Barcelona, 08193 Bellaterra, Spain
| | - Tiago N. Cordeiro
- Instituto
de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av da República, 2780-157 Oeiras, Portugal
| | - Lígia O. Martins
- Instituto
de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av da República, 2780-157 Oeiras, Portugal,
| |
Collapse
|
42
|
Vila JA. Proteins' Evolution upon Point Mutations. ACS OMEGA 2022; 7:14371-14376. [PMID: 35573218 PMCID: PMC9089682 DOI: 10.1021/acsomega.2c01407] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 04/05/2022] [Indexed: 05/03/2023]
Abstract
As the reader must be already aware, state-of-the-art protein folding prediction methods have reached a smashing success in their goal of accurately determining the three-dimensional structures of proteins. Yet, a solution to simple problems such as the effects of protein point mutations on their (i) native conformation; (ii) marginal stability; (iii) ensemble of high-energy nativelike conformations; and (iv) metamorphism propensity and, hence, their evolvability, remains as an unsolved problem. As a plausible solution to the latter, some properties of the amide hydrogen-deuterium exchange, a highly sensitive probe of the structure, stability, and folding of proteins, are assessed from a new perspective. The preliminary results indicate that the protein marginal stability change upon point mutations provides the necessary and sufficient information to estimate, through a Boltzmann factor, the evolution of the amide hydrogen exchange protection factors and, consequently, that of the ensemble of folded conformations coexisting with the native state. This work contributes to our general understanding of the effects of point mutations on proteins and may spur significant progress in our efforts to develop methods to determine the appearance of new folds and functions accurately.
Collapse
|
43
|
Environmental selection and epistasis in an empirical phenotype-environment-fitness landscape. Nat Ecol Evol 2022; 6:427-438. [PMID: 35210579 DOI: 10.1038/s41559-022-01675-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 12/14/2021] [Indexed: 11/08/2022]
Abstract
Fitness landscapes, mappings of genotype/phenotype to their effects on fitness, are invaluable concepts in evolutionary biochemistry. Although widely discussed, measurements of phenotype-fitness landscapes in proteins remain scarce. Here, we quantify all single mutational effects on fitness and phenotype (EC50) of VIM-2 β-lactamase across a 64-fold range of ampicillin concentrations. We then construct a phenotype-fitness landscape that takes variations in environmental selection pressure into account. We found that a simple, empirical landscape accurately models the ~39,000 mutational data points, suggesting that the evolution of VIM-2 can be predicted on the basis of the selection environment. Our landscape provides new quantitative knowledge on the evolution of the β-lactamases and proteins in general, particularly their evolutionary dynamics under subinhibitory antibiotic concentrations, as well as the mechanisms and environmental dependence of non-specific epistasis.
Collapse
|
44
|
Qu G, Bi Y, Liu B, Li J, Han X, Liu W, Jiang Y, Qin Z, Sun Z. Unlocking the Stereoselectivity and Substrate Acceptance of Enzymes: Proline‐Induced Loop Engineering Test. Angew Chem Int Ed Engl 2022. [DOI: 10.1002/ange.202110793] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Ge Qu
- Tianjin Institute of Industrial Biotechnology Chinese Academy of Sciences Tianjin 300308 China
- National Technology Innovation Center of Synthetic Biology Tianjin 300308 China
| | - Yuexin Bi
- Tianjin Institute of Industrial Biotechnology Chinese Academy of Sciences Tianjin 300308 China
- University of Science and Technology of China Hefei 230027 China
| | - Beibei Liu
- Tianjin Institute of Industrial Biotechnology Chinese Academy of Sciences Tianjin 300308 China
| | - Junkuan Li
- Tianjin Institute of Industrial Biotechnology Chinese Academy of Sciences Tianjin 300308 China
- Department of Chemistry School of Science Tianjin University Tianjin 300072 China
| | - Xu Han
- Tianjin Institute of Industrial Biotechnology Chinese Academy of Sciences Tianjin 300308 China
- National Technology Innovation Center of Synthetic Biology Tianjin 300308 China
| | - Weidong Liu
- Tianjin Institute of Industrial Biotechnology Chinese Academy of Sciences Tianjin 300308 China
- National Technology Innovation Center of Synthetic Biology Tianjin 300308 China
| | - Yingying Jiang
- Tianjin Institute of Industrial Biotechnology Chinese Academy of Sciences Tianjin 300308 China
- University of Chinese Academy of Sciences Beijing 100049 China
| | - Zongmin Qin
- Tianjin Institute of Industrial Biotechnology Chinese Academy of Sciences Tianjin 300308 China
- University of Chinese Academy of Sciences Beijing 100049 China
| | - Zhoutong Sun
- Tianjin Institute of Industrial Biotechnology Chinese Academy of Sciences Tianjin 300308 China
- National Technology Innovation Center of Synthetic Biology Tianjin 300308 China
| |
Collapse
|
45
|
Jiang Y, Qu G, Sheng X, Tong F, Sun Z. Unraveling the mechanism of enantio-controlling switches of an alcohol dehydrogenase toward sterically small ketone. Catal Sci Technol 2022. [DOI: 10.1039/d2cy00031h] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Efficient synthesis of chiral compounds under mild conditions is highly desirable in the chemical and pharmaceutical communities, but it often faces difficulties. Although various enzymes have been harnessed as biocatalysts...
Collapse
|
46
|
Cadet XF, Gelly JC, van Noord A, Cadet F, Acevedo-Rocha CG. Learning Strategies in Protein Directed Evolution. Methods Mol Biol 2022; 2461:225-275. [PMID: 35727454 DOI: 10.1007/978-1-0716-2152-3_15] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Synthetic biology is a fast-evolving research field that combines biology and engineering principles to develop new biological systems for medical, pharmacological, and industrial applications. Synthetic biologists use iterative "design, build, test, and learn" cycles to efficiently engineer genetic systems that are reliable, reproducible, and predictable. Protein engineering by directed evolution can benefit from such a systematic engineering approach for various reasons. Learning can be carried out before starting, throughout or after finalizing a directed evolution project. Computational tools, bioinformatics, and scanning mutagenesis methods can be excellent starting points, while molecular dynamics simulations and other strategies can guide engineering efforts. Similarly, studying protein intermediates along evolutionary pathways offers fascinating insights into the molecular mechanisms shaped by evolution. The learning step of the cycle is not only crucial for proteins or enzymes that are not suitable for high-throughput screening or selection systems, but it is also valuable for any platform that can generate a large amount of data that can be aided by machine learning algorithms. The main challenge in protein engineering is to predict the effect of a single mutation on one functional parameter-to say nothing of several mutations on multiple parameters. This is largely due to nonadditive mutational interactions, known as epistatic effects-beneficial mutations present in a genetic background may not be beneficial in another genetic background. In this work, we provide an overview of experimental and computational strategies that can guide the user to learn protein function at different stages in a directed evolution project. We also discuss how epistatic effects can influence the success of directed evolution projects. Since machine learning is gaining momentum in protein engineering and the field is becoming more interdisciplinary thanks to collaboration between mathematicians, computational scientists, engineers, molecular biologists, and chemists, we provide a general workflow that familiarizes nonexperts with the basic concepts, dataset requirements, learning approaches, model capabilities and performance metrics of this intriguing area. Finally, we also provide some practical recommendations on how machine learning can harness epistatic effects for engineering proteins in an "outside-the-box" way.
Collapse
Affiliation(s)
- Xavier F Cadet
- PEACCEL, Artificial Intelligence Department, Paris, France
| | - Jean Christophe Gelly
- Laboratoire d'Excellence GR-Ex, Paris, France
- BIGR, DSIMB, UMR_S1134, INSERM, University of Paris & University of Reunion, Paris, France
| | | | - Frédéric Cadet
- Laboratoire d'Excellence GR-Ex, Paris, France
- BIGR, DSIMB, UMR_S1134, INSERM, University of Paris & University of Reunion, Paris, France
| | | |
Collapse
|
47
|
SpeedyGenesXL: an Automated, High-Throughput Platform for the Preparation of Bespoke Ultralarge Variant Libraries for Directed Evolution. Methods Mol Biol 2022; 2461:67-83. [PMID: 35727444 DOI: 10.1007/978-1-0716-2152-3_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Directed evolution of proteins is a highly effective strategy for tailoring biocatalysts to a particular application, and is capable of engineering improvements such as kcat, thermostability and organic solvent tolerance. It is recognized that large and systematic libraries are required to navigate a protein's vast and rugged sequence landscape effectively, yet their preparation is nontrivial and commercial libraries are extremely costly. To address this, we have developed SpeedyGenesXL, an automated, high-throughput platform for the production of wild-type genes, Boolean OR, combinatorial, or combinatorial-OR-type libraries based on the SpeedyGenes methodology. Together this offers a flexible platform for library synthesis, capable of generating many different bespoke, diverse libraries simultaneously.
Collapse
|
48
|
Baquero F, Martínez JL, F. Lanza V, Rodríguez-Beltrán J, Galán JC, San Millán A, Cantón R, Coque TM. Evolutionary Pathways and Trajectories in Antibiotic Resistance. Clin Microbiol Rev 2021; 34:e0005019. [PMID: 34190572 PMCID: PMC8404696 DOI: 10.1128/cmr.00050-19] [Citation(s) in RCA: 95] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Evolution is the hallmark of life. Descriptions of the evolution of microorganisms have provided a wealth of information, but knowledge regarding "what happened" has precluded a deeper understanding of "how" evolution has proceeded, as in the case of antimicrobial resistance. The difficulty in answering the "how" question lies in the multihierarchical dimensions of evolutionary processes, nested in complex networks, encompassing all units of selection, from genes to communities and ecosystems. At the simplest ontological level (as resistance genes), evolution proceeds by random (mutation and drift) and directional (natural selection) processes; however, sequential pathways of adaptive variation can occasionally be observed, and under fixed circumstances (particular fitness landscapes), evolution is predictable. At the highest level (such as that of plasmids, clones, species, microbiotas), the systems' degrees of freedom increase dramatically, related to the variable dispersal, fragmentation, relatedness, or coalescence of bacterial populations, depending on heterogeneous and changing niches and selective gradients in complex environments. Evolutionary trajectories of antibiotic resistance find their way in these changing landscapes subjected to random variations, becoming highly entropic and therefore unpredictable. However, experimental, phylogenetic, and ecogenetic analyses reveal preferential frequented paths (highways) where antibiotic resistance flows and propagates, allowing some understanding of evolutionary dynamics, modeling and designing interventions. Studies on antibiotic resistance have an applied aspect in improving individual health, One Health, and Global Health, as well as an academic value for understanding evolution. Most importantly, they have a heuristic significance as a model to reduce the negative influence of anthropogenic effects on the environment.
Collapse
Affiliation(s)
- F. Baquero
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - J. L. Martínez
- National Center for Biotechnology (CNB-CSIC), Madrid, Spain
| | - V. F. Lanza
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Central Bioinformatics Unit, Ramón y Cajal Institute for Health Research (IRYCIS), Madrid, Spain
| | - J. Rodríguez-Beltrán
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - J. C. Galán
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - A. San Millán
- National Center for Biotechnology (CNB-CSIC), Madrid, Spain
| | - R. Cantón
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - T. M. Coque
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| |
Collapse
|
49
|
Wang Y, Lei R, Nourmohammad A, Wu NC. Antigenic evolution of human influenza H3N2 neuraminidase is constrained by charge balancing. eLife 2021; 10:e72516. [PMID: 34878407 PMCID: PMC8683081 DOI: 10.7554/elife.72516] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 12/07/2021] [Indexed: 11/13/2022] Open
Abstract
As one of the main influenza antigens, neuraminidase (NA) in H3N2 virus has evolved extensively for more than 50 years due to continuous immune pressure. While NA has recently emerged as an effective vaccine target, biophysical constraints on the antigenic evolution of NA remain largely elusive. Here, we apply combinatorial mutagenesis and next-generation sequencing to characterize the local fitness landscape in an antigenic region of NA in six different human H3N2 strains that were isolated around 10 years apart. The local fitness landscape correlates well among strains and the pairwise epistasis is highly conserved. Our analysis further demonstrates that local net charge governs the pairwise epistasis in this antigenic region. In addition, we show that residue coevolution in this antigenic region is correlated with the pairwise epistasis between charge states. Overall, this study demonstrates the importance of quantifying epistasis and the underlying biophysical constraint for building a model of influenza evolution.
Collapse
Affiliation(s)
- Yiquan Wang
- Department of Biochemistry, University of Illinois at Urbana-ChampaignUrbanaUnited States
| | - Ruipeng Lei
- Department of Biochemistry, University of Illinois at Urbana-ChampaignUrbanaUnited States
| | - Armita Nourmohammad
- Department of Physics, University of WashingtonSeattleUnited States
- Max Planck Institute for Dynamics and Self-OrganizationGöttingenGermany
- Fred Hutchinson Cancer Research CenterSeattleUnited States
| | - Nicholas C Wu
- Department of Biochemistry, University of Illinois at Urbana-ChampaignUrbanaUnited States
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-ChampaignUrbanaUnited States
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-ChampaignUrbanaUnited States
- Carle Illinois College of Medicine, University of Illinois at Urbana-ChampaignUrbanaUnited States
| |
Collapse
|
50
|
Wittmann BJ, Yue Y, Arnold FH. Informed training set design enables efficient machine learning-assisted directed protein evolution. Cell Syst 2021; 12:1026-1045.e7. [PMID: 34416172 DOI: 10.1016/j.cels.2021.07.008] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 05/06/2021] [Accepted: 07/26/2021] [Indexed: 11/17/2022]
Abstract
Directed evolution of proteins often involves a greedy optimization in which the mutation in the highest-fitness variant identified in each round of single-site mutagenesis is fixed. The efficiency of such a single-step greedy walk depends on the order in which beneficial mutations are identified-the process is path dependent. Here, we investigate and optimize a path-independent machine learning-assisted directed evolution (MLDE) protocol that allows in silico screening of full combinatorial libraries. In particular, we evaluate the importance of different protein encoding strategies, training procedures, models, and training set design strategies on MLDE outcome, finding the most important consideration to be the implementation of strategies that reduce inclusion of minimally informative "holes" (protein variants with zero or extremely low fitness) in training data. When applied to an epistatic, hole-filled, four-site combinatorial fitness landscape, our optimized protocol achieved the global fitness maximum up to 81-fold more frequently than single-step greedy optimization. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Bruce J Wittmann
- Division of Biology and Biological Engineering, California Institute of Technology, MC 210-41, 1200 E. California Blvd., Pasadena, CA 91125, USA
| | - Yisong Yue
- Department of Computing and Mathematical Sciences, California Institute of Technology, MC 305-16, 1200 E. California Blvd., Pasadena, CA 91125, USA
| | - Frances H Arnold
- Division of Biology and Biological Engineering, California Institute of Technology, MC 210-41, 1200 E. California Blvd., Pasadena, CA 91125, USA; Division of Chemistry and Chemical Engineering, California Institute of Technology, MC 210-41, 1200 E. California Blvd., Pasadena, CA 91125, USA.
| |
Collapse
|