1
|
Valero AM, Prins RC, de Vroet T, Billerbeck S. Combining Oligo Pools and Golden Gate Cloning to Create Protein Variant Libraries or Guide RNA Libraries for CRISPR Applications. Methods Mol Biol 2025; 2850:265-295. [PMID: 39363077 DOI: 10.1007/978-1-0716-4220-7_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2024]
Abstract
Oligo pools are array-synthesized, user-defined mixtures of single-stranded oligonucleotides that can be used as a source of synthetic DNA for library cloning. While currently offering the most affordable source of synthetic DNA, oligo pools also come with limitations such as a maximum synthesis length (approximately 350 bases), a higher error rate compared to alternative synthesis methods, and the presence of truncated molecules in the pool due to incomplete synthesis. Here, we provide users with a comprehensive protocol that details how oligo pools can be used in combination with Golden Gate cloning to create user-defined protein mutant libraries, as well as single-guide RNA libraries for CRISPR applications. Our methods are optimized to work within the Yeast Toolkit Golden Gate scheme, but are in principle compatible with any other Golden Gate-based modular cloning toolkit and extendable to other restriction enzyme-based cloning methods beyond Golden Gate. Our methods yield high-quality, affordable, in-house variant libraries.
Collapse
Affiliation(s)
- Alicia Maciá Valero
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| | - Rianne C Prins
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| | - Thijs de Vroet
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| | - Sonja Billerbeck
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands.
| |
Collapse
|
2
|
Boyle GE, Sitko KA, Galloway JG, Haddox HK, Bianchi AH, Dixon A, Wheelock MK, Vandi AJ, Wang ZR, Thomson RES, Garge RK, Rettie AE, Rubin AF, Geck RC, Gillam EMJ, DeWitt WS, Matsen FA, Fowler DM. Deep mutational scanning of CYP2C19 in human cells reveals a substrate specificity-abundance tradeoff. Genetics 2024; 228:iyae156. [PMID: 39319420 PMCID: PMC11538415 DOI: 10.1093/genetics/iyae156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Accepted: 08/31/2024] [Indexed: 09/26/2024] Open
Abstract
The cytochrome P450s enzyme family metabolizes ∼80% of small molecule drugs. Variants in cytochrome P450s can substantially alter drug metabolism, leading to improper dosing and severe adverse drug reactions. Due to low sequence conservation, predicting variant effects across cytochrome P450s is challenging. Even closely related cytochrome P450s like CYP2C9 and CYP2C19, which share 92% amino acid sequence identity, display distinct phenotypic properties. Using variant abundance by massively parallel sequencing, we measured the steady-state protein abundance of 7,660 single amino acid variants in CYP2C19 expressed in cultured human cells. Our findings confirmed critical positions and structural features essential for cytochrome P450 function, and revealed how variants at conserved positions influence abundance. We jointly analyzed 4,670 variants whose abundance was measured in both CYP2C19 and CYP2C9, finding that the homologs have different variant abundances in substrate recognition sites within the hydrophobic core. We also measured the abundance of all single and some multiple wild type amino acid exchanges between CYP2C19 and CYP2C9. While most exchanges had no effect, substitutions in substrate recognition site 4 reduced abundance in CYP2C19. Double and triple mutants showed distinct interactions, highlighting a region that points to differing thermodynamic properties between the 2 homologs. These positions are known contributors to substrate specificity, suggesting an evolutionary tradeoff between stability and enzymatic function. Finally, we analyzed 368 previously unannotated human variants, finding that 43% had decreased abundance. By comparing variant effects between these homologs, we uncovered regions underlying their functional differences, advancing our understanding of this versatile family of enzymes.
Collapse
Affiliation(s)
- Gabriel E Boyle
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Katherine A Sitko
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Jared G Galloway
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Hugh K Haddox
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Aisha Haley Bianchi
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Ajeya Dixon
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Melinda K Wheelock
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Allyssa J Vandi
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Ziyu R Wang
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Raine E S Thomson
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD 4067, Australia
| | - Riddhiman K Garge
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA 98195, USA
| | - Allan E Rettie
- Department of Medicinal Chemistry, University of Washington, Seattle, WA 98195, USA
| | - Alan F Rubin
- Bioinformatics Division, Walter and Eliza Hall Institute, Parkville, VIC 3052, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC 3052, Australia
| | - Renee C Geck
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Elizabeth M J Gillam
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD 4067, Australia
| | - William S DeWitt
- Department of Electrical Engineering and Computer Science, University of California at Berkeley, Berkeley, CA 94720, USA
| | - Frederick A Matsen
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
- Howard Hughes Medical Institute, Seattle, WA 98109, USA
- Department of Statistics, University of Washington, Seattle, WA 98195, USA
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Department of Bioengineering, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
3
|
Billerbeck S, Walker RSK, Pretorius IS. Killer yeasts: expanding frontiers in the age of synthetic biology. Trends Biotechnol 2024; 42:1081-1096. [PMID: 38575438 DOI: 10.1016/j.tibtech.2024.03.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 03/07/2024] [Accepted: 03/07/2024] [Indexed: 04/06/2024]
Abstract
Killer yeasts secrete protein toxins that are selectively lethal to other yeast and filamentous fungi. These exhibit exceptional genetic and functional diversity, and have several biotechnological applications. However, despite decades of research, several limitations hinder their widespread adoption. In this perspective we contend that technical advances in synthetic biology present an unprecedented opportunity to unlock the full potential of yeast killer systems across a spectrum of applications. By leveraging these new technologies, engineered killer toxins may emerge as a pivotal new tool to address antifungal resistance and food security. Finally, we speculate on the biotechnological potential of re-engineering host double-stranded (ds) RNA mycoviruses, from which many toxins derive, as a safe and noninfectious system to produce designer RNA.
Collapse
Affiliation(s)
- Sonja Billerbeck
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology institute, University of Groningen, Groningen 9747, AG, The Netherlands
| | - Roy S K Walker
- Department of Molecular Sciences, Macquarie University, Sydney, New South Wales 2109, Australia; ARC Centre of Excellence in Synthetic Biology, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Isak S Pretorius
- ARC Centre of Excellence in Synthetic Biology, Macquarie University, Sydney, New South Wales 2109, Australia.
| |
Collapse
|
4
|
Xu R, Pan Q, Zhu G, Ye Y, Xin M, Wang Z, Wang S, Li W, Wei Y, Guo J, Zheng L. ThermoLink: Bridging disulfide bonds and enzyme thermostability through database construction and machine learning prediction. Protein Sci 2024; 33:e5097. [PMID: 39145402 PMCID: PMC11325166 DOI: 10.1002/pro.5097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Revised: 05/27/2024] [Accepted: 06/15/2024] [Indexed: 08/16/2024]
Abstract
Disulfide bonds, covalently formed by sulfur atoms in cysteine residues, play a crucial role in protein folding and structure stability. Considering their significance, artificial disulfide bonds are often introduced to enhance protein thermostability. Although an increasing number of tools can assist with this task, significant amounts of time and resources are often wasted owing to inadequate consideration. To enhance the accuracy and efficiency of designing disulfide bonds for protein thermostability improvement, we initially collected disulfide bond and protein thermostability data from extensive literature sources. Thereafter, we extracted various sequence- and structure-based features and constructed machine-learning models to predict whether disulfide bonds can improve protein thermostability. Among all models, the neighborhood context model based on the Adaboost-DT algorithm performed the best, yielding "area under the receiver operating characteristic curve" and accuracy scores of 0.773 and 0.714, respectively. Furthermore, we also found AlphaFold2 to exhibit high superiority in predicting disulfide bonds, and to some extent, the coevolutionary relationship between residue pairs potentially guided artificial disulfide bond design. Moreover, several mutants of imine reductase 89 (IR89) with artificially designed thermostable disulfide bonds were experimentally proven to be considerably efficient for substrate catalysis. The SS-bond data have been integrated into an online server, namely, ThermoLink, available at guolab.mpu.edu.mo/thermoLink.
Collapse
Affiliation(s)
- Ran Xu
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao, China
| | - Qican Pan
- Zelixir Biotech Company Ltd, Shanghai, China
| | | | - Yilin Ye
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao, China
| | - Minghui Xin
- School of Physics, Shandong University, Jinan, China
| | - Zechen Wang
- School of Physics, Shandong University, Jinan, China
| | - Sheng Wang
- Zelixir Biotech Company Ltd, Shanghai, China
| | - Weifeng Li
- School of Physics, Shandong University, Jinan, China
| | - Yanjie Wei
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Jingjing Guo
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao, China
| | - Liangzhen Zheng
- Zelixir Biotech Company Ltd, Shanghai, China
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| |
Collapse
|
5
|
Lipsh-Sokolik R, Fleishman SJ. Addressing epistasis in the design of protein function. Proc Natl Acad Sci U S A 2024; 121:e2314999121. [PMID: 39133844 PMCID: PMC11348311 DOI: 10.1073/pnas.2314999121] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024] Open
Abstract
Mutations in protein active sites can dramatically improve function. The active site, however, is densely packed and extremely sensitive to mutations. Therefore, some mutations may only be tolerated in combination with others in a phenomenon known as epistasis. Epistasis reduces the likelihood of obtaining improved functional variants and dramatically slows natural and lab evolutionary processes. Research has shed light on the molecular origins of epistasis and its role in shaping evolutionary trajectories and outcomes. In addition, sequence- and AI-based strategies that infer epistatic relationships from mutational patterns in natural or experimental evolution data have been used to design functional protein variants. In recent years, combinations of such approaches and atomistic design calculations have successfully predicted highly functional combinatorial mutations in active sites. These were used to design thousands of functional active-site variants, demonstrating that, while our understanding of epistasis remains incomplete, some of the determinants that are critical for accurate design are now sufficiently understood. We conclude that the space of active-site variants that has been explored by evolution may be expanded dramatically to enhance natural activities or discover new ones. Furthermore, design opens the way to systematically exploring sequence and structure space and mutational impacts on function, deepening our understanding and control over protein activity.
Collapse
Affiliation(s)
- Rosalie Lipsh-Sokolik
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Sarel J Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
6
|
Planas-Iglesias J, Borko S, Swiatkowski J, Elias M, Havlasek M, Salamon O, Grakova E, Kunka A, Martinovic T, Damborsky J, Martinovic J, Bednar D. AggreProt: a web server for predicting and engineering aggregation prone regions in proteins. Nucleic Acids Res 2024; 52:W159-W169. [PMID: 38801076 PMCID: PMC11223854 DOI: 10.1093/nar/gkae420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 04/23/2024] [Accepted: 05/13/2024] [Indexed: 05/29/2024] Open
Abstract
Recombinant proteins play pivotal roles in numerous applications including industrial biocatalysts or therapeutics. Despite the recent progress in computational protein structure prediction, protein solubility and reduced aggregation propensity remain challenging attributes to design. Identification of aggregation-prone regions is essential for understanding misfolding diseases or designing efficient protein-based technologies, and as such has a great socio-economic impact. Here, we introduce AggreProt, a user-friendly webserver that automatically exploits an ensemble of deep neural networks to predict aggregation-prone regions (APRs) in protein sequences. Trained on experimentally evaluated hexapeptides, AggreProt compares to or outperforms state-of-the-art algorithms on two independent benchmark datasets. The server provides per-residue aggregation profiles along with information on solvent accessibility and transmembrane propensity within an intuitive interface with interactive sequence and structure viewers for comprehensive analysis. We demonstrate AggreProt efficacy in predicting differential aggregation behaviours in proteins on several use cases, which emphasize its potential for guiding protein engineering strategies towards decreased aggregation propensity and improved solubility. The webserver is freely available and accessible at https://loschmidt.chemi.muni.cz/aggreprot/.
Collapse
Affiliation(s)
- Joan Planas-Iglesias
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Simeon Borko
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Jan Swiatkowski
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| | - Matej Elias
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| | - Martin Havlasek
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Ondrej Salamon
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| | - Ekaterina Grakova
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| | - Antonín Kunka
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Tomas Martinovic
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| | - Jiri Damborsky
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Jan Martinovic
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| | - David Bednar
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| |
Collapse
|
7
|
O'Neil PT, Swint‐Kruse L, Fenton AW. Rheostatic contributions to protein stability can obscure a position's functional role. Protein Sci 2024; 33:e5075. [PMID: 38895978 PMCID: PMC11187868 DOI: 10.1002/pro.5075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 05/24/2024] [Accepted: 05/27/2024] [Indexed: 06/21/2024]
Abstract
Rheostat positions, which can be substituted with various amino acids to tune protein function across a range of outcomes, are a developing area for advancing personalized medicine and bioengineering. Current methods cannot accurately predict which proteins contain rheostat positions or their substitution outcomes. To compare the prevalence of rheostat positions in homologs, we previously investigated their occurrence in two pyruvate kinase (PYK) isozymes. Human liver PYK contained numerous rheostat positions that tuned the apparent affinity for the substrate phosphoenolpyruvate (Kapp-PEP) across a wide range. In contrast, no functional rheostat positions were identified in Zymomonas mobilis PYK (ZmPYK). Further, the set of ZmPYK substitutions included an unusually large number that lacked measurable activity. We hypothesized that the inactive substitution variants had reduced protein stability, precluding detection of Kapp-PEP tuning. Using modified buffers, robust enzymatic activity was obtained for 19 previously-inactive ZmPYK substitution variants at three positions. Surprisingly, both previously-inactive and previously-active substitution variants all had Kapp-PEP values close to wild-type. Thus, none of the three positions were functional rheostat positions, and, unlike human liver PYK, ZmPYK's Kapp-PEP remained poorly tunable by single substitutions. To directly assess effects on stability, we performed thermal denaturation experiments for all ZmPYK substitution variants. Many diminished stability, two enhanced stability, and the three positions showed different thermal sensitivity to substitution, with one position acting as a "stability rheostat." The differences between the two PYK homologs raises interesting questions about the underlying mechanism(s) that permit functional tuning by single substitutions in some proteins but not in others.
Collapse
Affiliation(s)
- Pierce T. O'Neil
- Department of Biochemistry and Molecular BiologyThe University of Kansas Medical CenterKansasUSA
| | - Liskin Swint‐Kruse
- Department of Biochemistry and Molecular BiologyThe University of Kansas Medical CenterKansasUSA
| | - Aron W. Fenton
- Department of Biochemistry and Molecular BiologyThe University of Kansas Medical CenterKansasUSA
| |
Collapse
|
8
|
Fram B, Su Y, Truebridge I, Riesselman AJ, Ingraham JB, Passera A, Napier E, Thadani NN, Lim S, Roberts K, Kaur G, Stiffler MA, Marks DS, Bahl CD, Khan AR, Sander C, Gauthier NP. Simultaneous enhancement of multiple functional properties using evolution-informed protein design. Nat Commun 2024; 15:5141. [PMID: 38902262 PMCID: PMC11190266 DOI: 10.1038/s41467-024-49119-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 05/24/2024] [Indexed: 06/22/2024] Open
Abstract
A major challenge in protein design is to augment existing functional proteins with multiple property enhancements. Altering several properties likely necessitates numerous primary sequence changes, and novel methods are needed to accurately predict combinations of mutations that maintain or enhance function. Models of sequence co-variation (e.g., EVcouplings), which leverage extensive information about various protein properties and activities from homologous protein sequences, have proven effective for many applications including structure determination and mutation effect prediction. We apply EVcouplings to computationally design variants of the model protein TEM-1 β-lactamase. Nearly all the 14 experimentally characterized designs were functional, including one with 84 mutations from the nearest natural homolog. The designs also had large increases in thermostability, increased activity on multiple substrates, and nearly identical structure to the wild type enzyme. This study highlights the efficacy of evolutionary models in guiding large sequence alterations to generate functional diversity for protein design applications.
Collapse
Affiliation(s)
- Benjamin Fram
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.
| | - Yang Su
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Ian Truebridge
- Institute for Protein Innovation, Boston, MA, USA
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- AI Proteins, Boston, MA, USA
| | - Adam J Riesselman
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Program in Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - John B Ingraham
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Alessandro Passera
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Research Institute of Molecular Pathology (IMP), Vienna BioCenter (VBC), Campus-Vienna-Biocenter 1, 1030, Vienna, Austria
| | - Eve Napier
- School of Biochemistry and Immunology, Trinity College Dublin, Dublin 2, Ireland
| | - Nicole N Thadani
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Apriori Bio, Cambridge, MA, USA
| | - Samuel Lim
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Kristen Roberts
- Selux Diagnostics Inc., 56 Roland Street, Charlestown, MA, USA
| | - Gurleen Kaur
- Selux Diagnostics Inc., 56 Roland Street, Charlestown, MA, USA
| | - Michael A Stiffler
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Dyno Therapeutics, 343 Arsenal Street, Watertown, MA, USA
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Christopher D Bahl
- Institute for Protein Innovation, Boston, MA, USA
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- AI Proteins, Boston, MA, USA
| | - Amir R Khan
- School of Biochemistry and Immunology, Trinity College Dublin, Dublin 2, Ireland
- Division of Newborn Medicine, Boston Children's Hospital, Boston, MA, USA
| | - Chris Sander
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Nicholas P Gauthier
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
9
|
Petersen BM, Kirby MB, Chrispens KM, Irvin OM, Strawn IK, Haas CM, Walker AM, Baumer ZT, Ulmer SA, Ayala E, Rhodes ER, Guthmiller JJ, Steiner PJ, Whitehead TA. An integrated technology for quantitative wide mutational scanning of human antibody Fab libraries. Nat Commun 2024; 15:3974. [PMID: 38730230 PMCID: PMC11087541 DOI: 10.1038/s41467-024-48072-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 04/19/2024] [Indexed: 05/12/2024] Open
Abstract
Antibodies are engineerable quantities in medicine. Learning antibody molecular recognition would enable the in silico design of high affinity binders against nearly any proteinaceous surface. Yet, publicly available experiment antibody sequence-binding datasets may not contain the mutagenic, antigenic, or antibody sequence diversity necessary for deep learning approaches to capture molecular recognition. In part, this is because limited experimental platforms exist for assessing quantitative and simultaneous sequence-function relationships for multiple antibodies. Here we present MAGMA-seq, an integrated technology that combines multiple antigens and multiple antibodies and determines quantitative biophysical parameters using deep sequencing. We demonstrate MAGMA-seq on two pooled libraries comprising mutants of nine different human antibodies spanning light chain gene usage, CDR H3 length, and antigenic targets. We demonstrate the comprehensive mapping of potential antibody development pathways, sequence-binding relationships for multiple antibodies simultaneously, and identification of paratope sequence determinants for binding recognition for broadly neutralizing antibodies (bnAbs). MAGMA-seq enables rapid and scalable antibody engineering of multiple lead candidates because it can measure binding for mutants of many given parental antibodies in a single experiment.
Collapse
Affiliation(s)
- Brian M Petersen
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Monica B Kirby
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Karson M Chrispens
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Olivia M Irvin
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Isabell K Strawn
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Cyrus M Haas
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Alexis M Walker
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Zachary T Baumer
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Sophia A Ulmer
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Edgardo Ayala
- Department of Immunology and Microbiology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Emily R Rhodes
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Jenna J Guthmiller
- Department of Immunology and Microbiology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Paul J Steiner
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Timothy A Whitehead
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA.
| |
Collapse
|
10
|
Ali M, Greenig M, Oeller M, Atkinson M, Xu X, Sormanni P. Automated optimization of the solubility of a hyper-stable α-amylase. Open Biol 2024; 14:240014. [PMID: 38745462 PMCID: PMC11293438 DOI: 10.1098/rsob.240014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 03/26/2024] [Accepted: 03/27/2024] [Indexed: 05/16/2024] Open
Abstract
Most successes in computational protein engineering to date have focused on enhancing one biophysical trait, while multi-trait optimization remains a challenge. Different biophysical properties are often conflicting, as mutations that improve one tend to worsen the others. In this study, we explored the potential of an automated computational design strategy, called CamSol Combination, to optimize solubility and stability of enzymes without affecting their activity. Specifically, we focus on Bacillus licheniformis α-amylase (BLA), a hyper-stable enzyme that finds diverse application in industry and biotechnology. We validate the computational predictions by producing 10 BLA variants, including the wild-type (WT) and three designed models harbouring between 6 and 8 mutations each. Our results show that all three models have substantially improved relative solubility over the WT, unaffected catalytic rate and retained hyper-stability, supporting the algorithm's capacity to optimize enzymes. High stability and solubility embody enzymes with superior resilience to chemical and physical stresses, enhance manufacturability and allow for high-concentration formulations characterized by extended shelf lives. This ability to readily optimize solubility and stability of enzymes will enable the rapid and reliable generation of highly robust and versatile reagents, poised to contribute to advancements in diverse scientific and industrial domains.
Collapse
Affiliation(s)
- Montader Ali
- Yusuf Hamied Department of Chemistry, University of Cambridge, CambridgeCB2 1EW, UK
| | - Matthew Greenig
- Yusuf Hamied Department of Chemistry, University of Cambridge, CambridgeCB2 1EW, UK
| | - Marc Oeller
- Yusuf Hamied Department of Chemistry, University of Cambridge, CambridgeCB2 1EW, UK
- Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried82152, Germany
| | - Misha Atkinson
- Yusuf Hamied Department of Chemistry, University of Cambridge, CambridgeCB2 1EW, UK
| | - Xing Xu
- Yusuf Hamied Department of Chemistry, University of Cambridge, CambridgeCB2 1EW, UK
| | - Pietro Sormanni
- Yusuf Hamied Department of Chemistry, University of Cambridge, CambridgeCB2 1EW, UK
| |
Collapse
|
11
|
Judge A, Sankaran B, Hu L, Palaniappan M, Birgy A, Prasad BVV, Palzkill T. Network of epistatic interactions in an enzyme active site revealed by large-scale deep mutational scanning. Proc Natl Acad Sci U S A 2024; 121:e2313513121. [PMID: 38483989 PMCID: PMC10962969 DOI: 10.1073/pnas.2313513121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 02/14/2024] [Indexed: 03/19/2024] Open
Abstract
Cooperative interactions between amino acids are critical for protein function. A genetic reflection of cooperativity is epistasis, which is when a change in the amino acid at one position changes the sequence requirements at another position. To assess epistasis within an enzyme active site, we utilized CTX-M β-lactamase as a model system. CTX-M hydrolyzes β-lactam antibiotics to provide antibiotic resistance, allowing a simple functional selection for rapid sorting of modified enzymes. We created all pairwise mutations across 17 active site positions in the β-lactamase enzyme and quantitated the function of variants against two β-lactam antibiotics using next-generation sequencing. Context-dependent sequence requirements were determined by comparing the antibiotic resistance function of double mutations across the CTX-M active site to their predicted function based on the constituent single mutations, revealing both positive epistasis (synergistic interactions) and negative epistasis (antagonistic interactions) between amino acid substitutions. The resulting trends demonstrate that positive epistasis is present throughout the active site, that epistasis between residues is mediated through substrate interactions, and that residues more tolerant to substitutions serve as generic compensators which are responsible for many cases of positive epistasis. Additionally, we show that a key catalytic residue (Glu166) is amenable to compensatory mutations, and we characterize one such double mutant (E166Y/N170G) that acts by an altered catalytic mechanism. These findings shed light on the unique biochemical factors that drive epistasis within an enzyme active site and will inform enzyme engineering efforts by bridging the gap between amino acid sequence and catalytic function.
Collapse
Affiliation(s)
- Allison Judge
- Verna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX77030
| | - Banumathi Sankaran
- Department of Molecular Biophysics and Integrated Bioimaging, Berkeley Center for Structural Biology Lawrence Berkeley National Laboratory, Berkeley, CA94720
| | - Liya Hu
- Verna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX77030
| | - Murugesan Palaniappan
- Department of Pathology and Immunology, Center for Drug Discovery, Baylor College of Medicine, Houston, TX77030
| | - André Birgy
- Verna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX77030
- Infections, Antimicrobials, Modelling, Evolution, UMR 1137, French Insitute for Medical Research (INSERM), Faculty of Health, Université Paris Cité, Paris75006, France
| | - B. V. Venkataram Prasad
- Verna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX77030
| | - Timothy Palzkill
- Verna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX77030
| |
Collapse
|
12
|
Rosenberg AM, Ayres CM, Medina-Cucurella AV, Whitehead TA, Baker BM. Enhanced T cell receptor specificity through framework engineering. Front Immunol 2024; 15:1345368. [PMID: 38545094 PMCID: PMC10967027 DOI: 10.3389/fimmu.2024.1345368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 02/15/2024] [Indexed: 04/12/2024] Open
Abstract
Development of T cell receptors (TCRs) as immunotherapeutics is hindered by inherent TCR cross-reactivity. Engineering more specific TCRs has proven challenging, as unlike antibodies, improving TCR affinity does not usually improve specificity. Although various protein design approaches have been explored to surmount this, mutations in TCR binding interfaces risk broadening specificity or introducing new reactivities. Here we explored if TCR specificity could alternatively be tuned through framework mutations distant from the interface. Studying the 868 TCR specific for the HIV SL9 epitope presented by HLA-A2, we used deep mutational scanning to identify a framework mutation above the mobile CDR3β loop. This glycine to proline mutation had no discernable impact on binding affinity or functional avidity towards the SL9 epitope but weakened recognition of SL9 escape variants and led to fewer responses in a SL9-derived positional scanning library. In contrast, an interfacial mutation near the tip of CDR3α that also did not impact affinity or functional avidity towards SL9 weakened specificity. Simulations indicated that the specificity-enhancing mutation functions by reducing the range of loop motions, limiting the ability of the TCR to adjust to different ligands. Although our results are likely to be TCR dependent, using framework engineering to control TCR loop motions may be a viable strategy for improving the specificity of TCR-based immunotherapies.
Collapse
Affiliation(s)
- Aaron M. Rosenberg
- Department of Chemistry and Biochemistry and the Harper Cancer Research Institute, University of Notre Dame, Notre Dame, IN, United States
| | - Cory M. Ayres
- Department of Chemistry and Biochemistry and the Harper Cancer Research Institute, University of Notre Dame, Notre Dame, IN, United States
| | | | - Timothy A. Whitehead
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, United States
| | - Brian M. Baker
- Department of Chemistry and Biochemistry and the Harper Cancer Research Institute, University of Notre Dame, Notre Dame, IN, United States
| |
Collapse
|
13
|
Swint-Kruse L, Fenton AW. Rheostats, toggles, and neutrals, Oh my! A new framework for understanding how amino acid changes modulate protein function. J Biol Chem 2024; 300:105736. [PMID: 38336297 PMCID: PMC10914490 DOI: 10.1016/j.jbc.2024.105736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/09/2024] [Accepted: 01/25/2024] [Indexed: 02/12/2024] Open
Abstract
Advances in personalized medicine and protein engineering require accurately predicting outcomes of amino acid substitutions. Many algorithms correctly predict that evolutionarily-conserved positions show "toggle" substitution phenotypes, which is defined when a few substitutions at that position retain function. In contrast, predictions often fail for substitutions at the less-studied "rheostat" positions, which are defined when different amino acid substitutions at a position sample at least half of the possible functional range. This review describes efforts to understand the impact and significance of rheostat positions: (1) They have been observed in globular soluble, integral membrane, and intrinsically disordered proteins; within single proteins, their prevalence can be up to 40%. (2) Substitutions at rheostat positions can have biological consequences and ∼10% of substitutions gain function. (3) Although both rheostat and "neutral" (defined when all substitutions exhibit wild-type function) positions are nonconserved, the two classes have different evolutionary signatures. (4) Some rheostat positions have pleiotropic effects on function, simultaneously modulating multiple parameters (e.g., altering both affinity and allosteric coupling). (5) In structural studies, substitutions at rheostat positions appear to cause only local perturbations; the overall conformations appear unchanged. (6) Measured functional changes show promising correlations with predicted changes in protein dynamics; the emergent properties of predicted, dynamically coupled amino acid networks might explain some of the complex functional outcomes observed when substituting rheostat positions. Overall, rheostat positions provide unique opportunities for using single substitutions to tune protein function. Future studies of these positions will yield important insights into the protein sequence/function relationship.
Collapse
Affiliation(s)
- Liskin Swint-Kruse
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA.
| | - Aron W Fenton
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| |
Collapse
|
14
|
Vanella R, Küng C, Schoepfer AA, Doffini V, Ren J, Nash MA. Understanding activity-stability tradeoffs in biocatalysts by enzyme proximity sequencing. Nat Commun 2024; 15:1807. [PMID: 38418512 PMCID: PMC10902396 DOI: 10.1038/s41467-024-45630-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 01/26/2024] [Indexed: 03/01/2024] Open
Abstract
Understanding the complex relationships between enzyme sequence, folding stability and catalytic activity is crucial for applications in industry and biomedicine. However, current enzyme assay technologies are limited by an inability to simultaneously resolve both stability and activity phenotypes and to couple these to gene sequences at large scale. Here we present the development of enzyme proximity sequencing, a deep mutational scanning method that leverages peroxidase-mediated radical labeling with single cell fidelity to dissect the effects of thousands of mutations on stability and catalytic activity of oxidoreductase enzymes in a single experiment. We use enzyme proximity sequencing to analyze how 6399 missense mutations influence folding stability and catalytic activity in a D-amino acid oxidase from Rhodotorula gracilis. The resulting datasets demonstrate activity-based constraints that limit folding stability during natural evolution, and identify hotspots distant from the active site as candidates for mutations that improve catalytic activity without sacrificing stability. Enzyme proximity sequencing can be extended to other enzyme classes and provides valuable insights into biophysical principles governing enzyme structure and function.
Collapse
Affiliation(s)
- Rosario Vanella
- Institute of Physical Chemistry, Department of Chemistry, University of Basel, 4058, Basel, Switzerland.
- Department of Biosystems Science and Engineering, ETH Zurich, 4058, Basel, Switzerland.
| | - Christoph Küng
- Institute of Physical Chemistry, Department of Chemistry, University of Basel, 4058, Basel, Switzerland
- Department of Biosystems Science and Engineering, ETH Zurich, 4058, Basel, Switzerland
| | - Alexandre A Schoepfer
- Institute of Physical Chemistry, Department of Chemistry, University of Basel, 4058, Basel, Switzerland
- Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland
- National Center for Competence in Research (NCCR), Catalysis, École Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland
| | - Vanni Doffini
- Institute of Physical Chemistry, Department of Chemistry, University of Basel, 4058, Basel, Switzerland
- Department of Biosystems Science and Engineering, ETH Zurich, 4058, Basel, Switzerland
| | - Jin Ren
- Institute of Physical Chemistry, Department of Chemistry, University of Basel, 4058, Basel, Switzerland
- Department of Biosystems Science and Engineering, ETH Zurich, 4058, Basel, Switzerland
| | - Michael A Nash
- Institute of Physical Chemistry, Department of Chemistry, University of Basel, 4058, Basel, Switzerland.
- Department of Biosystems Science and Engineering, ETH Zurich, 4058, Basel, Switzerland.
- National Center for Competence in Research (NCCR), Molecular Systems Engineering, 4058, Basel, Switzerland.
- Swiss Nanoscience Institute, 4056, Basel, Switzerland.
| |
Collapse
|
15
|
Skene KR. Systems theory, thermodynamics and life: Integrated thinking across ecology, organization and biological evolution. Biosystems 2024; 236:105123. [PMID: 38244715 DOI: 10.1016/j.biosystems.2024.105123] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 01/15/2024] [Accepted: 01/15/2024] [Indexed: 01/22/2024]
Abstract
In this paper we explore the relevance and integration of system theory and thermodynamics in terms of the Earth system. It is proposed that together, these fields explain the evolution, organization, functionality and directionality of life on Earth. We begin by summarizing historical and current thinking on the definition of life itself. We then investigate the evidence for a single unit of life. Given that any definition of life and its levels of organization are intertwined, we explore how the Earth system is structured and functions from an energetic perspective, by outlining relevant thermodynamic theory relating to molecular, metabolic, cellular, individual, population, species, ecosystem and biome organization. We next investigate the fundamental relationships between systems theory and thermodynamics in terms of the Earth system, examining the key characteristics of self-assembly, self-organization (including autonomy), emergence, non-linearity, feedback and sub-optimality. Finally, we examine the relevance of systems theory and thermodynamics with reference to two specific aspects: the tempo and directionality of evolution and the directional and predictable process of ecological succession. We discuss the importance of the entropic drive in understanding altruism, multicellularity, mutualistic and antagonistic relationships and how maximum entropy production theory may explain patterns thought to evidence the intermediate disturbance hypothesis.
Collapse
Affiliation(s)
- Keith R Skene
- Biosphere Research Institute, Angus, United Kingdom.
| |
Collapse
|
16
|
Petersen BM, Kirby MB, Chrispens KM, Irvin OM, Strawn IK, Haas CM, Walker AM, Baumer ZT, Ulmer SA, Ayala E, Rhodes ER, Guthmiller JJ, Steiner PJ, Whitehead TA. An integrated technology for quantitative wide mutational scanning of human antibody Fab libraries. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.16.575852. [PMID: 38293170 PMCID: PMC10827193 DOI: 10.1101/2024.01.16.575852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Antibodies are engineerable quantities in medicine. Learning antibody molecular recognition would enable the in silico design of high affinity binders against nearly any proteinaceous surface. Yet, publicly available experiment antibody sequence-binding datasets may not contain the mutagenic, antigenic, or antibody sequence diversity necessary for deep learning approaches to capture molecular recognition. In part, this is because limited experimental platforms exist for assessing quantitative and simultaneous sequence-function relationships for multiple antibodies. Here we present MAGMA-seq, an integrated technology that combines multiple antigens and multiple antibodies and determines quantitative biophysical parameters using deep sequencing. We demonstrate MAGMA-seq on two pooled libraries comprising mutants of ten different human antibodies spanning light chain gene usage, CDR H3 length, and antigenic targets. We demonstrate the comprehensive mapping of potential antibody development pathways, sequence-binding relationships for multiple antibodies simultaneously, and identification of paratope sequence determinants for binding recognition for broadly neutralizing antibodies (bnAbs). MAGMA-seq enables rapid and scalable antibody engineering of multiple lead candidates because it can measure binding for mutants of many given parental antibodies in a single experiment.
Collapse
Affiliation(s)
- Brian M. Petersen
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80305, USA
| | - Monica B. Kirby
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80305, USA
| | - Karson M. Chrispens
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80305, USA
| | - Olivia M. Irvin
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80305, USA
| | - Isabell K. Strawn
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80305, USA
| | - Cyrus M. Haas
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80305, USA
| | - Alexis M. Walker
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80305, USA
| | - Zachary T. Baumer
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80305, USA
| | - Sophia A. Ulmer
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80305, USA
| | - Edgardo Ayala
- Department of Immunology and Microbiology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045
| | - Emily R. Rhodes
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80305, USA
| | - Jenna J. Guthmiller
- Department of Immunology and Microbiology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045
| | - Paul J. Steiner
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80305, USA
| | - Timothy A. Whitehead
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, 80305, USA
| |
Collapse
|
17
|
Nemoto T, Ocari T, Planul A, Tekinsoy M, Zin EA, Dalkara D, Ferrari U. ACIDES: on-line monitoring of forward genetic screens for protein engineering. Nat Commun 2023; 14:8504. [PMID: 38148337 PMCID: PMC10751290 DOI: 10.1038/s41467-023-43967-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 11/24/2023] [Indexed: 12/28/2023] Open
Abstract
Forward genetic screens of mutated variants are a versatile strategy for protein engineering and investigation, which has been successfully applied to various studies like directed evolution (DE) and deep mutational scanning (DMS). While next-generation sequencing can track millions of variants during the screening rounds, the vast and noisy nature of the sequencing data impedes the estimation of the performance of individual variants. Here, we propose ACIDES that combines statistical inference and in-silico simulations to improve performance estimation in the library selection process by attributing accurate statistical scores to individual variants. We tested ACIDES first on a random-peptide-insertion experiment and then on multiple public datasets from DE and DMS studies. ACIDES allows experimentalists to reliably estimate variant performance on the fly and can aid protein engineering and research pipelines in a range of applications, including gene therapy.
Collapse
Affiliation(s)
- Takahiro Nemoto
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France.
- Graduate School of Informatics, Kyoto University, Yoshida Hon-machi, Sakyo-ku, Kyoto, 606-8501, Japan.
- Premium Research Institute for Human Metaverse Medicine (WPI-PRIMe), Osaka University, Suita, Osaka, 565-0871, Japan.
| | - Tommaso Ocari
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Arthur Planul
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Muge Tekinsoy
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Emilia A Zin
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Deniz Dalkara
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France.
| | - Ulisse Ferrari
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France.
| |
Collapse
|
18
|
Kouba P, Kohout P, Haddadi F, Bushuiev A, Samusevich R, Sedlar J, Damborsky J, Pluskal T, Sivic J, Mazurenko S. Machine Learning-Guided Protein Engineering. ACS Catal 2023; 13:13863-13895. [PMID: 37942269 PMCID: PMC10629210 DOI: 10.1021/acscatal.3c02743] [Citation(s) in RCA: 28] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 09/20/2023] [Indexed: 11/10/2023]
Abstract
Recent progress in engineering highly promising biocatalysts has increasingly involved machine learning methods. These methods leverage existing experimental and simulation data to aid in the discovery and annotation of promising enzymes, as well as in suggesting beneficial mutations for improving known targets. The field of machine learning for protein engineering is gathering steam, driven by recent success stories and notable progress in other areas. It already encompasses ambitious tasks such as understanding and predicting protein structure and function, catalytic efficiency, enantioselectivity, protein dynamics, stability, solubility, aggregation, and more. Nonetheless, the field is still evolving, with many challenges to overcome and questions to address. In this Perspective, we provide an overview of ongoing trends in this domain, highlight recent case studies, and examine the current limitations of machine learning-based methods. We emphasize the crucial importance of thorough experimental validation of emerging models before their use for rational protein design. We present our opinions on the fundamental problems and outline the potential directions for future research.
Collapse
Affiliation(s)
- Petr Kouba
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
- Faculty of
Electrical Engineering, Czech Technical
University in Prague, Technicka 2, 166 27 Prague 6, Czech Republic
| | - Pavel Kohout
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Faraneh Haddadi
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Anton Bushuiev
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Raman Samusevich
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
| | - Jiri Sedlar
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Jiri Damborsky
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Tomas Pluskal
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
| | - Josef Sivic
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Stanislav Mazurenko
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| |
Collapse
|
19
|
Smith MD, Case MA, Makowski EK, Tessier PM. Position-Specific Enrichment Ratio Matrix scores predict antibody variant properties from deep sequencing data. Bioinformatics 2023; 39:btad446. [PMID: 37478351 PMCID: PMC10477941 DOI: 10.1093/bioinformatics/btad446] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 06/21/2023] [Accepted: 07/20/2023] [Indexed: 07/23/2023] Open
Abstract
MOTIVATION Deep sequencing of antibody and related protein libraries after phage or yeast-surface display sorting is widely used to identify variants with increased affinity, specificity, and/or improvements in key biophysical properties. Conventional approaches for identifying optimal variants typically use the frequencies of observation in enriched libraries or the corresponding enrichment ratios. However, these approaches disregard the vast majority of deep sequencing data and often fail to identify the best variants in the libraries. RESULTS Here, we present a method, Position-Specific Enrichment Ratio Matrix (PSERM) scoring, that uses entire deep sequencing datasets from pre- and post-selections to score each observed protein variant. The PSERM scores are the sum of the site-specific enrichment ratios observed at each mutated position. We find that PSERM scores are much more reproducible and correlate more strongly with experimentally measured properties than frequencies or enrichment ratios, including for multiple antibody properties (affinity and non-specific binding) for a clinical-stage antibody (emibetuzumab). We expect that this method will be broadly applicable to diverse protein engineering campaigns. AVAILABILITY AND IMPLEMENTATION All deep sequencing datasets and code to perform the analyses presented within are available via https://github.com/Tessier-Lab-UMich/PSERM_paper.
Collapse
Affiliation(s)
- Matthew D Smith
- Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Biointerfaces Institute, University of Michigan, Ann Arbor, MI 48109-2200, United States
| | - Marshall A Case
- Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109-2200, United States
| | - Emily K Makowski
- Biointerfaces Institute, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, MI 48109-2200, United States
| | - Peter M Tessier
- Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Biointerfaces Institute, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Protein Folding Disease Initiative, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Michigan Alzheimer’s Disease Center, University of Michigan, Ann Arbor, MI 48109-2200, United States
| |
Collapse
|
20
|
McConnell A, Hackel BJ. Protein engineering via sequence-performance mapping. Cell Syst 2023; 14:656-666. [PMID: 37494931 PMCID: PMC10527434 DOI: 10.1016/j.cels.2023.06.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 05/10/2023] [Accepted: 06/21/2023] [Indexed: 07/28/2023]
Abstract
Discovery and evolution of new and improved proteins has empowered molecular therapeutics, diagnostics, and industrial biotechnology. Discovery and evolution both require efficient screens and effective libraries, although they differ in their challenges because of the absence or presence, respectively, of an initial protein variant with the desired function. A host of high-throughput technologies-experimental and computational-enable efficient screens to identify performant protein variants. In partnership, an informed search of sequence space is needed to overcome the immensity, sparsity, and complexity of the sequence-performance landscape. Early in the historical trajectory of protein engineering, these elements aligned with distinct approaches to identify the most performant sequence: selection from large, randomized combinatorial libraries versus rational computational design. Substantial advances have now emerged from the synergy of these perspectives. Rational design of combinatorial libraries aids the experimental search of sequence space, and high-throughput, high-integrity experimental data inform computational design. At the core of the collaborative interface, efficient protein characterization (rather than mere selection of optimal variants) maps sequence-performance landscapes. Such quantitative maps elucidate the complex relationships between protein sequence and performance-e.g., binding, catalytic efficiency, biological activity, and developability-thereby advancing fundamental protein science and facilitating protein discovery and evolution.
Collapse
Affiliation(s)
- Adam McConnell
- Department of Biomedical Engineering, University of Minnesota - Twin Cities, 421 Washington Avenue SE, Minneapolis, MN 55455, USA
| | - Benjamin J Hackel
- Department of Biomedical Engineering, University of Minnesota - Twin Cities, 421 Washington Avenue SE, Minneapolis, MN 55455, USA; Department of Chemical Engineering and Materials Science, University of Minnesota - Twin Cities, 421 Washington Avenue SE, Minneapolis, MN 55455, USA.
| |
Collapse
|
21
|
Küng C, Vanella R, Nash MA. Directed evolution of Rhodotorula gracilisd-amino acid oxidase using single-cell hydrogel encapsulation and ultrahigh-throughput screening. REACT CHEM ENG 2023; 8:1960-1968. [PMID: 37496730 PMCID: PMC10366730 DOI: 10.1039/d3re00002h] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 04/15/2023] [Indexed: 07/28/2023]
Abstract
Engineering catalytic and biophysical properties of enzymes is an essential step en route to advanced biomedical and industrial applications. Here, we developed a high-throughput screening and directed evolution strategy relying on single-cell hydrogel encapsulation to enhance the performance of d-Amino acid oxidase from Rhodotorula gracilis (RgDAAOx), a candidate enzyme for cancer therapy. We used a cascade reaction between RgDAAOx variants surface displayed on yeast and horseradish peroxidase (HRP) in the bulk media to trigger enzyme-mediated crosslinking of phenol-bearing fluorescent alginate macromonomers, resulting in hydrogel formation around single yeast cells. The fluorescent hydrogel capsules served as an artificial phenotype and basis for pooled library screening by fluorescence activated cell sorting (FACS). We screened a RgDAAOx variant library containing ∼106 clones while lowering the d-Ala substrate concentration over three sorting rounds in order to isolate variants with low Km. After three rounds of FACS sorting and regrowth, we isolated and fully characterized four variants displayed on the yeast surface. We identified variants with a more than 5-fold lower Km than the parent sequence, with an apparent increase in substrate binding affinity. The mutations we identified were scattered across the RgDAAOx structure, demonstrating the difficulty in rationally predicting allosteric sites and highlighting the advantages of scalable library screening technologies for evolving catalytic enzymes.
Collapse
Affiliation(s)
- Christoph Küng
- Institute of Physical Chemistry, Department of Chemistry, University of Basel 4058 Basel Switzerland
- Department of Biosystems Science and Engineering, ETH Zurich 4058 Basel Switzerland
| | - Rosario Vanella
- Institute of Physical Chemistry, Department of Chemistry, University of Basel 4058 Basel Switzerland
- Department of Biosystems Science and Engineering, ETH Zurich 4058 Basel Switzerland
| | - Michael A Nash
- Institute of Physical Chemistry, Department of Chemistry, University of Basel 4058 Basel Switzerland
- Department of Biosystems Science and Engineering, ETH Zurich 4058 Basel Switzerland
| |
Collapse
|
22
|
Smith MD, Case MA, Makowski EK, Tessier PM. Position-Specific Enrichment Ratio Matrix scores predict antibody variant properties from deep sequencing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.10.548448. [PMID: 37503142 PMCID: PMC10369870 DOI: 10.1101/2023.07.10.548448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Motivation Deep sequencing of antibody and related protein libraries after phage or yeast-surface display sorting is widely used to identify variants with increased affinity, specificity and/or improvements in key biophysical properties. Conventional approaches for identifying optimal variants typically use the frequencies of observation in enriched libraries or the corresponding enrichment ratios. However, these approaches disregard the vast majority of deep sequencing data and often fail to identify the best variants in the libraries. Results Here, we present a method, Position-Specific Enrichment Ratio Matrix (PSERM) scoring, that uses entire deep sequencing datasets from pre- and post-selections to score each observed protein variant. The PSERM scores are the sum of the site-specific enrichment ratios observed at each mutated position. We find that PSERM scores are much more reproducible and correlate more strongly with experimentally measured properties than frequencies or enrichment ratios, including for multiple antibody properties (affinity and non-specific binding) for a clinical-stage antibody (emibetuzumab). We expect that this method will be broadly applicable to diverse protein engineering campaigns. Availability All deep sequencing datasets and code to do the analyses presented within are available via GitHub. Contact Peter Tessier, ptessier@umich.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
23
|
Fram B, Truebridge I, Su Y, Riesselman AJ, Ingraham JB, Passera A, Napier E, Thadani NN, Lim S, Roberts K, Kaur G, Stiffler M, Marks DS, Bahl CD, Khan AR, Sander C, Gauthier NP. Simultaneous enhancement of multiple functional properties using evolution-informed protein design. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.09.539914. [PMID: 37214973 PMCID: PMC10197589 DOI: 10.1101/2023.05.09.539914] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Designing optimized proteins is important for a range of practical applications. Protein design is a rapidly developing field that would benefit from approaches that enable many changes in the amino acid primary sequence, rather than a small number of mutations, while maintaining structure and enhancing function. Homologous protein sequences contain extensive information about various protein properties and activities that have emerged over billions of years of evolution. Evolutionary models of sequence co-variation, derived from a set of homologous sequences, have proven effective in a range of applications including structure determination and mutation effect prediction. In this work we apply one of these models (EVcouplings) to computationally design highly divergent variants of the model protein TEM-1 β-lactamase, and characterize these designs experimentally using multiple biochemical and biophysical assays. Nearly all designed variants were functional, including one with 84 mutations from the nearest natural homolog. Surprisingly, all functional designs had large increases in thermostability and most had a broadening of available substrates. These property enhancements occurred while maintaining a nearly identical structure to the wild type enzyme. Collectively, this work demonstrates that evolutionary models of sequence co-variation (1) are able to capture complex epistatic interactions that successfully guide large sequence departures from natural contexts, and (2) can be applied to generate functional diversity useful for many applications in protein design.
Collapse
Affiliation(s)
- Benjamin Fram
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Ian Truebridge
- Institute for Protein Innovation, Boston, Massachusetts, Boston, MA, USA
- Division of Hematology/Oncology, Boston Children’s Hospital, Harvard Medical School; Boston, MA, USA
- current address: AI Proteins; Boston, MA, USA
| | - Yang Su
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Adam J. Riesselman
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Program in Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - John B. Ingraham
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Alessandro Passera
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- current address: Research Institute of Molecular Pathology (IMP), Vienna BioCenter (VBC), Campus-Vienna-Biocenter 1, 1030 Vienna, Austria
| | - Eve Napier
- School of Biochemistry and Immunology, Trinity College Dublin, Dublin 2, Ireland
| | - Nicole N. Thadani
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Samuel Lim
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Kristen Roberts
- Selux Diagnostics, Inc., 56 Roland Street, Charlestown, MA, USA
| | - Gurleen Kaur
- Selux Diagnostics, Inc., 56 Roland Street, Charlestown, MA, USA
| | - Michael Stiffler
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Debora S. Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Christopher D. Bahl
- Institute for Protein Innovation, Boston, Massachusetts, Boston, MA, USA
- Division of Hematology/Oncology, Boston Children’s Hospital, Harvard Medical School; Boston, MA, USA
- current address: AI Proteins; Boston, MA, USA
| | - Amir R. Khan
- School of Biochemistry and Immunology, Trinity College Dublin, Dublin 2, Ireland
- Division of Newborn Medicine, Boston Children’s Hospital, Boston, MA, USA
| | - Chris Sander
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Nicholas P. Gauthier
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| |
Collapse
|
24
|
Chen Z, Wang X, Chen X, Huang J, Wang C, Wang J, Wang Z. Accelerating therapeutic protein design with computational approaches toward the clinical stage. Comput Struct Biotechnol J 2023; 21:2909-2926. [PMID: 38213894 PMCID: PMC10781723 DOI: 10.1016/j.csbj.2023.04.027] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 04/11/2023] [Accepted: 04/27/2023] [Indexed: 01/13/2024] Open
Abstract
Therapeutic protein, represented by antibodies, is of increasing interest in human medicine. However, clinical translation of therapeutic protein is still largely hindered by different aspects of developability, including affinity and selectivity, stability and aggregation prevention, solubility and viscosity reduction, and deimmunization. Conventional optimization of the developability with widely used methods, like display technologies and library screening approaches, is a time and cost-intensive endeavor, and the efficiency in finding suitable solutions is still not enough to meet clinical needs. In recent years, the accelerated advancement of computational methodologies has ushered in a transformative era in the field of therapeutic protein design. Owing to their remarkable capabilities in feature extraction and modeling, the integration of cutting-edge computational strategies with conventional techniques presents a promising avenue to accelerate the progression of therapeutic protein design and optimization toward clinical implementation. Here, we compared the differences between therapeutic protein and small molecules in developability and provided an overview of the computational approaches applicable to the design or optimization of therapeutic protein in several developability issues.
Collapse
Affiliation(s)
- Zhidong Chen
- Department of Pathology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen 518033, China
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Xinpei Wang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Xu Chen
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Juyang Huang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Chenglin Wang
- Shenzhen Qiyu Biotechnology Co., Ltd, Shenzhen 518107, China
| | - Junqing Wang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Zhe Wang
- Department of Pathology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen 518033, China
| |
Collapse
|
25
|
Wei H, Li X. Deep mutational scanning: A versatile tool in systematically mapping genotypes to phenotypes. Front Genet 2023; 14:1087267. [PMID: 36713072 PMCID: PMC9878224 DOI: 10.3389/fgene.2023.1087267] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 01/02/2023] [Indexed: 01/13/2023] Open
Abstract
Unveiling how genetic variations lead to phenotypic variations is one of the key questions in evolutionary biology, genetics, and biomedical research. Deep mutational scanning (DMS) technology has allowed the mapping of tens of thousands of genetic variations to phenotypic variations efficiently and economically. Since its first systematic introduction about a decade ago, we have witnessed the use of deep mutational scanning in many research areas leading to scientific breakthroughs. Also, the methods in each step of deep mutational scanning have become much more versatile thanks to the oligo-synthesizing technology, high-throughput phenotyping methods and deep sequencing technology. However, each specific possible step of deep mutational scanning has its pros and cons, and some limitations still await further technological development. Here, we discuss recent scientific accomplishments achieved through the deep mutational scanning and describe widely used methods in each step of deep mutational scanning. We also compare these different methods and analyze their advantages and disadvantages, providing insight into how to design a deep mutational scanning study that best suits the aims of the readers' projects.
Collapse
Affiliation(s)
- Huijin Wei
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China
| | - Xianghua Li
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China
- Deanery of Biomedical Sciences, University of Edinburgh, Edinburgh, United Kingdom
- The Second Affiliated Hospital of Zhejiang University, Hangzhou, Zhejiang, China
- Biomedical and Health Translational Centre of Zhejiang Province, Haining, Zhejiang, China
| |
Collapse
|
26
|
Wagner A. Adaptive evolvability through direct selection instead of indirect, second-order selection. JOURNAL OF EXPERIMENTAL ZOOLOGY. PART B, MOLECULAR AND DEVELOPMENTAL EVOLUTION 2022; 338:395-404. [PMID: 34254439 PMCID: PMC9786751 DOI: 10.1002/jez.b.23071] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 05/11/2021] [Accepted: 06/04/2021] [Indexed: 12/30/2022]
Abstract
Can evolvability itself be the product of adaptive evolution? To answer this question is challenging, because any DNA mutation that alters only evolvability is subject to indirect, "second order" selection on the future effects of this mutation. Such indirect selection is weaker than "first-order" selection on mutations that alter fitness, in the sense that it can operate only under restrictive conditions. Here I discuss a route to adaptive evolvability that overcomes this challenge. Specifically, a recent evolution experiment showed that some mutations can enhance both fitness and evolvability through a combination of direct and indirect selection. Unrelated evidence from gene duplication and the evolution of gene regulation suggests that mutations with such dual effects may not be rare. Through such mutations, evolvability may increase at least in part because it provides an adaptive advantage. These observations suggest a research program on the adaptive evolution of evolvability, which aims to identify such mutations and to disentangle their direct fitness effects from their indirect effects on evolvability. If evolvability is itself adaptive, Darwinian evolution may have created more than life's diversity. It may also have helped create the very conditions that made the success of Darwinian evolution possible.
Collapse
Affiliation(s)
- Andreas Wagner
- Department of Evolutionary Biology and Environmental StudiesUniversity of ZurichZurichSwitzerland,Swiss Institute of BioinformaticsQuartier Sorge‐Batiment GenopodeLausanneSwitzerland,The Santa Fe InstituteSanta FeNew MexicoUSA,Stellenbosch Institute for Advanced Study, Wallenberg Research Centre at Stellenbosch UniversityStellenboschSouth Africa
| |
Collapse
|
27
|
Velecký J, Hamsikova M, Stourac J, Musil M, Damborsk J, Bednar D, Mazurenko S. SoluProtMutDB: a manually curated database of protein solubility changes upon mutations. Comput Struct Biotechnol J 2022; 20:6339-6347. [DOI: 10.1016/j.csbj.2022.11.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 11/04/2022] [Accepted: 11/04/2022] [Indexed: 11/11/2022] Open
|
28
|
Harman JL, Reardon PN, Costello SM, Warren GD, Phillips SR, Connor PJ, Marqusee S, Harms MJ. Evolution avoids a pathological stabilizing interaction in the immune protein S100A9. Proc Natl Acad Sci U S A 2022; 119:e2208029119. [PMID: 36194634 PMCID: PMC9565474 DOI: 10.1073/pnas.2208029119] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 09/07/2022] [Indexed: 01/03/2023] Open
Abstract
Stability constrains evolution. While much is known about constraints on destabilizing mutations, less is known about the constraints on stabilizing mutations. We recently identified a mutation in the innate immune protein S100A9 that provides insight into such constraints. When introduced into human S100A9, M63F simultaneously increases the stability of the protein and disrupts its natural ability to activate Toll-like receptor 4. Using chemical denaturation, we found that M63F stabilizes a calcium-bound conformation of hS100A9. We then used NMR to solve the structure of the mutant protein, revealing that the mutation distorts the hydrophobic binding surface of hS100A9, explaining its deleterious effect on function. Hydrogen-deuterium exchange (HDX) experiments revealed stabilization of the region around M63F in the structure, notably Phe37. In the structure of the M63F mutant, the Phe37 and Phe63 sidechains are in contact, plausibly forming an edge-face π-stack. Mutating Phe37 to Leu abolished the stabilizing effect of M63F as probed by both chemical denaturation and HDX. It also restored the biological activity of S100A9 disrupted by M63F. These findings reveal that Phe63 creates a molecular staple with Phe37 that stabilizes a nonfunctional conformation of the protein, thus disrupting function. Using a bioinformatic analysis, we found that S100A9 proteins from different organisms rarely have Phe at both positions 37 and 63, suggesting that avoiding a pathological stabilizing interaction indeed constrains S100A9 evolution. This work highlights an important evolutionary constraint on stabilizing mutations, namely, that they must avoid inappropriately stabilizing nonfunctional protein conformations.
Collapse
Affiliation(s)
- Joseph L. Harman
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Patrick N. Reardon
- College of Science, NMR Facility, Oregon State University, Corvallis, OR 97331
| | - Shawn M. Costello
- Biophysics Graduate Program, University of California, Berkeley, Berkeley, CA 94720
| | - Gus D. Warren
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Sophia R. Phillips
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Patrick J. Connor
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Susan Marqusee
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720
- Department of Chemistry, University of California, Berkeley, Berkeley, CA 94720
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA 94720
| | - Michael J. Harms
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| |
Collapse
|
29
|
Nájera-Martínez EF, Melchor-Martínez EM, Sosa-Hernández JE, Levin LN, Parra-Saldívar R, Iqbal HMN. Lignocellulosic residues as supports for enzyme immobilization, and biocatalysts with potential applications. Int J Biol Macromol 2022; 208:748-759. [PMID: 35364201 DOI: 10.1016/j.ijbiomac.2022.03.180] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 03/21/2022] [Accepted: 03/26/2022] [Indexed: 02/08/2023]
Abstract
Growing demand for agricultural production means a higher quantity of residues produced. The reuse and recycling of agro-industrial wastes reduce worldwide greenhouse emissions. New opportunities are derived from this kind of residuals in the biotechnological field generating valuable products in growing sectors such as transportation, bioenergy, food, and feedstock. The use of natural macromolecules towards biocatalysts offers numerous advantages over free enzymes and friendliness with the environment. Enzyme immobilization improves enzyme properties (stability and reusability), and three types of supports are discussed: inorganic, organic, and hybrid. Several examples of agro-industrial wastes such as coconut wastes, rice husks, corn residues and brewers spent grains (BSG), their properties and potential as supports for enzyme immobilization are described in this work. Before the immobilization, biological and non-biological pretreatments could be performed to enhance the waste potential as a carrier. Additionally, immobilization methods such as covalent binding, adsorption, cross-linking and entrapment are compared to provide high efficiency. Enzymes and biocatalysts for industrial applications offer advantages over traditional chemical processes with respect to sustainability and process efficiency in food, energy, and bioremediation fields. The wastes reviewed in this work demonstrated a high affinity for lipases and laccases and might be used in biodiesel production and textile wastewater treatment, among other applications.
Collapse
Affiliation(s)
| | | | | | - Laura Noemí Levin
- Universidad de Buenos Aires, Facultad de Ciencias Exactas y Naturales, Dpto. de Biodiversidad y Biología Experimental, Laboratorio de Micología Experimental: INMIBO-CONICET, 1428, Ciudad Autónoma de Buenos Aires, Argentina.
| | - Roberto Parra-Saldívar
- Tecnológico de Monterrey, School of Engineering and Sciences, 64849, Monterrey, NL, Mexico.
| | - Hafiz M N Iqbal
- Tecnológico de Monterrey, School of Engineering and Sciences, 64849, Monterrey, NL, Mexico.
| |
Collapse
|
30
|
Sunderhaus A, Imran R, Enoh E, Adedeji A, Obafemi T, Abdel Aziz MH. Comparative expression of soluble, active human kinases in specialized bacterial strains. PLoS One 2022; 17:e0267226. [PMID: 35439268 PMCID: PMC9017934 DOI: 10.1371/journal.pone.0267226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 04/05/2022] [Indexed: 11/19/2022] Open
Abstract
Kinases act as molecular switches for cellular functions and are involved in multiple human pathogeneses, most notably cancer. There is a continuous need for soluble and active kinases for in-vitro drug discovery and structural biology purposes. Kinases remain challenging to express using Escherichia coli, the most widely utilized host for heterologous expression. In this work, four bacterial strains, BL21 (DE3), BL21 (DE3) pLysS, Rosetta, and Arctic Express, were chosen for parallel expression trials along with BL21 (DE3) complemented with folding chaperones DnaJ/K and GroEL/ES to compare their performance in producing soluble and active human kinases. Three representative diverse kinases were studied, Epidermal Growth Factor Receptor kinase domain, Aurora Kinase A kinase domain, and Mitogen-activated protein Kinase Kinase. The genes encoding the kinases were subcloned into pET15b bacterial plasmid and transformed into the bacterial strains. Soluble kinase expression was tested using different IPTG concentrations (1–0.05 mM) at varying temperatures (37°C– 10°C) and induction times (3–24 hours). The optimum conditions for each kinase in all strains were then used for 1L large scale cultures from which each kinase was purified to compare yield, purity, oligomerization status, and activity. Although using specialized strains achieved improvements in yield and/or activity for the three kinases, none of the tested strains was universally superior, highlighting the individuality in kinase expression.
Collapse
Affiliation(s)
- Allison Sunderhaus
- Department of Pharmaceutical Sciences and Health Outcomes, Fisch College of Pharmacy, The University of Texas at Tyler, Tyler, Texas, United States of America
| | - Ramsha Imran
- Department of Pharmaceutical Sciences and Health Outcomes, Fisch College of Pharmacy, The University of Texas at Tyler, Tyler, Texas, United States of America
| | - Elanzou Enoh
- Department of Pharmaceutical Sciences and Health Outcomes, Fisch College of Pharmacy, The University of Texas at Tyler, Tyler, Texas, United States of America
| | - Adesola Adedeji
- Department of Pharmaceutical Sciences and Health Outcomes, Fisch College of Pharmacy, The University of Texas at Tyler, Tyler, Texas, United States of America
| | - Taiye Obafemi
- Department of Pharmaceutical Sciences and Health Outcomes, Fisch College of Pharmacy, The University of Texas at Tyler, Tyler, Texas, United States of America
| | - May H. Abdel Aziz
- Department of Pharmaceutical Sciences and Health Outcomes, Fisch College of Pharmacy, The University of Texas at Tyler, Tyler, Texas, United States of America
- * E-mail:
| |
Collapse
|
31
|
Vasina M, Velecký J, Planas-Iglesias J, Marques SM, Skarupova J, Damborsky J, Bednar D, Mazurenko S, Prokop Z. Tools for computational design and high-throughput screening of therapeutic enzymes. Adv Drug Deliv Rev 2022; 183:114143. [PMID: 35167900 DOI: 10.1016/j.addr.2022.114143] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Revised: 02/04/2022] [Accepted: 02/09/2022] [Indexed: 12/16/2022]
Abstract
Therapeutic enzymes are valuable biopharmaceuticals in various biomedical applications. They have been successfully applied for fibrinolysis, cancer treatment, enzyme replacement therapies, and the treatment of rare diseases. Still, there is a permanent demand to find new or better therapeutic enzymes, which would be sufficiently soluble, stable, and active to meet specific medical needs. Here, we highlight the benefits of coupling computational approaches with high-throughput experimental technologies, which significantly accelerate the identification and engineering of catalytic therapeutic agents. New enzymes can be identified in genomic and metagenomic databases, which grow thanks to next-generation sequencing technologies exponentially. Computational design and machine learning methods are being developed to improve catalytically potent enzymes and predict their properties to guide the selection of target enzymes. High-throughput experimental pipelines, increasingly relying on microfluidics, ensure functional screening and biochemical characterization of target enzymes to reach efficient therapeutic enzymes.
Collapse
Affiliation(s)
- Michal Vasina
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic
| | - Jan Velecký
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic
| | - Joan Planas-Iglesias
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic
| | - Sergio M Marques
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic
| | - Jana Skarupova
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic
| | - Jiri Damborsky
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic; Enantis, INBIT, Kamenice 34, Brno, Czech Republic
| | - David Bednar
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic.
| | - Stanislav Mazurenko
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic.
| | - Zbynek Prokop
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic.
| |
Collapse
|
32
|
Environmental selection and epistasis in an empirical phenotype-environment-fitness landscape. Nat Ecol Evol 2022; 6:427-438. [PMID: 35210579 DOI: 10.1038/s41559-022-01675-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 12/14/2021] [Indexed: 11/08/2022]
Abstract
Fitness landscapes, mappings of genotype/phenotype to their effects on fitness, are invaluable concepts in evolutionary biochemistry. Although widely discussed, measurements of phenotype-fitness landscapes in proteins remain scarce. Here, we quantify all single mutational effects on fitness and phenotype (EC50) of VIM-2 β-lactamase across a 64-fold range of ampicillin concentrations. We then construct a phenotype-fitness landscape that takes variations in environmental selection pressure into account. We found that a simple, empirical landscape accurately models the ~39,000 mutational data points, suggesting that the evolution of VIM-2 can be predicted on the basis of the selection environment. Our landscape provides new quantitative knowledge on the evolution of the β-lactamases and proteins in general, particularly their evolutionary dynamics under subinhibitory antibiotic concentrations, as well as the mechanisms and environmental dependence of non-specific epistasis.
Collapse
|
33
|
Schulz L, Sendker FL, Hochberg GKA. Non-adaptive complexity and biochemical function. Curr Opin Struct Biol 2022; 73:102339. [PMID: 35247750 DOI: 10.1016/j.sbi.2022.102339] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 12/06/2021] [Accepted: 01/24/2022] [Indexed: 11/25/2022]
Abstract
Intricate biochemical structures are usually thought to be useful, because natural selection preserves them from degradation by a constant hail of destructive mutations. Biochemists therefore often deliberately disrupt them to understand how complexity improves protein function or fitness. However, evolutionary theory suggests that even useless complexity that never improved fitness can become completely essential if a simple set of evolutionary conditions is fulfilled. We review evidence that stable protein complexes, protein-chaperone interactions, and complexes consisting of several paralogs all fulfill these conditions. This makes reverse genetics or destructive mutagenesis unsuitable for assigning functions to these kinds of complexity. Instead, we advocate that incorporating evolutionary approaches into biochemistry overcomes this difficulty and allows us to distinguish useless from useful biochemical complexity.
Collapse
Affiliation(s)
- Luca Schulz
- Max Planck Institute for Terrestrial Microbiology, Karl-von-Frisch Straße 10, 35043 Marburg, Germany. https://twitter.com/schulluc
| | - Franziska L Sendker
- Max Planck Institute for Terrestrial Microbiology, Karl-von-Frisch Straße 10, 35043 Marburg, Germany. https://twitter.com/SendkerFL
| | - Georg K A Hochberg
- Max Planck Institute for Terrestrial Microbiology, Karl-von-Frisch Straße 10, 35043 Marburg, Germany; Department of Chemistry, University of Marburg, Hans-Meerwein-Straße 4, 35032 Marburg, Germany; Center for Synthetic Microbiology (SYNMIKRO), Hans-Meerwein-Straße 6, 35032 Marburg, Germany.
| |
Collapse
|
34
|
Vanella R, Kovacevic G, Doffini V, Fernández de Santaella J, Nash MA. High-throughput screening, next generation sequencing and machine learning: advanced methods in enzyme engineering. Chem Commun (Camb) 2022; 58:2455-2467. [PMID: 35107442 PMCID: PMC8851469 DOI: 10.1039/d1cc04635g] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 01/23/2022] [Indexed: 12/29/2022]
Abstract
Enzyme engineering is an important biotechnological process capable of generating tailored biocatalysts for applications in industrial chemical conversion and biopharma. Typical enhancements sought in enzyme engineering and in vitro evolution campaigns include improved folding stability, catalytic activity, and/or substrate specificity. Despite significant progress in recent years in the areas of high-throughput screening and DNA sequencing, our ability to explore the vast space of functional enzyme sequences remains severely limited. Here, we review the currently available suite of modern methods for enzyme engineering, with a focus on novel readout systems based on enzyme cascades, and new approaches to reaction compartmentalization including single-cell hydrogel encapsulation techniques to achieve a genotype-phenotype link. We further summarize systematic scanning mutagenesis approaches and their merger with deep mutational scanning and massively parallel next-generation DNA sequencing technologies to generate mutability landscapes. Finally, we discuss the implementation of machine learning models for computational prediction of enzyme phenotypic fitness from sequence. This broad overview of current state-of-the-art approaches for enzyme engineering and evolution will aid newcomers and experienced researchers alike in identifying the important challenges that should be addressed to move the field forward.
Collapse
Affiliation(s)
- Rosario Vanella
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| | - Gordana Kovacevic
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| | - Vanni Doffini
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| | - Jaime Fernández de Santaella
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| | - Michael A Nash
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| |
Collapse
|
35
|
Leonard AC, Weinstein JJ, Steiner PJ, Erbse AH, Fleishman SJ, Whitehead TA. Stabilization of the SARS-CoV-2 receptor binding domain by protein core redesign and deep mutational scanning. Protein Eng Des Sel 2022; 35:gzac002. [PMID: 35325236 PMCID: PMC9077414 DOI: 10.1093/protein/gzac002] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 01/21/2022] [Accepted: 02/16/2022] [Indexed: 11/12/2022] Open
Abstract
Stabilizing antigenic proteins as vaccine immunogens or diagnostic reagents is a stringent case of protein engineering and design as the exterior surface must maintain recognition by receptor(s) and antigen-specific antibodies at multiple distinct epitopes. This is a challenge, as stability enhancing mutations must be focused on the protein core, whereas successful computational stabilization algorithms typically select mutations at solvent-facing positions. In this study, we report the stabilization of SARS-CoV-2 Wuhan Hu-1 Spike receptor binding domain using a combination of deep mutational scanning and computational design, including the FuncLib algorithm. Our most successful design encodes I358F, Y365W, T430I, and I513L receptor binding domain mutations, maintains recognition by the receptor ACE2 and a panel of different anti-receptor binding domain monoclonal antibodies, is between 1 and 2°C more thermally stable than the original receptor binding domain using a thermal shift assay, and is less proteolytically sensitive to chymotrypsin and thermolysin than the original receptor binding domain. Our approach could be applied to the computational stabilization of a wide range of proteins without requiring detailed knowledge of active sites or binding epitopes. We envision that this strategy may be particularly powerful for cases when there are multiple or unknown binding sites.
Collapse
Affiliation(s)
- Alison C Leonard
- Department of Chemical and Biological Engineering, University of Colorado, Boulder, CO 80303, USA
| | - Jonathan J Weinstein
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Paul J Steiner
- Department of Chemical and Biological Engineering, University of Colorado, Boulder, CO 80303, USA
| | - Annette H Erbse
- Department of Biochemistry, University of Colorado, Boulder, CO 80303, USA
| | - Sarel J Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Timothy A Whitehead
- Department of Chemical and Biological Engineering, University of Colorado, Boulder, CO 80303, USA
| |
Collapse
|
36
|
Sorokina I, Mushegian AR, Koonin EV. Is Protein Folding a Thermodynamically Unfavorable, Active, Energy-Dependent Process? Int J Mol Sci 2022; 23:521. [PMID: 35008947 PMCID: PMC8745595 DOI: 10.3390/ijms23010521] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Revised: 12/30/2021] [Accepted: 12/31/2021] [Indexed: 02/04/2023] Open
Abstract
The prevailing current view of protein folding is the thermodynamic hypothesis, under which the native folded conformation of a protein corresponds to the global minimum of Gibbs free energy G. We question this concept and show that the empirical evidence behind the thermodynamic hypothesis of folding is far from strong. Furthermore, physical theory-based approaches to the prediction of protein folds and their folding pathways so far have invariably failed except for some very small proteins, despite decades of intensive theory development and the enormous increase of computer power. The recent spectacular successes in protein structure prediction owe to evolutionary modeling of amino acid sequence substitutions enhanced by deep learning methods, but even these breakthroughs provide no information on the protein folding mechanisms and pathways. We discuss an alternative view of protein folding, under which the native state of most proteins does not occupy the global free energy minimum, but rather, a local minimum on a fluctuating free energy landscape. We further argue that ΔG of folding is likely to be positive for the majority of proteins, which therefore fold into their native conformations only through interactions with the energy-dependent molecular machinery of living cells, in particular, the translation system and chaperones. Accordingly, protein folding should be modeled as it occurs in vivo, that is, as a non-equilibrium, active, energy-dependent process.
Collapse
Affiliation(s)
| | - Arcady R. Mushegian
- Division of Molecular and Cellular Biosciences, National Science Foundation, Alexandria, VA 22314, USA;
- Clare Hall College, University of Cambridge, Cambridge CB3 9AL, UK
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
37
|
Boock JT, Taw M, King BC, Conrado RJ, Gibson DM, DeLisa MP. Two-Tiered Selection and Screening Strategy to Increase Functional Enzyme Production in E. coli. Methods Mol Biol 2022; 2406:169-187. [PMID: 35089557 DOI: 10.1007/978-1-0716-1859-2_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Development of recombinant enzymes as industrial biocatalysts or metabolic pathway elements requires soluble expression of active protein. Here we present a two-step strategy, combining a directed evolution selection with an enzyme activity screen, to increase the soluble production of enzymes in the cytoplasm of E. coli. The directed evolution component relies on the innate quality control of the twin-arginine translocation pathway coupled with antibiotic selection to isolate point mutations that promote intracellular solubility. A secondary screen is applied to ensure the solubility enhancement has not compromised enzyme activity. This strategy has been successfully applied to increase the soluble production of a fungal endocellulase by 30-fold in E. coli without change in enzyme specific activity through two rounds of directed evolution.
Collapse
Affiliation(s)
- Jason T Boock
- Robert F. Smith School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY, USA.
- Department of Chemical, Paper and Biomedical Engineering, Miami University (OH), Oxford, OH, USA.
| | - May Taw
- Department of Microbiology, Cornell University, Ithaca, NY, USA
| | - Brian C King
- Department of Plant Pathology and Plant-Microbe Biology, Cornell University, Ithaca, NY, USA
| | - Robert J Conrado
- Robert F. Smith School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY, USA
| | - Donna M Gibson
- Department of Plant Pathology and Plant-Microbe Biology, Cornell University, Ithaca, NY, USA
- USDA Agricultural Research Service, Robert W. Holley Center for Agriculture and Health, Ithaca, NY, USA
| | - Matthew P DeLisa
- Robert F. Smith School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY, USA
| |
Collapse
|
38
|
Leonard AC, Weinstein JJ, Steiner PJ, Erbse AH, Fleishman SJ, Whitehead TA. Stabilization of the SARS-CoV-2 Receptor Binding Domain by Protein Core Redesign and Deep Mutational Scanning.. [PMID: 34845448 PMCID: PMC8629191 DOI: 10.1101/2021.11.22.469552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Stabilizing antigenic proteins as vaccine immunogens or diagnostic reagents is a stringent case of protein engineering and design as the exterior surface must maintain recognition by receptor(s) and antigen—specific antibodies at multiple distinct epitopes. This is a challenge, as stability-enhancing mutations must be focused on the protein core, whereas successful computational stabilization algorithms typically select mutations at solvent-facing positions. In this study we report the stabilization of SARS-CoV-2 Wuhan Hu-1 Spike receptor binding domain (S RBD) using a combination of deep mutational scanning and computational design, including the FuncLib algorithm. Our most successful design encodes I358F, Y365W, T430I, and I513L RBD mutations, maintains recognition by the receptor ACE2 and a panel of different anti-RBD monoclonal antibodies, is between 1–2°C more thermally stable than the original RBD using a thermal shift assay, and is less proteolytically sensitive to chymotrypsin and thermolysin than the original RBD. Our approach could be applied to the computational stabilization of a wide range of proteins without requiring detailed knowledge of active sites or binding epitopes, particularly powerful for cases when there are multiple or unknown binding sites.
Collapse
|
39
|
Kuiper BP, Prins RC, Billerbeck S. Oligo Pools as an Affordable Source of Synthetic DNA for Cost-Effective Library Construction in Protein- and Metabolic Pathway Engineering. Chembiochem 2021; 23:e202100507. [PMID: 34817110 PMCID: PMC9300125 DOI: 10.1002/cbic.202100507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 11/23/2021] [Indexed: 11/11/2022]
Abstract
The construction of custom libraries is critical for rational protein engineering and directed evolution. Array‐synthesized oligo pools of thousands of user‐defined sequences (up to ∼350 bases in length) have emerged as a low‐cost commercially available source of DNA. These pools cost ≤10 % (depending on error rate and length) of other commercial sources of custom DNA, and this significant cost difference can determine whether an enzyme engineering project can be realized on a given research budget. However, while being cheap, oligo pools do suffer from a low concentration of individual oligos and relatively high error rates. Several powerful techniques that specifically make use of oligo pools have been developed and proven valuable or even essential for next‐generation protein and pathway engineering strategies, such as sequence‐function mapping, enzyme minimization, or de‐novo design. Here we consolidate the knowledge on these techniques and their applications to facilitate the use of oligo pools within the protein engineering community.
Collapse
Affiliation(s)
- Bastiaan P Kuiper
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| | - Rianne C Prins
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| | - Sonja Billerbeck
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
40
|
Munro LJ, Kell DB. Intelligent host engineering for metabolic flux optimisation in biotechnology. Biochem J 2021; 478:3685-3721. [PMID: 34673920 PMCID: PMC8589332 DOI: 10.1042/bcj20210535] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Revised: 09/22/2021] [Accepted: 09/24/2021] [Indexed: 12/13/2022]
Abstract
Optimising the function of a protein of length N amino acids by directed evolution involves navigating a 'search space' of possible sequences of some 20N. Optimising the expression levels of P proteins that materially affect host performance, each of which might also take 20 (logarithmically spaced) values, implies a similar search space of 20P. In this combinatorial sense, then, the problems of directed protein evolution and of host engineering are broadly equivalent. In practice, however, they have different means for avoiding the inevitable difficulties of implementation. The spare capacity exhibited in metabolic networks implies that host engineering may admit substantial increases in flux to targets of interest. Thus, we rehearse the relevant issues for those wishing to understand and exploit those modern genome-wide host engineering tools and thinking that have been designed and developed to optimise fluxes towards desirable products in biotechnological processes, with a focus on microbial systems. The aim throughput is 'making such biology predictable'. Strategies have been aimed at both transcription and translation, especially for regulatory processes that can affect multiple targets. However, because there is a limit on how much protein a cell can produce, increasing kcat in selected targets may be a better strategy than increasing protein expression levels for optimal host engineering.
Collapse
Affiliation(s)
- Lachlan J. Munro
- Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Douglas B. Kell
- Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St, Liverpool L69 7ZB, U.K
- Mellizyme Biotechnology Ltd, IC1, Liverpool Science Park, 131 Mount Pleasant, Liverpool L3 5TF, U.K
| |
Collapse
|
41
|
Heyne M, Shirian J, Cohen I, Peleg Y, Radisky ES, Papo N, Shifman JM. Climbing Up and Down Binding Landscapes through Deep Mutational Scanning of Three Homologous Protein-Protein Complexes. J Am Chem Soc 2021; 143:17261-17275. [PMID: 34609866 PMCID: PMC8532158 DOI: 10.1021/jacs.1c08707] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Indexed: 12/18/2022]
Abstract
Protein-protein interactions (PPIs) have evolved to display binding affinities that can support their function. As such, cognate and noncognate PPIs could be highly similar structurally but exhibit huge differences in binding affinities. To understand this phenomenon, we study three homologous protease-inhibitor PPIs that span 9 orders of magnitude in binding affinity. Using state-of-the-art methodology that combines protein randomization, affinity sorting, deep sequencing, and data normalization, we report quantitative binding landscapes consisting of ΔΔGbind values for the three PPIs, gleaned from tens of thousands of single and double mutations. We show that binding landscapes of the three complexes are strikingly different and depend on the PPI evolutionary optimality. We observe different patterns of couplings between mutations for the three PPIs with negative and positive epistasis appearing most frequently at hot-spot and cold-spot positions, respectively. The evolutionary trends observed here are likely to be universal to other biological complexes in the cell.
Collapse
Affiliation(s)
- Michael Heyne
- Department
of Biological Chemistry, The Alexander Silberman Institute of Life
Sciences, The Hebrew University of Jerusalem, Jerusalem, 9190401, Israel
- Avram
and Stella Goldstein-Goren Department of Biotechnology Engineering
and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, 8410501, Israel
| | - Jason Shirian
- Department
of Biological Chemistry, The Alexander Silberman Institute of Life
Sciences, The Hebrew University of Jerusalem, Jerusalem, 9190401, Israel
| | - Itay Cohen
- Avram
and Stella Goldstein-Goren Department of Biotechnology Engineering
and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, 8410501, Israel
| | - Yoav Peleg
- Life
Sciences Core Facilities (LSCF) Structural Proteomics Unit (SPU), Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Evette S. Radisky
- Department
of Cancer Biology, Mayo Clinic Comprehensive
Cancer Center, Jacksonville, Florida 32224, United States
| | - Niv Papo
- Avram
and Stella Goldstein-Goren Department of Biotechnology Engineering
and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, 8410501, Israel
| | - Julia M. Shifman
- Department
of Biological Chemistry, The Alexander Silberman Institute of Life
Sciences, The Hebrew University of Jerusalem, Jerusalem, 9190401, Israel
| |
Collapse
|
42
|
Zutz A, Hamborg L, Pedersen LE, Kassem MM, Papaleo E, Koza A, Herrgård MJ, Jensen SI, Teilum K, Lindorff-Larsen K, Nielsen AT. A dual-reporter system for investigating and optimizing protein translation and folding in E. coli. Nat Commun 2021; 12:6093. [PMID: 34667164 PMCID: PMC8526717 DOI: 10.1038/s41467-021-26337-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Accepted: 10/01/2021] [Indexed: 01/29/2023] Open
Abstract
Strategies for investigating and optimizing the expression and folding of proteins for biotechnological and pharmaceutical purposes are in high demand. Here, we describe a dual-reporter biosensor system that simultaneously assesses in vivo protein translation and protein folding, thereby enabling rapid screening of mutant libraries. We have validated the dual-reporter system on five different proteins and find an excellent correlation between reporter signals and the levels of protein expression and solubility of the proteins. We further demonstrate the applicability of the dual-reporter system as a screening assay for deep mutational scanning experiments. The system enables high throughput selection of protein variants with high expression levels and altered protein stability. Next generation sequencing analysis of the resulting libraries of protein variants show a good correlation between computationally predicted and experimentally determined protein stabilities. We furthermore show that the mutational experimental data obtained using this system may be useful for protein structure calculations.
Collapse
Affiliation(s)
- Ariane Zutz
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet 220, 2800 Kgs, Lyngby, Denmark
| | - Louise Hamborg
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet 220, 2800 Kgs, Lyngby, Denmark
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200, Copenhagen N, Denmark
| | - Lasse Ebdrup Pedersen
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet 220, 2800 Kgs, Lyngby, Denmark
| | - Maher M Kassem
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200, Copenhagen N, Denmark
| | - Elena Papaleo
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200, Copenhagen N, Denmark
| | - Anna Koza
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet 220, 2800 Kgs, Lyngby, Denmark
| | - Markus J Herrgård
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet 220, 2800 Kgs, Lyngby, Denmark
| | - Sheila Ingemann Jensen
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet 220, 2800 Kgs, Lyngby, Denmark
| | - Kaare Teilum
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200, Copenhagen N, Denmark
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200, Copenhagen N, Denmark
| | - Alex Toftgaard Nielsen
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet 220, 2800 Kgs, Lyngby, Denmark.
| |
Collapse
|
43
|
Galanie S, Entwistle D, Lalonde J. Engineering biosynthetic enzymes for industrial natural product synthesis. Nat Prod Rep 2021; 37:1122-1143. [PMID: 32364202 DOI: 10.1039/c9np00071b] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Covering: 2000 to 2020 Natural products and their derivatives are commercially important medicines, agrochemicals, flavors, fragrances, and food ingredients. Industrial strategies to produce these structurally complex molecules encompass varied combinations of chemical synthesis, biocatalysis, and extraction from natural sources. Interest in engineering natural product biosynthesis began with the advent of genetic tools for pathway discovery. Genes and strains can now readily be synthesized, mutated, recombined, and sequenced. Enzyme engineering has succeeded commercially due to the development of genetic methods, analytical technologies, and machine learning algorithms. Today, engineered biosynthetic enzymes from organisms spanning the tree of life are used industrially to produce diverse molecules. These biocatalytic processes include single enzymatic steps, multienzyme cascades, and engineered native and heterologous microbial strains. This review will describe how biosynthetic enzymes have been engineered to enable commercial and near-commercial syntheses of natural products and their analogs.
Collapse
Affiliation(s)
- Stephanie Galanie
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.
| | - David Entwistle
- Process Chemistry, Codexis, Inc., Redwood City, California, USA
| | - James Lalonde
- Microbial Digital Genome Engineering, Inscripta, Inc., Pleasanton, California, USA
| |
Collapse
|
44
|
Optimization of protein trans-splicing in an inducible plasmid display system for high-throughput screening and selection of soluble proteins. Enzyme Microb Technol 2021; 153:109914. [PMID: 34670187 DOI: 10.1016/j.enzmictec.2021.109914] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 09/01/2021] [Accepted: 09/03/2021] [Indexed: 11/24/2022]
Abstract
Directed evolution is widely used to optimize protein folding and solubility in cells. Although the screening and selection of desired mutants is an essential step in directed evolution, it generally requires laborious optimization and/or specialized equipment. With a view toward designing a more practical procedure, we previously developed an inducible plasmid display system, in which the intein (auto-processing) and Oct-1 DNA-binding (DBD) domains were used as the protein trans-splicing domain and DNA-binding module, respectively. Specifically, the N-terminal (CfaN) and C-terminal (CfaC) domains of intein were fused to the C-terminal end of the His-tag and the N-terminal end of Oct-1 DBD to generate His6-CfaN and CfaC-Oct-1, respectively. For such a system to be viable, the efficiency of protein trans-splicing without the protein of interest (POI) should be maximized, such that the probability of occurrence is solely dependent on the solubility of the POI. To this end, we initially prevented the degradation of l-arabinose (the inducer of the PBAD promoter) by employing an Escherichia coli host strain deficient in the metabolism of l-arabinose. Given that a low expression of His6-CfaN, compared with that of CfaC-Oct-1, was found to be conducive to the generation to a soluble product of the protein trans-splicing event, we designed the expression of His6-CfaN and CfaC-Oct-1 to be inducible from the PBAD and PT7 promoters, respectively. The optimized system thus obtained enabled in vitro selection of the plasmid-protein complex with high yield. We believe that the inducible plasmid display system developed in this study would be applicable to high-throughput screening and/or selection of protein variants with enhanced solubility.
Collapse
|
45
|
Yang Y, Zeng L, Vihinen M. PON-Sol2: Prediction of Effects of Variants on Protein Solubility. Int J Mol Sci 2021; 22:8027. [PMID: 34360790 PMCID: PMC8348231 DOI: 10.3390/ijms22158027] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 07/19/2021] [Accepted: 07/22/2021] [Indexed: 01/13/2023] Open
Abstract
Genetic variations have a multitude of effects on proteins. A substantial number of variations affect protein-solvent interactions, either aggregation or solubility. Aggregation is often related to structural alterations, whereas solubilizable proteins in the solid phase can be made again soluble by dilution. Solubility is a central protein property and when reduced can lead to diseases. We developed a prediction method, PON-Sol2, to identify amino acid substitutions that increase, decrease, or have no effect on the protein solubility. The method is a machine learning tool utilizing gradient boosting algorithm and was trained on a large dataset of variants with different outcomes after the selection of features among a large number of tested properties. The method is fast and has high performance. The normalized correct prediction rate for three states is 0.656, and the normalized GC2 score is 0.312 in 10-fold cross-validation. The corresponding numbers in the blind test were 0.545 and 0.157. The performance was superior in comparison to previous methods. The PON-Sol2 predictor is freely available. It can be used to predict the solubility effects of variants for any organism, even in large-scale projects.
Collapse
Affiliation(s)
- Yang Yang
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (L.Z.)
| | - Lianjie Zeng
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (L.Z.)
- Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210000, China
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, BMC B13, SE-221 84 Lund, Sweden
| |
Collapse
|
46
|
Markin CJ, Mokhtari DA, Sunden F, Appel MJ, Akiva E, Longwell SA, Sabatti C, Herschlag D, Fordyce PM. Revealing enzyme functional architecture via high-throughput microfluidic enzyme kinetics. Science 2021; 373:373/6553/eabf8761. [PMID: 34437092 DOI: 10.1126/science.abf8761] [Citation(s) in RCA: 91] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 05/24/2021] [Indexed: 12/21/2022]
Abstract
Systematic and extensive investigation of enzymes is needed to understand their extraordinary efficiency and meet current challenges in medicine and engineering. We present HT-MEK (High-Throughput Microfluidic Enzyme Kinetics), a microfluidic platform for high-throughput expression, purification, and characterization of more than 1500 enzyme variants per experiment. For 1036 mutants of the alkaline phosphatase PafA (phosphate-irrepressible alkaline phosphatase of Flavobacterium), we performed more than 670,000 reactions and determined more than 5000 kinetic and physical constants for multiple substrates and inhibitors. We uncovered extensive kinetic partitioning to a misfolded state and isolated catalytic effects, revealing spatially contiguous regions of residues linked to particular aspects of function. Regions included active-site proximal residues but extended to the enzyme surface, providing a map of underlying architecture not possible to derive from existing approaches. HT-MEK has applications that range from understanding molecular mechanisms to medicine, engineering, and design.
Collapse
Affiliation(s)
- C J Markin
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - D A Mokhtari
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - F Sunden
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - M J Appel
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - E Akiva
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158, USA
| | - S A Longwell
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - C Sabatti
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA.,Department of Statistics, Stanford University, Stanford, CA 94305, USA
| | - D Herschlag
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA. .,Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA.,ChEM-H Institute, Stanford University, Stanford, CA 94305, USA
| | - P M Fordyce
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA. .,ChEM-H Institute, Stanford University, Stanford, CA 94305, USA.,Department of Genetics, Stanford University, Stanford, CA 94305, USA.,Chan Zuckerberg Biohub; San Francisco, CA 94110, USA
| |
Collapse
|
47
|
The Carbapenemase BKC-1 from Klebsiella pneumoniae Is Adapted for Translocation by Both the Tat and Sec Translocons. mBio 2021; 12:e0130221. [PMID: 34154411 PMCID: PMC8262980 DOI: 10.1128/mbio.01302-21] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The cell envelope of Gram-negative bacteria consists of two membranes surrounding the periplasm and peptidoglycan layer. β-Lactam antibiotics target the periplasmic penicillin-binding proteins that synthesize peptidoglycan, resulting in cell death. The primary means by which bacterial species resist the effects of β-lactam drugs is to populate the periplasmic space with β-lactamases. Resistance to β-lactam drugs is spread by lateral transfer of genes encoding β-lactamases from one species of bacteria to another. However, the resistance phenotype depends in turn on these “alien” protein sequences being recognized and exported across the cytoplasmic membrane by either the Sec or Tat protein translocation machinery of the new bacterial host. Here, we examine BKC-1, a carbapenemase from an unknown bacterial source that has been identified in a single clinical isolate of Klebsiella pneumoniae. BKC-1 was shown to be located in the periplasm, and functional in both K. pneumoniae and Escherichia coli. Sequence analysis revealed the presence of an unusual signal peptide with a twin arginine motif and a duplicated hydrophobic region. Biochemical assays showed this signal peptide directs BKC-1 for translocation by both Sec and Tat translocons. This is one of the few descriptions of a periplasmic protein that is functionally translocated by both export pathways in the same organism, and we suggest it represents a snapshot of evolution for a β-lactamase adapting to functionality in a new host.
Collapse
|
48
|
Bepler T, Berger B. Learning the protein language: Evolution, structure, and function. Cell Syst 2021; 12:654-669.e3. [PMID: 34139171 PMCID: PMC8238390 DOI: 10.1016/j.cels.2021.05.017] [Citation(s) in RCA: 172] [Impact Index Per Article: 57.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Revised: 05/20/2021] [Accepted: 05/20/2021] [Indexed: 02/06/2023]
Abstract
Language models have recently emerged as a powerful machine-learning approach for distilling information from massive protein sequence databases. From readily available sequence data alone, these models discover evolutionary, structural, and functional organization across protein space. Using language models, we can encode amino-acid sequences into distributed vector representations that capture their structural and functional properties, as well as evaluate the evolutionary fitness of sequence variants. We discuss recent advances in protein language modeling and their applications to downstream protein property prediction problems. We then consider how these models can be enriched with prior biological knowledge and introduce an approach for encoding protein structural knowledge into the learned representations. The knowledge distilled by these models allows us to improve downstream function prediction through transfer learning. Deep protein language models are revolutionizing protein biology. They suggest new ways to approach protein and therapeutic design. However, further developments are needed to encode strong biological priors into protein language models and to increase their accessibility to the broader community.
Collapse
Affiliation(s)
- Tristan Bepler
- Simons Machine Learning Center, New York Structural Biology Center, New York, NY, USA; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA; Computational and Systems Biology Program, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
49
|
Burton TD, Eyre NS. Applications of Deep Mutational Scanning in Virology. Viruses 2021; 13:1020. [PMID: 34071591 PMCID: PMC8227372 DOI: 10.3390/v13061020] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 05/26/2021] [Accepted: 05/26/2021] [Indexed: 12/20/2022] Open
Abstract
Several recently developed high-throughput techniques have changed the field of molecular virology. For example, proteomics studies reveal complete interactomes of a viral protein, genome-wide CRISPR knockout and activation screens probe the importance of every single human gene in aiding or fighting a virus, and ChIP-seq experiments reveal genome-wide epigenetic changes in response to infection. Deep mutational scanning is a relatively novel form of protein science which allows the in-depth functional analysis of every nucleotide within a viral gene or genome, revealing regions of importance, flexibility, and mutational potential. In this review, we discuss the application of this technique to RNA viruses including members of the Flaviviridae family, Influenza A Virus and Severe Acute Respiratory Syndrome Coronavirus 2. We also briefly discuss the reverse genetics systems which allow for analysis of viral replication cycles, next-generation sequencing technologies and the bioinformatics tools that facilitate this research.
Collapse
Affiliation(s)
| | - Nicholas S. Eyre
- College of Medicine and Public Health, Flinders University, Bedford Park, SA 5042, Australia;
| |
Collapse
|
50
|
An integrative approach to improving the biocatalytic reactions of whole cells expressing recombinant enzymes. World J Microbiol Biotechnol 2021; 37:105. [PMID: 34037845 DOI: 10.1007/s11274-021-03075-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 05/17/2021] [Indexed: 10/21/2022]
Abstract
Biotransformation is a selective, stereospecific, efficient, and environment friendly method, compared to chemical synthesis, and a feasible tool for industrial and pharmaceutical applications. The design of biocatalysts using enzyme engineering and metabolic engineering tools has been widely reviewed. However, less importance has been given to the biocatalytic reaction of whole cells expressing recombinant enzymes. Along with the remarkable development of biotechnology tools, a variety of techniques have been applied to improve the biocatalytic reaction of whole cell biotransformation. In this review, techniques related to the biocatalytic reaction are examined, reorganized, and summarized via an integrative approach. Moreover, equilibrium-shifted biotransformation is reviewed for the first time.
Collapse
|