1
|
Cadet XF, Gelly JC, van Noord A, Cadet F, Acevedo-Rocha CG. Learning Strategies in Protein Directed Evolution. Methods Mol Biol 2022; 2461:225-275. [PMID: 35727454 DOI: 10.1007/978-1-0716-2152-3_15] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Synthetic biology is a fast-evolving research field that combines biology and engineering principles to develop new biological systems for medical, pharmacological, and industrial applications. Synthetic biologists use iterative "design, build, test, and learn" cycles to efficiently engineer genetic systems that are reliable, reproducible, and predictable. Protein engineering by directed evolution can benefit from such a systematic engineering approach for various reasons. Learning can be carried out before starting, throughout or after finalizing a directed evolution project. Computational tools, bioinformatics, and scanning mutagenesis methods can be excellent starting points, while molecular dynamics simulations and other strategies can guide engineering efforts. Similarly, studying protein intermediates along evolutionary pathways offers fascinating insights into the molecular mechanisms shaped by evolution. The learning step of the cycle is not only crucial for proteins or enzymes that are not suitable for high-throughput screening or selection systems, but it is also valuable for any platform that can generate a large amount of data that can be aided by machine learning algorithms. The main challenge in protein engineering is to predict the effect of a single mutation on one functional parameter-to say nothing of several mutations on multiple parameters. This is largely due to nonadditive mutational interactions, known as epistatic effects-beneficial mutations present in a genetic background may not be beneficial in another genetic background. In this work, we provide an overview of experimental and computational strategies that can guide the user to learn protein function at different stages in a directed evolution project. We also discuss how epistatic effects can influence the success of directed evolution projects. Since machine learning is gaining momentum in protein engineering and the field is becoming more interdisciplinary thanks to collaboration between mathematicians, computational scientists, engineers, molecular biologists, and chemists, we provide a general workflow that familiarizes nonexperts with the basic concepts, dataset requirements, learning approaches, model capabilities and performance metrics of this intriguing area. Finally, we also provide some practical recommendations on how machine learning can harness epistatic effects for engineering proteins in an "outside-the-box" way.
Collapse
Affiliation(s)
- Xavier F Cadet
- PEACCEL, Artificial Intelligence Department, Paris, France
| | - Jean Christophe Gelly
- Laboratoire d'Excellence GR-Ex, Paris, France
- BIGR, DSIMB, UMR_S1134, INSERM, University of Paris & University of Reunion, Paris, France
| | | | - Frédéric Cadet
- Laboratoire d'Excellence GR-Ex, Paris, France
- BIGR, DSIMB, UMR_S1134, INSERM, University of Paris & University of Reunion, Paris, France
| | | |
Collapse
|
2
|
Li A, Acevedo-Rocha CG, Reetz MT. Boosting the efficiency of site-saturation mutagenesis for a difficult-to-randomize gene by a two-step PCR strategy. Appl Microbiol Biotechnol 2018; 102:6095-6103. [PMID: 29785500 PMCID: PMC6013526 DOI: 10.1007/s00253-018-9041-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Revised: 04/13/2018] [Accepted: 04/19/2018] [Indexed: 12/31/2022]
Abstract
Site-saturation mutagenesis (SSM) has been used in directed evolution of proteins for a long time. As a special form of saturation mutagenesis, it involves individual randomization at a given residue with formation of all 19 amino acids. To date, the most efficient embodiment of SSM is a one-step PCR-based approach using NNK codon degeneracy. However, in the case of difficult-to-randomize genes, SSM may not deliver all of the expected 19 mutants, which compels the user to invest further efforts by applying site-directed mutagenesis for the construction of the missing mutants. To solve this problem, we developed a two-step PCR-based technique in which a mutagenic primer and a non-mutagenic (silent) primer are used to generate a short DNA fragment, which is recovered and then employed as a megaprimer to amplify the whole plasmid. The present two-step and older one-step (partially overlapped primer approach) procedures were compared by utilizing cytochrome P450-BM3, which is a "difficult-to-randomize" gene. The results document the distinct superiority of the new method by checking the library quality on DNA level based on massive sequence data, but also at amino acid level. Various future applications in biotechnology can be expected, including the utilization when constructing mutability landscapes, which provide semi-rational information for identifying hot spots for protein engineering and directed evolution.
Collapse
Affiliation(s)
- Aitao Li
- Hubei Collaborative Innovation Center for Green Transformation of Bio-resources, Hubei Key Laboratory of Industrial Biotechnology, College of Life Sciences, Hubei University, Wuhan, 430062, China.,Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470, Muelheim, Germany.,Department of Chemistry, Philipps-Universität, Hans-Meerwein-Strasse 4, 35032, Marburg, Germany
| | | | - Manfred T Reetz
- Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470, Muelheim, Germany. .,Department of Chemistry, Philipps-Universität, Hans-Meerwein-Strasse 4, 35032, Marburg, Germany.
| |
Collapse
|
3
|
Fulton A, Hayes MR, Schwaneberg U, Pietruszka J, Jaeger KE. High-Throughput Screening Assays for Lipolytic Enzymes. Methods Mol Biol 2018; 1685:209-231. [PMID: 29086311 DOI: 10.1007/978-1-4939-7366-8_12] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Screening is defined as the identification of hits within a large library of variants of an enzyme or protein with a predefined property. In theory, each variant present in the respective library needs to be assayed; however, to save time and consumables, many screening regimes involve a primary round to identify clones producing active enzymes. Such primary or prescreenings for lipolytic enzyme activity are often carried out on agar plates containing pH indicators or substrates as triolein or tributyrin. Subsequently, high-throughput screening assays are usually performed in microtiter plate (MTP) format using chromogenic or fluorogenic substrates and, if available, automated liquid handling robotics. Here, we describe different assay systems to determine the activity and enantioselectivity of lipases and esterases as well as the synthesis of several substrates. We also report on the construction of a complete site saturation library derived from lipase A of Bacillus subtilis and its testing for detergent tolerance. This approach allows for the identification of amino acids affecting sensitivity or resistance against different detergents.
Collapse
Affiliation(s)
- Alexander Fulton
- Institute of Molecular Enzyme Technology, Heinrich-Heine - Universität Düsseldorf, Forschungszentrum Jülich, 52426, Jülich, Germany
- Novozymes A/S, Krogshoejvej 36, 2880, Bagsvaerd, Denmark
| | - Marc R Hayes
- Institute of Bioorganic Chemistry, Heinrich-Heine - Universität Düsseldorf, Forschungszentrum Jülich, 52426, Jülich, Germany
| | - Ulrich Schwaneberg
- Lehrstuhl für Biotechnologie, RWTH Aachen University, 52074, Aachen, Germany
- DWI Leibniz-Institute for Interactive Materials at RWTH Aachen University, 52056, Aachen, Germany
| | - Jörg Pietruszka
- Institute of Bioorganic Chemistry, Heinrich-Heine - Universität Düsseldorf, Forschungszentrum Jülich, 52426, Jülich, Germany
- Institute of Bio- and Geosciences IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52428, Jülich, Germany
| | - Karl-Erich Jaeger
- Institute of Molecular Enzyme Technology, Heinrich-Heine - Universität Düsseldorf, Forschungszentrum Jülich, 52426, Jülich, Germany.
- Institute of Bio- and Geosciences IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52428, Jülich, Germany.
| |
Collapse
|
4
|
Abstract
Directed evolution has emerged as one of the most effective protein engineering methods in basic research as well as in applications in synthetic organic chemistry and biotechnology. The successful engineering of protein activity, allostery, binding affinity, expression, folding, fluorescence, solubility, substrate scope, selectivity (enantio-, stereo-, and regioselectivity), and/or stability (temperature, organic solvents, pH) is just limited by the throughput of the genetic selection, display, or screening system that is available for a given protein. Sometimes it is possible to analyze millions of protein variants from combinatorial libraries per day. In other cases, however, only a few hundred variants can be screened in a single day, and thus the creation of smaller yet smarter libraries is needed. Different strategies have been developed to create these libraries. One approach is to perform mutational scanning or to construct "mutability landscapes" in order to understand sequence-function relationships that can guide the actual directed evolution process. Herein we provide a protocol for economically constructing scanning mutagenesis libraries using a cytochrome P450 enzyme in a high-throughput manner. The goal is to engineer activity, regioselectivity, and stereoselectivity in the oxidative hydroxylation of a steroid, a challenging reaction in synthetic organic chemistry. Libraries based on mutability landscapes can be used to engineer any fitness trait of interest. The protocol is also useful for constructing gene libraries for deep mutational scanning experiments.
Collapse
Affiliation(s)
- Carlos G Acevedo-Rocha
- Department of Biocatalysis, Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470, Mülheim an der Ruhr, Germany.
- Department of Chemistry, Philipps-Universität Marburg, Marburg, 35032, Germany.
- Biosyntia ApS, 2100, Copenhagen, Denmark.
| | - Matteo Ferla
- Department of Biochemistry, Oxford University, Oxford, OX1 3QU, UK
| | - Manfred T Reetz
- Department of Biocatalysis, Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470, Mülheim an der Ruhr, Germany
- Department of Chemistry, Philipps-Universität Marburg, Marburg, 35032, Germany
| |
Collapse
|
5
|
Mingo J, Erramuzpe A, Luna S, Aurtenetxe O, Amo L, Diez I, Schepens JTG, Hendriks WJAJ, Cortés JM, Pulido R. One-Tube-Only Standardized Site-Directed Mutagenesis: An Alternative Approach to Generate Amino Acid Substitution Collections. PLoS One 2016; 11:e0160972. [PMID: 27548698 PMCID: PMC4993582 DOI: 10.1371/journal.pone.0160972] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Accepted: 07/27/2016] [Indexed: 12/30/2022] Open
Abstract
Site-directed mutagenesis (SDM) is a powerful tool to create defined collections of protein variants for experimental and clinical purposes, but effectiveness is compromised when a large number of mutations is required. We present here a one-tube-only standardized SDM approach that generates comprehensive collections of amino acid substitution variants, including scanning- and single site-multiple mutations. The approach combines unified mutagenic primer design with the mixing of multiple distinct primer pairs and/or plasmid templates to increase the yield of a single inverse-PCR mutagenesis reaction. Also, a user-friendly program for automatic design of standardized primers for Ala-scanning mutagenesis is made available. Experimental results were compared with a modeling approach together with stochastic simulation data. For single site-multiple mutagenesis purposes and for simultaneous mutagenesis in different plasmid backgrounds, combination of primer sets and/or plasmid templates in a single reaction tube yielded the distinct mutations in a stochastic fashion. For scanning mutagenesis, we found that a combination of overlapping primer sets in a single PCR reaction allowed the yield of different individual mutations, although this yield did not necessarily follow a stochastic trend. Double mutants were generated when the overlap of primer pairs was below 60%. Our results illustrate that one-tube-only SDM effectively reduces the number of reactions required in large-scale mutagenesis strategies, facilitating the generation of comprehensive collections of protein variants suitable for functional analysis.
Collapse
Affiliation(s)
- Janire Mingo
- Biomarkers in Cancer Unit, Biocruces Health Research Institute, Barakaldo, Spain
| | - Asier Erramuzpe
- Quantitative Biomedicine Unit, Biocruces Health Research Institute, Barakaldo, Spain
| | - Sandra Luna
- Biomarkers in Cancer Unit, Biocruces Health Research Institute, Barakaldo, Spain
| | - Olaia Aurtenetxe
- Biomarkers in Cancer Unit, Biocruces Health Research Institute, Barakaldo, Spain
| | - Laura Amo
- Biomarkers in Cancer Unit, Biocruces Health Research Institute, Barakaldo, Spain
| | - Ibai Diez
- Quantitative Biomedicine Unit, Biocruces Health Research Institute, Barakaldo, Spain
| | - Jan T. G. Schepens
- Department of Cell Biology, Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
| | - Wiljan J. A. J. Hendriks
- Department of Cell Biology, Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
| | - Jesús M. Cortés
- Quantitative Biomedicine Unit, Biocruces Health Research Institute, Barakaldo, Spain
- IKERBASQUE, Basque Foundation for Science, Bilbao, Spain
| | - Rafael Pulido
- Biomarkers in Cancer Unit, Biocruces Health Research Institute, Barakaldo, Spain
- IKERBASQUE, Basque Foundation for Science, Bilbao, Spain
- * E-mail: ;
| |
Collapse
|
6
|
Fulton A, Frauenkron-Machedjou VJ, Skoczinski P, Wilhelm S, Zhu L, Schwaneberg U, Jaeger KE. Exploring the Protein Stability Landscape:Bacillus subtilisLipase A as a Model for Detergent Tolerance. Chembiochem 2015; 16:930-6. [DOI: 10.1002/cbic.201402664] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2014] [Indexed: 11/08/2022]
|
7
|
Production of the sesquiterpene (+)-valencene by metabolically engineered Corynebacterium glutamicum. J Biotechnol 2014; 191:205-13. [DOI: 10.1016/j.jbiotec.2014.05.032] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2014] [Revised: 05/02/2014] [Accepted: 05/14/2014] [Indexed: 11/18/2022]
|