1
|
Rollins ZA, Widatalla T, Cheng AC, Metwally E. AbMelt: Learning antibody thermostability from molecular dynamics. Biophys J 2024; 123:2921-2933. [PMID: 38851888 PMCID: PMC11393704 DOI: 10.1016/j.bpj.2024.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 03/16/2024] [Accepted: 06/04/2024] [Indexed: 06/10/2024] Open
Abstract
Antibody thermostability is challenging to predict from sequence and/or structure. This difficulty is likely due to the absence of direct entropic information. Herein, we present AbMelt where we model the inherent flexibility of homologous antibody structures using molecular dynamics simulations at three temperatures and learn the relevant descriptors to predict the temperatures of aggregation (Tagg), melt onset (Tm,on), and melt (Tm). We observed that the radius of gyration deviation of the complementarity determining regions at 400 K is the highest Pearson correlated descriptor with aggregation temperature (rp = -0.68 ± 0.23) and the deviation of internal molecular contacts at 350 K is the highest correlated descriptor with both Tm,on (rp = -0.74 ± 0.04) as well as Tm (rp = -0.69 ± 0.03). Moreover, after descriptor selection and machine learning regression, we predict on a held-out test set containing both internal and public data and achieve robust performance for all endpoints compared with baseline models (Tagg R2 = 0.57 ± 0.11, Tm,on R2 = 0.56 ± 0.01, and Tm R2 = 0.60 ± 0.06). In addition, the robustness of the AbMelt molecular dynamics methodology is demonstrated by only training on <5% of the data and outperforming more traditional machine learning models trained on the entire data set of more than 500 internal antibodies. Users can predict thermostability measurements for antibody variable fragments by collecting descriptors and using AbMelt, which has been made available.
Collapse
Affiliation(s)
- Zachary A Rollins
- Modeling and Informatics, Merck & Co., Inc., South San Francisco, California
| | - Talal Widatalla
- Modeling and Informatics, Merck & Co., Inc., South San Francisco, California
| | - Alan C Cheng
- Modeling and Informatics, Merck & Co., Inc., South San Francisco, California
| | - Essam Metwally
- Modeling and Informatics, Merck & Co., Inc., South San Francisco, California.
| |
Collapse
|
2
|
Capponi S, Wang S. AI in cellular engineering and reprogramming. Biophys J 2024; 123:2658-2670. [PMID: 38576162 PMCID: PMC11393708 DOI: 10.1016/j.bpj.2024.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 03/19/2024] [Accepted: 04/01/2024] [Indexed: 04/06/2024] Open
Abstract
During the last decade, artificial intelligence (AI) has increasingly been applied in biophysics and related fields, including cellular engineering and reprogramming, offering novel approaches to understand, manipulate, and control cellular function. The potential of AI lies in its ability to analyze complex datasets and generate predictive models. AI algorithms can process large amounts of data from single-cell genomics and multiomic technologies, allowing researchers to gain mechanistic insights into the control of cell identity and function. By integrating and interpreting these complex datasets, AI can help identify key molecular events and regulatory pathways involved in cellular reprogramming. This knowledge can inform the design of precision engineering strategies, such as the development of new transcription factor and signaling molecule cocktails, to manipulate cell identity and drive authentic cell fate across lineage boundaries. Furthermore, when used in combination with computational methods, AI can accelerate and improve the analysis and understanding of the intricate relationships between genes, proteins, and cellular processes. In this review article, we explore the current state of AI applications in biophysics with a specific focus on cellular engineering and reprogramming. Then, we showcase a couple of recent applications where we combined machine learning with experimental and computational techniques. Finally, we briefly discuss the challenges and prospects of AI in cellular engineering and reprogramming, emphasizing the potential of these technologies to revolutionize our ability to engineer cells for a variety of applications, from disease modeling and drug discovery to regenerative medicine and biomanufacturing.
Collapse
Affiliation(s)
- Sara Capponi
- IBM Almaden Research Center, San Jose, California; Center for Cellular Construction, San Francisco, California.
| | - Shangying Wang
- Bay Area Institute of Science, Altos Labs, Redwood City, California.
| |
Collapse
|
3
|
Ding K, Chin M, Zhao Y, Huang W, Mai BK, Wang H, Liu P, Yang Y, Luo Y. Machine learning-guided co-optimization of fitness and diversity facilitates combinatorial library design in enzyme engineering. Nat Commun 2024; 15:6392. [PMID: 39080249 PMCID: PMC11289365 DOI: 10.1038/s41467-024-50698-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Accepted: 07/19/2024] [Indexed: 08/02/2024] Open
Abstract
The effective design of combinatorial libraries to balance fitness and diversity facilitates the engineering of useful enzyme functions, particularly those that are poorly characterized or unknown in biology. We introduce MODIFY, a machine learning (ML) algorithm that learns from natural protein sequences to infer evolutionarily plausible mutations and predict enzyme fitness. MODIFY co-optimizes predicted fitness and sequence diversity of starting libraries, prioritizing high-fitness variants while ensuring broad sequence coverage. In silico evaluation shows that MODIFY outperforms state-of-the-art unsupervised methods in zero-shot fitness prediction and enables ML-guided directed evolution with enhanced efficiency. Using MODIFY, we engineer generalist biocatalysts derived from a thermostable cytochrome c to achieve enantioselective C-B and C-Si bond formation via a new-to-nature carbene transfer mechanism, leading to biocatalysts six mutations away from previously developed enzymes while exhibiting superior or comparable activities. These results demonstrate MODIFY's potential in solving challenging enzyme engineering problems beyond the reach of classic directed evolution.
Collapse
Affiliation(s)
- Kerr Ding
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, 30332, USA
| | - Michael Chin
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, CA, 93106, USA
| | - Yunlong Zhao
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, CA, 93106, USA
| | - Wei Huang
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, CA, 93106, USA
| | - Binh Khanh Mai
- Department of Chemistry, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Huanan Wang
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, CA, 93106, USA
| | - Peng Liu
- Department of Chemistry, University of Pittsburgh, Pittsburgh, PA, 15260, USA.
| | - Yang Yang
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, CA, 93106, USA.
- Biomolecular Science and Engineering (BMSE) Program, University of California, Santa Barbara, CA, 93106, USA.
| | - Yunan Luo
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, 30332, USA.
| |
Collapse
|
4
|
Vornholt T, Mutný M, Schmidt GW, Schellhaas C, Tachibana R, Panke S, Ward TR, Krause A, Jeschek M. Enhanced Sequence-Activity Mapping and Evolution of Artificial Metalloenzymes by Active Learning. ACS CENTRAL SCIENCE 2024; 10:1357-1370. [PMID: 39071060 PMCID: PMC11273458 DOI: 10.1021/acscentsci.4c00258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/22/2024] [Accepted: 05/02/2024] [Indexed: 07/30/2024]
Abstract
Tailored enzymes are crucial for the transition to a sustainable bioeconomy. However, enzyme engineering is laborious and failure-prone due to its reliance on serendipity. The efficiency and success rates of engineering campaigns may be improved by applying machine learning to map the sequence-activity landscape based on small experimental data sets. Yet, it often proves challenging to reliably model large sequence spaces while keeping the experimental effort tractable. To address this challenge, we present an integrated pipeline combining large-scale screening with active machine learning, which we applied to engineer an artificial metalloenzyme (ArM) catalyzing a new-to-nature hydroamination reaction. Combining lab automation and next-generation sequencing, we acquired sequence-activity data for several thousand ArM variants. We then used Gaussian process regression to model the activity landscape and guide further screening rounds. Critical characteristics of our pipeline include the cost-effective generation of information-rich data sets, the integration of an explorative round to improve the model's performance, and the inclusion of experimental noise. Our approach led to an order-of-magnitude boost in the hit rate while making efficient use of experimental resources. Search strategies like this should find broad utility in enzyme engineering and accelerate the development of novel biocatalysts.
Collapse
Affiliation(s)
- Tobias Vornholt
- Department
of Biosystems Science and Engineering, ETH
Zurich, Mattenstrasse 26, 4058 Basel, Switzerland
- National
Centre of Competence in Research (NCCR) Molecular Systems Engineering, 4056 Basel,Switzerland
| | - Mojmír Mutný
- Department
of Computer Science, ETH Zurich, Andreasstrasse 5, 8092 Zurich, Switzerland
| | - Gregor W. Schmidt
- Department
of Biosystems Science and Engineering, ETH
Zurich, Mattenstrasse 26, 4058 Basel, Switzerland
| | - Christian Schellhaas
- Department
of Biosystems Science and Engineering, ETH
Zurich, Mattenstrasse 26, 4058 Basel, Switzerland
| | - Ryo Tachibana
- Department
of Chemistry, University of Basel, Mattenstrasse 24a, 4058 Basel, Switzerland
| | - Sven Panke
- Department
of Biosystems Science and Engineering, ETH
Zurich, Mattenstrasse 26, 4058 Basel, Switzerland
- National
Centre of Competence in Research (NCCR) Molecular Systems Engineering, 4056 Basel,Switzerland
| | - Thomas R. Ward
- National
Centre of Competence in Research (NCCR) Molecular Systems Engineering, 4056 Basel,Switzerland
- Department
of Chemistry, University of Basel, Mattenstrasse 24a, 4058 Basel, Switzerland
| | - Andreas Krause
- Department
of Computer Science, ETH Zurich, Andreasstrasse 5, 8092 Zurich, Switzerland
| | - Markus Jeschek
- Department
of Biosystems Science and Engineering, ETH
Zurich, Mattenstrasse 26, 4058 Basel, Switzerland
- Institute
of Microbiology, University of Regensburg, Universitätsstraße 31, 93053 Regensburg, Germany
| |
Collapse
|
5
|
Hunter Wilson R, Damodaran AR, Bhagi-Damodaran A. Machine learning guided rational design of a non-heme iron-based lysine dioxygenase improves its total turnover number. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.04.597480. [PMID: 38895203 PMCID: PMC11185610 DOI: 10.1101/2024.06.04.597480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Highly selective C-H functionalization remains an ongoing challenge in organic synthetic methodologies. Biocatalysts are robust tools for achieving these difficult chemical transformations. Biocatalyst engineering has often required directed evolution or structure-based rational design campaigns to improve their activities. In recent years, machine learning has been integrated into these workflows to improve the discovery of beneficial enzyme variants. In this work, we combine a structure-based machine-learning algorithm with classical molecular dynamics simulations to down select mutations for rational design of a non-heme iron-dependent lysine dioxygenase, LDO. This approach consistently resulted in functional LDO mutants and circumvents the need for extensive study of mutational activity before-hand. Our rationally designed single mutants purified with up to 2-fold higher yields than WT and displayed higher total turnover numbers (TTN). Combining five such single mutations into a pentamutant variant, LPNYI LDO, leads to a 40% improvement in the TTN (218±3) as compared to WT LDO (TTN = 160±2). Overall, this work offers a low-barrier approach for those seeking to synergize machine learning algorithms with pre-existing protein engineering strategies.
Collapse
Affiliation(s)
- R Hunter Wilson
- Department of Chemistry, University of Minnesota, Twin Cities, Minneapolis, MN, 55455
| | - Anoop R Damodaran
- Department of Chemistry, University of Minnesota, Twin Cities, Minneapolis, MN, 55455
| | | |
Collapse
|
6
|
Liu Y, Chen Z, Wang Z, Lv Y. Boosted Enzyme Activity via Encapsulation within Metal-Organic Frameworks with Pores Matching Enzyme Size and Shape. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2309243. [PMID: 38576185 DOI: 10.1002/advs.202309243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 02/21/2024] [Indexed: 04/06/2024]
Abstract
A novel and versatile approach called "physical imprinting" is introduced to modulate enzyme conformation using mesoporous materials, addressing challenges in achieving improved enzyme activity and stability. Metal-organic frameworks with tailored mesopores, precisely matching enzyme size and shape, are synthesized. Remarkably, enzymes encapsulated within these customized mesopores exhibit over 1670% relative activity compared to free enzymes, maintaining outstanding efficiency even under harsh conditions such as heat, exposure to organic solvents, wide-ranging pH extremes from acidic to alkaline, and exposure to a digestion cocktail. After 18 consecutive cycles of use, the immobilized enzymes retain 80% of their initial activity. Additionally, the encapsulated enzymes exhibit a substantial increase in catalytic efficiency, with a 14.1-fold enhancement in kcat/KM compared to native enzymes. This enhancement is among the highest reported for immobilized enzymes. The improved enzyme activity and stability are corroborated by solid-state UV-vis, electron paramagnetic resonance, Fourier-transform infrared spectroscopy, and solid-state NMR spectroscopy. The findings not only offer valuable insights into the crucial role of size and shape complementarity within confined microenvironments but also establish a new pathway for developing solid carriers capable of enhancing enzyme activity and stability.
Collapse
Affiliation(s)
- Ying Liu
- State Key Laboratory of Organic-Inorganic Composites, National Energy Research and Development Center for Biorefinery, International Joint Bioenergy Laboratory of Ministry of Education, Beijing Key Laboratory of Bioprocess, College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Ziman Chen
- State Key Laboratory of Organic-Inorganic Composites, National Energy Research and Development Center for Biorefinery, International Joint Bioenergy Laboratory of Ministry of Education, Beijing Key Laboratory of Bioprocess, College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Zheng Wang
- State Key Laboratory of Organic-Inorganic Composites, National Energy Research and Development Center for Biorefinery, International Joint Bioenergy Laboratory of Ministry of Education, Beijing Key Laboratory of Bioprocess, College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Yongqin Lv
- State Key Laboratory of Organic-Inorganic Composites, National Energy Research and Development Center for Biorefinery, International Joint Bioenergy Laboratory of Ministry of Education, Beijing Key Laboratory of Bioprocess, College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| |
Collapse
|
7
|
Winnifrith A, Outeiral C, Hie BL. Generative artificial intelligence for de novo protein design. Curr Opin Struct Biol 2024; 86:102794. [PMID: 38663170 DOI: 10.1016/j.sbi.2024.102794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 01/31/2024] [Accepted: 02/19/2024] [Indexed: 05/19/2024]
Abstract
Engineering new molecules with desirable functions and properties has the potential to extend our ability to engineer proteins beyond what nature has so far evolved. Advances in the so-called 'de novo' design problem have recently been brought forward by developments in artificial intelligence. Generative architectures, such as language models and diffusion processes, seem adept at generating novel, yet realistic proteins that display desirable properties and perform specified functions. State-of-the-art design protocols now achieve experimental success rates nearing 20%, thus widening the access to de novo designed proteins. Despite extensive progress, there are clear field-wide challenges, for example, in determining the best in silico metrics to prioritise designs for experimental testing, and in designing proteins that can undergo large conformational changes or be regulated by post-translational modifications. With an increase in the number of models being developed, this review provides a framework to understand how these tools fit into the overall process of de novo protein design. Throughout, we highlight the power of incorporating biochemical knowledge to improve performance and interpretability.
Collapse
Affiliation(s)
- Adam Winnifrith
- Department of Biochemistry, University of Oxford, South Parks Rd, Oxford, OX1 3QU, United Kingdom; Evolvere Biosciences, Innovation Building, Old Road Campus, Oxford, OX3 7FZ, United Kingdom.
| | - Carlos Outeiral
- Department of Statistics, University of Oxford, 24-29 St Giles', Oxford OX1 3LB, United Kingdom.
| | - Brian L Hie
- Department of Chemical Engineering, Stanford University, 443 Via Ortega, Stanford, CA 94305, USA; Stanford Data Science, 475 Via Ortega, Stanford CA 94305, USA; Arc Institute, 3181 Porter Dr, Palo Alto, CA, USA.
| |
Collapse
|
8
|
Ding K, Luo J, Luo Y. Leveraging conformal prediction to annotate enzyme function space with limited false positives. PLoS Comput Biol 2024; 20:e1012135. [PMID: 38809942 PMCID: PMC11164347 DOI: 10.1371/journal.pcbi.1012135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 06/10/2024] [Accepted: 05/03/2024] [Indexed: 05/31/2024] Open
Abstract
Machine learning (ML) is increasingly being used to guide biological discovery in biomedicine such as prioritizing promising small molecules in drug discovery. In those applications, ML models are used to predict the properties of biological systems, and researchers use these predictions to prioritize candidates as new biological hypotheses for downstream experimental validations. However, when applied to unseen situations, these models can be overconfident and produce a large number of false positives. One solution to address this issue is to quantify the model's prediction uncertainty and provide a set of hypotheses with a controlled false discovery rate (FDR) pre-specified by researchers. We propose CPEC, an ML framework for FDR-controlled biological discovery. We demonstrate its effectiveness using enzyme function annotation as a case study, simulating the discovery process of identifying the functions of less-characterized enzymes. CPEC integrates a deep learning model with a statistical tool known as conformal prediction, providing accurate and FDR-controlled function predictions for a given protein enzyme. Conformal prediction provides rigorous statistical guarantees to the predictive model and ensures that the expected FDR will not exceed a user-specified level with high probability. Evaluation experiments show that CPEC achieves reliable FDR control, better or comparable prediction performance at a lower FDR than existing methods, and accurate predictions for enzymes under-represented in the training data. We expect CPEC to be a useful tool for biological discovery applications where a high yield rate in validation experiments is desired but the experimental budget is limited.
Collapse
Affiliation(s)
- Kerr Ding
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Jiaqi Luo
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Yunan Luo
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| |
Collapse
|
9
|
d'Oelsnitz S, Diaz DJ, Kim W, Acosta DJ, Dangerfield TL, Schechter MW, Minus MB, Howard JR, Do H, Loy JM, Alper HS, Zhang YJ, Ellington AD. Biosensor and machine learning-aided engineering of an amaryllidaceae enzyme. Nat Commun 2024; 15:2084. [PMID: 38453941 PMCID: PMC10920890 DOI: 10.1038/s41467-024-46356-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 02/22/2024] [Indexed: 03/09/2024] Open
Abstract
A major challenge to achieving industry-scale biomanufacturing of therapeutic alkaloids is the slow process of biocatalyst engineering. Amaryllidaceae alkaloids, such as the Alzheimer's medication galantamine, are complex plant secondary metabolites with recognized therapeutic value. Due to their difficult synthesis they are regularly sourced by extraction and purification from the low-yielding daffodil Narcissus pseudonarcissus. Here, we propose an efficient biosensor-machine learning technology stack for biocatalyst development, which we apply to engineer an Amaryllidaceae enzyme in Escherichia coli. Directed evolution is used to develop a highly sensitive (EC50 = 20 μM) and specific biosensor for the key Amaryllidaceae alkaloid branchpoint 4'-O-methylnorbelladine. A structure-based residual neural network (MutComputeX) is subsequently developed and used to generate activity-enriched variants of a plant methyltransferase, which are rapidly screened with the biosensor. Functional enzyme variants are identified that yield a 60% improvement in product titer, 2-fold higher catalytic activity, and 3-fold lower off-product regioisomer formation. A solved crystal structure elucidates the mechanism behind key beneficial mutations.
Collapse
Affiliation(s)
- Simon d'Oelsnitz
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA.
- Synthetic Biology HIVE, Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA.
| | - Daniel J Diaz
- Department of Chemistry, University of Texas at Austin, Austin, TX, 78712, USA
- Institute for Foundations of Machine Learning, University of Texas at Austin, Austin, TX, 78712, USA
| | - Wantae Kim
- McKetta Department of Chemical Engineering, University of Texas at Austin, Austin, TX, 78712, USA
| | - Daniel J Acosta
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Tyler L Dangerfield
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Mason W Schechter
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Matthew B Minus
- Department of Chemistry, Prairie View A&M University, 100 University Dr, Prairie View, TX, 77446, USA
| | - James R Howard
- Department of Chemistry, University of Texas at Austin, Austin, TX, 78712, USA
| | - Hannah Do
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - James M Loy
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Hal S Alper
- McKetta Department of Chemical Engineering, University of Texas at Austin, Austin, TX, 78712, USA
| | - Y Jessie Zhang
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Andrew D Ellington
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| |
Collapse
|
10
|
Goshisht MK. Machine Learning and Deep Learning in Synthetic Biology: Key Architectures, Applications, and Challenges. ACS OMEGA 2024; 9:9921-9945. [PMID: 38463314 PMCID: PMC10918679 DOI: 10.1021/acsomega.3c05913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/19/2024] [Accepted: 01/30/2024] [Indexed: 03/12/2024]
Abstract
Machine learning (ML), particularly deep learning (DL), has made rapid and substantial progress in synthetic biology in recent years. Biotechnological applications of biosystems, including pathways, enzymes, and whole cells, are being probed frequently with time. The intricacy and interconnectedness of biosystems make it challenging to design them with the desired properties. ML and DL have a synergy with synthetic biology. Synthetic biology can be employed to produce large data sets for training models (for instance, by utilizing DNA synthesis), and ML/DL models can be employed to inform design (for example, by generating new parts or advising unrivaled experiments to perform). This potential has recently been brought to light by research at the intersection of engineering biology and ML/DL through achievements like the design of novel biological components, best experimental design, automated analysis of microscopy data, protein structure prediction, and biomolecular implementations of ANNs (Artificial Neural Networks). I have divided this review into three sections. In the first section, I describe predictive potential and basics of ML along with myriad applications in synthetic biology, especially in engineering cells, activity of proteins, and metabolic pathways. In the second section, I describe fundamental DL architectures and their applications in synthetic biology. Finally, I describe different challenges causing hurdles in the progress of ML/DL and synthetic biology along with their solutions.
Collapse
Affiliation(s)
- Manoj Kumar Goshisht
- Department of Chemistry, Natural and
Applied Sciences, University of Wisconsin—Green
Bay, Green
Bay, Wisconsin 54311-7001, United States
| |
Collapse
|
11
|
Honda Malca S, Duss N, Meierhofer J, Patsch D, Niklaus M, Reiter S, Hanlon SP, Wetzl D, Kuhn B, Iding H, Buller R. Effective engineering of a ketoreductase for the biocatalytic synthesis of an ipatasertib precursor. Commun Chem 2024; 7:46. [PMID: 38418529 PMCID: PMC10902378 DOI: 10.1038/s42004-024-01130-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 02/15/2024] [Indexed: 03/01/2024] Open
Abstract
Semi-rational enzyme engineering is a powerful method to develop industrial biocatalysts. Profiting from advances in molecular biology and bioinformatics, semi-rational approaches can effectively accelerate enzyme engineering campaigns. Here, we present the optimization of a ketoreductase from Sporidiobolus salmonicolor for the chemo-enzymatic synthesis of ipatasertib, a potent protein kinase B inhibitor. Harnessing the power of mutational scanning and structure-guided rational design, we created a 10-amino acid substituted variant exhibiting a 64-fold higher apparent kcat and improved robustness under process conditions compared to the wild-type enzyme. In addition, the benefit of algorithm-aided enzyme engineering was studied to derive correlations in protein sequence-function data, and it was found that the applied Gaussian processes allowed us to reduce enzyme library size. The final scalable and high performing biocatalytic process yielded the alcohol intermediate with ≥ 98% conversion and a diastereomeric excess of 99.7% (R,R-trans) from 100 g L-1 ketone after 30 h. Modelling and kinetic studies shed light on the mechanistic factors governing the improved reaction outcome, with mutations T134V, A238K, M242W and Q245S exerting the most beneficial effect on reduction activity towards the target ketone.
Collapse
Affiliation(s)
- Sumire Honda Malca
- Institute of Chemistry and Biotechnology, Zurich University of Applied Sciences, Einsiedlerstrasse 31, 8820 Wädenswil, Switzerland
| | - Nadine Duss
- Institute of Chemistry and Biotechnology, Zurich University of Applied Sciences, Einsiedlerstrasse 31, 8820 Wädenswil, Switzerland
| | - Jasmin Meierhofer
- Institute of Chemistry and Biotechnology, Zurich University of Applied Sciences, Einsiedlerstrasse 31, 8820 Wädenswil, Switzerland
- Analytical Research and Development, MSD Werthenstein BioPharma GmbH, Industrie Nord 1, 6105 Schachen, Switzerland
| | - David Patsch
- Institute of Chemistry and Biotechnology, Zurich University of Applied Sciences, Einsiedlerstrasse 31, 8820 Wädenswil, Switzerland
| | - Michael Niklaus
- Institute of Chemistry and Biotechnology, Zurich University of Applied Sciences, Einsiedlerstrasse 31, 8820 Wädenswil, Switzerland
| | - Stefanie Reiter
- Institute of Chemistry and Biotechnology, Zurich University of Applied Sciences, Einsiedlerstrasse 31, 8820 Wädenswil, Switzerland
- Manufacturing Science and Technology, Fisher Clinical Services GmbH, Biotech Innovation Park, 2543 Lengnau, Switzerland
| | - Steven Paul Hanlon
- Process Chemistry and Catalysis, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070 Basel, Switzerland
| | - Dennis Wetzl
- Process Chemistry and Catalysis, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070 Basel, Switzerland
- Nonclinical Drug Development, Boehringer Ingelheim International GmbH, Birkendorfer Strasse 65, 88397 Biberach an der Riss, Germany
| | - Bernd Kuhn
- Pharmaceutical Research and Early Development, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070 Basel, Switzerland
| | - Hans Iding
- Process Chemistry and Catalysis, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070 Basel, Switzerland
| | - Rebecca Buller
- Institute of Chemistry and Biotechnology, Zurich University of Applied Sciences, Einsiedlerstrasse 31, 8820 Wädenswil, Switzerland.
| |
Collapse
|
12
|
Yang J, Li FZ, Arnold FH. Opportunities and Challenges for Machine Learning-Assisted Enzyme Engineering. ACS CENTRAL SCIENCE 2024; 10:226-241. [PMID: 38435522 PMCID: PMC10906252 DOI: 10.1021/acscentsci.3c01275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 12/26/2023] [Accepted: 01/16/2024] [Indexed: 03/05/2024]
Abstract
Enzymes can be engineered at the level of their amino acid sequences to optimize key properties such as expression, stability, substrate range, and catalytic efficiency-or even to unlock new catalytic activities not found in nature. Because the search space of possible proteins is vast, enzyme engineering usually involves discovering an enzyme starting point that has some level of the desired activity followed by directed evolution to improve its "fitness" for a desired application. Recently, machine learning (ML) has emerged as a powerful tool to complement this empirical process. ML models can contribute to (1) starting point discovery by functional annotation of known protein sequences or generating novel protein sequences with desired functions and (2) navigating protein fitness landscapes for fitness optimization by learning mappings between protein sequences and their associated fitness values. In this Outlook, we explain how ML complements enzyme engineering and discuss its future potential to unlock improved engineering outcomes.
Collapse
Affiliation(s)
- Jason Yang
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Francesca-Zhoufan Li
- Division
of Biology and Biological Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Frances H. Arnold
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
- Division
of Biology and Biological Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
13
|
Chu HY, Fong JHC, Thean DGL, Zhou P, Fung FKC, Huang Y, Wong ASL. Accurate top protein variant discovery via low-N pick-and-validate machine learning. Cell Syst 2024; 15:193-203.e6. [PMID: 38340729 DOI: 10.1016/j.cels.2024.01.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 10/11/2023] [Accepted: 01/18/2024] [Indexed: 02/12/2024]
Abstract
A strategy to obtain the greatest number of best-performing variants with least amount of experimental effort over the vast combinatorial mutational landscape would have enormous utility in boosting resource producibility for protein engineering. Toward this goal, we present a simple and effective machine learning-based strategy that outperforms other state-of-the-art methods. Our strategy integrates zero-shot prediction and multi-round sampling to direct active learning via experimenting with only a few predicted top variants. We find that four rounds of low-N pick-and-validate sampling of 12 variants for machine learning yielded the best accuracy of up to 92.6% in selecting the true top 1% variants in combinatorial mutant libraries, whereas two rounds of 24 variants can also be used. We demonstrate our strategy in successfully discovering high-performance protein variants from diverse families including the CRISPR-based genome editors, supporting its generalizable application for solving protein engineering tasks. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Hoi Yee Chu
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - John H C Fong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Dawn G L Thean
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Peng Zhou
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Frederic K C Fung
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Yuanhua Huang
- School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Department of Statistics and Actuarial Science, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Alan S L Wong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China.
| |
Collapse
|
14
|
Hie BL, Shanker VR, Xu D, Bruun TUJ, Weidenbacher PA, Tang S, Wu W, Pak JE, Kim PS. Efficient evolution of human antibodies from general protein language models. Nat Biotechnol 2024; 42:275-283. [PMID: 37095349 PMCID: PMC10869273 DOI: 10.1038/s41587-023-01763-2] [Citation(s) in RCA: 71] [Impact Index Per Article: 71.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 03/28/2023] [Indexed: 04/26/2023]
Abstract
Natural evolution must explore a vast landscape of possible sequences for desirable yet rare mutations, suggesting that learning from natural evolutionary strategies could guide artificial evolution. Here we report that general protein language models can efficiently evolve human antibodies by suggesting mutations that are evolutionarily plausible, despite providing the model with no information about the target antigen, binding specificity or protein structure. We performed language-model-guided affinity maturation of seven antibodies, screening 20 or fewer variants of each antibody across only two rounds of laboratory evolution, and improved the binding affinities of four clinically relevant, highly mature antibodies up to sevenfold and three unmatured antibodies up to 160-fold, with many designs also demonstrating favorable thermostability and viral neutralization activity against Ebola and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pseudoviruses. The same models that improve antibody binding also guide efficient evolution across diverse protein families and selection pressures, including antibiotic resistance and enzyme activity, suggesting that these results generalize to many settings.
Collapse
Affiliation(s)
- Brian L Hie
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA.
- Sarafan ChEM-H, Stanford University, Stanford, CA, USA.
| | - Varun R Shanker
- Sarafan ChEM-H, Stanford University, Stanford, CA, USA
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Duo Xu
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Sarafan ChEM-H, Stanford University, Stanford, CA, USA
| | - Theodora U J Bruun
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Sarafan ChEM-H, Stanford University, Stanford, CA, USA
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Payton A Weidenbacher
- Sarafan ChEM-H, Stanford University, Stanford, CA, USA
- Department of Chemistry, Stanford University, Stanford, CA, USA
| | - Shaogeng Tang
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Sarafan ChEM-H, Stanford University, Stanford, CA, USA
| | - Wesley Wu
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - John E Pak
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Peter S Kim
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA.
- Sarafan ChEM-H, Stanford University, Stanford, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
15
|
Ao YF, Dörr M, Menke MJ, Born S, Heuson E, Bornscheuer UT. Data-Driven Protein Engineering for Improving Catalytic Activity and Selectivity. Chembiochem 2024; 25:e202300754. [PMID: 38029350 DOI: 10.1002/cbic.202300754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 11/28/2023] [Accepted: 11/29/2023] [Indexed: 12/01/2023]
Abstract
Protein engineering is essential for altering the substrate scope, catalytic activity and selectivity of enzymes for applications in biocatalysis. However, traditional approaches, such as directed evolution and rational design, encounter the challenge in dealing with the experimental screening process of a large protein mutation space. Machine learning methods allow the approximation of protein fitness landscapes and the identification of catalytic patterns using limited experimental data, thus providing a new avenue to guide protein engineering campaigns. In this concept article, we review machine learning models that have been developed to assess enzyme-substrate-catalysis performance relationships aiming to improve enzymes through data-driven protein engineering. Furthermore, we prospect the future development of this field to provide additional strategies and tools for achieving desired activities and selectivities.
Collapse
Affiliation(s)
- Yu-Fei Ao
- Department of Biotechnology and Enzyme Catalysis, Institute of Biochemistry, University of Greifswald, Felix-Hausdorff-Str. 4, 17487, Greifswald, Germany
- Beijing National Laboratory for Molecular Sciences, CAS Key Laboratory of Molecular Recognition and Function, Institute of Chemistry, Chinese Academy of Sciences, Zhongguancun North First Street 2, Beijing, 100190, China
- University of Chinese Academy of Sciences, Yuquan Road 19(A), Beijing, 100049, China
| | - Mark Dörr
- Department of Biotechnology and Enzyme Catalysis, Institute of Biochemistry, University of Greifswald, Felix-Hausdorff-Str. 4, 17487, Greifswald, Germany
| | - Marian J Menke
- Department of Biotechnology and Enzyme Catalysis, Institute of Biochemistry, University of Greifswald, Felix-Hausdorff-Str. 4, 17487, Greifswald, Germany
| | - Stefan Born
- Technische Universität Berlin, Chair of Bioprocess Engineering, Ackerstraße 76, 13355, Berlin, Germany
| | - Egon Heuson
- Univ. Lille, CNRS, Centrale Lille, Univ. Artois, UMR 8181 UCCS, Unité de Catalyse et Chimie du Solide, 59000, Lille, France
| | - Uwe T Bornscheuer
- Department of Biotechnology and Enzyme Catalysis, Institute of Biochemistry, University of Greifswald, Felix-Hausdorff-Str. 4, 17487, Greifswald, Germany
| |
Collapse
|
16
|
Dohadwala S, Geib MT, Politch JA, Anderson DJ. Innovations in monoclonal antibody-based multipurpose prevention technology (MPT) for the prevention of sexually transmitted infections and unintended pregnancy. FRONTIERS IN REPRODUCTIVE HEALTH 2024; 5:1337479. [PMID: 38264184 PMCID: PMC10803587 DOI: 10.3389/frph.2023.1337479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 12/14/2023] [Indexed: 01/25/2024] Open
Abstract
Monoclonal antibodies (mAbs) are currently being produced for a number of clinical applications including contraception and the prevention of sexually transmitted infections (STIs). Combinations of contraceptive and anti-STI mAbs, including antibodies against HIV-1 and HSV-2, provide a powerful and flexible approach for highly potent and specific multipurpose prevention technology (MPT) products with desirable efficacy, safety and pharmacokinetic profiles. MAbs can be administered systemically by injection, or mucosally via topical products (e.g., films, gels, rings) which can be tailored for vaginal, penile or rectal administration to address the needs of different populations. The MPT field has faced challenges with safety, efficacy, production and cost. Here, we review the state-of-the-art of mAb MPTs that tackle these challenges with innovative strategies in mAb engineering, manufacturing, and delivery that could usher in a new generation of safe, efficacious, cost-effective, and scalable mAb MPTs.
Collapse
Affiliation(s)
- Sarah Dohadwala
- Department of Virology, Immunology and Microbiology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, United States
| | - Matthew T. Geib
- Department of Material Science and Engineering, Boston University, Boston, MA, United States
| | - Joseph A. Politch
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, United States
| | - Deborah J. Anderson
- Department of Virology, Immunology and Microbiology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, United States
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, United States
| |
Collapse
|
17
|
Teng F, Cui T, Zhou L, Gao Q, Zhou Q, Li W. Programmable synthetic receptors: the next-generation of cell and gene therapies. Signal Transduct Target Ther 2024; 9:7. [PMID: 38167329 PMCID: PMC10761793 DOI: 10.1038/s41392-023-01680-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 09/22/2023] [Accepted: 10/11/2023] [Indexed: 01/05/2024] Open
Abstract
Cell and gene therapies hold tremendous promise for treating a range of difficult-to-treat diseases. However, concerns over the safety and efficacy require to be further addressed in order to realize their full potential. Synthetic receptors, a synthetic biology tool that can precisely control the function of therapeutic cells and genetic modules, have been rapidly developed and applied as a powerful solution. Delicately designed and engineered, they can be applied to finetune the therapeutic activities, i.e., to regulate production of dosed, bioactive payloads by sensing and processing user-defined signals or biomarkers. This review provides an overview of diverse synthetic receptor systems being used to reprogram therapeutic cells and their wide applications in biomedical research. With a special focus on four synthetic receptor systems at the forefront, including chimeric antigen receptors (CARs) and synthetic Notch (synNotch) receptors, we address the generalized strategies to design, construct and improve synthetic receptors. Meanwhile, we also highlight the expanding landscape of therapeutic applications of the synthetic receptor systems as well as current challenges in their clinical translation.
Collapse
Affiliation(s)
- Fei Teng
- University of Chinese Academy of Sciences, Beijing, 101408, China.
| | - Tongtong Cui
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China
| | - Li Zhou
- University of Chinese Academy of Sciences, Beijing, 101408, China
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China
| | - Qingqin Gao
- University of Chinese Academy of Sciences, Beijing, 101408, China
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China
| | - Qi Zhou
- University of Chinese Academy of Sciences, Beijing, 101408, China.
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China.
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China.
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China.
| | - Wei Li
- University of Chinese Academy of Sciences, Beijing, 101408, China.
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China.
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China.
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China.
| |
Collapse
|
18
|
Rapp JT, Bremer BJ, Romero PA. Self-driving laboratories to autonomously navigate the protein fitness landscape. NATURE CHEMICAL ENGINEERING 2024; 1:97-107. [PMID: 38468718 PMCID: PMC10926838 DOI: 10.1038/s44286-023-00002-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 11/20/2023] [Indexed: 03/13/2024]
Abstract
Protein engineering has nearly limitless applications across chemistry, energy and medicine, but creating new proteins with improved or novel functions remains slow, labor-intensive and inefficient. Here we present the Self-driving Autonomous Machines for Protein Landscape Exploration (SAMPLE) platform for fully autonomous protein engineering. SAMPLE is driven by an intelligent agent that learns protein sequence-function relationships, designs new proteins and sends designs to a fully automated robotic system that experimentally tests the designed proteins and provides feedback to improve the agent's understanding of the system. We deploy four SAMPLE agents with the goal of engineering glycoside hydrolase enzymes with enhanced thermal tolerance. Despite showing individual differences in their search behavior, all four agents quickly converge on thermostable enzymes. Self-driving laboratories automate and accelerate the scientific discovery process and hold great potential for the fields of protein engineering and synthetic biology.
Collapse
Affiliation(s)
- Jacob T. Rapp
- Department of Biochemistry, University of Wisconsin–Madison, Madison, WI, USA
| | - Bennett J. Bremer
- Department of Biochemistry, University of Wisconsin–Madison, Madison, WI, USA
| | - Philip A. Romero
- Department of Biochemistry, University of Wisconsin–Madison, Madison, WI, USA
- Department of Chemical & Biological Engineering, University of Wisconsin–Madison, Madison, WI, USA
| |
Collapse
|
19
|
Qu G, Liu Y, Ma Q, Li J, Du G, Liu L, Lv X. Progress and Prospects of Natural Glycoside Sweetener Biosynthesis: A Review. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2023; 71:15926-15941. [PMID: 37856872 DOI: 10.1021/acs.jafc.3c05074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2023]
Abstract
To achieve an adequate sense of sweetness with a healthy low-sugar diet, it is necessary to explore and produce sugar alternatives. Recently, glycoside sweeteners and their biosynthetic approaches have attracted the attention of researchers. In this review, we first outlined the synthetic pathways of glycoside sweeteners, including the key enzymes and rate-limiting steps. Next, we reviewed the progress in engineered microorganisms producing glycoside sweeteners, including de novo synthesis, whole-cell catalysis synthesis, and in vitro synthesis. The applications of metabolic engineering strategies, such as cofactor engineering and enzyme modification, in the optimization of glycoside sweetener biosynthesis were summarized. Finally, the prospects of combining enzyme engineering and machine learning strategies to enhance the production of glycoside sweeteners were discussed. This review provides a perspective on synthesizing glycoside sweeteners in microbial cells, theoretically guiding the bioproduction of glycoside sweeteners.
Collapse
Affiliation(s)
- Guanyi Qu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, P. R. China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, P. R. China
- Shandong Jincheng Biological Pharmaceutical Company, Limited, Zibo 255000, P. R. China
| | - Yanfeng Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, P. R. China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, P. R. China
| | - Qinyuan Ma
- Shandong Jincheng Biological Pharmaceutical Company, Limited, Zibo 255000, P. R. China
| | - Jianghua Li
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, P. R. China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, P. R. China
| | - Guocheng Du
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, P. R. China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, P. R. China
| | - Long Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, P. R. China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, P. R. China
- Yixing Institute of Food Biotechnology Company, Limited, Yixing 214200, P. R. China
- Food Laboratory of Zhongyuan, Jiangnan University, Wuxi 214122, P. R. China
| | - Xueqin Lv
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, P. R. China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, P. R. China
- Yixing Institute of Food Biotechnology Company, Limited, Yixing 214200, P. R. China
| |
Collapse
|
20
|
Capponi S, Daniels KG. Harnessing the power of artificial intelligence to advance cell therapy. Immunol Rev 2023; 320:147-165. [PMID: 37415280 DOI: 10.1111/imr.13236] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 06/17/2023] [Indexed: 07/08/2023]
Abstract
Cell therapies are powerful technologies in which human cells are reprogrammed for therapeutic applications such as killing cancer cells or replacing defective cells. The technologies underlying cell therapies are increasing in effectiveness and complexity, making rational engineering of cell therapies more difficult. Creating the next generation of cell therapies will require improved experimental approaches and predictive models. Artificial intelligence (AI) and machine learning (ML) methods have revolutionized several fields in biology including genome annotation, protein structure prediction, and enzyme design. In this review, we discuss the potential of combining experimental library screens and AI to build predictive models for the development of modular cell therapy technologies. Advances in DNA synthesis and high-throughput screening techniques enable the construction and screening of libraries of modular cell therapy constructs. AI and ML models trained on this screening data can accelerate the development of cell therapies by generating predictive models, design rules, and improved designs.
Collapse
Affiliation(s)
- Sara Capponi
- Department of Functional Genomics and Cellular Engineering, IBM Almaden Research Center, San Jose, California, USA
- Center for Cellular Construction, San Francisco, California, USA
| | - Kyle G Daniels
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, California, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| |
Collapse
|
21
|
Marshall LR, Bhattacharya S, Korendovych IV. Fishing for Catalysis: Experimental Approaches to Narrowing Search Space in Directed Evolution of Enzymes. JACS AU 2023; 3:2402-2412. [PMID: 37772192 PMCID: PMC10523367 DOI: 10.1021/jacsau.3c00315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 08/07/2023] [Accepted: 08/08/2023] [Indexed: 09/30/2023]
Abstract
Directed evolution has transformed protein engineering offering a path to rapid improvement of protein properties. Yet, in practice it is limited by the hyper-astronomic protein sequence search space, and approaches to identify mutagenic hot spots, i.e., locations where mutations are most likely to have a productive impact, are needed. In this perspective, we categorize and discuss recent progress in the experimental approaches (broadly defined as structural, bioinformatic, and dynamic) to hot spot identification. Recent successes in harnessing protein dynamics and machine learning approaches provide new opportunities for the field and will undoubtedly help directed evolution reach its full potential.
Collapse
Affiliation(s)
- Liam R. Marshall
- Department of Chemistry, Syracuse
University, 111 College Place, Syracuse, New York 13224, United States
| | - Sagar Bhattacharya
- Department of Chemistry, Syracuse
University, 111 College Place, Syracuse, New York 13224, United States
| | - Ivan V. Korendovych
- Department of Chemistry, Syracuse
University, 111 College Place, Syracuse, New York 13224, United States
| |
Collapse
|
22
|
Qiu Y, Wei GW. Artificial intelligence-aided protein engineering: from topological data analysis to deep protein language models. Brief Bioinform 2023; 24:bbad289. [PMID: 37580175 PMCID: PMC10516362 DOI: 10.1093/bib/bbad289] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 07/14/2023] [Accepted: 07/26/2023] [Indexed: 08/16/2023] Open
Abstract
Protein engineering is an emerging field in biotechnology that has the potential to revolutionize various areas, such as antibody design, drug discovery, food security, ecology, and more. However, the mutational space involved is too vast to be handled through experimental means alone. Leveraging accumulative protein databases, machine learning (ML) models, particularly those based on natural language processing (NLP), have considerably expedited protein engineering. Moreover, advances in topological data analysis (TDA) and artificial intelligence-based protein structure prediction, such as AlphaFold2, have made more powerful structure-based ML-assisted protein engineering strategies possible. This review aims to offer a comprehensive, systematic, and indispensable set of methodological components, including TDA and NLP, for protein engineering and to facilitate their future development.
Collapse
Affiliation(s)
- Yuchi Qiu
- Department of Mathematics, Michigan State University, East Lansing, 48824 MI, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, 48824 MI, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, 48824 MI, USA
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, 48824 MI, USA
| |
Collapse
|
23
|
Nordquist E, Zhang G, Barethiya S, Ji N, White KM, Han L, Jia Z, Shi J, Cui J, Chen J. Incorporating physics to overcome data scarcity in predictive modeling of protein function: A case study of BK channels. PLoS Comput Biol 2023; 19:e1011460. [PMID: 37713443 PMCID: PMC10529646 DOI: 10.1371/journal.pcbi.1011460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 09/27/2023] [Accepted: 08/24/2023] [Indexed: 09/17/2023] Open
Abstract
Machine learning has played transformative roles in numerous chemical and biophysical problems such as protein folding where large amount of data exists. Nonetheless, many important problems remain challenging for data-driven machine learning approaches due to the limitation of data scarcity. One approach to overcome data scarcity is to incorporate physical principles such as through molecular modeling and simulation. Here, we focus on the big potassium (BK) channels that play important roles in cardiovascular and neural systems. Many mutants of BK channel are associated with various neurological and cardiovascular diseases, but the molecular effects are unknown. The voltage gating properties of BK channels have been characterized for 473 site-specific mutations experimentally over the last three decades; yet, these functional data by themselves remain far too sparse to derive a predictive model of BK channel voltage gating. Using physics-based modeling, we quantify the energetic effects of all single mutations on both open and closed states of the channel. Together with dynamic properties derived from atomistic simulations, these physical descriptors allow the training of random forest models that could reproduce unseen experimentally measured shifts in gating voltage, ∆V1/2, with a RMSE ~ 32 mV and correlation coefficient of R ~ 0.7. Importantly, the model appears capable of uncovering nontrivial physical principles underlying the gating of the channel, including a central role of hydrophobic gating. The model was further evaluated using four novel mutations of L235 and V236 on the S5 helix, mutations of which are predicted to have opposing effects on V1/2 and suggest a key role of S5 in mediating voltage sensor-pore coupling. The measured ∆V1/2 agree quantitatively with prediction for all four mutations, with a high correlation of R = 0.92 and RMSE = 18 mV. Therefore, the model can capture nontrivial voltage gating properties in regions where few mutations are known. The success of predictive modeling of BK voltage gating demonstrates the potential of combining physics and statistical learning for overcoming data scarcity in nontrivial protein function prediction.
Collapse
Affiliation(s)
- Erik Nordquist
- Department of Chemistry, University of Massachusetts Amherst, Amherst, Massachusetts, United States of America
| | - Guohui Zhang
- Department of Biomedical Engineering, Center for the Investigation of Membrane Excitability Disorders, Cardiac Bioelectricity and Arrhythmia Center, Washington University in St. Louis, St. Louis, Missouri, United States of America
| | - Shrishti Barethiya
- Department of Chemistry, University of Massachusetts Amherst, Amherst, Massachusetts, United States of America
| | - Nathan Ji
- Department of Biology, Boston College, Chestnut Hill, Massachusetts, United States of America
| | - Kelli M. White
- Department of Biomedical Engineering, Center for the Investigation of Membrane Excitability Disorders, Cardiac Bioelectricity and Arrhythmia Center, Washington University in St. Louis, St. Louis, Missouri, United States of America
| | - Lu Han
- Department of Biomedical Engineering, Center for the Investigation of Membrane Excitability Disorders, Cardiac Bioelectricity and Arrhythmia Center, Washington University in St. Louis, St. Louis, Missouri, United States of America
| | - Zhiguang Jia
- Department of Chemistry, University of Massachusetts Amherst, Amherst, Massachusetts, United States of America
| | - Jingyi Shi
- Department of Biomedical Engineering, Center for the Investigation of Membrane Excitability Disorders, Cardiac Bioelectricity and Arrhythmia Center, Washington University in St. Louis, St. Louis, Missouri, United States of America
| | - Jianmin Cui
- Department of Biomedical Engineering, Center for the Investigation of Membrane Excitability Disorders, Cardiac Bioelectricity and Arrhythmia Center, Washington University in St. Louis, St. Louis, Missouri, United States of America
| | - Jianhan Chen
- Department of Chemistry, University of Massachusetts Amherst, Amherst, Massachusetts, United States of America
| |
Collapse
|
24
|
Yang J, Ducharme J, Johnston KE, Li FZ, Yue Y, Arnold FH. DeCOIL: Optimization of Degenerate Codon Libraries for Machine Learning-Assisted Protein Engineering. ACS Synth Biol 2023; 12:2444-2454. [PMID: 37524064 DOI: 10.1021/acssynbio.3c00301] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/02/2023]
Abstract
With advances in machine learning (ML)-assisted protein engineering, models based on data, biophysics, and natural evolution are being used to propose informed libraries of protein variants to explore. Synthesizing these libraries for experimental screens is a major bottleneck, as the cost of obtaining large numbers of exact gene sequences is often prohibitive. Degenerate codon (DC) libraries are a cost-effective alternative for generating combinatorial mutagenesis libraries where mutations are targeted to a handful of amino acid sites. However, existing computational methods to optimize DC libraries to include desired protein variants are not well suited to design libraries for ML-assisted protein engineering. To address these drawbacks, we present DEgenerate Codon Optimization for Informed Libraries (DeCOIL), a generalized method that directly optimizes DC libraries to be useful for protein engineering: to sample protein variants that are likely to have both high fitness and high diversity in the sequence search space. Using computational simulations and wet-lab experiments, we demonstrate that DeCOIL is effective across two specific case studies, with the potential to be applied to many other use cases. DeCOIL offers several advantages over existing methods, as it is direct, easy to use, generalizable, and scalable. With accompanying software (https://github.com/jsunn-y/DeCOIL), DeCOIL can be readily implemented to generate desired informed libraries.
Collapse
Affiliation(s)
- Jason Yang
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Julie Ducharme
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Kadina E Johnston
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Francesca-Zhoufan Li
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Yisong Yue
- Division of Engineering and Applied Sciences, California Institute of Technology, Pasadena, California 91125, United States
| | - Frances H Arnold
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
25
|
Yu T, Boob AG, Singh N, Su Y, Zhao H. In vitro continuous protein evolution empowered by machine learning and automation. Cell Syst 2023; 14:633-644. [PMID: 37224814 DOI: 10.1016/j.cels.2023.04.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Revised: 11/19/2022] [Accepted: 04/20/2023] [Indexed: 05/26/2023]
Abstract
Directed evolution has become one of the most successful and powerful tools for protein engineering. However, the efforts required for designing, constructing, and screening a large library of variants can be laborious, time-consuming, and costly. With the recent advent of machine learning (ML) in the directed evolution of proteins, researchers can now evaluate variants in silico and guide a more efficient directed evolution campaign. Furthermore, recent advancements in laboratory automation have enabled the rapid execution of long, complex experiments for high-throughput data acquisition in both industrial and academic settings, thus providing the means to collect a large quantity of data required to develop ML models for protein engineering. In this perspective, we propose a closed-loop in vitro continuous protein evolution framework that leverages the best of both worlds, ML and automation, and provide a brief overview of the recent developments in the field.
Collapse
Affiliation(s)
- Tianhao Yu
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, Urbana, IL, USA; NSF Molecule Maker Lab Institute, Urbana, IL, USA
| | - Aashutosh Girish Boob
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, Urbana, IL, USA; DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Nilmani Singh
- DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Yufeng Su
- NSF Molecule Maker Lab Institute, Urbana, IL, USA; Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Huimin Zhao
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, Urbana, IL, USA; NSF Molecule Maker Lab Institute, Urbana, IL, USA; DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
| |
Collapse
|
26
|
Parkinson J, Wang W. Linear-Scaling Kernels for Protein Sequences and Small Molecules Outperform Deep Learning While Providing Uncertainty Quantitation and Improved Interpretability. J Chem Inf Model 2023; 63:4589-4601. [PMID: 37498239 DOI: 10.1021/acs.jcim.3c00601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
Gaussian process (GP) is a Bayesian model which provides several advantages for regression tasks in machine learning such as reliable quantitation of uncertainty and improved interpretability. Their adoption has been precluded by their excessive computational cost and by the difficulty in adapting them for analyzing sequences (e.g., amino acid sequences) and graphs (e.g., small molecules). In this study, we introduce a group of random feature-approximated kernels for sequences and graphs that exhibit linear scaling with both the size of the training set and the size of the sequences or graphs. We incorporate these new kernels into our new Python library for GP regression, xGPR, and develop an efficient and scalable algorithm for fitting GPs equipped with these kernels to large datasets. We compare the performance of xGPR on 17 different benchmarks with both standard and state-of-the-art deep learning models and find that GP regression achieves highly competitive accuracy for these tasks while providing with well-calibrated uncertainty quantitation and improved interpretability. Finally, in a simple experiment, we illustrate how xGPR may be used as part of an active learning strategy to engineer a protein with a desired property in an automated way without human intervention.
Collapse
|
27
|
Nordquist E, Zhang G, Barethiya S, Ji N, White KM, Han L, Jia Z, Shi J, Cui J, Chen J. Incorporating physics to overcome data scarcity in predictive modeling of protein function: a case study of BK channels. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.24.546384. [PMID: 37425916 PMCID: PMC10327070 DOI: 10.1101/2023.06.24.546384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Machine learning has played transformative roles in numerous chemical and biophysical problems such as protein folding where large amount of data exists. Nonetheless, many important problems remain challenging for data-driven machine learning approaches due to the limitation of data scarcity. One approach to overcome data scarcity is to incorporate physical principles such as through molecular modeling and simulation. Here, we focus on the big potassium (BK) channels that play important roles in cardiovascular and neural systems. Many mutants of BK channel are associated with various neurological and cardiovascular diseases, but the molecular effects are unknown. The voltage gating properties of BK channels have been characterized for 473 site-specific mutations experimentally over the last three decades; yet, these functional data by themselves remain far too sparse to derive a predictive model of BK channel voltage gating. Using physics-based modeling, we quantify the energetic effects of all single mutations on both open and closed states of the channel. Together with dynamic properties derived from atomistic simulations, these physical descriptors allow the training of random forest models that could reproduce unseen experimentally measured shifts in gating voltage, ΔV 1/2 , with a RMSE ∼ 32 mV and correlation coefficient of R ∼ 0.7. Importantly, the model appears capable of uncovering nontrivial physical principles underlying the gating of the channel, including a central role of hydrophobic gating. The model was further evaluated using four novel mutations of L235 and V236 on the S5 helix, mutations of which are predicted to have opposing effects on V 1/2 and suggest a key role of S5 in mediating voltage sensor-pore coupling. The measured ΔV 1/2 agree quantitatively with prediction for all four mutations, with a high correlation of R = 0.92 and RMSE = 18 mV. Therefore, the model can capture nontrivial voltage gating properties in regions where few mutations are known. The success of predictive modeling of BK voltage gating demonstrates the potential of combining physics and statistical learning for overcoming data scarcity in nontrivial protein function prediction. Author Summary Deep machine learning has brought many exciting breakthroughs in chemistry, physics and biology. These models require large amount of training data and struggle when the data is scarce. The latter is true for predictive modeling of the function of complex proteins such as ion channels, where only hundreds of mutational data may be available. Using the big potassium (BK) channel as a biologically important model system, we demonstrate that a reliable predictive model of its voltage gating property could be derived from only 473 mutational data by incorporating physics-derived features, which include dynamic properties from molecular dynamics simulations and energetic quantities from Rosetta mutation calculations. We show that the final random forest model captures key trends and hotspots in mutational effects of BK voltage gating, such as the important role of pore hydrophobicity. A particularly curious prediction is that mutations of two adjacent residues on the S5 helix would always have opposite effects on the gating voltage, which was confirmed by experimental characterization of four novel mutations. The current work demonstrates the importance and effectiveness of incorporating physics in predictive modeling of protein function with scarce data.
Collapse
Affiliation(s)
- Erik Nordquist
- Department of Chemistry, University of Massachusetts Amherst, Amherst, Massachusetts, USA
| | - Guohui Zhang
- Department of Biomedical Engineering, Center for the Investigation of Membrane Excitability Disorders, Cardiac Bioelectricity and Arrhythmia Center, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Shrishti Barethiya
- Department of Chemistry, University of Massachusetts Amherst, Amherst, Massachusetts, USA
| | - Nathan Ji
- Department of Biology, Boston College, Chestnut Hill, Massachusetts, USA
| | - Kelli M White
- Department of Biomedical Engineering, Center for the Investigation of Membrane Excitability Disorders, Cardiac Bioelectricity and Arrhythmia Center, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Lu Han
- Department of Biomedical Engineering, Center for the Investigation of Membrane Excitability Disorders, Cardiac Bioelectricity and Arrhythmia Center, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Zhiguang Jia
- Department of Chemistry, University of Massachusetts Amherst, Amherst, Massachusetts, USA
| | - Jingyi Shi
- Department of Biomedical Engineering, Center for the Investigation of Membrane Excitability Disorders, Cardiac Bioelectricity and Arrhythmia Center, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Jianmin Cui
- Department of Biomedical Engineering, Center for the Investigation of Membrane Excitability Disorders, Cardiac Bioelectricity and Arrhythmia Center, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Jianhan Chen
- Department of Chemistry, University of Massachusetts Amherst, Amherst, Massachusetts, USA
| |
Collapse
|
28
|
Yu T, Boob AG, Volk MJ, Liu X, Cui H, Zhao H. Machine learning-enabled retrobiosynthesis of molecules. Nat Catal 2023. [DOI: 10.1038/s41929-022-00909-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
|
29
|
Orlando M, Molla G, Castellani P, Pirillo V, Torretta V, Ferronato N. Microbial Enzyme Biotechnology to Reach Plastic Waste Circularity: Current Status, Problems and Perspectives. Int J Mol Sci 2023; 24:3877. [PMID: 36835289 PMCID: PMC9967032 DOI: 10.3390/ijms24043877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Revised: 02/08/2023] [Accepted: 02/10/2023] [Indexed: 02/17/2023] Open
Abstract
The accumulation of synthetic plastic waste in the environment has become a global concern. Microbial enzymes (purified or as whole-cell biocatalysts) represent emerging biotechnological tools for waste circularity; they can depolymerize materials into reusable building blocks, but their contribution must be considered within the context of present waste management practices. This review reports on the prospective of biotechnological tools for plastic bio-recycling within the framework of plastic waste management in Europe. Available biotechnology tools can support polyethylene terephthalate (PET) recycling. However, PET represents only ≈7% of unrecycled plastic waste. Polyurethanes, the principal unrecycled waste fraction, together with other thermosets and more recalcitrant thermoplastics (e.g., polyolefins) are the next plausible target for enzyme-based depolymerization, even if this process is currently effective only on ideal polyester-based polymers. To extend the contribution of biotechnology to plastic circularity, optimization of collection and sorting systems should be considered to feed chemoenzymatic technologies for the treatment of more recalcitrant and mixed polymers. In addition, new bio-based technologies with a lower environmental impact in comparison with the present approaches should be developed to depolymerize (available or new) plastic materials, that should be designed for the required durability and for being susceptible to the action of enzymes.
Collapse
Affiliation(s)
- Marco Orlando
- Department of Biotechnology and Life Sciences, University of Insubria, Via Dunant, 21100 Varese, Italy
| | - Gianluca Molla
- Department of Biotechnology and Life Sciences, University of Insubria, Via Dunant, 21100 Varese, Italy
| | - Pietro Castellani
- Department of Theoretical and Applied Sciences (DiSTA), University of Insubria, Via G.B. Vico 46, 21100 Varese, Italy
| | - Valentina Pirillo
- Department of Biotechnology and Life Sciences, University of Insubria, Via Dunant, 21100 Varese, Italy
| | - Vincenzo Torretta
- Department of Theoretical and Applied Sciences (DiSTA), University of Insubria, Via G.B. Vico 46, 21100 Varese, Italy
| | - Navarro Ferronato
- Department of Theoretical and Applied Sciences (DiSTA), University of Insubria, Via G.B. Vico 46, 21100 Varese, Italy
| |
Collapse
|
30
|
Luo Y. Sensing the shape of functional proteins with topology. NATURE COMPUTATIONAL SCIENCE 2023; 3:124-125. [PMID: 38177630 DOI: 10.1038/s43588-023-00404-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2024]
Affiliation(s)
- Yunan Luo
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA.
| |
Collapse
|
31
|
Liu H, Chen Q. Computational protein design with data‐driven approaches: Recent developments and perspectives. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Affiliation(s)
- Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine University of Science and Technology of China Hefei Anhui China
- Biomedical Sciences and Health Laboratory of Anhui Province University of Science and Technology of China Hefei Anhui China
- School of Data Science University of Science and Technology of China Hefei Anhui China
| | - Quan Chen
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine University of Science and Technology of China Hefei Anhui China
- Biomedical Sciences and Health Laboratory of Anhui Province University of Science and Technology of China Hefei Anhui China
| |
Collapse
|
32
|
Creixell M, Kim H, Mohammadi F, Peyton SR, Meyer AS. Systems approaches to uncovering the contribution of environment-mediated drug resistance. CURRENT OPINION IN SOLID STATE & MATERIALS SCIENCE 2022; 26:101005. [PMID: 36321161 PMCID: PMC9620953 DOI: 10.1016/j.cossms.2022.101005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Cancer drug response is heavily influenced by the extracellular matrix (ECM) environment. Despite a clear appreciation that the ECM influences cancer drug response and progression, a unified view of how, where, and when environment-mediated drug resistance contributes to cancer progression has not coalesced. Here, we survey some specific ways in which the ECM contributes to cancer resistance with a focus on how materials development can coincide with systems biology approaches to better understand and perturb this contribution. We argue that part of the reason that environment-mediated resistance remains a perplexing problem is our lack of a wholistic view of the entire range of environments and their impacts on cell behavior. We cover a series of recent experimental and computational tools that will aid exploration of ECM reactions space, and how they might be synergistically integrated.
Collapse
Affiliation(s)
- Marc Creixell
- Department of Bioengineering, University of California Los Angeles
| | - Hyuna Kim
- Molecular and Cellular Biology Graduate Program, University of Massachusetts Amherst
| | - Farnaz Mohammadi
- Department of Bioengineering, University of California Los Angeles
| | - Shelly R Peyton
- Molecular and Cellular Biology Graduate Program, University of Massachusetts Amherst
- Department of Chemical Engineering, University of Massachusetts Amherst
| | - Aaron S Meyer
- Department of Bioengineering, University of California Los Angeles
| |
Collapse
|
33
|
Hirschi S, Ward TR, Meier WP, Müller DJ, Fotiadis D. Synthetic Biology: Bottom-Up Assembly of Molecular Systems. Chem Rev 2022; 122:16294-16328. [PMID: 36179355 DOI: 10.1021/acs.chemrev.2c00339] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The bottom-up assembly of biological and chemical components opens exciting opportunities to engineer artificial vesicular systems for applications with previously unmet requirements. The modular combination of scaffolds and functional building blocks enables the engineering of complex systems with biomimetic or new-to-nature functionalities. Inspired by the compartmentalized organization of cells and organelles, lipid or polymer vesicles are widely used as model membrane systems to investigate the translocation of solutes and the transduction of signals by membrane proteins. The bottom-up assembly and functionalization of such artificial compartments enables full control over their composition and can thus provide specifically optimized environments for synthetic biological processes. This review aims to inspire future endeavors by providing a diverse toolbox of molecular modules, engineering methodologies, and different approaches to assemble artificial vesicular systems. Important technical and practical aspects are addressed and selected applications are presented, highlighting particular achievements and limitations of the bottom-up approach. Complementing the cutting-edge technological achievements, fundamental aspects are also discussed to cater to the inherently diverse background of the target audience, which results from the interdisciplinary nature of synthetic biology. The engineering of proteins as functional modules and the use of lipids and block copolymers as scaffold modules for the assembly of functionalized vesicular systems are explored in detail. Particular emphasis is placed on ensuring the controlled assembly of these components into increasingly complex vesicular systems. Finally, all descriptions are presented in the greater context of engineering valuable synthetic biological systems for applications in biocatalysis, biosensing, bioremediation, or targeted drug delivery.
Collapse
Affiliation(s)
- Stephan Hirschi
- Institute of Biochemistry and Molecular Medicine, University of Bern, Bühlstrasse 28, 3012 Bern, Switzerland.,Molecular Systems Engineering, National Centre of Competence in Research (NCCR), 4002 Basel, Switzerland
| | - Thomas R Ward
- Department of Chemistry, University of Basel, St. Johanns-Ring 19, 4056 Basel, Switzerland.,Molecular Systems Engineering, National Centre of Competence in Research (NCCR), 4002 Basel, Switzerland
| | - Wolfgang P Meier
- Department of Chemistry, University of Basel, St. Johanns-Ring 19, 4056 Basel, Switzerland.,Molecular Systems Engineering, National Centre of Competence in Research (NCCR), 4002 Basel, Switzerland
| | - Daniel J Müller
- Department of Biosystems Science and Engineering, ETH Zürich, Mattenstrasse 26, 4058 Basel, Switzerland.,Molecular Systems Engineering, National Centre of Competence in Research (NCCR), 4002 Basel, Switzerland
| | - Dimitrios Fotiadis
- Institute of Biochemistry and Molecular Medicine, University of Bern, Bühlstrasse 28, 3012 Bern, Switzerland.,Molecular Systems Engineering, National Centre of Competence in Research (NCCR), 4002 Basel, Switzerland
| |
Collapse
|
34
|
Abstract
One core goal of genetics is to systematically understand the mapping between the DNA sequence of an organism (genotype) and its measurable characteristics (phenotype). Understanding this mapping is often challenging because of interactions between mutations, where the result of combining several different mutations can be very different than the sum of their individual effects. Here we provide a statistical framework for modeling complex genetic interactions of this type. The key idea is to ask how fast the effects of mutations change when introducing the same mutation in increasingly distant genetic backgrounds. We then propose a model for phenotypic prediction that takes into account this tendency for the effects of mutations to be more similar in nearby genetic backgrounds. Contemporary high-throughput mutagenesis experiments are providing an increasingly detailed view of the complex patterns of genetic interaction that occur between multiple mutations within a single protein or regulatory element. By simultaneously measuring the effects of thousands of combinations of mutations, these experiments have revealed that the genotype–phenotype relationship typically reflects not only genetic interactions between pairs of sites but also higher-order interactions among larger numbers of sites. However, modeling and understanding these higher-order interactions remains challenging. Here we present a method for reconstructing sequence-to-function mappings from partially observed data that can accommodate all orders of genetic interaction. The main idea is to make predictions for unobserved genotypes that match the type and extent of epistasis found in the observed data. This information on the type and extent of epistasis can be extracted by considering how phenotypic correlations change as a function of mutational distance, which is equivalent to estimating the fraction of phenotypic variance due to each order of genetic interaction (additive, pairwise, three-way, etc.). Using these estimated variance components, we then define an empirical Bayes prior that in expectation matches the observed pattern of epistasis and reconstruct the genotype–phenotype mapping by conducting Gaussian process regression under this prior. To demonstrate the power of this approach, we present an application to the antibody-binding domain GB1 and also provide a detailed exploration of a dataset consisting of high-throughput measurements for the splicing efficiency of human pre-mRNA 5′ splice sites, for which we also validate our model predictions via additional low-throughput experiments.
Collapse
|
35
|
Qiu Y, Wei GW. CLADE 2.0: Evolution-Driven Cluster Learning-Assisted Directed Evolution. J Chem Inf Model 2022; 62:4629-4641. [PMID: 36154171 DOI: 10.1021/acs.jcim.2c01046] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Directed evolution, a revolutionary biotechnology in protein engineering, optimizes protein fitness by searching an astronomical mutational space via expensive experiments. The cluster learning-assisted directed evolution (CLADE) efficiently explores the mutational space via a combination of unsupervised hierarchical clustering and supervised learning. However, the initial-stage sampling in CLADE treats all clusters equally despite many clusters containing a large portion of non-functional mutations. Recent statistical and deep learning tools enable evolutionary density modeling to access protein fitness in an unsupervised manner. In this work, we construct an ensemble of multiple evolutionary scores to guide the initial sampling in CLADE. The resulting evolutionary score-enhanced CLADE, called CLADE 2.0, efficiently selects a training set within a small informative space using the evolution-driven clustering sampling. CLADE 2.0 is validated by using two benchmark libraries both having 160,000 sequences from four-site mutational combinations. Extensive computational experiments and comparisons with existing cutting-edge methods indicate that CLADE 2.0 is a new state-of-art tool for machine learning-assisted directed evolution.
Collapse
Affiliation(s)
- Yuchi Qiu
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States.,Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States.,Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
36
|
Villalobos-Alva J, Ochoa-Toledo L, Villalobos-Alva MJ, Aliseda A, Pérez-Escamirosa F, Altamirano-Bustamante NF, Ochoa-Fernández F, Zamora-Solís R, Villalobos-Alva S, Revilla-Monsalve C, Kemper-Valverde N, Altamirano-Bustamante MM. Protein Science Meets Artificial Intelligence: A Systematic Review and a Biochemical Meta-Analysis of an Inter-Field. Front Bioeng Biotechnol 2022; 10:788300. [PMID: 35875501 PMCID: PMC9301016 DOI: 10.3389/fbioe.2022.788300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Accepted: 05/25/2022] [Indexed: 11/23/2022] Open
Abstract
Proteins are some of the most fascinating and challenging molecules in the universe, and they pose a big challenge for artificial intelligence. The implementation of machine learning/AI in protein science gives rise to a world of knowledge adventures in the workhorse of the cell and proteome homeostasis, which are essential for making life possible. This opens up epistemic horizons thanks to a coupling of human tacit-explicit knowledge with machine learning power, the benefits of which are already tangible, such as important advances in protein structure prediction. Moreover, the driving force behind the protein processes of self-organization, adjustment, and fitness requires a space corresponding to gigabytes of life data in its order of magnitude. There are many tasks such as novel protein design, protein folding pathways, and synthetic metabolic routes, as well as protein-aggregation mechanisms, pathogenesis of protein misfolding and disease, and proteostasis networks that are currently unexplored or unrevealed. In this systematic review and biochemical meta-analysis, we aim to contribute to bridging the gap between what we call binomial artificial intelligence (AI) and protein science (PS), a growing research enterprise with exciting and promising biotechnological and biomedical applications. We undertake our task by exploring "the state of the art" in AI and machine learning (ML) applications to protein science in the scientific literature to address some critical research questions in this domain, including What kind of tasks are already explored by ML approaches to protein sciences? What are the most common ML algorithms and databases used? What is the situational diagnostic of the AI-PS inter-field? What do ML processing steps have in common? We also formulate novel questions such as Is it possible to discover what the rules of protein evolution are with the binomial AI-PS? How do protein folding pathways evolve? What are the rules that dictate the folds? What are the minimal nuclear protein structures? How do protein aggregates form and why do they exhibit different toxicities? What are the structural properties of amyloid proteins? How can we design an effective proteostasis network to deal with misfolded proteins? We are a cross-functional group of scientists from several academic disciplines, and we have conducted the systematic review using a variant of the PICO and PRISMA approaches. The search was carried out in four databases (PubMed, Bireme, OVID, and EBSCO Web of Science), resulting in 144 research articles. After three rounds of quality screening, 93 articles were finally selected for further analysis. A summary of our findings is as follows: regarding AI applications, there are mainly four types: 1) genomics, 2) protein structure and function, 3) protein design and evolution, and 4) drug design. In terms of the ML algorithms and databases used, supervised learning was the most common approach (85%). As for the databases used for the ML models, PDB and UniprotKB/Swissprot were the most common ones (21 and 8%, respectively). Moreover, we identified that approximately 63% of the articles organized their results into three steps, which we labeled pre-process, process, and post-process. A few studies combined data from several databases or created their own databases after the pre-process. Our main finding is that, as of today, there are no research road maps serving as guides to address gaps in our knowledge of the AI-PS binomial. All research efforts to collect, integrate multidimensional data features, and then analyze and validate them are, so far, uncoordinated and scattered throughout the scientific literature without a clear epistemic goal or connection between the studies. Therefore, our main contribution to the scientific literature is to offer a road map to help solve problems in drug design, protein structures, design, and function prediction while also presenting the "state of the art" on research in the AI-PS binomial until February 2021. Thus, we pave the way toward future advances in the synthetic redesign of novel proteins and protein networks and artificial metabolic pathways, learning lessons from nature for the welfare of humankind. Many of the novel proteins and metabolic pathways are currently non-existent in nature, nor are they used in the chemical industry or biomedical field.
Collapse
Affiliation(s)
- Jalil Villalobos-Alva
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Luis Ochoa-Toledo
- Instituto de Ciencias Aplicadas y Tecnología (ICAT), Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - Mario Javier Villalobos-Alva
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Atocha Aliseda
- Instituto de Investigaciones Filosóficas, Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - Fernando Pérez-Escamirosa
- Instituto de Ciencias Aplicadas y Tecnología (ICAT), Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | | | - Francine Ochoa-Fernández
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Ricardo Zamora-Solís
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Sebastián Villalobos-Alva
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Cristina Revilla-Monsalve
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Nicolás Kemper-Valverde
- Instituto de Ciencias Aplicadas y Tecnología (ICAT), Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - Myriam M. Altamirano-Bustamante
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| |
Collapse
|
37
|
Freschlin CR, Fahlberg SA, Romero PA. Machine learning to navigate fitness landscapes for protein engineering. Curr Opin Biotechnol 2022; 75:102713. [PMID: 35413604 PMCID: PMC9177649 DOI: 10.1016/j.copbio.2022.102713] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 01/05/2022] [Accepted: 02/28/2022] [Indexed: 11/19/2022]
Abstract
Machine learning (ML) is revolutionizing our ability to understand and predict the complex relationships between protein sequence, structure, and function. Predictive sequence-function models are enabling protein engineers to efficiently search the sequence space for useful proteins with broad applications in biotechnology. In this review, we highlight the recent advances in applying ML to protein engineering. We discuss supervised learning methods that infer the sequence-function mapping from experimental data and new sequence representation strategies for data-efficient modeling. We then describe the various ways in which ML can be incorporated into protein engineering workflows, including purely in silico searches, ML-assisted directed evolution, and generative models that can learn the underlying distribution of the protein function in a sequence space. ML-driven protein engineering will become increasingly powerful with continued advances in high-throughput data generation, data science, and deep learning.
Collapse
Affiliation(s)
- Chase R Freschlin
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Sarah A Fahlberg
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Philip A Romero
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA; Department of Chemical & Biological Engineering, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
38
|
Cheng F, Tuncbag N. Editorial overview: Artificial intelligence (AI) methodologies in structural biology. Curr Opin Struct Biol 2022; 74:102387. [PMID: 35589509 DOI: 10.1016/j.sbi.2022.102387] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA; Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA; Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA.
| | - Nurcan Tuncbag
- Department of Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul, 34450, Turkey; School of Medicine, Koc University, Istanbul, 34450, Turkey.
| |
Collapse
|
39
|
Huang Y, Zhang M, Wang J, Xu D, Zhong C. Engineering microbial systems for the production and functionalization of biomaterialsBiomaterials engineering with microorganisms. Curr Opin Microbiol 2022; 68:102154. [PMID: 35568018 DOI: 10.1016/j.mib.2022.102154] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 03/16/2022] [Accepted: 04/06/2022] [Indexed: 11/03/2022]
Abstract
A new trend in biomaterials synthesis is harnessing the production of microorganisms, owing to the low cost and sustainability. Because microorganisms use DNA as a production code, it is possible for humans to reprogram these cells and thus build living factories for the production of biomaterials. Over the past decade, advances in genetic engineering have enabled the development of various intriguing biomaterials with useful properties, with commercially available biomaterials representing only a few of these. In this review, we discuss the common strategies for the production of bulk and commodity biogenic polymers, and highlight several notable approaches such as modular protein engineering and pathway optimization in achieving these goals. We finally investigate the available synthetic biology tools that allow engineering of living materials, and discuss how this emerging class of materials has expanded the application scope of biomaterials.
Collapse
Affiliation(s)
- Yuanyuan Huang
- Center for Materials Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Cas Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Mingyi Zhang
- Center for Materials Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Cas Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Jie Wang
- Center for Materials Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Cas Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Shenyang National Laboratory for Materials Science, Northeastern University, Shenyang, China; Electrobiomaterials Institute, Key Laboratory for Anisotropy and Texture of Materials (Ministry of Education), Northeastern University, Shenyang, China
| | - Dake Xu
- Shenyang National Laboratory for Materials Science, Northeastern University, Shenyang, China; Electrobiomaterials Institute, Key Laboratory for Anisotropy and Texture of Materials (Ministry of Education), Northeastern University, Shenyang, China
| | - Chao Zhong
- Center for Materials Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Cas Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| |
Collapse
|
40
|
Zhuang XY, Zhang YH, Xiao AF, Zhang AH, Fang BS. Key Enzymes in Fatty Acid Synthesis Pathway for Bioactive Lipids Biosynthesis. Front Nutr 2022; 9:851402. [PMID: 35284441 PMCID: PMC8905437 DOI: 10.3389/fnut.2022.851402] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Accepted: 01/25/2022] [Indexed: 11/13/2022] Open
Abstract
Dietary bioactive lipids, one of the three primary nutrients, is not only essential for growth and provides nutrients and energy for life's activities but can also help to guard against disease, such as Alzheimer's and cardiovascular diseases, which further strengthen the immune system and maintain many body functions. Many microorganisms, such as yeast, algae, and marine fungi, have been widely developed for dietary bioactive lipids production. These biosynthetic processes were not limited by the climate and ground, which are also responsible for superiority of shorter periods and high conversion rate. However, the production process was also exposed to the challenges of low stability, concentration, and productivity, which was derived from the limited knowledge about the critical enzyme in the metabolic pathway. Fortunately, the development of enzymatic research methods provides powerful tools to understand the catalytic process, including site-specific mutagenesis, protein dynamic simulation, and metabolic engineering technology. Thus, we review the characteristics of critical desaturase and elongase involved in the fatty acids' synthesis metabolic pathway, which aims to not only provide extensive data for enzyme rational design and modification but also provides a more profound and comprehensive understanding of the dietary bioactive lipids' synthetic process.
Collapse
Affiliation(s)
- Xiao-Yan Zhuang
- College of Food and Biological Engineering, Jimei University, Xiamen, China
| | - Yong-Hui Zhang
- College of Food and Biological Engineering, Jimei University, Xiamen, China
| | - An-Feng Xiao
- College of Food and Biological Engineering, Jimei University, Xiamen, China
| | - Ai-Hui Zhang
- Department of Chemical and Biochemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, China
- *Correspondence: Ai-Hui Zhang
| | - Bai-Shan Fang
- College of Food and Biological Engineering, Jimei University, Xiamen, China
- Department of Chemical and Biochemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, China
| |
Collapse
|