1
|
Chu AE, Kim J, Cheng L, El Nesr G, Xu M, Shuai RW, Huang PS. An all-atom protein generative model. Proc Natl Acad Sci U S A 2024; 121:e2311500121. [PMID: 38916999 DOI: 10.1073/pnas.2311500121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Accepted: 05/13/2024] [Indexed: 06/27/2024] Open
Abstract
Proteins mediate their functions through chemical interactions; modeling these interactions, which are typically through sidechains, is an important need in protein design. However, constructing an all-atom generative model requires an appropriate scheme for managing the jointly continuous and discrete nature of proteins encoded in the structure and sequence. We describe an all-atom diffusion model of protein structure, Protpardelle, which represents all sidechain states at once as a "superposition" state; superpositions defining a protein are collapsed into individual residue types and conformations during sample generation. When combined with sequence design methods, our model is able to codesign all-atom protein structure and sequence. Generated proteins are of good quality under the typical quality, diversity, and novelty metrics, and sidechains reproduce the chemical features and behavior of natural proteins. Finally, we explore the potential of our model to conduct all-atom protein design and scaffold functional motifs in a backbone- and rotamer-free way.
Collapse
Affiliation(s)
- Alexander E Chu
- Biophysics Program, Stanford University, Stanford, CA 94305
- Department of Bioengineering, Stanford University, Stanford, CA 94305
| | - Jinho Kim
- Department of Bioengineering, Stanford University, Stanford, CA 94305
- Department of Physics, Stanford University, Stanford, CA 94305
| | - Lucy Cheng
- Aquarium Learning, San Francisco, CA 94117
| | - Gina El Nesr
- Biophysics Program, Stanford University, Stanford, CA 94305
- Department of Bioengineering, Stanford University, Stanford, CA 94305
| | - Minkai Xu
- Department of Computer Science, Stanford University, Stanford, CA 94305
| | - Richard W Shuai
- Biophysics Program, Stanford University, Stanford, CA 94305
- Department of Bioengineering, Stanford University, Stanford, CA 94305
| | - Po-Ssu Huang
- Biophysics Program, Stanford University, Stanford, CA 94305
- Department of Bioengineering, Stanford University, Stanford, CA 94305
| |
Collapse
|
2
|
Guan A, He Z, Wang X, Jia ZJ, Qin J. Engineering the next-generation synthetic cell factory driven by protein engineering. Biotechnol Adv 2024; 73:108366. [PMID: 38663492 DOI: 10.1016/j.biotechadv.2024.108366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 03/21/2024] [Accepted: 04/22/2024] [Indexed: 05/09/2024]
Abstract
Synthetic cell factory offers substantial advantages in economically efficient production of biofuels, chemicals, and pharmaceutical compounds. However, to create a high-performance synthetic cell factory, precise regulation of cellular material and energy flux is essential. In this context, protein components including enzymes, transcription factor-based biosensors and transporters play pivotal roles. Protein engineering aims to create novel protein variants with desired properties by modifying or designing protein sequences. This review focuses on summarizing the latest advancements of protein engineering in optimizing various aspects of synthetic cell factory, including: enhancing enzyme activity to eliminate production bottlenecks, altering enzyme selectivity to steer metabolic pathways towards desired products, modifying enzyme promiscuity to explore innovative routes, and improving the efficiency of transporters. Furthermore, the utilization of protein engineering to modify protein-based biosensors accelerates evolutionary process and optimizes the regulation of metabolic pathways. The remaining challenges and future opportunities in this field are also discussed.
Collapse
Affiliation(s)
- Ailin Guan
- College of Biomass Science and Engineering, Sichuan University, Chengdu 610065, China
| | - Zixi He
- College of Biomass Science and Engineering, Sichuan University, Chengdu 610065, China
| | - Xin Wang
- West China School of Pharmacy, Sichuan University, Chengdu 610041, China
| | - Zhi-Jun Jia
- West China School of Pharmacy, Sichuan University, Chengdu 610041, China
| | - Jiufu Qin
- College of Biomass Science and Engineering, Sichuan University, Chengdu 610065, China.
| |
Collapse
|
3
|
Zhou L, Tao C, Shen X, Sun X, Wang J, Yuan Q. Unlocking the potential of enzyme engineering via rational computational design strategies. Biotechnol Adv 2024; 73:108376. [PMID: 38740355 DOI: 10.1016/j.biotechadv.2024.108376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 04/27/2024] [Accepted: 05/08/2024] [Indexed: 05/16/2024]
Abstract
Enzymes play a pivotal role in various industries by enabling efficient, eco-friendly, and sustainable chemical processes. However, the low turnover rates and poor substrate selectivity of enzymes limit their large-scale applications. Rational computational enzyme design, facilitated by computational algorithms, offers a more targeted and less labor-intensive approach. There has been notable advancement in employing rational computational protein engineering strategies to overcome these issues, it has not been comprehensively reviewed so far. This article reviews recent developments in rational computational enzyme design, categorizing them into three types: structure-based, sequence-based, and data-driven machine learning computational design. Case studies are presented to demonstrate successful enhancements in catalytic activity, stability, and substrate selectivity. Lastly, the article provides a thorough analysis of these approaches, highlights existing challenges and potential solutions, and offers insights into future development directions.
Collapse
Affiliation(s)
- Lei Zhou
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Chunmeng Tao
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Xiaolin Shen
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Xinxiao Sun
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Jia Wang
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China.
| | - Qipeng Yuan
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China.
| |
Collapse
|
4
|
Goverde CA, Pacesa M, Goldbach N, Dornfeld LJ, Balbi PEM, Georgeon S, Rosset S, Kapoor S, Choudhury J, Dauparas J, Schellhaas C, Kozlov S, Baker D, Ovchinnikov S, Vecchio AJ, Correia BE. Computational design of soluble and functional membrane protein analogues. Nature 2024:10.1038/s41586-024-07601-y. [PMID: 38898281 DOI: 10.1038/s41586-024-07601-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 05/23/2024] [Indexed: 06/21/2024]
Abstract
De novo design of complex protein folds using solely computational means remains a substantial challenge1. Here we use a robust deep learning pipeline to design complex folds and soluble analogues of integral membrane proteins. Unique membrane topologies, such as those from G-protein-coupled receptors2, are not found in the soluble proteome, and we demonstrate that their structural features can be recapitulated in solution. Biophysical analyses demonstrate the high thermal stability of the designs, and experimental structures show remarkable design accuracy. The soluble analogues were functionalized with native structural motifs, as a proof of concept for bringing membrane protein functions to the soluble proteome, potentially enabling new approaches in drug discovery. In summary, we have designed complex protein topologies and enriched them with functionalities from membrane proteins, with high experimental success rates, leading to a de facto expansion of the functional soluble fold space.
Collapse
Affiliation(s)
- Casper A Goverde
- Laboratory of Protein Design and Immunoengineering, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Martin Pacesa
- Laboratory of Protein Design and Immunoengineering, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Nicolas Goldbach
- Laboratory of Protein Design and Immunoengineering, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Lars J Dornfeld
- Laboratory of Protein Design and Immunoengineering, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Petra E M Balbi
- Laboratory of Protein Design and Immunoengineering, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Sandrine Georgeon
- Laboratory of Protein Design and Immunoengineering, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Stéphane Rosset
- Laboratory of Protein Design and Immunoengineering, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Srajan Kapoor
- Department of Structural Biology, University at Buffalo, Buffalo, NY, USA
| | - Jagrity Choudhury
- Department of Structural Biology, University at Buffalo, Buffalo, NY, USA
| | - Justas Dauparas
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Christian Schellhaas
- Laboratory of Protein Design and Immunoengineering, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Simon Kozlov
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Sergey Ovchinnikov
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Alex J Vecchio
- Department of Structural Biology, University at Buffalo, Buffalo, NY, USA
| | - Bruno E Correia
- Laboratory of Protein Design and Immunoengineering, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
5
|
Zhang F, Naeem M, Yu B, Liu F, Ju J. Improving the enzymatic activity and stability of N-carbamoyl hydrolase using deep learning approach. Microb Cell Fact 2024; 23:164. [PMID: 38834993 PMCID: PMC11151596 DOI: 10.1186/s12934-024-02439-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 05/24/2024] [Indexed: 06/06/2024] Open
Abstract
BACKGROUND Optically active D-amino acids are widely used as intermediates in the synthesis of antibiotics, insecticides, and peptide hormones. Currently, the two-enzyme cascade reaction is the most efficient way to produce D-amino acids using enzymes DHdt and DCase, but DCase is susceptible to heat inactivation. Here, to enhance the enzymatic activity and thermal stability of DCase, a rational design software "Feitian" was developed based on kcat prediction using the deep learning approach. RESULTS According to empirical design and prediction of "Feitian" software, six single-point mutants with high kcat value were selected and successfully constructed by site-directed mutagenesis. Out of six, three mutants (Q4C, T212S, and A302C) showed higher enzymatic activity than the wild-type. Furthermore, the combined triple-point mutant DCase-M3 (Q4C/T212S/A302C) exhibited a 4.25-fold increase in activity (29.77 ± 4.52 U) and a 2.25-fold increase in thermal stability as compared to the wild-type, respectively. Through the whole-cell reaction, the high titer of D-HPG (2.57 ± 0.43 mM) was produced by the mutant Q4C/T212S/A302C, which was about 2.04-fold of the wild-type. Molecular dynamics simulation results showed that DCase-M3 significantly enhances the rigidity of the catalytic site and thus increases the activity of DCase-M3. CONCLUSIONS In this study, an efficient rational design software "Feitian" was successfully developed with a prediction accuracy of about 50% in enzymatic activity. A triple-point mutant DCase-M3 (Q4C/T212S/A302C) with enhanced enzymatic activity and thermostability was successfully obtained, which could be applied to the development of a fully enzymatic process for the industrial production of D-HPG.
Collapse
Affiliation(s)
- Fa Zhang
- College of Life Science, Hebei Normal University, Shijiazhuang, 050024, China
- Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Muhammad Naeem
- College of Life Science, Hebei Normal University, Shijiazhuang, 050024, China
| | - Bo Yu
- Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Feixia Liu
- Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jiansong Ju
- College of Life Science, Hebei Normal University, Shijiazhuang, 050024, China.
- Hebei Collaborative Innovation Center for Eco-Environment, Shijiazhuang, 050024, China.
| |
Collapse
|
6
|
Zhang R, Chai N, Liu T, Zheng Z, Lin Q, Xie X, Wen J, Yang Z, Liu YG, Zhu Q. The type V effectors for CRISPR/Cas-mediated genome engineering in plants. Biotechnol Adv 2024; 74:108382. [PMID: 38801866 DOI: 10.1016/j.biotechadv.2024.108382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 05/07/2024] [Accepted: 05/24/2024] [Indexed: 05/29/2024]
Abstract
A plethora of CRISPR effectors, such as Cas3, Cas9, and Cas12a, are commonly employed as gene editing tools. Among these, Cas12 effectors developed based on Class II type V proteins exhibit distinct characteristics compared to Class II type VI and type II effectors, such as their ability to generate non-allelic DNA double-strand breaks, their compact structures, and the presence of a single RuvC-like nuclease domain. Capitalizing on these advantages, Cas12 family proteins have been increasingly explored and utilized in recent years. However, the characteristics and applications of different subfamilies within the type V protein family have not been systematically summarized. In this review, we focus on the characteristics of type V effector (CRISPR/Cas12) proteins and the current methods used to discover new effector proteins. We also summarize recent modifications based on engineering of type V effectors. In addition, we introduce the applications of type V effectors for gene editing in animals and plants, including the development of base editors, tools for regulating gene expression, methods for gene targeting, and biosensors. We emphasize the prospects for development and application of CRISPR/Cas12 effectors with the goal of better utilizing toolkits based on this protein family for crop improvement and enhanced agricultural production.
Collapse
Affiliation(s)
- Ruixiang Zhang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangdong Laboratory for Lingnan Modern Agriculture, College of Life Sciences, South China Agricultural University, Guangzhou 510642, China
| | - Nan Chai
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangdong Laboratory for Lingnan Modern Agriculture, College of Life Sciences, South China Agricultural University, Guangzhou 510642, China
| | - Taoli Liu
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangdong Laboratory for Lingnan Modern Agriculture, College of Life Sciences, South China Agricultural University, Guangzhou 510642, China
| | - Zhiye Zheng
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangdong Laboratory for Lingnan Modern Agriculture, College of Life Sciences, South China Agricultural University, Guangzhou 510642, China
| | - Qiupeng Lin
- College of Agriculture, South China Agricultural University, Guangzhou 510642, China
| | - Xianrong Xie
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangdong Laboratory for Lingnan Modern Agriculture, College of Life Sciences, South China Agricultural University, Guangzhou 510642, China; College of Agriculture, South China Agricultural University, Guangzhou 510642, China
| | - Jun Wen
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangdong Laboratory for Lingnan Modern Agriculture, College of Life Sciences, South China Agricultural University, Guangzhou 510642, China
| | - Zi Yang
- College of Natural & Agricultural Sciences, University of California, Riverside, 900 University Ave, Riverside, CA 92507, USA
| | - Yao-Guang Liu
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangdong Laboratory for Lingnan Modern Agriculture, College of Life Sciences, South China Agricultural University, Guangzhou 510642, China; College of Agriculture, South China Agricultural University, Guangzhou 510642, China.
| | - Qinlong Zhu
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangdong Laboratory for Lingnan Modern Agriculture, College of Life Sciences, South China Agricultural University, Guangzhou 510642, China; College of Agriculture, South China Agricultural University, Guangzhou 510642, China.
| |
Collapse
|
7
|
Hu X, Xu Y, Yi J, Wang C, Zhu Z, Yue T, Zhang H, Wang X, Wu F, Xue L, Bai L, Liu H, Chen Q. Using Protein Design and Directed Evolution to Monomerize a Bright Near-Infrared Fluorescent Protein. ACS Synth Biol 2024; 13:1177-1190. [PMID: 38552148 DOI: 10.1021/acssynbio.3c00643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2024]
Abstract
The small ultrared fluorescent protein (smURFP) is a bright near-infrared (NIR) fluorescent protein (FP) that forms a dimer and binds its fluorescence chromophore, biliverdin, at its dimer interface. To engineer a monomeric NIR FP based on smURFP potentially more suitable for bioimaging, we employed protein design to extend the protein backbone with a new segment of two helices that shield the original dimer interface while covering the biliverdin binding pocket in place of the second chain in the original dimer. We experimentally characterized 13 designs and obtained a monomeric protein with a weak fluorescence. We enhanced the fluorescence of this designed protein through two rounds of directed evolution and obtained designed monomeric smURFP (DMsmURFP), a bright, stable, and monomeric NIR FP with a molecular weight of 19.6 kDa. We determined the crystal structures of DMsmURFP both in the apo state and in complex with biliverdin, which confirmed the designed structure. The use of DMsmURFP in in vivo imaging of mammalian systems was demonstrated. The backbone design-based strategy used here can also be applied to monomerize other naturally multimeric proteins with intersubunit functional sites.
Collapse
Affiliation(s)
- Xiuhong Hu
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Yang Xu
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Junxi Yi
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Chenchen Wang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Zhongliang Zhu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Ting Yue
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Haiyan Zhang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Xinyu Wang
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Fan Wu
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Lin Xue
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Li Bai
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China
- School of Data Science, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Quan Chen
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China
| |
Collapse
|
8
|
Xu Y, Hu X, Wang C, Liu Y, Chen Q, Liu H. De novo design of cavity-containing proteins with a backbone-centered neural network energy function. Structure 2024; 32:424-432.e4. [PMID: 38325370 DOI: 10.1016/j.str.2024.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 10/04/2023] [Accepted: 01/11/2024] [Indexed: 02/09/2024]
Abstract
The design of small-molecule-binding proteins requires protein backbones that contain cavities. Previous design efforts were based on naturally occurring cavity-containing backbone architectures. Here, we designed diverse cavity-containing backbones without predefined architectures by introducing tailored restraints into the backbone sampling driven by SCUBA (Side Chain-Unknown Backbone Arrangement), a neural network statistical energy function. For 521 out of 5816 designs, the root-mean-square deviations (RMSDs) of the Cα atoms for the AlphaFold2-predicted structures and our designed structures are within 2.0 Å. We experimentally tested 10 designed proteins and determined the crystal structures of two of them. One closely agrees with the designed model, while the other forms a domain-swapped dimer, where the partial structures are in agreement with the designed structures. Our results indicate that data-driven methods such as SCUBA hold great potential for designing de novo proteins with tailored small-molecule-binding function.
Collapse
Affiliation(s)
- Yang Xu
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Centre for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230001, China; MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Xiuhong Hu
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Centre for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230001, China; MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Chenchen Wang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Yongrui Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Quan Chen
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Centre for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230001, China; MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China; Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China.
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China; Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China; School of Data Science, University of Science and Technology of China, Hefei, Anhui 230027, China.
| |
Collapse
|
9
|
Listov D, Goverde CA, Correia BE, Fleishman SJ. Opportunities and challenges in design and optimization of protein function. Nat Rev Mol Cell Biol 2024:10.1038/s41580-024-00718-y. [PMID: 38565617 DOI: 10.1038/s41580-024-00718-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2024] [Indexed: 04/04/2024]
Abstract
The field of protein design has made remarkable progress over the past decade. Historically, the low reliability of purely structure-based design methods limited their application, but recent strategies that combine structure-based and sequence-based calculations, as well as machine learning tools, have dramatically improved protein engineering and design. In this Review, we discuss how these methods have enabled the design of increasingly complex structures and therapeutically relevant activities. Additionally, protein optimization methods have improved the stability and activity of complex eukaryotic proteins. Thanks to their increased reliability, computational design methods have been applied to improve therapeutics and enzymes for green chemistry and have generated vaccine antigens, antivirals and drug-delivery nano-vehicles. Moreover, the high success of design methods reflects an increased understanding of basic rules that govern the relationships among protein sequence, structure and function. However, de novo design is still limited mostly to α-helix bundles, restricting its potential to generate sophisticated enzymes and diverse protein and small-molecule binders. Designing complex protein structures is a challenging but necessary next step if we are to realize our objective of generating new-to-nature activities.
Collapse
Affiliation(s)
- Dina Listov
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Casper A Goverde
- Institute of Bioengineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Bruno E Correia
- Institute of Bioengineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| | - Sarel Jacob Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
10
|
Goverde CA, Pacesa M, Goldbach N, Dornfeld LJ, Balbi PEM, Georgeon S, Rosset S, Kapoor S, Choudhury J, Dauparas J, Schellhaas C, Kozlov S, Baker D, Ovchinnikov S, Vecchio AJ, Correia BE. Computational design of soluble functional analogues of integral membrane proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.05.09.540044. [PMID: 38496615 PMCID: PMC10942269 DOI: 10.1101/2023.05.09.540044] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
De novo design of complex protein folds using solely computational means remains a significant challenge. Here, we use a robust deep learning pipeline to design complex folds and soluble analogues of integral membrane proteins. Unique membrane topologies, such as those from GPCRs, are not found in the soluble proteome and we demonstrate that their structural features can be recapitulated in solution. Biophysical analyses reveal high thermal stability of the designs and experimental structures show remarkable design accuracy. The soluble analogues were functionalized with native structural motifs, standing as a proof-of-concept for bringing membrane protein functions to the soluble proteome, potentially enabling new approaches in drug discovery. In summary, we designed complex protein topologies and enriched them with functionalities from membrane proteins, with high experimental success rates, leading to a de facto expansion of the functional soluble fold space.
Collapse
|
11
|
Jänes J, Beltrao P. Deep learning for protein structure prediction and design-progress and applications. Mol Syst Biol 2024; 20:162-169. [PMID: 38291232 PMCID: PMC10912668 DOI: 10.1038/s44320-024-00016-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 12/21/2023] [Accepted: 01/11/2024] [Indexed: 02/01/2024] Open
Abstract
Proteins are the key molecular machines that orchestrate all biological processes of the cell. Most proteins fold into three-dimensional shapes that are critical for their function. Studying the 3D shape of proteins can inform us of the mechanisms that underlie biological processes in living cells and can have practical applications in the study of disease mutations or the discovery of novel drug treatments. Here, we review the progress made in sequence-based prediction of protein structures with a focus on applications that go beyond the prediction of single monomer structures. This includes the application of deep learning methods for the prediction of structures of protein complexes, different conformations, the evolution of protein structures and the application of these methods to protein design. These developments create new opportunities for research that will have impact across many areas of biomedical research.
Collapse
Affiliation(s)
- Jürgen Jänes
- Institute of Molecular Systems Biology, ETH Zürich, 8093, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pedro Beltrao
- Institute of Molecular Systems Biology, ETH Zürich, 8093, Zürich, Switzerland.
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
12
|
Chu AE, Lu T, Huang PS. Sparks of function by de novo protein design. Nat Biotechnol 2024; 42:203-215. [PMID: 38361073 DOI: 10.1038/s41587-024-02133-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Accepted: 01/09/2024] [Indexed: 02/17/2024]
Abstract
Information in proteins flows from sequence to structure to function, with each step causally driven by the preceding one. Protein design is founded on inverting this process: specify a desired function, design a structure executing this function, and find a sequence that folds into this structure. This 'central dogma' underlies nearly all de novo protein-design efforts. Our ability to accomplish these tasks depends on our understanding of protein folding and function and our ability to capture this understanding in computational methods. In recent years, deep learning-derived approaches for efficient and accurate structure modeling and enrichment of successful designs have enabled progression beyond the design of protein structures and towards the design of functional proteins. We examine these advances in the broader context of classical de novo protein design and consider implications for future challenges to come, including fundamental capabilities such as sequence and structure co-design and conformational control considering flexibility, and functional objectives such as antibody and enzyme design.
Collapse
Affiliation(s)
- Alexander E Chu
- Biophysics Program, Stanford University, Palo Alto, CA, USA
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA
- Google DeepMind, London, UK
| | - Tianyu Lu
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA
| | - Po-Ssu Huang
- Biophysics Program, Stanford University, Palo Alto, CA, USA.
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA.
| |
Collapse
|
13
|
Liu Y, Liu H. Protein sequence design on given backbones with deep learning. Protein Eng Des Sel 2024; 37:gzad024. [PMID: 38157313 DOI: 10.1093/protein/gzad024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 12/08/2023] [Accepted: 12/18/2023] [Indexed: 01/03/2024] Open
Abstract
Deep learning methods for protein sequence design focus on modeling and sampling the many- dimensional distribution of amino acid sequences conditioned on the backbone structure. To produce physically foldable sequences, inter-residue couplings need to be considered properly. These couplings are treated explicitly in iterative methods or autoregressive methods. Non-autoregressive models treating these couplings implicitly are computationally more efficient, but still await tests by wet experiment. Currently, sequence design methods are evaluated mainly using native sequence recovery rate and native sequence perplexity. These metrics can be complemented by sequence-structure compatibility metrics obtained from energy calculation or structure prediction. However, existing computational metrics have important limitations that may render the generalization of computational test results to performance in real applications unwarranted. Validation of design methods by wet experiments should be encouraged.
Collapse
Affiliation(s)
- Yufeng Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China
- School of Biomedical Engineering, Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, Jiangsu 215004, China
| |
Collapse
|
14
|
Min J, Rong X, Zhang J, Su R, Wang Y, Qi W. Computational Design of Peptide Assemblies. J Chem Theory Comput 2024; 20:532-550. [PMID: 38206800 DOI: 10.1021/acs.jctc.3c01054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2024]
Abstract
With the ongoing development of peptide self-assembling materials, there is growing interest in exploring novel functional peptide sequences. From short peptides to long polypeptides, as the functionality increases, the sequence space is also expanding exponentially. Consequently, attempting to explore all functional sequences comprehensively through experience and experiments alone has become impractical. By utilizing computational methods, especially artificial intelligence enhanced molecular dynamics (MD) simulation and de novo peptide design, there has been a significant expansion in the exploration of sequence space. Through these methods, a variety of supramolecular functional materials, including fibers, two-dimensional arrays, nanocages, etc., have been designed by meticulously controlling the inter- and intramolecular interactions. In this review, we first provide a brief overview of the current main computational methods and then focus on the computational design methods for various self-assembled peptide materials. Additionally, we introduce some representative protein self-assemblies to offer guidance for the design of self-assembling peptides.
Collapse
Affiliation(s)
- Jiwei Min
- State Key Laboratory of Chemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, P. R. China
| | - Xi Rong
- State Key Laboratory of Chemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, P. R. China
| | - Jiaxing Zhang
- State Key Laboratory of Chemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, P. R. China
| | - Rongxin Su
- State Key Laboratory of Chemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, P. R. China
- Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin 300072, P. R. China
- Tianjin Key Laboratory of Membrane Science and Desalination Technology, Tianjin 300072, P. R. China
| | - Yuefei Wang
- State Key Laboratory of Chemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, P. R. China
- Tianjin Key Laboratory of Membrane Science and Desalination Technology, Tianjin 300072, P. R. China
| | - Wei Qi
- State Key Laboratory of Chemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, P. R. China
- Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin 300072, P. R. China
- Tianjin Key Laboratory of Membrane Science and Desalination Technology, Tianjin 300072, P. R. China
| |
Collapse
|
15
|
Min X, Yang C, Xie J, Huang Y, Liu N, Jin X, Wang T, Kong Z, Lu X, Ge S, Zhang J, Xia N. Tpgen: a language model for stable protein design with a specific topology structure. BMC Bioinformatics 2024; 25:35. [PMID: 38254030 PMCID: PMC10804651 DOI: 10.1186/s12859-024-05637-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 01/03/2024] [Indexed: 01/24/2024] Open
Abstract
BACKGROUND Natural proteins occupy a small portion of the protein sequence space, whereas artificial proteins can explore a wider range of possibilities within the sequence space. However, specific requirements may not be met when generating sequences blindly. Research indicates that small proteins have notable advantages, including high stability, accurate resolution prediction, and facile specificity modification. RESULTS This study involves the construction of a neural network model named TopoProGenerator(TPGen) using a transformer decoder. The model is trained with sequences consisting of a maximum of 65 amino acids. The training process of TopoProGenerator incorporates reinforcement learning and adversarial learning, for fine-tuning. Additionally, it encompasses a stability predictive model trained with a dataset comprising over 200,000 sequences. The results demonstrate that TopoProGenerator is capable of designing stable small protein sequences with specified topology structures. CONCLUSION TPGen has the ability to generate protein sequences that fold into the specified topology, and the pretraining and fine-tuning methods proposed in this study can serve as a framework for designing various types of proteins.
Collapse
Affiliation(s)
- Xiaoping Min
- School of Informatics, Institute of Artificial Intelligence, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Collaborative Innovation Centers of Biologic Products, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Chongzhou Yang
- School of Informatics, Institute of Artificial Intelligence, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Collaborative Innovation Centers of Biologic Products, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Jun Xie
- School of Informatics, Institute of Artificial Intelligence, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Collaborative Innovation Centers of Biologic Products, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Yang Huang
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Collaborative Innovation Centers of Biologic Products, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- School of Life Sciences, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Nan Liu
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Collaborative Innovation Centers of Biologic Products, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Xiaocheng Jin
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Collaborative Innovation Centers of Biologic Products, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Tianshu Wang
- School of Informatics, Institute of Artificial Intelligence, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Collaborative Innovation Centers of Biologic Products, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Zhibo Kong
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Collaborative Innovation Centers of Biologic Products, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Xiaoli Lu
- Information and Networking Center, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Shengxiang Ge
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Collaborative Innovation Centers of Biologic Products, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China.
- School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China.
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, No. 422 Siming South Rd, Xiamen, 361005, China.
| | - Jun Zhang
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Collaborative Innovation Centers of Biologic Products, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Ningshao Xia
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Collaborative Innovation Centers of Biologic Products, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, No. 422 Siming South Rd, Xiamen, 361005, China
| |
Collapse
|
16
|
Teng F, Cui T, Zhou L, Gao Q, Zhou Q, Li W. Programmable synthetic receptors: the next-generation of cell and gene therapies. Signal Transduct Target Ther 2024; 9:7. [PMID: 38167329 PMCID: PMC10761793 DOI: 10.1038/s41392-023-01680-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 09/22/2023] [Accepted: 10/11/2023] [Indexed: 01/05/2024] Open
Abstract
Cell and gene therapies hold tremendous promise for treating a range of difficult-to-treat diseases. However, concerns over the safety and efficacy require to be further addressed in order to realize their full potential. Synthetic receptors, a synthetic biology tool that can precisely control the function of therapeutic cells and genetic modules, have been rapidly developed and applied as a powerful solution. Delicately designed and engineered, they can be applied to finetune the therapeutic activities, i.e., to regulate production of dosed, bioactive payloads by sensing and processing user-defined signals or biomarkers. This review provides an overview of diverse synthetic receptor systems being used to reprogram therapeutic cells and their wide applications in biomedical research. With a special focus on four synthetic receptor systems at the forefront, including chimeric antigen receptors (CARs) and synthetic Notch (synNotch) receptors, we address the generalized strategies to design, construct and improve synthetic receptors. Meanwhile, we also highlight the expanding landscape of therapeutic applications of the synthetic receptor systems as well as current challenges in their clinical translation.
Collapse
Affiliation(s)
- Fei Teng
- University of Chinese Academy of Sciences, Beijing, 101408, China.
| | - Tongtong Cui
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China
| | - Li Zhou
- University of Chinese Academy of Sciences, Beijing, 101408, China
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China
| | - Qingqin Gao
- University of Chinese Academy of Sciences, Beijing, 101408, China
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China
| | - Qi Zhou
- University of Chinese Academy of Sciences, Beijing, 101408, China.
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China.
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China.
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China.
| | - Wei Li
- University of Chinese Academy of Sciences, Beijing, 101408, China.
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China.
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China.
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China.
| |
Collapse
|
17
|
Zhang X, Yin H, Ling F, Zhan J, Zhou Y. SPIN-CGNN: Improved fixed backbone protein design with contact map-based graph construction and contact graph neural network. PLoS Comput Biol 2023; 19:e1011330. [PMID: 38060617 PMCID: PMC10729952 DOI: 10.1371/journal.pcbi.1011330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 12/19/2023] [Accepted: 11/27/2023] [Indexed: 12/20/2023] Open
Abstract
Recent advances in deep learning have significantly improved the ability to infer protein sequences directly from protein structures for the fix-backbone design. The methods have evolved from the early use of multi-layer perceptrons to convolutional neural networks, transformers, and graph neural networks (GNN). However, the conventional approach of constructing K-nearest-neighbors (KNN) graph for GNN has limited the utilization of edge information, which plays a critical role in network performance. Here we introduced SPIN-CGNN based on protein contact maps for nearest neighbors. Together with auxiliary edge updates and selective kernels, we found that SPIN-CGNN provided a comparable performance in refolding ability by AlphaFold2 to the current state-of-the-art techniques but a significant improvement over them in term of sequence recovery, perplexity, deviation from amino-acid compositions of native sequences, conservation of hydrophobic positions, and low complexity regions, according to the test by unseen structures, "hallucinated" structures and diffusion models. Results suggest that low complexity regions in the sequences designed by deep learning, for generated structures in particular, remain to be improved, when compared to the native sequences.
Collapse
Affiliation(s)
- Xing Zhang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, People’s Republic of China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| | - Hongmei Yin
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| | - Fei Ling
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, People’s Republic of China
| | - Jian Zhan
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| | - Yaoqi Zhou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| |
Collapse
|
18
|
Wang J, Chen C, Yao G, Ding J, Wang L, Jiang H. Intelligent Protein Design and Molecular Characterization Techniques: A Comprehensive Review. Molecules 2023; 28:7865. [PMID: 38067593 PMCID: PMC10707872 DOI: 10.3390/molecules28237865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Revised: 11/13/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open
Abstract
In recent years, the widespread application of artificial intelligence algorithms in protein structure, function prediction, and de novo protein design has significantly accelerated the process of intelligent protein design and led to many noteworthy achievements. This advancement in protein intelligent design holds great potential to accelerate the development of new drugs, enhance the efficiency of biocatalysts, and even create entirely new biomaterials. Protein characterization is the key to the performance of intelligent protein design. However, there is no consensus on the most suitable characterization method for intelligent protein design tasks. This review describes the methods, characteristics, and representative applications of traditional descriptors, sequence-based and structure-based protein characterization. It discusses their advantages, disadvantages, and scope of application. It is hoped that this could help researchers to better understand the limitations and application scenarios of these methods, and provide valuable references for choosing appropriate protein characterization techniques for related research in the field, so as to better carry out protein research.
Collapse
Affiliation(s)
| | | | | | - Junjie Ding
- State Key Laboratory of NBC Protection for Civilian, Beijing 102205, China; (J.W.); (C.C.); (G.Y.)
| | - Liangliang Wang
- State Key Laboratory of NBC Protection for Civilian, Beijing 102205, China; (J.W.); (C.C.); (G.Y.)
| | - Hui Jiang
- State Key Laboratory of NBC Protection for Civilian, Beijing 102205, China; (J.W.); (C.C.); (G.Y.)
| |
Collapse
|
19
|
Bai G, Sun C, Guo Z, Wang Y, Zeng X, Su Y, Zhao Q, Ma B. Accelerating antibody discovery and design with artificial intelligence: Recent advances and prospects. Semin Cancer Biol 2023; 95:13-24. [PMID: 37355214 DOI: 10.1016/j.semcancer.2023.06.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 06/09/2023] [Accepted: 06/18/2023] [Indexed: 06/26/2023]
Abstract
Therapeutic antibodies are the largest class of biotherapeutics and have been successful in treating human diseases. However, the design and discovery of antibody drugs remains challenging and time-consuming. Recently, artificial intelligence technology has had an incredible impact on antibody design and discovery, resulting in significant advances in antibody discovery, optimization, and developability. This review summarizes major machine learning (ML) methods and their applications for computational predictors of antibody structure and antigen interface/interaction, as well as the evaluation of antibody developability. Additionally, this review addresses the current status of ML-based therapeutic antibodies under preclinical and clinical phases. While many challenges remain, ML may offer a new therapeutic option for the future direction of fully computational antibody design.
Collapse
Affiliation(s)
- Ganggang Bai
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Chuance Sun
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ziang Guo
- Cancer Center, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macao Special Administrative Region of China
| | - Yangjing Wang
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xincheng Zeng
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yuhong Su
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Qi Zhao
- Cancer Center, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macao Special Administrative Region of China; MoE Frontiers Science Center for Precision Oncology, University of Macau, Taipa, Macao Special Administrative Region of China.
| | - Buyong Ma
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China; Shanghai Digiwiser BioTechnolgy, Limited, Shanghai 201203, China.
| |
Collapse
|
20
|
Zhang L, Liu H. Exploring binding positions and backbone conformations of peptide ligands of proteins with a backbone-centred statistical energy function. J Comput Aided Mol Des 2023; 37:463-478. [PMID: 37498491 DOI: 10.1007/s10822-023-00518-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 07/05/2023] [Indexed: 07/28/2023]
Abstract
When designing peptide ligands based on the structure of a protein receptor, it can be very useful to narrow down the possible binding positions and bound conformations of the ligand without the need to choose its amino acid sequence in advance. Here, we construct and benchmark a tool for this purpose based on a recently reported statistical energy model named SCUBA (Sidechain-Unknown Backbone Arrangement) for designing protein backbones without considering specific amino acid sequences. With this tool, backbone fragments of different local conformation types are generated and optimized with SCUBA-driven stochastic simulations and simulated annealing, and then ranked and clustered to obtain representative backbone fragment poses of strong SCUBA interaction energies with the receptor. We computationally benchmarked the tool on 111 known protein-peptide complex structures. When the bound ligands are in the strand conformation, the method is able to generate backbone fragments of both low SCUBA energies and low root mean square deviations from experimental structures of peptide ligands. When the bound ligands are helices or coils, low-energy backbone fragments with binding poses similar to experimental structures have been generated for approximately 50% of benchmark cases. We have examined a number of predicted ligand-receptor complexes by atomistic molecular dynamics simulations, in which the peptide ligands have been found to stay at the predicted binding sites and to maintain their local conformations. These results suggest that promising backbone structures of peptides bound to protein receptors can be designed by identifying outstanding minima on the SCUBA-modeled backbone energy landscape.
Collapse
Affiliation(s)
- Lu Zhang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230027, Anhui, China
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230027, Anhui, China.
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, 230027, Anhui, China.
- School of Data Science, University of Science and Technology of China, Hefei, 230027, Anhui, China.
| |
Collapse
|
21
|
Lin P, Yan Y, Tao H, Huang SY. Deep transfer learning for inter-chain contact predictions of transmembrane protein complexes. Nat Commun 2023; 14:4935. [PMID: 37582780 PMCID: PMC10427616 DOI: 10.1038/s41467-023-40426-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 07/21/2023] [Indexed: 08/17/2023] Open
Abstract
Membrane proteins are encoded by approximately a quarter of human genes. Inter-chain residue-residue contact information is important for structure prediction of membrane protein complexes and valuable for understanding their molecular mechanism. Although many deep learning methods have been proposed to predict the intra-protein contacts or helix-helix interactions in membrane proteins, it is still challenging to accurately predict their inter-chain contacts due to the limited number of transmembrane proteins. Addressing the challenge, here we develop a deep transfer learning method for predicting inter-chain contacts of transmembrane protein complexes, named DeepTMP, by taking advantage of the knowledge pre-trained from a large data set of non-transmembrane proteins. DeepTMP utilizes a geometric triangle-aware module to capture the correct inter-chain interaction from the coevolution information generated by protein language models. DeepTMP is extensively evaluated on a test set of 52 self-associated transmembrane protein complexes, and compared with state-of-the-art methods including DeepHomo2.0, CDPred, GLINTER, DeepHomo, and DNCON2_Inter. It is shown that DeepTMP considerably improves the precision of inter-chain contact prediction and outperforms the existing approaches in both accuracy and robustness.
Collapse
Affiliation(s)
- Peicong Lin
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Huanyu Tao
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China.
| |
Collapse
|
22
|
Zhang XE, Liu C, Dai J, Yuan Y, Gao C, Feng Y, Wu B, Wei P, You C, Wang X, Si T. Enabling technology and core theory of synthetic biology. SCIENCE CHINA. LIFE SCIENCES 2023; 66:1742-1785. [PMID: 36753021 PMCID: PMC9907219 DOI: 10.1007/s11427-022-2214-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 10/04/2022] [Indexed: 02/09/2023]
Abstract
Synthetic biology provides a new paradigm for life science research ("build to learn") and opens the future journey of biotechnology ("build to use"). Here, we discuss advances of various principles and technologies in the mainstream of the enabling technology of synthetic biology, including synthesis and assembly of a genome, DNA storage, gene editing, molecular evolution and de novo design of function proteins, cell and gene circuit engineering, cell-free synthetic biology, artificial intelligence (AI)-aided synthetic biology, as well as biofoundries. We also introduce the concept of quantitative synthetic biology, which is guiding synthetic biology towards increased accuracy and predictability or the real rational design. We conclude that synthetic biology will establish its disciplinary system with the iterative development of enabling technologies and the maturity of the core theory.
Collapse
Affiliation(s)
- Xian-En Zhang
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Chenli Liu
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Junbiao Dai
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Yingjin Yuan
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China.
| | - Caixia Gao
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Yan Feng
- State Key Laboratory of Microbial Metabolism, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Bian Wu
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Ping Wei
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Chun You
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China.
| | - Xiaowo Wang
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, 100084, China.
| | - Tong Si
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| |
Collapse
|
23
|
Liu X, Duan Y, Hong X, Xie J, Liu S. Challenges in structural modeling of RNA-protein interactions. Curr Opin Struct Biol 2023; 81:102623. [PMID: 37301066 DOI: 10.1016/j.sbi.2023.102623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 05/14/2023] [Accepted: 05/16/2023] [Indexed: 06/12/2023]
Abstract
In the past few years, the number of RNA-binding proteins (RBP) and RNA-RBP interactions has increased significantly. Here, we review recent developments in the methodology for protein-RNA and protein-protein complex structure modeling with deep learning and co-evolution, as well as discuss the challenges and opportunities for building a reliable approach for protein-RNA complex structure modelling. Protein Data bank (PDB) and Cross-linking immunoprecipitation (CLIP) data could be combined together and used to infer 2D geometry of protein-RNA interactions by deep learning.
Collapse
Affiliation(s)
- Xudong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Yingtian Duan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Xu Hong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Juan Xie
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Shiyong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China.
| |
Collapse
|
24
|
Madani A, Krause B, Greene ER, Subramanian S, Mohr BP, Holton JM, Olmos JL, Xiong C, Sun ZZ, Socher R, Fraser JS, Naik N. Large language models generate functional protein sequences across diverse families. Nat Biotechnol 2023; 41:1099-1106. [PMID: 36702895 PMCID: PMC10400306 DOI: 10.1038/s41587-022-01618-2] [Citation(s) in RCA: 167] [Impact Index Per Article: 167.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 11/17/2022] [Indexed: 01/27/2023]
Abstract
Deep-learning language models have shown promise in various biotechnological applications, including protein design and engineering. Here we describe ProGen, a language model that can generate protein sequences with a predictable function across large protein families, akin to generating grammatically and semantically correct natural language sentences on diverse topics. The model was trained on 280 million protein sequences from >19,000 families and is augmented with control tags specifying protein properties. ProGen can be further fine-tuned to curated sequences and tags to improve controllable generation performance of proteins from families with sufficient homologous samples. Artificial proteins fine-tuned to five distinct lysozyme families showed similar catalytic efficiencies as natural lysozymes, with sequence identity to natural proteins as low as 31.4%. ProGen is readily adapted to diverse protein families, as we demonstrate with chorismate mutase and malate dehydrogenase.
Collapse
Affiliation(s)
- Ali Madani
- Salesforce Research, Palo Alto, CA, USA.
- Profluent Bio, San Francisco, CA, USA.
| | | | - Eric R Greene
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
| | - Subu Subramanian
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Howard Hughes Medical Institute, University of California, Berkeley, Berkeley, CA, USA
| | | | - James M Holton
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Jose Luis Olmos
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
| | | | | | | | - James S Fraser
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
| | | |
Collapse
|
25
|
Yan J, Li S, Zhang Y, Hao A, Zhao Q. ZetaDesign: an end-to-end deep learning method for protein sequence design and side-chain packing. Brief Bioinform 2023; 24:bbad257. [PMID: 37429578 DOI: 10.1093/bib/bbad257] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 06/05/2023] [Accepted: 06/21/2023] [Indexed: 07/12/2023] Open
Abstract
Computational protein design has been demonstrated to be the most powerful tool in the last few years among protein designing and repacking tasks. In practice, these two tasks are strongly related but often treated separately. Besides, state-of-the-art deep-learning-based methods cannot provide interpretability from an energy perspective, affecting the accuracy of the design. Here we propose a new systematic approach, including both a posterior probability and a joint probability parts, to solve the two essential questions once for all. This approach takes the physicochemical property of amino acids into consideration and uses the joint probability model to ensure the convergence between structure and amino acid type. Our results demonstrated that this method could generate feasible, high-confidence sequences with low-energy side conformations. The designed sequences can fold into target structures with high confidence and maintain relatively stable biochemical properties. The side chain conformation has a significantly lower energy landscape without delegating to a rotamer library or performing the expensive conformational searches. Overall, we propose an end-to-end method that combines the advantages of both deep learning and energy-based methods. The design results of this model demonstrate high efficiency, and precision, as well as a low energy state and good interpretability.
Collapse
Affiliation(s)
- Junyu Yan
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| | - Shuai Li
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| | - Ying Zhang
- The Key Laboratory of Cell Proliferation and Regulation Biology, Ministry of Education, College of Life Sciences, Beijing Normal University, Beijing, China
| | - Aimin Hao
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| | - Qinping Zhao
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| |
Collapse
|
26
|
Huang T, Li Y. Current progress, challenges, and future perspectives of language models for protein representation and protein design. Innovation (N Y) 2023; 4:100446. [PMID: 37485078 PMCID: PMC10362512 DOI: 10.1016/j.xinn.2023.100446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 05/18/2023] [Indexed: 07/25/2023] Open
Abstract
The sequence-structure-function paradigm of protein is the basis of molecular biology. What is the underlying mechanism of such sequence and structure/function corresponding relationship? We reviewed the methods for protein representation and protein design. With these protein representation models, we can accurately predict many properties of proteins, such as stability and binding affinity. Progen, Chroma, RF Diffusion, SCUBA, and other protein design models have demonstrated how human-designed artificial proteins can have desired biological functions. The protein design will revolutionize drug development. And more efficient artificial enzymes that break down industrial waste or plastics will contribute to carbon neutrality. We also discussed the three greatest challenges of protein design in future and possible solutions.
Collapse
Affiliation(s)
- Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yixue Li
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
- Guangzhou Laboratory, Guangzhou 510005, China
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai 200433, China
| |
Collapse
|
27
|
Chu AE, Cheng L, Nesr GE, Xu M, Huang PS. An all-atom protein generative model. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.24.542194. [PMID: 37292974 PMCID: PMC10245864 DOI: 10.1101/2023.05.24.542194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Proteins mediate their functions through chemical interactions; modeling these interactions, which are typically through sidechains, is an important need in protein design. However, constructing an all-atom generative model requires an appropriate scheme for managing the jointly continuous and discrete nature of proteins encoded in the structure and sequence. We describe an all-atom diffusion model of protein structure, Protpardelle, which instantiates a "superposition" over the possible sidechain states, and collapses it to conduct reverse diffusion for sample generation. When combined with sequence design methods, our model is able to co-design all-atom protein structure and sequence. Generated proteins are of good quality under the typical quality, diversity, and novelty metrics, and sidechains reproduce the chemical features and behavior of natural proteins. Finally, we explore the potential of our model conduct all-atom protein design and scaffold functional motifs in a backbone- and rotamer-free way.
Collapse
Affiliation(s)
- Alexander E. Chu
- Biophysics Program, Stanford University
- Department of Bioengineering, Stanford University
| | | | - Gina El Nesr
- Biophysics Program, Stanford University
- Department of Bioengineering, Stanford University
| | - Minkai Xu
- Department of Computer Science, Stanford University
| | - Po-Ssu Huang
- Department of Bioengineering, Stanford University
| |
Collapse
|
28
|
Huang J, Xie X, Zheng Z, Ye L, Wang P, Xu L, Wu Y, Yan J, Yang M, Yan Y. De Novo Computational Design of a Lipase with Hydrolysis Activity towards Middle-Chained Fatty Acid Esters. Int J Mol Sci 2023; 24:ijms24108581. [PMID: 37239928 DOI: 10.3390/ijms24108581] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 05/08/2023] [Accepted: 05/09/2023] [Indexed: 05/28/2023] Open
Abstract
Innovations in biocatalysts provide great prospects for intolerant environments or novel reactions. Due to the limited catalytic capacity and the long-term and labor-intensive characteristics of mining enzymes with the desired functions, de novo enzyme design was developed to obtain industrial application candidates in a rapid and convenient way. Here, based on the catalytic mechanisms and the known structures of proteins, we proposed a computational protein design strategy combining de novo enzyme design and laboratory-directed evolution. Starting with the theozyme constructed using a quantum-mechanical approach, the theoretical enzyme-skeleton combinations were assembled and optimized via the Rosetta "inside-out" protocol. A small number of designed sequences were experimentally screened using SDS-PAGE, mass spectrometry and a qualitative activity assay in which the designed enzyme 1a8uD1 exhibited a measurable hydrolysis activity of 24.25 ± 0.57 U/g towards p-nitrophenyl octanoate. To improve the activity of the designed enzyme, molecular dynamics simulations and the RosettaDesign application were utilized to further optimize the substrate binding mode and amino acid sequence, thus keeping the residues of theozyme intact. The redesigned lipase 1a8uD1-M8 displayed enhanced hydrolysis activity towards p-nitrophenyl octanoate-3.34 times higher than that of 1a8uD1. Meanwhile, the natural skeleton protein (PDB entry 1a8u) did not display any hydrolysis activity, confirming that the hydrolysis abilities of the designed 1a8uD1 and the redesigned 1a8uD1-M8 were devised from scratch. More importantly, the designed 1a8uD1-M8 was also able to hydrolyze the natural middle-chained substrate (glycerol trioctanoate), for which the activity was 27.67 ± 0.69 U/g. This study indicates that the strategy employed here has great potential to generate novel enzymes exhibiting the desired reactions.
Collapse
Affiliation(s)
- Jinsha Huang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Xiaoman Xie
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Zhen Zheng
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Luona Ye
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Pengbo Wang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Li Xu
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Ying Wu
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Jinyong Yan
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Min Yang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Yunjun Yan
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| |
Collapse
|
29
|
Malbranke C, Bikard D, Cocco S, Monasson R, Tubiana J. Machine learning for evolutionary-based and physics-inspired protein design: Current and future synergies. Curr Opin Struct Biol 2023; 80:102571. [PMID: 36947951 DOI: 10.1016/j.sbi.2023.102571] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 01/29/2023] [Accepted: 02/07/2023] [Indexed: 03/24/2023]
Abstract
Computational protein design facilitates the discovery of novel proteins with prescribed structure and functionality. Exciting designs were recently reported using novel data-driven methodologies that can be roughly divided into two categories: evolutionary-based and physics-inspired approaches. The former infer characteristic sequence features shared by sets of evolutionary-related proteins, such as conserved or coevolving positions, and recombine them to generate candidates with similar structure and function. The latter approaches estimate key biochemical properties, such as structure free energy, conformational entropy, or binding affinities using machine learning surrogates, and optimize them to yield improved designs. Here, we review recent progress along both tracks, discuss their strengths and weaknesses, and highlight opportunities for synergistic approaches.
Collapse
Affiliation(s)
- Cyril Malbranke
- Laboratory of Physics of the Ecole Normale Supérieure, PSL Research, CNRS UMR 8023, Sorbonne Université, Université de Paris, Paris, France; Institut Pasteur, Université Paris Cité, CNRS UMR 6047, Synthetic Biology, 75015 Paris, France.
| | - David Bikard
- Institut Pasteur, Université Paris Cité, CNRS UMR 6047, Synthetic Biology, 75015 Paris, France
| | - Simona Cocco
- Laboratory of Physics of the Ecole Normale Supérieure, PSL Research, CNRS UMR 8023, Sorbonne Université, Université de Paris, Paris, France
| | - Rémi Monasson
- Laboratory of Physics of the Ecole Normale Supérieure, PSL Research, CNRS UMR 8023, Sorbonne Université, Université de Paris, Paris, France
| | - Jérôme Tubiana
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
30
|
Pan W, Hu G, Li S, Li G, Feng X, Wu Z, Zhang D, Qin L, Wang X, Hu L, Xu J, Hu L, Jia Y, Wen X, Wang J, Zhang C, Zhou J, Li W, Wang X, Wang Y, Wang S. Nanonitrator: novel enhancer of inorganic nitrate’s protective effects, predicated on swarm learning approach. Sci Bull (Beijing) 2023; 68:838-850. [PMID: 37029030 DOI: 10.1016/j.scib.2023.03.043] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 03/14/2023] [Accepted: 03/22/2023] [Indexed: 03/31/2023]
Abstract
Inorganic nitrate is an indispensable nutrient that has been used in experimental studies for the prevention and treatment of several diseases. However, the short half-life of nitrate limits its clinical application. To increase the usability of nitrate and overcome the challenges of traditional combination drug discovery through large-scale high-throughput biological experiments, we developed a swarm learning-based combination drug prediction system that identified vitamin C as the drug of choice to be combined with nitrate. Employing microencapsulation technology, we used vitamin C, sodium nitrate, and chitosan 3000 as the core materials to prepare a nitrate nanoparticle, which we named Nanonitrator. The long-circulating delivery ability of nitrate by Nanonitrator significantly increased the efficacy and effect duration of nitrate in irradiation-induced salivary gland injury, without compromising safety. Nanonitrator at the same dose could better maintain intracellular homeostasis than nitrate (with or without vitamin C), emphasizing its potential for clinical use. More importantly, our work provides a method for incorporating inorganic compounds into sustained-release nanoparticles.
Collapse
|
31
|
Chen S, Xu Z, Ding B, Zhang Y, Liu S, Cai C, Li M, Dale BE, Jin M. Big data mining, rational modification, and ancestral sequence reconstruction inferred multiple xylose isomerases for biorefinery. SCIENCE ADVANCES 2023; 9:eadd8835. [PMID: 36724227 PMCID: PMC9891696 DOI: 10.1126/sciadv.add8835] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 12/30/2022] [Indexed: 05/28/2023]
Abstract
The isomerization of xylose to xylulose is considered the most promising approach to initiate xylose bioconversion. Here, phylogeny-guided big data mining, rational modification, and ancestral sequence reconstruction strategies were implemented to explore new active xylose isomerases (XIs) for Saccharomyces cerevisiae. Significantly, 13 new active XIs for S. cerevisiae were mined or artificially created. Moreover, the importance of the amino-terminal fragment for maintaining basic XI activity was demonstrated. With the mined XIs, four efficient xylose-utilizing S. cerevisiae were constructed and evolved, among which the strain S. cerevisiae CRD5HS contributed to ethanol titers as high as 85.95 and 94.76 g/liter from pretreated corn stover and corn cob, respectively, without detoxifying or washing pretreated biomass. Potential genetic targets obtained from adaptive laboratory evolution were further analyzed by sequencing the high-performance strains. The combined XI mining methods described here provide practical references for mining other scarce and valuable enzymes.
Collapse
Affiliation(s)
- Sitong Chen
- School of Environmental and Biological Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
- Biorefinery Research Institution, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Zhaoxian Xu
- School of Environmental and Biological Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
- Biorefinery Research Institution, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Boning Ding
- School of Environmental and Biological Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
- Biorefinery Research Institution, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Yuwei Zhang
- School of Environmental and Biological Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
- Biorefinery Research Institution, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Shuangmei Liu
- School of Environmental and Biological Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
- Biorefinery Research Institution, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Chenggu Cai
- School of Environmental and Biological Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
- Biorefinery Research Institution, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Muzi Li
- School of Environmental and Biological Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
- Biorefinery Research Institution, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Bruce E. Dale
- Biomass Conversion Research Laboratory, Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI 48824, USA
- Great Lakes Bioenergy Research Centre (GLBRC), Michigan State University, East Lansing, MI, 48824 USA
| | - Mingjie Jin
- School of Environmental and Biological Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
- Biorefinery Research Institution, Nanjing University of Science and Technology, Nanjing 210094, China
| |
Collapse
|
32
|
Lu H, Cheng Z, Hu Y, Tang LV. What Can De Novo Protein Design Bring to the Treatment of Hematological Disorders? BIOLOGY 2023; 12:biology12020166. [PMID: 36829445 PMCID: PMC9952452 DOI: 10.3390/biology12020166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 01/17/2023] [Accepted: 01/18/2023] [Indexed: 01/22/2023]
Abstract
Protein therapeutics have been widely used to treat hematological disorders. With the advent of de novo protein design, protein therapeutics are not limited to ameliorating natural proteins but also produce novel protein sequences, folds, and functions with shapes and functions customized to bind to the therapeutic targets. De novo protein techniques have been widely used biomedically to design novel diagnostic and therapeutic drugs, novel vaccines, and novel biological materials. In addition, de novo protein design has provided new options for treating hematological disorders. Scientists have designed protein switches called Colocalization-dependent Latching Orthogonal Cage-Key pRoteins (Co-LOCKR) that perform computations on the surface of cells. De novo designed molecules exhibit a better capacity than the currently available tyrosine kinase inhibitors in chronic myeloid leukemia therapy. De novo designed protein neoleukin-2/15 enhances chimeric antigen receptor T-cell activity. This new technique has great biomedical potential, especially in exploring new treatment methods for hematological disorders. This review discusses the development of de novo protein design and its biological applications, with emphasis on the treatment of hematological disorders.
Collapse
|
33
|
Ferruz N, Heinzinger M, Akdel M, Goncearenco A, Naef L, Dallago C. From sequence to function through structure: Deep learning for protein design. Comput Struct Biotechnol J 2022; 21:238-250. [PMID: 36544476 PMCID: PMC9755234 DOI: 10.1016/j.csbj.2022.11.014] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 11/05/2022] [Accepted: 11/05/2022] [Indexed: 11/20/2022] Open
Abstract
The process of designing biomolecules, in particular proteins, is witnessing a rapid change in available tooling and approaches, moving from design through physicochemical force fields, to producing plausible, complex sequences fast via end-to-end differentiable statistical models. To achieve conditional and controllable protein design, researchers at the interface of artificial intelligence and biology leverage advances in natural language processing (NLP) and computer vision techniques, coupled with advances in computing hardware to learn patterns from growing biological databases, curated annotations thereof, or both. Once learned, these patterns can be used to provide novel insights into mechanistic biology and the design of biomolecules. However, navigating and understanding the practical applications for the many recent protein design tools is complex. To facilitate this, we 1) document recent advances in deep learning (DL) assisted protein design from the last three years, 2) present a practical pipeline that allows to go from de novo-generated sequences to their predicted properties and web-powered visualization within minutes, and 3) leverage it to suggest a generated protein sequence which might be used to engineer a biosynthetic gene cluster to produce a molecular glue-like compound. Lastly, we discuss challenges and highlight opportunities for the protein design field.
Collapse
Key Words
- ADMM, Alternating Direction Method of Multipliers
- CNN, Convolutional Neural Network
- DL, Deep learning
- Deep learning
- Drug discovery
- FNN, fully-connected neural network
- GAN, Generative Adversarial Network
- GCN, Graph Convolutional Network
- GNN, Graph Neural Network
- GO, Gene Ontology
- GVP, Geometric Vector Perceptron
- LSTM, Long-Short Term Memory
- MLP, Multilayer Perceptron
- MSA, Multiple Sequence Alignment
- NLP, Natural Language Processing
- NSR, Natural Sequence Recovery
- Protein design
- Protein language models
- Protein prediction
- VAE, Variational Autoencoder
- pLM, protein Language Model
Collapse
Affiliation(s)
- Noelia Ferruz
- Institute of Informatics and Applications, University of Girona, Girona, Spain
- Department of Biochemistry, University of Bayreuth, Bayreuth, Germany
| | - Michael Heinzinger
- Department of Informatics, Bioinformatics & Computational Biology, Technische Universität München, 85748 Garching, Germany
| | - Mehmet Akdel
- VantAI, 151 W 42nd Street, New York, NY 10036, United States
| | | | - Luca Naef
- VantAI, 151 W 42nd Street, New York, NY 10036, United States
| | - Christian Dallago
- Department of Informatics, Bioinformatics & Computational Biology, Technische Universität München, 85748 Garching, Germany
- VantAI, 151 W 42nd Street, New York, NY 10036, United States
- NVIDIA DE GmbH, Einsteinstraße 172, 81677 München, Germany
| |
Collapse
|
34
|
Liu H, Chen Q. Computational protein design with data‐driven approaches: Recent developments and perspectives. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Affiliation(s)
- Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine University of Science and Technology of China Hefei Anhui China
- Biomedical Sciences and Health Laboratory of Anhui Province University of Science and Technology of China Hefei Anhui China
- School of Data Science University of Science and Technology of China Hefei Anhui China
| | - Quan Chen
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine University of Science and Technology of China Hefei Anhui China
- Biomedical Sciences and Health Laboratory of Anhui Province University of Science and Technology of China Hefei Anhui China
| |
Collapse
|
35
|
Gao B, Huang Y, Peng C, Lin B, Liao Y, Bian C, Yang J, Shi Q. High-Throughput Prediction and Design of Novel Conopeptides for Biomedical Research and Development. BIODESIGN RESEARCH 2022; 2022:9895270. [PMID: 37850131 PMCID: PMC10521759 DOI: 10.34133/2022/9895270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 07/23/2022] [Indexed: 10/19/2023] Open
Abstract
Cone snail venoms have been considered a valuable treasure for international scientists and businessmen, mainly due to their pharmacological applications in development of marine drugs for treatment of various human diseases. To date, around 800 Conus species are recorded, and each of them produces over 1,000 venom peptides (termed as conopeptides or conotoxins). This reflects the high diversity and complexity of cone snails, although most of their venoms are still uncharacterized. Advanced multiomics (such as genomics, transcriptomics, and proteomics) approaches have been recently developed to mine diverse Conus venom samples, with the main aim to predict and identify potentially interesting conopeptides in an efficient way. Some bioinformatics techniques have been applied to predict and design novel conopeptide sequences, related targets, and their binding modes. This review provides an overview of current knowledge on the high diversity of conopeptides and multiomics advances in high-throughput prediction of novel conopeptide sequences, as well as molecular modeling and design of potential drugs based on the predicted or validated interactions between these toxins and their molecular targets.
Collapse
Affiliation(s)
- Bingmiao Gao
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, School of Pharmacy, Hainan Medical University, Haikou, Hainan 570102, China
| | - Yu Huang
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, Shenzhen, Guangdong 518081, China
| | - Chao Peng
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, Shenzhen, Guangdong 518081, China
- BGI-Marine Research Institute for Biomedical Technology, Shenzhen Huahong Marine Biomedicine Co. Ltd., Shenzhen, Guangdong 518119, China
| | - Bo Lin
- Hainan Provincial Key Laboratory of Carcinogenesis and Intervention, Hainan Medical University, Haikou, Hainan 570102, China
| | - Yanling Liao
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, School of Pharmacy, Hainan Medical University, Haikou, Hainan 570102, China
| | - Chao Bian
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, Shenzhen, Guangdong 518081, China
| | - Jiaan Yang
- Research and Development Department, Micro Pharmtech Ltd., Wuhan, Hubei 430075, China
| | - Qiong Shi
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, Shenzhen, Guangdong 518081, China
- BGI-Marine Research Institute for Biomedical Technology, Shenzhen Huahong Marine Biomedicine Co. Ltd., Shenzhen, Guangdong 518119, China
| |
Collapse
|
36
|
Qiao D, Chen Y, Tan H, Zhou R, Feng J. De novo design of transmembrane nanopores. Sci China Chem 2022. [DOI: 10.1007/s11426-022-1354-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
37
|
Li H, Lyv Y, Zhou S, Yu S, Zhou J. Microbial cell factories for the production of flavonoids-barriers and opportunities. BIORESOURCE TECHNOLOGY 2022; 360:127538. [PMID: 35777639 DOI: 10.1016/j.biortech.2022.127538] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 06/24/2022] [Accepted: 06/26/2022] [Indexed: 06/15/2023]
Abstract
Flavonoids are natural plant products with important nutritional value, health-promoting benefits, and therapeutic potential. The use of microbial cell factories to generate flavonoids is an appealing option. The microbial biosynthesis of flavonoids is compared to the classic plant extract approach in this review, and the pharmaceutical applications were presented. This paper summarize approaches for effective flavonoid biosynthesis from microorganisms, and discuss the challenges and prospects of microbial flavonoid biosynthesis. Finally, the barriers and strategies for industrial bio-production of flavonoids are highlighted. This review offers guidance on how to create robust microbial cell factories for producing flavonoids and other relevant chemicals.
Collapse
Affiliation(s)
- Hongbiao Li
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Yunbin Lyv
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Shenghu Zhou
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Shiqin Yu
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Jingwen Zhou
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; The Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; National Engineering Laboratory for Cereal Fermentation Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China.
| |
Collapse
|
38
|
Liu Y, Zhang L, Wang W, Zhu M, Wang C, Li F, Zhang J, Li H, Chen Q, Liu H. Rotamer-free protein sequence design based on deep learning and self-consistency. NATURE COMPUTATIONAL SCIENCE 2022; 2:451-462. [PMID: 38177863 DOI: 10.1038/s43588-022-00273-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 06/07/2022] [Indexed: 01/06/2024]
Abstract
Several previously proposed deep learning methods to design amino acid sequences that autonomously fold into a given protein backbone yielded promising results in computational tests but did not outperform conventional energy function-based methods in wet experiments. Here we present the ABACUS-R method, which uses an encoder-decoder network trained using a multitask learning strategy to predict the sidechain type of a central residue from its three-dimensional local environment, which includes, besides other features, the types but not the conformations of the surrounding sidechains. This eliminates the need to reconstruct and optimize sidechain structures, and drastically simplifies the sequence design process. Thus iteratively applying the encoder-decoder to different central residues is able to produce self-consistent overall sequences for a target backbone. Results of wet experiments, including five structures solved by X-ray crystallography, show that ABACUS-R outperforms state-of-the-art energy function-based methods in success rate and design precision.
Collapse
Affiliation(s)
- Yufeng Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Lu Zhang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Weilun Wang
- CAS Key Laboratory of GIPAS, School of Information Science and Technology, Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, Anhui, China
| | - Min Zhu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Chenchen Wang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Fudong Li
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui, China
| | - Jiahai Zhang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui, China
| | - Houqiang Li
- CAS Key Laboratory of GIPAS, School of Information Science and Technology, Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, Anhui, China.
| | - Quan Chen
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China.
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui, China.
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China.
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui, China.
- School of Data Science, University of Science and Technology of China, Hefei, Anhui, China.
| |
Collapse
|
39
|
|
40
|
Zheng S, Zeng T, Li C, Chen B, Coley CW, Yang Y, Wu R. Deep learning driven biosynthetic pathways navigation for natural products with BioNavi-NP. Nat Commun 2022; 13:3342. [PMID: 35688826 PMCID: PMC9187661 DOI: 10.1038/s41467-022-30970-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 05/27/2022] [Indexed: 12/30/2022] Open
Abstract
The complete biosynthetic pathways are unknown for most natural products (NPs), it is thus valuable to make computer-aided bio-retrosynthesis predictions. Here, a navigable and user-friendly toolkit, BioNavi-NP, is developed to predict the biosynthetic pathways for both NPs and NP-like compounds. First, a single-step bio-retrosynthesis prediction model is trained using both general organic and biosynthetic reactions through end-to-end transformer neural networks. Based on this model, plausible biosynthetic pathways can be efficiently sampled through an AND-OR tree-based planning algorithm from iterative multi-step bio-retrosynthetic routes. Extensive evaluations reveal that BioNavi-NP can identify biosynthetic pathways for 90.2% of 368 test compounds and recover the reported building blocks as in the test set for 72.8%, 1.7 times more accurate than existing conventional rule-based approaches. The model is further shown to identify biologically plausible pathways for complex NPs collected from the recent literature. The toolkit as well as the curated datasets and learned models are freely available to facilitate the elucidation and reconstruction of the biosynthetic pathways for NPs. The complete biosynthetic pathway from most natural products (NPs) are unknown. Here, the authors report BioNavi-NP, a computational toolkit for bio-retrosynthetic pathway elucidation or reconstruction for both NPs and NP-like compounds.
Collapse
Affiliation(s)
- Shuangjia Zheng
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China.,School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China.,Galixir, Beijing, China.,School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China
| | - Tao Zeng
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China
| | | | - Binghong Chen
- College of Computing, Georgia Institute of Technology, Atlanta, GA, USA
| | - Connor W Coley
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China.
| | - Ruibo Wu
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China.
| |
Collapse
|
41
|
Precision materials: Computational design methods of accurate protein materials. Curr Opin Struct Biol 2022; 74:102367. [DOI: 10.1016/j.sbi.2022.102367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Revised: 02/22/2022] [Accepted: 02/28/2022] [Indexed: 11/23/2022]
|
42
|
Sun J, Wu B. Protein design with a machine-learned potential about backbone designability. Trends Biochem Sci 2022; 47:638-640. [DOI: 10.1016/j.tibs.2022.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 04/07/2022] [Accepted: 04/07/2022] [Indexed: 10/18/2022]
|