1
|
Soleymani F, Paquet E, Viktor HL, Michalowski W. Structure-based protein and small molecule generation using EGNN and diffusion models: A comprehensive review. Comput Struct Biotechnol J 2024; 23:2779-2797. [PMID: 39050782 PMCID: PMC11268121 DOI: 10.1016/j.csbj.2024.06.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 06/13/2024] [Accepted: 06/18/2024] [Indexed: 07/27/2024] Open
Abstract
Recent breakthroughs in deep learning have revolutionized protein sequence and structure prediction. These advancements are built on decades of protein design efforts, and are overcoming traditional time and cost limitations. Diffusion models, at the forefront of these innovations, significantly enhance design efficiency by automating knowledge acquisition. In the field of de novo protein design, the goal is to create entirely novel proteins with predetermined structures. Given the arbitrary positions of proteins in 3-D space, graph representations and their properties are widely used in protein generation studies. A critical requirement in protein modelling is maintaining spatial relationships under transformations (rotations, translations, and reflections). This property, known as equivariance, ensures that predicted protein characteristics adapt seamlessly to changes in orientation or position. Equivariant graph neural networks offer a solution to this challenge. By incorporating equivariant graph neural networks to learn the score of the probability density function in diffusion models, one can generate proteins with robust 3-D structural representations. This review examines the latest deep learning advancements, specifically focusing on frameworks that combine diffusion models with equivariant graph neural networks for protein generation.
Collapse
Affiliation(s)
- Farzan Soleymani
- Telfer School of Management, University of Ottawa, ON, K1N 6N5, Canada
| | - Eric Paquet
- National Research Council, 1200 Montreal Road, Ottawa, ON, K1A 0R6, Canada
- School of Electrical Engineering and Computer Science, University of Ottawa, ON, K1N 6N5, Canada
| | - Herna Lydia Viktor
- School of Electrical Engineering and Computer Science, University of Ottawa, ON, K1N 6N5, Canada
| | | |
Collapse
|
2
|
Nguyen PT, Harris BJ, Mateos DL, González AH, Murray AM, Yarov-Yarovoy V. Structural modeling of ion channels using AlphaFold2, RoseTTAFold2, and ESMFold. Channels (Austin) 2024; 18:2325032. [PMID: 38445990 PMCID: PMC10936637 DOI: 10.1080/19336950.2024.2325032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 01/14/2024] [Indexed: 03/07/2024] Open
Abstract
Ion channels play key roles in human physiology and are important targets in drug discovery. The atomic-scale structures of ion channels provide invaluable insights into a fundamental understanding of the molecular mechanisms of channel gating and modulation. Recent breakthroughs in deep learning-based computational methods, such as AlphaFold, RoseTTAFold, and ESMFold have transformed research in protein structure prediction and design. We review the application of AlphaFold, RoseTTAFold, and ESMFold to structural modeling of ion channels using representative voltage-gated ion channels, including human voltage-gated sodium (NaV) channel - NaV1.8, human voltage-gated calcium (CaV) channel - CaV1.1, and human voltage-gated potassium (KV) channel - KV1.3. We compared AlphaFold, RoseTTAFold, and ESMFold structural models of NaV1.8, CaV1.1, and KV1.3 with corresponding cryo-EM structures to assess details of their similarities and differences. Our findings shed light on the strengths and limitations of the current state-of-the-art deep learning-based computational methods for modeling ion channel structures, offering valuable insights to guide their future applications for ion channel research.
Collapse
Affiliation(s)
- Phuong Tran Nguyen
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA, USA
| | - Brandon John Harris
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA, USA
- Biophysics Graduate Group, University of California School of Medicine, Davis, CA, USA
| | - Diego Lopez Mateos
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA, USA
- Biophysics Graduate Group, University of California School of Medicine, Davis, CA, USA
| | - Adriana Hernández González
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA, USA
- Biophysics Graduate Group, University of California School of Medicine, Davis, CA, USA
| | | | - Vladimir Yarov-Yarovoy
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA, USA
- Department of Anesthesiology and Pain Medicine, University of California School of Medicine, Davis, CA, USA
| |
Collapse
|
3
|
Meng F, Zhou N, Hu G, Liu R, Zhang Y, Jing M, Hou Q. A comprehensive overview of recent advances in generative models for antibodies. Comput Struct Biotechnol J 2024; 23:2648-2660. [PMID: 39027650 PMCID: PMC11254834 DOI: 10.1016/j.csbj.2024.06.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 06/15/2024] [Accepted: 06/18/2024] [Indexed: 07/20/2024] Open
Abstract
Therapeutic antibodies are an important class of biopharmaceuticals. With the rapid development of deep learning methods and the increasing amount of antibody data, antibody generative models have made great progress recently. They aim to solve the antibody space searching problems and are widely incorporated into the antibody development process. Therefore, a comprehensive introduction to the development methods in this field is imperative. Here, we collected 34 representative antibody generative models published recently and all generative models can be divided into three categories: sequence-generating models, structure-generating models, and hybrid models, based on their principles and algorithms. We further studied their performance and contributions to antibody sequence prediction, structure optimization, and affinity enhancement. Our manuscript will provide a comprehensive overview of the status of antibody generative models and also offer guidance for selecting different approaches.
Collapse
Affiliation(s)
- Fanxu Meng
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao 266042, China
| | - Na Zhou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250100, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250100, China
| | - Guangchun Hu
- School of Information Science and Engineering, University of Jinan, Jinan 250022, China
| | - Ruotong Liu
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250100, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250100, China
| | - Yuanyuan Zhang
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao 266042, China
| | - Ming Jing
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250000, China
| | - Qingzhen Hou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250100, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250100, China
| |
Collapse
|
4
|
Lin Y, Dong Y, Li X, Cai J, Cai L, Zhang G. Enzymatic production of xylooligosaccharide from lignocellulosic and marine biomass: A review of current progress, challenges, and its applications in food sectors. Int J Biol Macromol 2024; 277:134014. [PMID: 39047995 DOI: 10.1016/j.ijbiomac.2024.134014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 04/03/2024] [Accepted: 07/17/2024] [Indexed: 07/27/2024]
Abstract
Over the last decade, xylooligosaccharides (XOS) have attracted great attentions because of their unique chemical properties and excellent prebiotic effects. Among the current strategies for XOS production, enzymatic hydrolysis is preferred due to its green and safe process, simplicity in equipment, and high control of the degrees of polymerization. This paper comprehensively summarizes various lignocellulosic biomass and marine biomass employed in enzymatic production of XOS. The importance and advantages of enzyme immobilization in XOS production are also discussed. Many novel immobilization techniques for xylanase are presented. In addition, bioinformatics techniques for the mining and designing of new xylanase are also described. Moreover, XOS has exhibited great potential applications in the food industry as diverse roles, such as a sugar replacer, a fat replacer, and cryoprotectant. This review systematically summarizes the current research progress on the applications of XOS in food sectors, including beverages, bakery products, dairy products, meat products, aquatic products, food packaging film, wall materials, and others. It is anticipated that this paper will act as a reference for the further development and application of XOS in food sectors and other fields.
Collapse
Affiliation(s)
- Yuanqing Lin
- College of Environment and Public Health, Xiamen Huaxia University, Xiamen 361024, Fujian, China
| | - Yuting Dong
- College of Environment and Public Health, Xiamen Huaxia University, Xiamen 361024, Fujian, China; Department of Bioengineering and Biotechnology, Huaqiao University, Xiamen 361021, Fujian, China
| | - Xiangling Li
- Thayer School of Engineering, Dartmouth College, Hanover, NH 03755, United States
| | - Jinzhong Cai
- College of Environment and Public Health, Xiamen Huaxia University, Xiamen 361024, Fujian, China
| | - Lixi Cai
- Department of Bioengineering and Biotechnology, Huaqiao University, Xiamen 361021, Fujian, China; College of Basic Medicine, Putian University, Putian 351100, Fujian, China.
| | - Guangya Zhang
- Department of Bioengineering and Biotechnology, Huaqiao University, Xiamen 361021, Fujian, China.
| |
Collapse
|
5
|
Wu K, Jiang H, Hicks DR, Liu C, Muratspahic E, Ramelot TA, Liu Y, McNally K, Gaur A, Coventry B, Chen W, Bera AK, Kang A, Gerben S, Lamb MYL, Murray A, Li X, Kennedy MA, Yang W, Schober G, Brierley SM, Gelb MH, Montelione GT, Derivery E, Baker D. Sequence-specific targeting of intrinsically disordered protein regions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.15.603480. [PMID: 39071356 PMCID: PMC11275711 DOI: 10.1101/2024.07.15.603480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
A general approach to design proteins that bind tightly and specifically to intrinsically disordered regions (IDRs) of proteins and flexible peptides would have wide application in biological research, therapeutics, and diagnosis. However, the lack of defined structures and the high variability in sequence and conformational preferences has complicated such efforts. We sought to develop a method combining biophysical principles with deep learning to readily generate binders for any disordered sequence. Instead of assuming a fixed regular structure for the target, general recognition is achieved by threading the query sequence through diverse extended binding modes in hundreds of templates with varying pocket depths and spacings, followed by RFdiffusion refinement to optimize the binder-target fit. We tested the method by designing binders to 39 highly diverse unstructured targets. Experimental testing of ~36 designs per target yielded binders with affinities better than 100 nM in 34 cases, and in the pM range in four cases. The co-crystal structure of a designed binder in complex with dynorphin A is closely consistent with the design model. All by all binding experiments for 20 designs binding diverse targets show they are highly specific for the intended targets, with no crosstalk even for the closely related dynorphin A and dynorphin B. Our approach thus could provide a general solution to the intrinsically disordered protein and peptide recognition problem.
Collapse
|
6
|
Chen X, Wang X. Computational investigation in inhibitory effects of amantadine on classical swine fever virus p7 ion channel activity. Sci Rep 2024; 14:20387. [PMID: 39223222 PMCID: PMC11369150 DOI: 10.1038/s41598-024-71477-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 08/28/2024] [Indexed: 09/04/2024] Open
Abstract
Classical swine fever virus (CSFV) p7 viroporin plays crucial roles in cellular ion balance and permeabilization. The antiviral drug amantadine effectively inhibits viral replication by blocking the activity of CSFV p7 viroporin. However, little information is available for the binding mode of amantadine with CSFV p7 viroporin, due to the lack of a known polymer structure for CSFV p7. In this study, we employed AlphaFold2 to predict CSFV p7 structures. Subsequently, we conducted a docking study to investigate the binding sites of amantadine to CSFV p7. Computational analysis showed that CSFV p7 forms a pore channel in a hexameric structure. Furthermore, molecular dynamics (MD) simulations and mutant analyses further suggest that CSFV p7 likely exists as a hexamer. Docking studies and MD simulations showed that amantadine interacts with the hydrophibic regions of tetramer and pentamer, as well as with the hydrophobic pore channel of the hexamer. Considering the potential hexameric assembly of CSFV p7, along with docking results, MD simulations, and the characteristics of the gated ion channels, we propose a model of CSFV p7 ion channel based on its hexameric configuration. In this model, residues E21, Y25, and R34 are suggested to selectively recruit and dehydrate ions, while residues L28 and L31 likely act as hydrophobic constrictors, thereby restricting the free movement of water. The binding of amantadine to residues I20, E21, V24 and Y25 effectively blocks ion transport. However, this proposed molecular model requires experimental validation. Our findings give a structural insight into the models of CSFV p7 as an ion channel and provide a molecular explanation for the inhibition effects of amantadine on CSFV p7-mediated ion channel conductance.
Collapse
Affiliation(s)
- Xiaowei Chen
- School of Basic Medical Sciences, Binzhou Medical University, Yantai, 264003, China
- Medicine and Pharmacy Research Center, Binzhou Medical University, Yantai, 264003, China
| | - Xiao Wang
- School of Basic Medical Sciences, Binzhou Medical University, Yantai, 264003, China.
| |
Collapse
|
7
|
Giraldo-Castaño MC, Littlejohn KA, Avecilla ARC, Barrera-Villamizar N, Quiroz FG. Programmability and biomedical utility of intrinsically-disordered protein polymers. Adv Drug Deliv Rev 2024; 212:115418. [PMID: 39094909 DOI: 10.1016/j.addr.2024.115418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Revised: 07/03/2024] [Accepted: 07/29/2024] [Indexed: 08/04/2024]
Abstract
Intrinsically disordered proteins (IDPs) exhibit molecular-level conformational dynamics that are functionally harnessed across a wide range of fascinating biological phenomena. The low sequence complexity of IDPs has led to the design and development of intrinsically-disordered protein polymers (IDPPs), a class of engineered repeat IDPs with stimuli-responsive properties. The perfect repetitive architecture of IDPPs allows for repeat-level encoding of tunable protein functionality. Designer IDPPs can be modeled on endogenous IDPs or engineered de novo as protein polymers with dual biophysical and biological functionality. Their properties can be rationally tailored to access enigmatic IDP biology and to create programmable smart biomaterials. With the goal of inspiring the bioengineering of multifunctional IDP-based materials, here we synthesize recent multidisciplinary progress in programming and exploiting the bio-functionality of IDPPs and IDPP-containing proteins. Collectively, expanding beyond the traditional sequence space of extracellular IDPs, emergent sequence-level control of IDPP functionality is fueling the bioengineering of self-assembling biomaterials, advanced drug delivery systems, tissue scaffolds, and biomolecular condensates -genetically encoded organelle-like structures. Looking forward, we emphasize open challenges and emerging opportunities, arguing that the intracellular behaviors of IDPPs represent a rich space for biomedical discovery and innovation. Combined with the intense focus on IDP biology, the growing landscape of IDPPs and their biomedical applications set the stage for the accelerated engineering of high-value biotechnologies and biomaterials.
Collapse
Affiliation(s)
- Maria Camila Giraldo-Castaño
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Kai A Littlejohn
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Alexa Regina Chua Avecilla
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Natalia Barrera-Villamizar
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Felipe Garcia Quiroz
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA.
| |
Collapse
|
8
|
Thornton EL, Paterson SM, Stam MJ, Wood CW, Laohakunakorn N, Regan L. Applications of cell free protein synthesis in protein design. Protein Sci 2024; 33:e5148. [PMID: 39180484 PMCID: PMC11344276 DOI: 10.1002/pro.5148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 07/26/2024] [Accepted: 08/02/2024] [Indexed: 08/26/2024]
Abstract
In protein design, the ultimate test of success is that the designs function as desired. Here, we discuss the utility of cell free protein synthesis (CFPS) as a rapid, convenient and versatile method to screen for activity. We champion the use of CFPS in screening potential designs. Compared to in vivo protein screening, a wider range of different activities can be evaluated using CFPS, and the scale on which it can easily be used-screening tens to hundreds of designed proteins-is ideally suited to current needs. Protein design using physics-based strategies tended to have a relatively low success rate, compared with current machine-learning based methods. Screening steps (such as yeast display) were often used to identify proteins that displayed the desired activity from many designs that were highly ranked computationally. We also describe how CFPS is well-suited to identify the reasons designs fail, which may include problems with transcription, translation, and solubility, in addition to not achieving the desired structure and function.
Collapse
Affiliation(s)
- Ella Lucille Thornton
- Centre for Engineering Biology, Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological SciencesUniversity of EdinburghEdinburghUK
| | - Sarah Maria Paterson
- Centre for Engineering Biology, Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological SciencesUniversity of EdinburghEdinburghUK
| | - Michael J. Stam
- Centre for Engineering Biology, Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological SciencesUniversity of EdinburghEdinburghUK
| | - Christopher W. Wood
- Centre for Engineering Biology, Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological SciencesUniversity of EdinburghEdinburghUK
| | - Nadanai Laohakunakorn
- Centre for Engineering Biology, Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological SciencesUniversity of EdinburghEdinburghUK
| | - Lynne Regan
- Centre for Engineering Biology, Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological SciencesUniversity of EdinburghEdinburghUK
| |
Collapse
|
9
|
Huang H, Sun L, Du B, Lv W. Learning Joint 2-D and 3-D Graph Diffusion Models for Complete Molecule Generation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:11857-11871. [PMID: 38976472 DOI: 10.1109/tnnls.2024.3416328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Designing new molecules is essential for drug discovery and material science. Recently, deep generative models that aim to model molecule distribution have made promising progress in narrowing down the chemical research space and generating high-fidelity molecules. However, current generative models only focus on modeling 2-D bonding graphs or 3-D geometries, which are two complementary descriptors for molecules. The lack of ability to jointly model them limits the improvement of generation quality and further downstream applications. In this article, we propose a joint 2-D and 3-D graph diffusion model (JODO) that generates geometric graphs representing complete molecules with atom types, formal charges, bond information, and 3-D coordinates. To capture the correlation between 2-D molecular graphs and 3-D geometries in the diffusion process, we develop a diffusion graph transformer (DGT) to parameterize the data prediction model that recovers the original data from noisy data. The DGT uses a relational attention mechanism that enhances the interaction between node and edge representations. This mechanism operates concurrently with the propagation and update of scalar attributes and geometric vectors. Our model can also be extended for inverse molecular design targeting single or multiple quantum properties. In our comprehensive evaluation pipeline for unconditional joint generation, the experimental results show that JODO remarkably outperforms the baselines on the QM9 and GEOM-Drugs datasets. Furthermore, our model excels in few-step fast sampling, as well as in inverse molecule design and molecular graph generation. Our code is provided in https://github.com/GRAPH-0/JODO.
Collapse
|
10
|
Peter AS, Hoffmann DS, Klier J, Lange CM, Moeller J, Most V, Wüst CK, Beining M, Gülesen S, Junker H, Brumme B, Schiffner T, Meiler J, Schoeder CT. Strategies of rational and structure-driven vaccine design for Arenaviruses. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2024; 123:105626. [PMID: 38908736 DOI: 10.1016/j.meegid.2024.105626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 04/16/2024] [Accepted: 06/18/2024] [Indexed: 06/24/2024]
Abstract
The COVID-19 outbreak has highlighted the importance of pandemic preparedness for the prevention of future health crises. One virus family with high pandemic potential are Arenaviruses, which have been detected almost worldwide, particularly in Africa and the Americas. These viruses are highly understudied and many questions regarding their structure, replication and tropism remain unanswered, making the design of an efficacious and molecularly-defined vaccine challenging. We propose that structure-driven computational vaccine design will contribute to overcome these challenges. Computational methods for stabilization of viral glycoproteins or epitope focusing have made progress during the last decades and particularly during the COVID-19 pandemic, and have proven useful for rational vaccine design and the establishment of novel diagnostic tools. In this review, we summarize gaps in our understanding of Arenavirus molecular biology, highlight challenges in vaccine design and discuss how structure-driven and computationally informed strategies will aid in overcoming these obstacles.
Collapse
Affiliation(s)
- Antonia Sophia Peter
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany
| | - Dieter S Hoffmann
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany
| | - Johannes Klier
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany
| | - Christina M Lange
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany
| | - Johanna Moeller
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany; Center for Scalable Data Analytics and Artificial Intelligence ScaDS.AI, Dresden/Leipzig, Germany
| | - Victoria Most
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany
| | - Christina K Wüst
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany; Molecular Medicine Studies, Faculty for Biology and Preclinical Medicine, University of Regensburg, Regensburg, Germany
| | - Max Beining
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany; SECAI, School of Embedded Composite Artificial Intelligence, Dresden/Leipzig, Germany
| | - Sevilay Gülesen
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany
| | - Hannes Junker
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany
| | - Birke Brumme
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany
| | - Torben Schiffner
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany; The Scripps Research Institute, Department for Immunology and Microbiology, La Jolla, CA, United States
| | - Jens Meiler
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany; Center for Scalable Data Analytics and Artificial Intelligence ScaDS.AI, Dresden/Leipzig, Germany; Department of Chemistry, Vanderbilt University, Nashville, TN, United States; Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Clara T Schoeder
- Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany; Center for Scalable Data Analytics and Artificial Intelligence ScaDS.AI, Dresden/Leipzig, Germany.
| |
Collapse
|
11
|
Pala D, Clark DE. Caught between a ROCK and a hard place: current challenges in structure-based drug design. Drug Discov Today 2024; 29:104106. [PMID: 39029868 DOI: 10.1016/j.drudis.2024.104106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Revised: 06/27/2024] [Accepted: 07/13/2024] [Indexed: 07/21/2024]
Abstract
The discipline of structure-based drug design (SBDD) is several decades old and it is tempting to think that the proliferation of experimental structures for many drug targets might make computer-aided drug design (CADD) straightforward. However, this is far from true. In this review, we illustrate some of the challenges that CADD scientists face every day in their work, even now. We use Rho-associated protein kinase (ROCK), and public domain structures and data, as an example to illustrate some of the challenges we have experienced during our project targeting this protein. We hope that this will help to prevent unrealistic expectations of what CADD can accomplish and to educate non-CADD scientists regarding the challenges still facing their CADD colleagues.
Collapse
Affiliation(s)
- Daniele Pala
- Medicinal Chemistry and Drug Design Technologies Department, Chiesi Farmaceutici S.p.A, Research Center, Largo Belloli 11/a, 43122 Parma, Italy
| | - David E Clark
- Charles River, 6-9 Spire Green Centre, Flex Meadow, Harlow CM19 5TR, UK.
| |
Collapse
|
12
|
Li B, Sun C, Li J, Gao C. Targeted genome-modification tools and their advanced applications in crop breeding. Nat Rev Genet 2024; 25:603-622. [PMID: 38658741 DOI: 10.1038/s41576-024-00720-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/01/2024] [Indexed: 04/26/2024]
Abstract
Crop improvement by genome editing involves the targeted alteration of genes to improve plant traits, such as stress tolerance, disease resistance or nutritional content. Techniques for the targeted modification of genomes have evolved from generating random mutations to precise base substitutions, followed by insertions, substitutions and deletions of small DNA fragments, and are finally starting to achieve precision manipulation of large DNA segments. Recent developments in base editing, prime editing and other CRISPR-associated systems have laid a solid technological foundation to enable plant basic research and precise molecular breeding. In this Review, we systematically outline the technological principles underlying precise and targeted genome-modification methods. We also review methods for the delivery of genome-editing reagents in plants and outline emerging crop-breeding strategies based on targeted genome modification. Finally, we consider potential future developments in precise genome-editing technologies, delivery methods and crop-breeding approaches, as well as regulatory policies for genome-editing products.
Collapse
Affiliation(s)
- Boshu Li
- New Cornerstone Science Laboratory, Center for Genome Editing, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
- College of Advanced Agricultural Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Chao Sun
- New Cornerstone Science Laboratory, Center for Genome Editing, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
- College of Advanced Agricultural Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Jiayang Li
- Hainan Yazhou Bay Seed Laboratory, Sanya, China
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
| | - Caixia Gao
- New Cornerstone Science Laboratory, Center for Genome Editing, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.
- College of Advanced Agricultural Sciences, University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
13
|
Robustelli P. Extending computational protein design to intrinsically disordered proteins. SCIENCE ADVANCES 2024; 10:eadr3239. [PMID: 39196938 PMCID: PMC11352910 DOI: 10.1126/sciadv.adr3239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Accepted: 08/13/2024] [Indexed: 08/30/2024]
Abstract
Advances in the accuracy and throughput of molecular simulations usher in a new era in the structural biology of disordered proteins.
Collapse
Affiliation(s)
- Paul Robustelli
- Department of Chemistry, Dartmouth College, Hanover, NH 03755, USA
| |
Collapse
|
14
|
R VS, Choudhuri S, Ghosh B. Hybrid Diffusion Model for Stable, Affinity-Driven, Receptor-Aware Peptide Generation. J Chem Inf Model 2024. [PMID: 39193724 DOI: 10.1021/acs.jcim.4c01020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024]
Abstract
The convergence of biotechnology and artificial intelligence has the potential to transform drug development, especially in the field of therapeutic peptide design. Peptides are short chains of amino acids with diverse therapeutic applications that offer several advantages over small molecular drugs, such as targeted therapy and minimal side effects. However, limited oral bioavailability and enzymatic degradation have limited their effectiveness. With advances in deep learning techniques, innovative approaches to peptide design have become possible. In this work, we demonstrate HYDRA, a hybrid deep learning approach that leverages the distribution modeling capabilities of a diffusion model and combines it with a binding affinity maximization algorithm that can be used for de novo design of peptide binders for various target receptors. As an application, we have used our approach to design therapeutic peptides targeting proteins expressed by Plasmodium falciparum erythrocyte membrane protein 1 (PfEMP1) genes. The ability of HYDRA to generate peptides conditioned on the target receptor's binding sites makes it a promising approach for developing effective therapies for malaria and other diseases.
Collapse
Affiliation(s)
- Vishva Saravanan R
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| | - Soham Choudhuri
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| | - Bhaswar Ghosh
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| |
Collapse
|
15
|
Hardy BJ, Curnow P. Computational design of de novo bioenergetic membrane proteins. Biochem Soc Trans 2024; 52:1737-1745. [PMID: 38958574 DOI: 10.1042/bst20231347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 06/11/2024] [Accepted: 06/17/2024] [Indexed: 07/04/2024]
Abstract
The major energy-producing reactions of biochemistry occur at biological membranes. Computational protein design now provides the opportunity to elucidate the underlying principles of these processes and to construct bioenergetic pathways on our own terms. Here, we review recent achievements in this endeavour of 'synthetic bioenergetics', with a particular focus on new enabling tools that facilitate the computational design of biocompatible de novo integral membrane proteins. We use recent examples to showcase some of the key computational approaches in current use and highlight that the overall philosophy of 'surface-swapping' - the replacement of solvent-facing residues with amino acids bearing lipid-soluble hydrophobic sidechains - is a promising avenue in membrane protein design. We conclude by highlighting outstanding design challenges and the emerging role of AI in sequence design and structure ideation.
Collapse
Affiliation(s)
| | - Paul Curnow
- School of Biochemistry, University of Bristol, Bristol, U.K
| |
Collapse
|
16
|
Tom G, Schmid SP, Baird SG, Cao Y, Darvish K, Hao H, Lo S, Pablo-García S, Rajaonson EM, Skreta M, Yoshikawa N, Corapi S, Akkoc GD, Strieth-Kalthoff F, Seifrid M, Aspuru-Guzik A. Self-Driving Laboratories for Chemistry and Materials Science. Chem Rev 2024; 124:9633-9732. [PMID: 39137296 PMCID: PMC11363023 DOI: 10.1021/acs.chemrev.4c00055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
Self-driving laboratories (SDLs) promise an accelerated application of the scientific method. Through the automation of experimental workflows, along with autonomous experimental planning, SDLs hold the potential to greatly accelerate research in chemistry and materials discovery. This review provides an in-depth analysis of the state-of-the-art in SDL technology, its applications across various scientific disciplines, and the potential implications for research and industry. This review additionally provides an overview of the enabling technologies for SDLs, including their hardware, software, and integration with laboratory infrastructure. Most importantly, this review explores the diverse range of scientific domains where SDLs have made significant contributions, from drug discovery and materials science to genomics and chemistry. We provide a comprehensive review of existing real-world examples of SDLs, their different levels of automation, and the challenges and limitations associated with each domain.
Collapse
Affiliation(s)
- Gary Tom
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Vector Institute
for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
| | - Stefan P. Schmid
- Department
of Chemistry and Applied Biosciences, ETH
Zurich, Vladimir-Prelog-Weg 1, CH-8093 Zurich, Switzerland
| | - Sterling G. Baird
- Acceleration
Consortium, 80 St. George
St, Toronto, Ontario M5S 3H6, Canada
| | - Yang Cao
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Acceleration
Consortium, 80 St. George
St, Toronto, Ontario M5S 3H6, Canada
| | - Kourosh Darvish
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Vector Institute
for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
- Acceleration
Consortium, 80 St. George
St, Toronto, Ontario M5S 3H6, Canada
| | - Han Hao
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Acceleration
Consortium, 80 St. George
St, Toronto, Ontario M5S 3H6, Canada
| | - Stanley Lo
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
| | - Sergio Pablo-García
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
| | - Ella M. Rajaonson
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Vector Institute
for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
| | - Marta Skreta
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Vector Institute
for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
| | - Naruki Yoshikawa
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Vector Institute
for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
| | - Samantha Corapi
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
| | - Gun Deniz Akkoc
- Forschungszentrum
Jülich GmbH, Helmholtz Institute
for Renewable Energy Erlangen-Nürnberg, Cauerstr. 1, 91058 Erlangen, Germany
- Department
of Chemical and Biological Engineering, Friedrich-Alexander Universität Erlangen-Nürnberg, Egerlandstr. 3, 91058 Erlangen, Germany
| | - Felix Strieth-Kalthoff
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- School of
Mathematics and Natural Sciences, University
of Wuppertal, Gaußstraße
20, 42119 Wuppertal, Germany
| | - Martin Seifrid
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Department
of Materials Science and Engineering, North
Carolina State University, Raleigh, North Carolina 27695, United States of America
| | - Alán Aspuru-Guzik
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Vector Institute
for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
- Acceleration
Consortium, 80 St. George
St, Toronto, Ontario M5S 3H6, Canada
- Department
of Chemical Engineering & Applied Chemistry, University of Toronto, Toronto, Ontario M5S 3E5, Canada
- Department
of Materials Science & Engineering, University of Toronto, Toronto, Ontario M5S 3E4, Canada
- Lebovic
Fellow, Canadian Institute for Advanced
Research (CIFAR), 661
University Ave, Toronto, Ontario M5G 1M1, Canada
| |
Collapse
|
17
|
Wang X, Yin X, Jiang D, Zhao H, Wu Z, Zhang O, Wang J, Li Y, Deng Y, Liu H, Luo P, Han Y, Hou T, Yao X, Hsieh CY. Multi-modal deep learning enables efficient and accurate annotation of enzymatic active sites. Nat Commun 2024; 15:7348. [PMID: 39187482 PMCID: PMC11347633 DOI: 10.1038/s41467-024-51511-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Accepted: 08/09/2024] [Indexed: 08/28/2024] Open
Abstract
Annotating active sites in enzymes is crucial for advancing multiple fields including drug discovery, disease research, enzyme engineering, and synthetic biology. Despite the development of numerous automated annotation algorithms, a significant trade-off between speed and accuracy limits their large-scale practical applications. We introduce EasIFA, an enzyme active site annotation algorithm that fuses latent enzyme representations from the Protein Language Model and 3D structural encoder, and then aligns protein-level information with the knowledge of enzymatic reactions using a multi-modal cross-attention framework. EasIFA outperforms BLASTp with a 10-fold speed increase and improved recall, precision, f1 score, and MCC by 7.57%, 13.08%, 9.68%, and 0.1012, respectively. It also surpasses empirical-rule-based algorithm and other state-of-the-art deep learning annotation method based on PSSM features, achieving a speed increase ranging from 650 to 1400 times while enhancing annotation quality. This makes EasIFA a suitable replacement for conventional tools in both industrial and academic settings. EasIFA can also effectively transfer knowledge gained from coarsely annotated enzyme databases to smaller, high-precision datasets, highlighting its ability to model sparse and high-quality databases. Additionally, EasIFA shows potential as a catalytic site monitoring tool for designing enzymes with desired functions beyond their natural distribution.
Collapse
Affiliation(s)
- Xiaorui Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
- Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, 999078, China
| | - Xiaodan Yin
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
- Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, 999078, China
| | - Dejun Jiang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Huifeng Zhao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Zhenxing Wu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Odin Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Yuquan Li
- College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou, 730000, Gansu, China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, 310018, Zhejiang, China
| | - Huanxiang Liu
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, China
| | - Pei Luo
- Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, 999078, China
| | - Yuqiang Han
- Department of Computer Science and Engineering, Chinese University of Hong Kong, Hong Kong, 999077, China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| | - Xiaojun Yao
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, China.
| | - Chang-Yu Hsieh
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| |
Collapse
|
18
|
Xie X, Valiente PA, Lee JS, Kim J, Kim PM. Antibody-SGM, a Score-Based Generative Model for Antibody Heavy-Chain Design. J Chem Inf Model 2024. [PMID: 39189360 DOI: 10.1021/acs.jcim.4c00711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/28/2024]
Abstract
Traditional computational methods for antibody design involved random mutagenesis followed by energy function assessment for candidate selection. Recently, diffusion models have garnered considerable attention as cutting-edge generative models, lauded for their remarkable performance. However, these methods often focus solely on the backbone or sequence, resulting in the incomplete depiction of the overall structure and necessitating additional techniques to predict the missing component. This study presents Antibody-SGM, an innovative joint structure-sequence diffusion model that addresses the limitations of existing protein backbone generation models. Unlike previous models, Antibody-SGM successfully integrates sequence-specific attributes and functional properties into the generation process. Our methodology generates full-atom native-like antibody heavy chains by refining the generation to create valid pairs of sequences and structures, starting with random sequences and structural properties. The versatility of our method is demonstrated through various applications, including the design of full-atom antibodies, antigen-specific CDR design, antibody heavy chains optimization, validation with Alphafold3, and the identification of crucial antibody sequences and structural features. Antibody-SGM also optimizes protein function through active inpainting learning, allowing simultaneous sequence and structure optimization. These improvements demonstrate the promise of our strategy for protein engineering and significantly increase the power of protein design models.
Collapse
Affiliation(s)
- Xuezhi Xie
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Pedro A Valiente
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Jin Sub Lee
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Jisun Kim
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Philip M Kim
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| |
Collapse
|
19
|
Weller J, Rohs R. Structure-Based Drug Design with a Deep Hierarchical Generative Model. J Chem Inf Model 2024; 64:6450-6463. [PMID: 39058534 PMCID: PMC11350878 DOI: 10.1021/acs.jcim.4c01193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 07/16/2024] [Accepted: 07/17/2024] [Indexed: 07/28/2024]
Abstract
Recently, the remarkable growth of available crystal structure data and libraries of commercially available or readily synthesizable molecules have unlocked previously inaccessible regions of chemical space for drug development. Paired with improvements in virtual ligand screening methods, these expanded libraries are having a notable impact on early drug design efforts. Yet screening-based methods still face scalability limits, due to computational constraints and the sheer scale of drug-like space. Machine learning approaches are overcoming these limitations by learning the fundamental intra- and intermolecular relationships in drug-target systems from existing data. Here, we introduce DrugHIVE, a deep hierarchical variational autoencoder that outperforms state-of-the-art autoregressive and diffusion-based methods in both speed and performance on common generative benchmarks. DrugHIVE's hierarchical design enables improved control over molecular generation. Its capabilities include dramatically increasing virtual screening efficiency and accelerating a wide range of common drug design tasks, including de novo generation, molecular optimization, scaffold hopping, linker design, and high-throughput pattern replacement. Our highly scalable method can even be applied to receptors with high-confidence AlphaFold-predicted structures, extending the ability to generate high-quality drug-like molecules to a majority of the unsolved human proteome.
Collapse
Affiliation(s)
- Jesse
A. Weller
- Department
of Quantitative and Computational Biology, University of Southern California, Los Angeles, California 90089, United States
- Department
of Physics and Astronomy, University of
Southern California, Los Angeles, California 90089, United States
| | - Remo Rohs
- Department
of Quantitative and Computational Biology, University of Southern California, Los Angeles, California 90089, United States
- Department
of Physics and Astronomy, University of
Southern California, Los Angeles, California 90089, United States
- Department
of Chemistry, University of Southern California, Los Angeles, California 90089, United States
- Thomas
Lord Department of Computer Science, University
of Southern California, Los Angeles, California 90089, United States
| |
Collapse
|
20
|
Talucci I, Maric HM. Epitope landscape in autoimmune neurological disease and beyond. Trends Pharmacol Sci 2024:S0165-6147(24)00149-4. [PMID: 39181736 DOI: 10.1016/j.tips.2024.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 07/08/2024] [Accepted: 07/19/2024] [Indexed: 08/27/2024]
Abstract
Autoantibody binding has a central role in autoimmune diseases and has also been linked to cancer, infections, and behavioral disorders. Autoimmune neurological diseases remain misclassified also due to an incomplete understanding of the underlying disease-specific epitopes. Such epitopes are crucial for both pathology and diagnosis, but have historically been overlooked. Recent technological advancements have enabled the exploration of these epitopes, potentially opening novel clinical avenues. The precise identification of novel B and T cell epitopes and their autoreactivity has led to the discovery of autoantigen-specific biomarkers for patients at high risk of autoimmune neurological diseases. In this review, we propose utilizing newly available synthetic and cellular-surface display technologies and guide epitope-focused studies to unlock the potential of disease-specific epitopes for improving diagnosis and treatments. Additionally, we offer recommendations to guide emerging epitope-focused studies to broaden the current landscape.
Collapse
Affiliation(s)
- Ivan Talucci
- Rudolf Virchow Center for Integrative and Translational Bioimaging, University of Würzburg, Germany; Department of Neurology, University Hospital Würzburg, Germany
| | - Hans M Maric
- Rudolf Virchow Center for Integrative and Translational Bioimaging, University of Würzburg, Germany.
| |
Collapse
|
21
|
Zhang Y, Mastouri M, Zhang Y. Accelerating drug discovery, development, and clinical trials by artificial intelligence. MED 2024:S2666-6340(24)00308-8. [PMID: 39173629 DOI: 10.1016/j.medj.2024.07.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 05/21/2024] [Accepted: 07/25/2024] [Indexed: 08/24/2024]
Abstract
Artificial intelligence (AI) has profoundly advanced the field of biomedical research, which also demonstrates transformative capacity for innovation in drug development. This paper aims to deliver a comprehensive analysis of the progress in AI-assisted drug development, particularly focusing on small molecules, RNA, and antibodies. Moreover, this paper elucidates the current integration of AI methodologies within the industrial drug development framework. This encompasses a detailed examination of the industry-standard drug development process, supplemented by a review of medications presently undergoing clinical trials. Conclusively, the paper tackles a predominant obstacle within the AI pharmaceutical sector: the absence of AI-conceived drugs receiving approval. This paper also advocates for the adoption of large language models and diffusion models as a viable strategy to surmount this challenge. This review not only underscores the significant potential of AI in drug discovery but also deliberates on the challenges and prospects within this dynamically progressing field.
Collapse
Affiliation(s)
- Yilun Zhang
- College of Science, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, China; School of Medicine, The Chinese University of Hong Kong (Shenzhen), Shenzhen, Guangdong, China
| | - Mohamed Mastouri
- College of Science, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, China
| | - Yang Zhang
- College of Science, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, China.
| |
Collapse
|
22
|
Lv Y, Qi J, Babon JJ, Cao L, Fan G, Lang J, Zhang J, Mi P, Kobe B, Wang F. The JAK-STAT pathway: from structural biology to cytokine engineering. Signal Transduct Target Ther 2024; 9:221. [PMID: 39169031 PMCID: PMC11339341 DOI: 10.1038/s41392-024-01934-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 06/12/2024] [Accepted: 07/16/2024] [Indexed: 08/23/2024] Open
Abstract
The Janus kinase-signal transducer and activator of transcription (JAK-STAT) pathway serves as a paradigm for signal transduction from the extracellular environment to the nucleus. It plays a pivotal role in physiological functions, such as hematopoiesis, immune balance, tissue homeostasis, and surveillance against tumors. Dysregulation of this pathway may lead to various disease conditions such as immune deficiencies, autoimmune diseases, hematologic disorders, and cancer. Due to its critical role in maintaining human health and involvement in disease, extensive studies have been conducted on this pathway, ranging from basic research to medical applications. Advances in the structural biology of this pathway have enabled us to gain insights into how the signaling cascade operates at the molecular level, laying the groundwork for therapeutic development targeting this pathway. Various strategies have been developed to restore its normal function, with promising therapeutic potential. Enhanced comprehension of these molecular mechanisms, combined with advances in protein engineering methodologies, has allowed us to engineer cytokines with tailored properties for targeted therapeutic applications, thereby enhancing their efficiency and safety. In this review, we outline the structural basis that governs key nodes in this pathway, offering a comprehensive overview of the signal transduction process. Furthermore, we explore recent advances in cytokine engineering for therapeutic development in this pathway.
Collapse
Affiliation(s)
- You Lv
- Center for Molecular Biosciences and Non-communicable Diseases Research, Xi'an University of Science and Technology, Xi'an, Shaanxi, 710054, China
- Xi'an Amazinggene Co., Ltd, Xi'an, Shaanxi, 710026, China
| | - Jianxun Qi
- CAS Key Laboratory of Pathogen Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100080, China
| | - Jeffrey J Babon
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia
| | - Longxing Cao
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, 310024, China
| | - Guohuang Fan
- Immunophage Biotech Co., Ltd, No. 10 Lv Zhou Huan Road, Shanghai, 201112, China
| | - Jiajia Lang
- School of Pharmaceutical Science, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Jin Zhang
- Xi'an Amazinggene Co., Ltd, Xi'an, Shaanxi, 710026, China
| | - Pengbing Mi
- School of Pharmaceutical Science, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China.
| | - Bostjan Kobe
- School of Chemistry and Molecular Biosciences, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, University of Queensland, Brisbane, Queensland, 4072, Australia.
| | - Faming Wang
- Center for Molecular Biosciences and Non-communicable Diseases Research, Xi'an University of Science and Technology, Xi'an, Shaanxi, 710054, China.
| |
Collapse
|
23
|
Lipsh-Sokolik R, Fleishman SJ. Addressing epistasis in the design of protein function. Proc Natl Acad Sci U S A 2024; 121:e2314999121. [PMID: 39133844 PMCID: PMC11348311 DOI: 10.1073/pnas.2314999121] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024] Open
Abstract
Mutations in protein active sites can dramatically improve function. The active site, however, is densely packed and extremely sensitive to mutations. Therefore, some mutations may only be tolerated in combination with others in a phenomenon known as epistasis. Epistasis reduces the likelihood of obtaining improved functional variants and dramatically slows natural and lab evolutionary processes. Research has shed light on the molecular origins of epistasis and its role in shaping evolutionary trajectories and outcomes. In addition, sequence- and AI-based strategies that infer epistatic relationships from mutational patterns in natural or experimental evolution data have been used to design functional protein variants. In recent years, combinations of such approaches and atomistic design calculations have successfully predicted highly functional combinatorial mutations in active sites. These were used to design thousands of functional active-site variants, demonstrating that, while our understanding of epistasis remains incomplete, some of the determinants that are critical for accurate design are now sufficiently understood. We conclude that the space of active-site variants that has been explored by evolution may be expanded dramatically to enhance natural activities or discover new ones. Furthermore, design opens the way to systematically exploring sequence and structure space and mutational impacts on function, deepening our understanding and control over protein activity.
Collapse
Affiliation(s)
- Rosalie Lipsh-Sokolik
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Sarel J Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
24
|
Bienau A, Jäkel AC, Simmel FC. Cell-Free Gene Expression in Bioprinted Fluidic Networks. ACS Synth Biol 2024; 13:2447-2456. [PMID: 39042670 PMCID: PMC11334185 DOI: 10.1021/acssynbio.4c00187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 07/01/2024] [Accepted: 07/09/2024] [Indexed: 07/25/2024]
Abstract
The realization of soft robotic devices with life-like properties requires the engineering of smart, active materials that can respond to environmental cues in similar ways as living cells or organisms. Cell-free expression systems provide an approach for embedding dynamic molecular control into such materials that avoids many of the complexities associated with genuinely living systems. Here, we present a strategy to integrate cell-free protein synthesis within agarose-based hydrogels that can be spatially organized and supplied by a synthetic vasculature. We first utilize an indirect printing approach with a commercial bioprinter and Pluronic F-127 as a fugitive ink to define fluidic channel structures within the hydrogels. We then investigate the impact of the gel matrix on the expression of proteins in E. coli cell-extract, which is found to depend on the gel density and the dilution of the expression system. When supplying the vascularized hydrogels with reactants, larger components such as DNA plasmids are confined to the channels or immobilized in the gels while nanoscale reaction components can diffusively spread within the gel. Using a single supply channel, we demonstrate different spatial protein concentration profiles emerging from different cell-free gene circuits comprising production, gene activation, and negative feedback. Variation of the channel design allows the creation of specific concentration profiles such as a long-term stable gradient or the homogeneous supply of a hydrogel with proteins.
Collapse
Affiliation(s)
- Alexandra Bienau
- TU Munich, School of Natural Sciences, Department of Bioscience, 85748 Garching
b. München, Germany
| | - Anna C. Jäkel
- TU Munich, School of Natural Sciences, Department of Bioscience, 85748 Garching
b. München, Germany
| | - Friedrich C. Simmel
- TU Munich, School of Natural Sciences, Department of Bioscience, 85748 Garching
b. München, Germany
| |
Collapse
|
25
|
Kim M, Bhargava HK, Shavey GE, Lim WA, El-Samad H, Ng AH. Degron-Based bioPROTACs for Controlling Signaling in CAR T Cells. ACS Synth Biol 2024; 13:2313-2327. [PMID: 38991546 PMCID: PMC11334183 DOI: 10.1021/acssynbio.4c00109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 06/02/2024] [Accepted: 06/03/2024] [Indexed: 07/13/2024]
Abstract
Chimeric antigen receptor (CAR) T cells have made a tremendous impact in the clinic, but potent signaling through the CAR can be detrimental to treatment safety and efficacy. The use of protein degradation to control CAR signaling can address these issues in preclinical models. Existing strategies for regulating CAR stability rely on small molecules to induce systemic degradation. In contrast to small molecule regulation, genetic circuits offer a more precise method to control CAR signaling in an autonomous cell-by-cell fashion. Here, we describe a programmable protein degradation tool that adopts the framework of bioPROTACs, heterobifunctional proteins that are composed of a target recognition domain fused to a domain that recruits the endogenous ubiquitin proteasome system. We develop novel bioPROTACs that utilize a compact four-residue degron and demonstrate degradation of cytosolic and membrane protein targets using either a nanobody or synthetic leucine zipper as a protein binder. Our bioPROTACs exhibit potent degradation of CARs and can inhibit CAR signaling in primary human T cells. We demonstrate the utility of our bioPROTACs by constructing a genetic circuit to degrade the tyrosine kinase ZAP70 in response to recognition of a specific membrane-bound antigen. This circuit can disrupt CAR T cell signaling only in the presence of a specific cell population. These results suggest that bioPROTACs are powerful tools for expanding the CAR T cell engineering toolbox.
Collapse
Affiliation(s)
- Matthew
S. Kim
- Tetrad
Graduate Program, University of California
San Francisco, San Francisco, California 94158, United States
- Cell
Design Institute, University of California
San Francisco, San Francisco, California 94158, United States
- Department
of Biochemistry and Biophysics, University
of California San Francisco, San
Francisco, California 94158, United States
| | - Hersh K. Bhargava
- Cell
Design Institute, University of California
San Francisco, San Francisco, California 94158, United States
- Department
of Biochemistry and Biophysics, University
of California San Francisco, San
Francisco, California 94158, United States
- Biophysics
Graduate Program, University of California
San Francisco, San Francisco, California 94158, United States
| | - Gavin E. Shavey
- Cell
Design Institute, University of California
San Francisco, San Francisco, California 94158, United States
| | - Wendell A. Lim
- Cell
Design Institute, University of California
San Francisco, San Francisco, California 94158, United States
- Department
of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, California 94158, United States
| | - Hana El-Samad
- Cell
Design Institute, University of California
San Francisco, San Francisco, California 94158, United States
- Department
of Biochemistry and Biophysics, University
of California San Francisco, San
Francisco, California 94158, United States
- Chan-Zuckerberg
Biohub, San Francisco, California 94158, United States
- Altos
Labs Inc., Redwood City, California, 94065, United States
| | - Andrew H. Ng
- Cell
Design Institute, University of California
San Francisco, San Francisco, California 94158, United States
- Department
of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, California 94158, United States
- Department
of Molecular Biology, Genentech Inc., South San Francisco, California 94080, United States
| |
Collapse
|
26
|
Stukenbroeker T. From De Novo to Xeno: Advancing Macromolecule Design beyond Proteins. ACS Synth Biol 2024; 13:2271-2275. [PMID: 39148431 DOI: 10.1021/acssynbio.4c00179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Protein synthesis methods have been adapted to incorporate an ever-growing level of non-natural components. Meanwhile, design of de novo protein structure and function has rapidly emerged as a viable capability. Yet, these two exciting trends have yet to intersect in a meaningful way. The ability to perform de novo design with non-proteinogenic components requires that synthesis and computation align on common targets and applications. This perspective examines the state of the art in these areas and identifies specific, consequential applications to advance the field toward generalized macromolecule design.
Collapse
|
27
|
Plaper T, Rihtar E, Železnik Ramuta T, Forstnerič V, Jazbec V, Ivanovski F, Benčina M, Jerala R. The art of designed coiled-coils for the regulation of mammalian cells. Cell Chem Biol 2024; 31:1460-1472. [PMID: 38971158 PMCID: PMC11335187 DOI: 10.1016/j.chembiol.2024.06.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 05/04/2024] [Accepted: 06/11/2024] [Indexed: 07/08/2024]
Abstract
Synthetic biology aims to engineer complex biological systems using modular elements, with coiled-coil (CC) dimer-forming modules are emerging as highly useful building blocks in the regulation of protein assemblies and biological processes. Those small modules facilitate highly specific and orthogonal protein-protein interactions, offering versatility for the regulation of diverse biological functions. Additionally, their design rules enable precise control and tunability over these interactions, which are crucial for specific applications. Recent advancements showcase their potential for use in innovative therapeutic interventions and biomedical applications. In this review, we discuss the potential of CCs, exploring their diverse applications in mammalian cells, such as synthetic biological circuit design, transcriptional and allosteric regulation, cellular assemblies, chimeric antigen receptor (CAR) T cell regulation, and genome editing and their role in advancing the understanding and regulation of cellular processes.
Collapse
Affiliation(s)
- Tjaša Plaper
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Erik Rihtar
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Taja Železnik Ramuta
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Vida Forstnerič
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Vid Jazbec
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Filip Ivanovski
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Mojca Benčina
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia; Centre for Technologies of Gene and Cell Therapy, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Roman Jerala
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia; Centre for Technologies of Gene and Cell Therapy, Hajdrihova 19, 1000 Ljubljana, Slovenia.
| |
Collapse
|
28
|
Zitnik M, Li MM, Wells A, Glass K, Morselli Gysi D, Krishnan A, Murali TM, Radivojac P, Roy S, Baudot A, Bozdag S, Chen DZ, Cowen L, Devkota K, Gitter A, Gosline SJC, Gu P, Guzzi PH, Huang H, Jiang M, Kesimoglu ZN, Koyuturk M, Ma J, Pico AR, Pržulj N, Przytycka TM, Raphael BJ, Ritz A, Sharan R, Shen Y, Singh M, Slonim DK, Tong H, Yang XH, Yoon BJ, Yu H, Milenković T. Current and future directions in network biology. BIOINFORMATICS ADVANCES 2024; 4:vbae099. [PMID: 39143982 PMCID: PMC11321866 DOI: 10.1093/bioadv/vbae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 05/31/2024] [Accepted: 07/08/2024] [Indexed: 08/16/2024]
Abstract
Summary Network biology is an interdisciplinary field bridging computational and biological sciences that has proved pivotal in advancing the understanding of cellular functions and diseases across biological systems and scales. Although the field has been around for two decades, it remains nascent. It has witnessed rapid evolution, accompanied by emerging challenges. These stem from various factors, notably the growing complexity and volume of data together with the increased diversity of data types describing different tiers of biological organization. We discuss prevailing research directions in network biology, focusing on molecular/cellular networks but also on other biological network types such as biomedical knowledge graphs, patient similarity networks, brain networks, and social/contact networks relevant to disease spread. In more detail, we highlight areas of inference and comparison of biological networks, multimodal data integration and heterogeneous networks, higher-order network analysis, machine learning on networks, and network-based personalized medicine. Following the overview of recent breakthroughs across these five areas, we offer a perspective on future directions of network biology. Additionally, we discuss scientific communities, educational initiatives, and the importance of fostering diversity within the field. This article establishes a roadmap for an immediate and long-term vision for network biology. Availability and implementation Not applicable.
Collapse
Affiliation(s)
- Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Michelle M Li
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Aydin Wells
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Deisy Morselli Gysi
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
- Department of Statistics, Federal University of Paraná, Curitiba, Paraná 81530-015, Brazil
- Department of Physics, Northeastern University, Boston, MA 02115, United States
| | - Arjun Krishnan
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, United States
| | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, United States
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, United States
| | - Sushmita Roy
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Wisconsin Institute for Discovery, Madison, WI 53715, United States
| | - Anaïs Baudot
- Aix Marseille Université, INSERM, MMG, Marseille, France
| | - Serdar Bozdag
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- Department of Mathematics, University of North Texas, Denton, TX 76203, United States
| | - Danny Z Chen
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Lenore Cowen
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Kapil Devkota
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Morgridge Institute for Research, Madison, WI 53715, United States
| | - Sara J C Gosline
- Biological Sciences Division, Pacific Northwest National Laboratory, Seattle, WA 98109, United States
| | - Pengfei Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Pietro H Guzzi
- Department of Medical and Surgical Sciences, University Magna Graecia of Catanzaro, Catanzaro, 88100, Italy
| | - Heng Huang
- Department of Computer Science, University of Maryland College Park, College Park, MD 20742, United States
| | - Meng Jiang
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Ziynet Nesibe Kesimoglu
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Mehmet Koyuturk
- Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH 44106, United States
| | - Jian Ma
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, United States
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, United States
| | - Nataša Pržulj
- Department of Computer Science, University College London, London, WC1E 6BT, England
- ICREA, Catalan Institution for Research and Advanced Studies, Barcelona, 08010, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain
| | - Teresa M Przytycka
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
| | - Anna Ritz
- Department of Biology, Reed College, Portland, OR 97202, United States
| | - Roded Sharan
- School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, United States
| | - Donna K Slonim
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Hanghang Tong
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
| | - Xinan Holly Yang
- Department of Pediatrics, University of Chicago, Chicago, IL 60637, United States
| | - Byung-Jun Yoon
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY 11973, United States
| | - Haiyuan Yu
- Department of Computational Biology, Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, United States
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| |
Collapse
|
29
|
Min X, Liao Y, Chen X, Yang Q, Ying J, Zou J, Yang C, Zhang J, Ge S, Xia N. PB-GPT: An innovative GPT-based model for protein backbone generation. Structure 2024:S0969-2126(24)00279-X. [PMID: 39173620 DOI: 10.1016/j.str.2024.07.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 06/02/2024] [Accepted: 07/28/2024] [Indexed: 08/24/2024]
Abstract
With advanced computational methods, it is now feasible to modify or design proteins for specific functions, a process with significant implications for disease treatment and other medical applications. Protein structures and functions are intrinsically linked to their backbones, making the design of these backbones a pivotal aspect of protein engineering. In this study, we focus on the task of unconditionally generating protein backbones. By means of codebook quantization and compression dictionaries, we convert protein backbone structures into a distinctive coded language and propose a GPT-based protein backbone generation model, PB-GPT. To validate the generalization performance of the model, we trained and evaluated the model on both public datasets and small protein datasets. The results demonstrate that our model has the capability to unconditionally generate elaborate, highly realistic protein backbones with structural patterns resembling those of natural proteins, thus showcasing the significant potential of large language models in protein structure design.
Collapse
Affiliation(s)
- Xiaoping Min
- School of Informatics, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China; National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, State Key, No. 422 Siming South Rd, Xiamen 361005, China; State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Yiyang Liao
- School of Informatics, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China; National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, State Key, No. 422 Siming South Rd, Xiamen 361005, China; State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Xiao Chen
- School of Informatics, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Qianli Yang
- Institute of Artificial Intelligence, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Junjie Ying
- Institute of Artificial Intelligence, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Jiajun Zou
- School of Informatics, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Chongzhou Yang
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, State Key, No. 422 Siming South Rd, Xiamen 361005, China; Institute of Artificial Intelligence, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Jun Zhang
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, State Key, No. 422 Siming South Rd, Xiamen 361005, China; School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China; State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Shengxiang Ge
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, State Key, No. 422 Siming South Rd, Xiamen 361005, China; School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China; State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China.
| | - Ningshao Xia
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, State Key, No. 422 Siming South Rd, Xiamen 361005, China; School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China; State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China.
| |
Collapse
|
30
|
Ishitani R, Takemoto M, Tomii K. Protein ligand binding site prediction using graph transformer neural network. PLoS One 2024; 19:e0308425. [PMID: 39106255 DOI: 10.1371/journal.pone.0308425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 07/23/2024] [Indexed: 08/09/2024] Open
Abstract
Ligand binding site prediction is a crucial initial step in structure-based drug discovery. Although several methods have been proposed previously, including those using geometry based and machine learning techniques, their accuracy is considered to be still insufficient. In this study, we introduce an approach that leverages a graph transformer neural network to rank the results of a geometry-based pocket detection method. We also created a larger training dataset compared to the conventionally used sc-PDB and investigated the correlation between the dataset size and prediction performance. Our findings indicate that utilizing a graph transformer-based method alongside a larger training dataset could enhance the performance of ligand binding site prediction.
Collapse
Affiliation(s)
- Ryuichiro Ishitani
- Division of Computational Drug Discovery and Design, Medical Research Institute, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo, Japan
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
- Preferred Networks, Inc., Chiyoda-ku, Tokyo, Japan
| | - Mizuki Takemoto
- Division of Computational Drug Discovery and Design, Medical Research Institute, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo, Japan
| | - Kentaro Tomii
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), Koto-ku, Tokyo, Japan
| |
Collapse
|
31
|
Yin Y, Ren H, Wu H, Lu Z. Triclosan Dioxygenase: A Novel Two-component Rieske Nonheme Iron Ring-hydroxylating Dioxygenase Initiates Triclosan Degradation. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:13833-13844. [PMID: 39012163 DOI: 10.1021/acs.est.4c02845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/17/2024]
Abstract
The emerging contaminant triclosan (TCS) is widely distributed both in surface water and in wastewater and poses a threat to aquatic organisms and human health due to its resistance to degradation. The dioxygenase enzyme TcsAB has been speculated to perform the initial degradation of TCS, but its precise catalytic mechanism remains unclear. In this study, the function of TcsAB was elucidated using multiple biochemical and molecular biology methods. Escherichia coli BL21(DE3) heterologously expressing tcsAB from Sphingomonas sp. RD1 converted TCS to 2,4-dichlorophenol. TcsAB belongs to the group IA family of two-component Rieske nonheme iron ring-hydroxylating dioxygenases. The highest amino acid identity of TcsA and the large subunits of other dioxygenases in the same family was only 35.50%, indicating that TcsAB is a novel dioxygenase. Mutagenesis of residues near the substrate binding pocket decreased the TCS-degrading activity and narrowed the substrate spectrum, except for the TcsAF343A mutant. A meta-analysis of 1492 samples from wastewater treatment systems worldwide revealed that tcsA genes are widely distributed. This study is the first to report that the TCS-specific dioxygenase TcsAB is responsible for the initial degradation of TCS. Studying the microbial degradation mechanism of TCS is crucial for removing this pollutant from the environment.
Collapse
Affiliation(s)
- Yiran Yin
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou 310058, China
- Cancer Center, Zhejiang University, Hangzhou 310058, China
| | - Hao Ren
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou 310058, China
- Cancer Center, Zhejiang University, Hangzhou 310058, China
| | - Hao Wu
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou 310058, China
- Cancer Center, Zhejiang University, Hangzhou 310058, China
| | - Zhenmei Lu
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou 310058, China
- Cancer Center, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
32
|
Peng S, Rajjou L. Advancing plant biology through deep learning-powered natural language processing. PLANT CELL REPORTS 2024; 43:208. [PMID: 39102077 DOI: 10.1007/s00299-024-03294-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 07/19/2024] [Indexed: 08/06/2024]
Abstract
The application of deep learning methods, specifically the utilization of Large Language Models (LLMs), in the field of plant biology holds significant promise for generating novel knowledge on plant cell systems. The LLM framework exhibits exceptional potential, particularly with the development of Protein Language Models (PLMs), allowing for in-depth analyses of nucleic acid and protein sequences. This analytical capacity facilitates the discernment of intricate patterns and relationships within biological data, encompassing multi-scale information within DNA or protein sequences. The contribution of PLMs extends beyond mere sequence patterns and structure--function recognition; it also supports advancements in genetic improvements for agriculture. The integration of deep learning approaches into the domain of plant sciences offers opportunities for major breakthroughs in basic research across multi-scale plant traits. Consequently, the strategic application of deep learning methodologies, particularly leveraging the potential of LLMs, will undoubtedly play a pivotal role in advancing plant sciences, plant production, plant uses and propelling the trajectory toward sustainable agroecological and agro-food transitions.
Collapse
Affiliation(s)
- Shuang Peng
- Université Paris-Saclay, INRAE, AgroParisTech, Institut Jean-Pierre Bourgin for Plant Sciences (IJPB), 78000, Versailles, France
| | - Loïc Rajjou
- Université Paris-Saclay, INRAE, AgroParisTech, Institut Jean-Pierre Bourgin for Plant Sciences (IJPB), 78000, Versailles, France.
| |
Collapse
|
33
|
Rodriguez DCP, Weber KC, Sundberg B, Glasgow A. MAGPIE: An interactive tool for visualizing and analyzing protein-ligand interactions. Protein Sci 2024; 33:e5027. [PMID: 38989559 PMCID: PMC11237554 DOI: 10.1002/pro.5027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 04/22/2024] [Accepted: 05/05/2024] [Indexed: 07/12/2024]
Abstract
Quantitative tools to compile and analyze biomolecular interactions among chemically diverse binding partners would improve therapeutic design and aid in studying molecular evolution. Here we present Mapping Areas of Genetic Parsimony In Epitopes (MAGPIE), a publicly available software package for simultaneously visualizing and analyzing thousands of interactions between a single protein or small molecule ligand (the "target") and all of its protein binding partners ("binders"). MAGPIE generates an interactive three-dimensional visualization from a set of protein complex structures that share the target ligand, as well as sequence logo-style amino acid frequency graphs that show all the amino acids from the set of protein binders that interact with user-defined target ligand positions or chemical groups. MAGPIE highlights all the salt bridge and hydrogen bond interactions made by the target in the visualization and as separate amino acid frequency graphs. Finally, MAGPIE collates the most common target-binder interactions as a list of "hotspots," which can be used to analyze trends or guide the de novo design of protein binders. As an example of the utility of the program, we used MAGPIE to probe how different antibody fragments bind a viral antigen; how a common metabolite binds diverse protein partners; and how two ligands bind orthologs of a well-conserved glycolytic enzyme for a detailed understanding of evolutionarily conserved interactions involved in its activation and inhibition. MAGPIE is implemented in Python 3 and freely available at https://github.com/glasgowlab/MAGPIE, along with sample datasets, usage examples, and helper scripts to prepare input structures.
Collapse
Affiliation(s)
- Daniel C. Pineda Rodriguez
- Department of Biochemistry and Molecular BiophysicsColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | - Kyle C. Weber
- Department of Biochemistry and Molecular BiophysicsColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | - Belen Sundberg
- Department of Biochemistry and Molecular BiophysicsColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | - Anum Glasgow
- Department of Biochemistry and Molecular BiophysicsColumbia University Irving Medical CenterNew YorkNew YorkUSA
| |
Collapse
|
34
|
Albanese KI, Petrenas R, Pirro F, Naudin EA, Borucu U, Dawson WM, Scott DA, Leggett GJ, Weiner OD, Oliver TAA, Woolfson DN. Rationally seeded computational protein design of ɑ-helical barrels. Nat Chem Biol 2024; 20:991-999. [PMID: 38902458 PMCID: PMC11288890 DOI: 10.1038/s41589-024-01642-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 05/09/2024] [Indexed: 06/22/2024]
Abstract
Computational protein design is advancing rapidly. Here we describe efficient routes starting from validated parallel and antiparallel peptide assemblies to design two families of α-helical barrel proteins with central channels that bind small molecules. Computational designs are seeded by the sequences and structures of defined de novo oligomeric barrel-forming peptides, and adjacent helices are connected by loop building. For targets with antiparallel helices, short loops are sufficient. However, targets with parallel helices require longer connectors; namely, an outer layer of helix-turn-helix-turn-helix motifs that are packed onto the barrels. Throughout these computational pipelines, residues that define open states of the barrels are maintained. This minimizes sequence sampling, accelerating the design process. For each of six targets, just two to six synthetic genes are made for expression in Escherichia coli. On average, 70% of these genes express to give soluble monomeric proteins that are fully characterized, including high-resolution structures for most targets that match the design models with high accuracy.
Collapse
Affiliation(s)
- Katherine I Albanese
- School of Chemistry, University of Bristol, Bristol, UK
- Max Planck-Bristol Centre for Minimal Biology, University of Bristol, Bristol, UK
| | | | - Fabio Pirro
- School of Chemistry, University of Bristol, Bristol, UK
| | | | - Ufuk Borucu
- School of Biochemistry, University of Bristol, Medical Sciences Building, Bristol, UK
| | | | - D Arne Scott
- Rosa Biotech, Science Creates St Philips, Bristol, UK
| | | | - Orion D Weiner
- Cardiovascular Research Institute, Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA, USA
| | | | - Derek N Woolfson
- School of Chemistry, University of Bristol, Bristol, UK.
- Max Planck-Bristol Centre for Minimal Biology, University of Bristol, Bristol, UK.
- School of Biochemistry, University of Bristol, Medical Sciences Building, Bristol, UK.
- Bristol BioDesign Institute, University of Bristol, Bristol, UK.
| |
Collapse
|
35
|
Sun J, Xiao Y, Xing W, Jiang W, Hu X, Li H, Liu Z, Jin Q, Ren P, Zhang H, Lobie PE. Pharmacodynamic and pharmacokinetic profiles of a novel GLP-1 receptor biased agonist-SAL0112. Biomed Pharmacother 2024; 177:116965. [PMID: 38925019 DOI: 10.1016/j.biopha.2024.116965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 06/11/2024] [Accepted: 06/15/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND AND PURPOSE GLP-1 receptor agonists are clinically utilized for type 2 diabetes and obesity. In vitro and in vivo preclinical studies were performed to assess the druggability of a novel small molecule GLP-1 receptor biased agonist SAL0112. EXPERIMENTAL APPROACH The HTRF assay, FLIPR assay, TR-FRET assay, and PathHunter assay were utilized for in vitro studies. Liver transporter tests were conducted using the HEK293-OATP1B1 and HEK293-OATP1B3 cell lines. In vitro stability assessments of various species and in vivo PK studies in rodents were performed. A model of type 2 diabetes and obesity induced by a high-energy diet in transgenic C57BL/6 mice expressing the human GLP-1 receptor gene was conducted. PRINCIPAL RESULTS SAL0112 demonstrated high potency and selectivity in activating the Gαs pathway of the GLP-1 receptor, with no observed desensitization. SAL0112 demonstrated greater stability in human and rat liver microsomes compared to Danuglipron. In vivo PK studies revealed higher absorption of SAL0112 in rats. SAL0112 displayed a significantly lower potential for DDI on liver transporters compared to Danuglipron. SAL0112 led to significant reductions in body weight (P<0.001), blood glucose levels in OGTT (P<0.001), HbA1c (P<0.05) and improved insulin resistance (P<0.01). Notably, it increased peripheral adipocyte density and resolved hepatic steatosis. The efficacy of SAL0112 was found to be comparable to that of Danuglipron and Liraglutide. CONCLUSION SAL0112 demonstrated potent and selective GLP-1 receptor biased agonism, effectively alleviating signs of type 2 diabetes in a mouse model. These promising findings pave the way for the advancement of SAL0112 into clinical trials.
Collapse
Affiliation(s)
- Jingchao Sun
- iBHE, Tsinghua Shenzhen International Graduate School, Shenzhen, Guangdong, China; R&D Center, Shenzhen Salubris Pharmaceutical Co., Ltd., Shenzhen, Guangdong, China.
| | - Ying Xiao
- R&D Center, Shenzhen Salubris Pharmaceutical Co., Ltd., Shenzhen, Guangdong, China
| | - Wei Xing
- R&D Center, Shenzhen Salubris Pharmaceutical Co., Ltd., Shenzhen, Guangdong, China
| | - Wenjuan Jiang
- R&D Center, Shenzhen Salubris Pharmaceutical Co., Ltd., Shenzhen, Guangdong, China
| | - Xuefeng Hu
- R&D Center, Shenzhen Salubris Pharmaceutical Co., Ltd., Shenzhen, Guangdong, China
| | - Hongchao Li
- R&D Center, Shenzhen Salubris Pharmaceutical Co., Ltd., Shenzhen, Guangdong, China
| | - Zhaojun Liu
- Pharmacology Department, Innoland Biosciences (SuZhou) co., LTD. Suzhou, Jiangsu, China
| | - Qian Jin
- Pharmacology Department, Innoland Biosciences (SuZhou) co., LTD. Suzhou, Jiangsu, China
| | - Peng Ren
- Biology Department, Pharmaron Inc. Beijing, China
| | - Hongmei Zhang
- Biology Department, WuXi AppTec (Shanghai) Co., Ltd. Shanghai, China
| | - Peter E Lobie
- iBHE, Tsinghua Shenzhen International Graduate School, Shenzhen, Guangdong, China.
| |
Collapse
|
36
|
Pillai A, Idris A, Philomin A, Weidle C, Skotheim R, Leung PJY, Broerman A, Demakis C, Borst AJ, Praetorius F, Baker D. De novo design of allosterically switchable protein assemblies. Nature 2024; 632:911-920. [PMID: 39143214 PMCID: PMC11338832 DOI: 10.1038/s41586-024-07813-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 07/11/2024] [Indexed: 08/16/2024]
Abstract
Allosteric modulation of protein function, wherein the binding of an effector to a protein triggers conformational changes at distant functional sites, plays a central part in the control of metabolism and cell signalling1-3. There has been considerable interest in designing allosteric systems, both to gain insight into the mechanisms underlying such 'action at a distance' modulation and to create synthetic proteins whose functions can be regulated by effectors4-7. However, emulating the subtle conformational changes distributed across many residues, characteristic of natural allosteric proteins, is a significant challenge8,9. Here, inspired by the classic Monod-Wyman-Changeux model of cooperativity10, we investigate the de novo design of allostery through rigid-body coupling of peptide-switchable hinge modules11 to protein interfaces12 that direct the formation of alternative oligomeric states. We find that this approach can be used to generate a wide variety of allosterically switchable systems, including cyclic rings that incorporate or eject subunits in response to peptide binding and dihedral cages that undergo effector-induced disassembly. Size-exclusion chromatography, mass photometry13 and electron microscopy reveal that these designed allosteric protein assemblies closely resemble the design models in both the presence and absence of peptide effectors and can have ligand-binding cooperativity comparable to classic natural systems such as haemoglobin14. Our results indicate that allostery can arise from global coupling of the energetics of protein substructures without optimized side-chain-side-chain allosteric communication pathways and provide a roadmap for generating allosterically triggerable delivery systems, protein nanomachines and cellular feedback control circuitry.
Collapse
Affiliation(s)
- Arvind Pillai
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
| | - Abbas Idris
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Annika Philomin
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Connor Weidle
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Rebecca Skotheim
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Philip J Y Leung
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Adam Broerman
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Chemical Engineering, University of Washington, Seattle, WA, USA
| | - Cullen Demakis
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Biological Physics, Structure, and Design, University of Washington, Seattle, WA, USA
| | - Andrew J Borst
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Florian Praetorius
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
- Institute of Science and Technology Austria (ISTA), Klosterneuburg, Austria.
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
| |
Collapse
|
37
|
Listov D, Goverde CA, Correia BE, Fleishman SJ. Opportunities and challenges in design and optimization of protein function. Nat Rev Mol Cell Biol 2024; 25:639-653. [PMID: 38565617 PMCID: PMC7616297 DOI: 10.1038/s41580-024-00718-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2024] [Indexed: 04/04/2024]
Abstract
The field of protein design has made remarkable progress over the past decade. Historically, the low reliability of purely structure-based design methods limited their application, but recent strategies that combine structure-based and sequence-based calculations, as well as machine learning tools, have dramatically improved protein engineering and design. In this Review, we discuss how these methods have enabled the design of increasingly complex structures and therapeutically relevant activities. Additionally, protein optimization methods have improved the stability and activity of complex eukaryotic proteins. Thanks to their increased reliability, computational design methods have been applied to improve therapeutics and enzymes for green chemistry and have generated vaccine antigens, antivirals and drug-delivery nano-vehicles. Moreover, the high success of design methods reflects an increased understanding of basic rules that govern the relationships among protein sequence, structure and function. However, de novo design is still limited mostly to α-helix bundles, restricting its potential to generate sophisticated enzymes and diverse protein and small-molecule binders. Designing complex protein structures is a challenging but necessary next step if we are to realize our objective of generating new-to-nature activities.
Collapse
Affiliation(s)
- Dina Listov
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Casper A Goverde
- Institute of Bioengineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Bruno E Correia
- Institute of Bioengineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| | - Sarel Jacob Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
38
|
Yang Y, Pan Z, Sun J, Welch J, Klionsky DJ. Autophagy and machine learning: Unanswered questions. Biochim Biophys Acta Mol Basis Dis 2024; 1870:167263. [PMID: 38801963 DOI: 10.1016/j.bbadis.2024.167263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/27/2024] [Accepted: 05/21/2024] [Indexed: 05/29/2024]
Abstract
Autophagy is a critical conserved cellular process in maintaining cellular homeostasis by clearing and recycling damaged organelles and intracellular components in lysosomes and vacuoles. Autophagy plays a vital role in cell survival, bioenergetic homeostasis, organism development, and cell death regulation. Malfunctions in autophagy are associated with various human diseases and health disorders, such as cancers and neurodegenerative diseases. Significant effort has been devoted to autophagy-related research in the context of genes, proteins, diagnosis, etc. In recent years, there has been a surge of studies utilizing state of the art machine learning (ML) tools to analyze and understand the roles of autophagy in various biological processes. We taxonomize ML techniques that are applicable in an autophagy context, comprehensively review existing efforts being taken in this direction, and outline principles to consider in a biomedical context. In recognition of recent groundbreaking advances in the deep-learning community, we discuss new opportunities in interdisciplinary collaborations and seek to engage autophagy and computer science researchers to promote autophagy research with joint efforts.
Collapse
Affiliation(s)
- Ying Yang
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, MI 48109, USA; Life Sciences Institute, University of Michigan, Ann Arbor, MI 48109, USA; Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA; Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Zhaoying Pan
- Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jianhui Sun
- Department of Computer Science, University of Virginia, Charlottesville, VA 22903, USA
| | - Joshua Welch
- Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA; Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Daniel J Klionsky
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, MI 48109, USA; Life Sciences Institute, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
39
|
Frasnetti E, Magni A, Castelli M, Serapian SA, Moroni E, Colombo G. Structures, dynamics, complexes, and functions: From classic computation to artificial intelligence. Curr Opin Struct Biol 2024; 87:102835. [PMID: 38744148 DOI: 10.1016/j.sbi.2024.102835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 04/14/2024] [Accepted: 04/22/2024] [Indexed: 05/16/2024]
Abstract
Computational approaches can provide highly detailed insight into the molecular recognition processes that underlie drug binding, the assembly of protein complexes, and the regulation of biological functional processes. Classical simulation methods can bridge a wide range of length- and time-scales typically involved in such processes. Lately, automated learning and artificial intelligence methods have shown the potential to expand the reach of physics-based approaches, ushering in the possibility to model and even design complex protein architectures. The synergy between atomistic simulations and AI methods is an emerging frontier with a huge potential for advances in structural biology. Herein, we explore various examples and frameworks for these approaches, providing select instances and applications that illustrate their impact on fundamental biomolecular problems.
Collapse
Affiliation(s)
- Elena Frasnetti
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | - Andrea Magni
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | - Matteo Castelli
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | - Stefano A Serapian
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | | | - Giorgio Colombo
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy.
| |
Collapse
|
40
|
Wei Z, Li B, Wen X, Jakobsson V, Liu P, Chen X, Zhang J. Engineered Antibodies as Cancer Radiotheranostics. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2402361. [PMID: 38874523 PMCID: PMC11321656 DOI: 10.1002/advs.202402361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 05/06/2024] [Indexed: 06/15/2024]
Abstract
Radiotheranostics is a rapidly growing approach in personalized medicine, merging diagnostic imaging and targeted radiotherapy to allow for the precise detection and treatment of diseases, notably cancer. Radiolabeled antibodies have become indispensable tools in the field of cancer theranostics due to their high specificity and affinity for cancer-associated antigens, which allows for accurate targeting with minimal impact on surrounding healthy tissues, enhancing therapeutic efficacy while reducing side effects, immune-modulating ability, and versatility and flexibility in engineering and conjugation. However, there are inherent limitations in using antibodies as a platform for radiopharmaceuticals due to their natural activities within the immune system, large size preventing effective tumor penetration, and relatively long half-life with concerns for prolonged radioactivity exposure. Antibody engineering can solve these challenges while preserving the many advantages of the immunoglobulin framework. In this review, the goal is to give a general overview of antibody engineering and design for tumor radiotheranostics. Particularly, the four ways that antibody engineering is applied to enhance radioimmunoconjugates: pharmacokinetics optimization, site-specific bioconjugation, modulation of Fc interactions, and bispecific construct creation are discussed. The radionuclide choices for designed antibody radionuclide conjugates and conjugation techniques and future directions for antibody radionuclide conjugate innovation and advancement are also discussed.
Collapse
Affiliation(s)
- Zhenni Wei
- Department of Diagnostic Radiology, Yong Loo Lin School of MedicineNational University of SingaporeSingapore119074Singapore
- Nanomedicine Translational Research ProgramNUS Center for NanomedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117597Singapore
- Clinical Imaging Research CentreCentre for Translational MedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117599Singapore
- Theranostics Center of Excellence (TCE)Yong Loo Lin School of MedicineNational University of Singapore11 Biopolis Way, HeliosSingapore138667Singapore
| | - Bingyu Li
- Department of Diagnostic Radiology, Yong Loo Lin School of MedicineNational University of SingaporeSingapore119074Singapore
- Nanomedicine Translational Research ProgramNUS Center for NanomedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117597Singapore
- Clinical Imaging Research CentreCentre for Translational MedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117599Singapore
- Theranostics Center of Excellence (TCE)Yong Loo Lin School of MedicineNational University of Singapore11 Biopolis Way, HeliosSingapore138667Singapore
| | - Xuejun Wen
- Department of Diagnostic Radiology, Yong Loo Lin School of MedicineNational University of SingaporeSingapore119074Singapore
- Nanomedicine Translational Research ProgramNUS Center for NanomedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117597Singapore
- Clinical Imaging Research CentreCentre for Translational MedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117599Singapore
- Theranostics Center of Excellence (TCE)Yong Loo Lin School of MedicineNational University of Singapore11 Biopolis Way, HeliosSingapore138667Singapore
| | - Vivianne Jakobsson
- Department of Diagnostic Radiology, Yong Loo Lin School of MedicineNational University of SingaporeSingapore119074Singapore
- Nanomedicine Translational Research ProgramNUS Center for NanomedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117597Singapore
- Clinical Imaging Research CentreCentre for Translational MedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117599Singapore
| | - Peifei Liu
- Department of Diagnostic Radiology, Yong Loo Lin School of MedicineNational University of SingaporeSingapore119074Singapore
- Nanomedicine Translational Research ProgramNUS Center for NanomedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117597Singapore
- Clinical Imaging Research CentreCentre for Translational MedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117599Singapore
- Theranostics Center of Excellence (TCE)Yong Loo Lin School of MedicineNational University of Singapore11 Biopolis Way, HeliosSingapore138667Singapore
| | - Xiaoyuan Chen
- Department of Diagnostic Radiology, Yong Loo Lin School of MedicineNational University of SingaporeSingapore119074Singapore
- Nanomedicine Translational Research ProgramNUS Center for NanomedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117597Singapore
- Clinical Imaging Research CentreCentre for Translational MedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117599Singapore
- Theranostics Center of Excellence (TCE)Yong Loo Lin School of MedicineNational University of Singapore11 Biopolis Way, HeliosSingapore138667Singapore
- Departments of SurgeryChemical and Biomolecular Engineeringand Biomedical EngineeringYong Loo Lin School of Medicine and College of Design and EngineeringNational University of SingaporeSingapore119074Singapore
- Institute of Molecular and Cell BiologyAgency for ScienceTechnologyand Research (A*STAR)61 Biopolis Drive, ProteosSingapore138673Singapore
| | - Jingjing Zhang
- Department of Diagnostic Radiology, Yong Loo Lin School of MedicineNational University of SingaporeSingapore119074Singapore
- Nanomedicine Translational Research ProgramNUS Center for NanomedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117597Singapore
- Clinical Imaging Research CentreCentre for Translational MedicineYong Loo Lin School of MedicineNational University of SingaporeSingapore117599Singapore
- Theranostics Center of Excellence (TCE)Yong Loo Lin School of MedicineNational University of Singapore11 Biopolis Way, HeliosSingapore138667Singapore
| |
Collapse
|
41
|
Xue Z, Zhou T, Xu Z, Yu S, Dai Q, Fang L. Fully forward mode training for optical neural networks. Nature 2024; 632:280-286. [PMID: 39112621 PMCID: PMC11306102 DOI: 10.1038/s41586-024-07687-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 06/06/2024] [Indexed: 08/10/2024]
Abstract
Optical computing promises to improve the speed and energy efficiency of machine learning applications1-6. However, current approaches to efficiently train these models are limited by in silico emulation on digital computers. Here we develop a method called fully forward mode (FFM) learning, which implements the compute-intensive training process on the physical system. The majority of the machine learning operations are thus efficiently conducted in parallel on site, alleviating numerical modelling constraints. In free-space and integrated photonics, we experimentally demonstrate optical systems with state-of-the-art performances for a given network size. FFM learning shows training the deepest optical neural networks with millions of parameters achieves accuracy equivalent to the ideal model. It supports all-optical focusing through scattering media with a resolution of the diffraction limit; it can also image in parallel the objects hidden outside the direct line of sight at over a kilohertz frame rate and can conduct all-optical processing with light intensity as weak as subphoton per pixel (5.40 × 1018- operations-per-second-per-watt energy efficiency) at room temperature. Furthermore, we prove that FFM learning can automatically search non-Hermitian exceptional points without an analytical model. FFM learning not only facilitates orders-of-magnitude-faster learning processes, but can also advance applied and theoretical fields such as deep neural networks, ultrasensitive perception and topological photonics.
Collapse
Affiliation(s)
- Zhiwei Xue
- Department of Electronic Engineering, Tsinghua University, Beijing, China
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing, China
- Institute for Brain and Cognitive Sciences, Tsinghua University, Beijing, China
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | - Tiankuang Zhou
- Department of Electronic Engineering, Tsinghua University, Beijing, China
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing, China
- Institute for Brain and Cognitive Sciences, Tsinghua University, Beijing, China
| | - Zhihao Xu
- Department of Electronic Engineering, Tsinghua University, Beijing, China
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing, China
- Institute for Brain and Cognitive Sciences, Tsinghua University, Beijing, China
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | - Shaoliang Yu
- Research Center for Intelligent Optoelectronic Computing, Zhejiang Laboratory, Hangzhou, China
| | - Qionghai Dai
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing, China.
- Institute for Brain and Cognitive Sciences, Tsinghua University, Beijing, China.
- Department of Automation, Tsinghua University, Beijing, China.
| | - Lu Fang
- Department of Electronic Engineering, Tsinghua University, Beijing, China.
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing, China.
- Institute for Brain and Cognitive Sciences, Tsinghua University, Beijing, China.
| |
Collapse
|
42
|
Chen T, Zhang Y, Chatterjee P. moPPIt: De Novo Generation of Motif-Specific Binders with Protein Language Models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.31.606098. [PMID: 39131360 PMCID: PMC11312608 DOI: 10.1101/2024.07.31.606098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
The ability to precisely target specific motifs on disease-related proteins, whether conserved epitopes on viral proteins, intrinsically disordered regions within transcription factors, or breakpoint junctions in fusion oncoproteins, is essential for modulating their function while minimizing off-target effects. Current methods struggle to achieve this specificity without reliable structural information. In this work, we introduce a motif-specific PPI targeting algorithm, moPPIt, for de novo generation of motif-specific peptide binders from the target protein sequence alone. At the core of moPPIt is BindEvaluator, a transformer-based model that interpolates protein language model embeddings of two proteins via a series of multi-headed self-attention blocks, with a key focus on local motif features. Trained on over 510,000 annotated PPIs, BindEvaluator accurately predicts target binding sites given protein-protein sequence pairs with a test AUC > 0.94, improving to AUC > 0.96 when fine-tuned on peptide-protein pairs. By combining BindEvaluator with our PepMLM peptide generator and genetic algorithm-based optimization, moPPIt generates peptides that bind specifically to user-defined residues on target proteins. We demonstrate moPPIt's efficacy in computationally designing binders to specific motifs, first on targets with known binding peptides and then extending to structured and disordered targets with no known binders. In total, moPPIt serves as a powerful tool for developing highly specific peptide therapeutics without relying on target structure or structure-dependent latent spaces.
Collapse
Affiliation(s)
- Tong Chen
- Department of Biomedical Engineering, Duke University
| | - Yinuo Zhang
- Department of Biostatistics and Bioinformatics, Duke University
| | - Pranam Chatterjee
- Department of Biomedical Engineering, Duke University
- Department of Biostatistics and Bioinformatics, Duke University
- Department of Computer Science, Duke University
| |
Collapse
|
43
|
Jiang H, Jude KM, Wu K, Fallas J, Ueda G, Brunette TJ, Hicks DR, Pyles H, Yang A, Carter L, Lamb M, Li X, Levine PM, Stewart L, Garcia KC, Baker D. De novo design of buttressed loops for sculpting protein functions. Nat Chem Biol 2024; 20:974-980. [PMID: 38816644 PMCID: PMC11288887 DOI: 10.1038/s41589-024-01632-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Accepted: 04/29/2024] [Indexed: 06/01/2024]
Abstract
In natural proteins, structured loops have central roles in molecular recognition, signal transduction and enzyme catalysis. However, because of the intrinsic flexibility and irregularity of loop regions, organizing multiple structured loops at protein functional sites has been very difficult to achieve by de novo protein design. Here we describe a solution to this problem that designs tandem repeat proteins with structured loops (9-14 residues) buttressed by extensive hydrogen bonding interactions. Experimental characterization shows that the designs are monodisperse, highly soluble, folded and thermally stable. Crystal structures are in close agreement with the design models, with the loops structured and buttressed as designed. We demonstrate the functionality afforded by loop buttressing by designing and characterizing binders for extended peptides in which the loops form one side of an extended binding pocket. The ability to design multiple structured loops should contribute generally to efforts to design new protein functions.
Collapse
Affiliation(s)
- Hanlun Jiang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Kevin M Jude
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA, USA
- Department of Molecular and Cellular Physiology, Stanford University School of Medicine, Stanford, CA, USA
| | - Kejia Wu
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Biological Physics, Structure and Design Graduate Program, University of Washington, Seattle, WA, USA
| | - Jorge Fallas
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - George Ueda
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - T J Brunette
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Derrick R Hicks
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Harley Pyles
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Aerin Yang
- Department of Molecular and Cellular Physiology, Stanford University School of Medicine, Stanford, CA, USA
| | - Lauren Carter
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Mila Lamb
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Xinting Li
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Paul M Levine
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Lance Stewart
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - K Christopher Garcia
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA, USA.
- Department of Molecular and Cellular Physiology, Stanford University School of Medicine, Stanford, CA, USA.
- Department of Structural Biology, Stanford University School of Medicine, Stanford, CA, USA.
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
44
|
Freschlin CR, Fahlberg SA, Heinzelman P, Romero PA. Neural network extrapolation to distant regions of the protein fitness landscape. Nat Commun 2024; 15:6405. [PMID: 39080282 PMCID: PMC11289474 DOI: 10.1038/s41467-024-50712-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 07/13/2024] [Indexed: 08/02/2024] Open
Abstract
Machine learning (ML) has transformed protein engineering by constructing models of the underlying sequence-function landscape to accelerate the discovery of new biomolecules. ML-guided protein design requires models, trained on local sequence-function information, to accurately predict distant fitness peaks. In this work, we evaluate neural networks' capacity to extrapolate beyond their training data. We perform model-guided design using a panel of neural network architectures trained on protein G (GB1)-Immunoglobulin G (IgG) binding data and experimentally test thousands of GB1 designs to systematically evaluate the models' extrapolation. We find each model architecture infers markedly different landscapes from the same data, which give rise to unique design preferences. We find simpler models excel in local extrapolation to design high fitness proteins, while more sophisticated convolutional models can venture deep into sequence space to design proteins that fold but are no longer functional. We also find that implementing a simple ensemble of convolutional neural networks enables robust design of high-performing variants in the local landscape. Our findings highlight how each architecture's inductive biases prime them to learn different aspects of the protein fitness landscape and how a simple ensembling approach makes protein engineering more robust.
Collapse
Affiliation(s)
- Chase R Freschlin
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Sarah A Fahlberg
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Pete Heinzelman
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Philip A Romero
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA.
- Department of Chemical & Biological Engineering, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
45
|
Zhang Z, Shen W, Liu Q, Zitnik M. Efficient Generation of Protein Pockets with PocketGen. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.25.581968. [PMID: 38464121 PMCID: PMC10925136 DOI: 10.1101/2024.02.25.581968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Designing protein-binding proteins plays an important role in drug discovery. However, AI-based design of such proteins is challenging due to complex ligand-protein interactions, flexibility of ligand molecules and amino acid side chains, and sequence-structure dependencies. We introduce PocketGen, a deep generative model that produces both the residue sequence and atom structure of the protein regions where interactions with ligand molecules occur. PocketGen ensures sequence-structure consistency by using a graph transformer for structural encoding and a sequence refinement module based on a protein language model. The bilevel graph transformer captures interactions at multiple granularities across atom, residue, and ligand levels. To enhance sequence refinement, PocketGen integrates a structural adapter with the protein language model, ensuring consistency between structure-based and sequence-based predictions. Results show that PocketGen can generate high-fidelity protein pockets with superior binding affinity and structural validity. It is ten times faster than physics-based methods and achieves a 95% success rate, defined as the percentage of generated pockets with higher binding affinity than reference pockets, along with achieving an amino acid recovery rate exceeding 64%.
Collapse
Affiliation(s)
- Zaixi Zhang
- State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China, Hefei, Anhui, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, Anhui, China
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Wanxiang Shen
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Qi Liu
- State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China, Hefei, Anhui, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, Anhui, China
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Data Science Initiative, Cambridge, MA, USA
| |
Collapse
|
46
|
Sun X, Lian Y, Tian T, Cui Z. Advancements in Functional Nanomaterials Inspired by Viral Particles. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2024:e2402980. [PMID: 39058214 DOI: 10.1002/smll.202402980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Revised: 06/27/2024] [Indexed: 07/28/2024]
Abstract
Virus-like particles (VLPs) are nanostructures composed of one or more structural proteins, exhibiting stable and symmetrical structures. Their precise compositions and dimensions provide versatile opportunities for modifications, enhancing their functionality. Consequently, VLP-based nanomaterials have gained widespread adoption across diverse domains. This review focuses on three key aspects: the mechanisms of viral capsid protein self-assembly into VLPs, design methods for constructing multifunctional VLPs, and strategies for synthesizing multidimensional nanomaterials using VLPs. It provides a comprehensive overview of the advancements in virus-inspired functional nanomaterials, encompassing VLP assembly, functionalization, and the synthesis of multidimensional nanomaterials. Additionally, this review explores future directions, opportunities, and challenges in the field of VLP-based nanomaterials, aiming to shed light on potential advancements and prospects in this exciting area of research.
Collapse
Affiliation(s)
- Xianxun Sun
- College of Life Science, Jiang Han University, Wuhan, 430056, China
| | - Yindong Lian
- College of Life Science, Jiang Han University, Wuhan, 430056, China
- State Key Laboratory of Virology, Wuhan Institute of Virology, Center for Biosafety Mega-Science, Chinese Academy of Sciences, Wuhan, 430071, China
| | - Tao Tian
- College of Life Science, Jiang Han University, Wuhan, 430056, China
- State Key Laboratory of Virology, Wuhan Institute of Virology, Center for Biosafety Mega-Science, Chinese Academy of Sciences, Wuhan, 430071, China
| | - Zongqiang Cui
- State Key Laboratory of Virology, Wuhan Institute of Virology, Center for Biosafety Mega-Science, Chinese Academy of Sciences, Wuhan, 430071, China
| |
Collapse
|
47
|
Ding N, Yuan Z, Ma Z, Wu Y, Yin L. AI-Assisted Rational Design and Activity Prediction of Biological Elements for Optimizing Transcription-Factor-Based Biosensors. Molecules 2024; 29:3512. [PMID: 39124917 PMCID: PMC11313831 DOI: 10.3390/molecules29153512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 07/22/2024] [Accepted: 07/24/2024] [Indexed: 08/12/2024] Open
Abstract
The rational design, activity prediction, and adaptive application of biological elements (bio-elements) are crucial research fields in synthetic biology. Currently, a major challenge in the field is efficiently designing desired bio-elements and accurately predicting their activity using vast datasets. The advancement of artificial intelligence (AI) technology has enabled machine learning and deep learning algorithms to excel in uncovering patterns in bio-element data and predicting their performance. This review explores the application of AI algorithms in the rational design of bio-elements, activity prediction, and the regulation of transcription-factor-based biosensor response performance using AI-designed elements. We discuss the advantages, adaptability, and biological challenges addressed by the AI algorithms in various applications, highlighting their powerful potential in analyzing biological data. Furthermore, we propose innovative solutions to the challenges faced by AI algorithms in the field and suggest future research directions. By consolidating current research and demonstrating the practical applications and future potential of AI in synthetic biology, this review provides valuable insights for advancing both academic research and practical applications in biotechnology.
Collapse
Affiliation(s)
- Nana Ding
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Hangzhou 311300, China;
- Zhejiang Provincial Key Laboratory of Resources Protection and Innovation of Traditional Chinese Medicine, Zhejiang A&F University, Hangzhou 311300, China
| | - Zenan Yuan
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Hangzhou 311300, China;
- Zhejiang Provincial Key Laboratory of Resources Protection and Innovation of Traditional Chinese Medicine, Zhejiang A&F University, Hangzhou 311300, China
| | - Zheng Ma
- Zhejiang Provincial Key Laboratory of Biometrology and Inspection & Quarantine, College of Life Sciences, China Jiliang University, Hangzhou 310018, China;
| | - Yefei Wu
- Zhejiang Qianjiang Biochemical Co., Ltd., Haining 314400, China;
| | - Lianghong Yin
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Hangzhou 311300, China;
- Zhejiang Provincial Key Laboratory of Resources Protection and Innovation of Traditional Chinese Medicine, Zhejiang A&F University, Hangzhou 311300, China
| |
Collapse
|
48
|
Krapp LF, Meireles FA, Abriata LA, Devillard J, Vacle S, Marcaida MJ, Dal Peraro M. Context-aware geometric deep learning for protein sequence design. Nat Commun 2024; 15:6273. [PMID: 39054322 PMCID: PMC11272779 DOI: 10.1038/s41467-024-50571-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Accepted: 07/15/2024] [Indexed: 07/27/2024] Open
Abstract
Protein design and engineering are evolving at an unprecedented pace leveraging the advances in deep learning. Current models nonetheless cannot natively consider non-protein entities within the design process. Here, we introduce a deep learning approach based solely on a geometric transformer of atomic coordinates and element names that predicts protein sequences from backbone scaffolds aware of the restraints imposed by diverse molecular environments. To validate the method, we show that it can produce highly thermostable, catalytically active enzymes with high success rates. This concept is anticipated to improve the versatility of protein design pipelines for crafting desired functions.
Collapse
Affiliation(s)
- Lucien F Krapp
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, Ecole Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Fernando A Meireles
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, Ecole Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Luciano A Abriata
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, Ecole Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Jean Devillard
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, Ecole Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Sarah Vacle
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, Ecole Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Maria J Marcaida
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, Ecole Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Matteo Dal Peraro
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, Ecole Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| |
Collapse
|
49
|
Birch-Price Z, Hardy FJ, Lister TM, Kohn AR, Green AP. Noncanonical Amino Acids in Biocatalysis. Chem Rev 2024; 124:8740-8786. [PMID: 38959423 PMCID: PMC11273360 DOI: 10.1021/acs.chemrev.4c00120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 06/11/2024] [Accepted: 06/12/2024] [Indexed: 07/05/2024]
Abstract
In recent years, powerful genetic code reprogramming methods have emerged that allow new functional components to be embedded into proteins as noncanonical amino acid (ncAA) side chains. In this review, we will illustrate how the availability of an expanded set of amino acid building blocks has opened a wealth of new opportunities in enzymology and biocatalysis research. Genetic code reprogramming has provided new insights into enzyme mechanisms by allowing introduction of new spectroscopic probes and the targeted replacement of individual atoms or functional groups. NcAAs have also been used to develop engineered biocatalysts with improved activity, selectivity, and stability, as well as enzymes with artificial regulatory elements that are responsive to external stimuli. Perhaps most ambitiously, the combination of genetic code reprogramming and laboratory evolution has given rise to new classes of enzymes that use ncAAs as key catalytic elements. With the framework for developing ncAA-containing biocatalysts now firmly established, we are optimistic that genetic code reprogramming will become a progressively more powerful tool in the armory of enzyme designers and engineers in the coming years.
Collapse
Affiliation(s)
| | | | | | | | - Anthony P. Green
- Manchester Institute of Biotechnology,
School of Chemistry, University of Manchester, Manchester M1 7DN, U.K.
| |
Collapse
|
50
|
Liu H, Yin H, Luo Z, Wang X. Integrating chemistry knowledge in large language models via prompt engineering. Synth Syst Biotechnol 2024; 10:23-38. [PMID: 39206087 PMCID: PMC11350497 DOI: 10.1016/j.synbio.2024.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 07/08/2024] [Accepted: 07/20/2024] [Indexed: 09/04/2024] Open
Abstract
This paper presents a study on the integration of domain-specific knowledge in prompt engineering to enhance the performance of large language models (LLMs) in scientific domains. The proposed domain-knowledge embedded prompt engineering method outperforms traditional prompt engineering strategies on various metrics, including capability, accuracy, F1 score, and hallucination drop. The effectiveness of the method is demonstrated through case studies on complex materials including the MacMillan catalyst, paclitaxel, and lithium cobalt oxide. The results suggest that domain-knowledge prompts can guide LLMs to generate more accurate and relevant responses, highlighting the potential of LLMs as powerful tools for scientific discovery and innovation when equipped with domain-specific prompts. The study also discusses limitations and future directions for domain-specific prompt engineering development.
Collapse
Affiliation(s)
- Hongxuan Liu
- Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China
| | - Haoyu Yin
- Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China
| | - Zhiyao Luo
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Old Road Campus Research Building, Headington, Oxford, OX3 7DQ, United Kingdom
| | - Xiaonan Wang
- Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China
- Key Laboratory for Industrial Biocatalysis, Ministry of Education, Tsinghua University, Beijing, 100084, China
| |
Collapse
|