1
|
Weckbecker M, Anžel A, Yang Z, Hattab G. Interpretable molecular encodings and representations for machine learning tasks. Comput Struct Biotechnol J 2024; 23:2326-2336. [PMID: 38867722 PMCID: PMC11167246 DOI: 10.1016/j.csbj.2024.05.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 05/13/2024] [Accepted: 05/19/2024] [Indexed: 06/14/2024] Open
Abstract
Molecular encodings and their usage in machine learning models have demonstrated significant breakthroughs in biomedical applications, particularly in the classification of peptides and proteins. To this end, we propose a new encoding method: Interpretable Carbon-based Array of Neighborhoods (iCAN). Designed to address machine learning models' need for more structured and less flexible input, it captures the neighborhoods of carbon atoms in a counting array and improves the utility of the resulting encodings for machine learning models. The iCAN method provides interpretable molecular encodings and representations, enabling the comparison of molecular neighborhoods, identification of repeating patterns, and visualization of relevance heat maps for a given data set. When reproducing a large biomedical peptide classification study, it outperforms its predecessor encoding. When extended to proteins, it outperforms a lead structure-based encoding on 71% of the data sets. Our method offers interpretable encodings that can be applied to all organic molecules, including exotic amino acids, cyclic peptides, and larger proteins, making it highly versatile across various domains and data sets. This work establishes a promising new direction for machine learning in peptide and protein classification in biomedicine and healthcare, potentially accelerating advances in drug discovery and disease diagnosis.
Collapse
Affiliation(s)
- Moritz Weckbecker
- Center for Artificial Intelligence in Public Health Research, (ZKI-PH), Robert Koch Institute, Nordufer 20, Berlin, 13353, Berlin, Germany
| | - Aleksandar Anžel
- Center for Artificial Intelligence in Public Health Research, (ZKI-PH), Robert Koch Institute, Nordufer 20, Berlin, 13353, Berlin, Germany
| | - Zewen Yang
- Center for Artificial Intelligence in Public Health Research, (ZKI-PH), Robert Koch Institute, Nordufer 20, Berlin, 13353, Berlin, Germany
| | - Georges Hattab
- Center for Artificial Intelligence in Public Health Research, (ZKI-PH), Robert Koch Institute, Nordufer 20, Berlin, 13353, Berlin, Germany
- Department of Mathematics and Computer science Freie Universität, Arnimallee 14, Berlin, 14195, Berlin, Germany
| |
Collapse
|
2
|
Grambow CA, Weir H, Cunningham CN, Biancalani T, Chuang KV. CREMP: Conformer-rotamer ensembles of macrocyclic peptides for machine learning. Sci Data 2024; 11:859. [PMID: 39122750 PMCID: PMC11316032 DOI: 10.1038/s41597-024-03698-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 07/29/2024] [Indexed: 08/12/2024] Open
Abstract
Computational and machine learning approaches to model the conformational landscape of macrocyclic peptides have the potential to enable rational design and optimization. However, accurate, fast, and scalable methods for modeling macrocycle geometries remain elusive. Recent deep learning approaches have significantly accelerated protein structure prediction and the generation of small-molecule conformational ensembles, yet similar progress has not been made for macrocyclic peptides due to their unique properties. Here, we introduce CREMP, a resource generated for the rapid development and evaluation of machine learning models for macrocyclic peptides. CREMP contains 36,198 unique macrocyclic peptides and their high-quality structural ensembles generated using the Conformer-Rotamer Ensemble Sampling Tool (CREST). Altogether, this new dataset contains nearly 31.3 million unique macrocycle geometries, each annotated with energies derived from semi-empirical extended tight-binding (xTB) DFT calculations. Additionally, we include 3,258 macrocycles with reported passive permeability data to couple conformational ensembles to experiment. We anticipate that this dataset will enable the development of machine learning models that can improve peptide design and optimization for novel therapeutics.
Collapse
Affiliation(s)
- Colin A Grambow
- Prescient Design, Genentech, 1 DNA Way, South San Francisco, CA, 94080, USA.
| | - Hayley Weir
- Prescient Design, Genentech, 1 DNA Way, South San Francisco, CA, 94080, USA
| | - Christian N Cunningham
- Department of Peptide Therapeutics, Genentech, 1 DNA Way, South San Francisco, CA, 94080, USA
| | - Tommaso Biancalani
- Biology Research | Development, Genentech, 1 DNA Way, South San Francisco, CA, 94080, USA
| | - Kangway V Chuang
- Prescient Design, Genentech, 1 DNA Way, South San Francisco, CA, 94080, USA.
| |
Collapse
|
3
|
Feller AL, Wilke CO. Peptide-specific chemical language model successfully predicts membrane diffusion of cyclic peptides. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.09.607221. [PMID: 39149303 PMCID: PMC11326283 DOI: 10.1101/2024.08.09.607221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Biological language modeling has significantly advanced the prediction of membrane penetration for small molecule drugs and natural peptides. However, accurately predicting membrane diffusion for peptides with pharmacologically relevant modifications remains a substantial challenge. Here, we introduce PeptideCLM, a peptide-focused chemical language model capable of encoding peptides with chemical modifications, unnatural or non-canonical amino acids, and cyclizations. We assess this model by predicting membrane diffusion of cyclic peptides, demonstrating greater predictive power than existing chemical language models. Our model is versatile, able to be extended beyond membrane diffusion predictions to other target values. Its advantages include the ability to model macromolecules using chemical string notation, a largely unexplored domain, and a simple, flexible architecture that allows for adaptation to any peptide or other macromolecule dataset.
Collapse
Affiliation(s)
- Aaron L Feller
- Interdisciplinary Life Sciences, The University of Texas, Austin
| | - Claus O Wilke
- Department of Integrative Biology, The University of Texas, Austin
- Interdisciplinary Life Sciences, The University of Texas, Austin
| |
Collapse
|
4
|
Tan X, Liu Q, Fang Y, Zhu Y, Chen F, Zeng W, Ouyang D, Dong J. Predicting Peptide Permeability Across Diverse Barriers: A Systematic Investigation. Mol Pharm 2024; 21:4116-4127. [PMID: 39031123 DOI: 10.1021/acs.molpharmaceut.4c00478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/22/2024]
Abstract
Peptide-based therapeutics hold immense promise for the treatment of various diseases. However, their effectiveness is often hampered by poor cell membrane permeability, hindering targeted intracellular delivery and oral drug development. This study addressed this challenge by introducing a novel graph neural network (GNN) framework and advanced machine learning algorithms to build predictive models for peptide permeability. Our models offer systematic evaluation across diverse peptides (natural, modified, linear and cyclic) and cell lines [Caco-2, Ralph Russ canine kidney (RRCK) and parallel artificial membrane permeability assay (PAMPA)]. The predictive models for linear and cyclic peptides in Caco-2 and RRCK cell lines were constructed for the first time, with an impressive coefficient of determination (R2) of 0.708, 0.484, 0.553, and 0.528 in the test set, respectively. Notably, the GNN framework behaved better in permeability prediction with larger data sets and improved the accuracy of cyclic peptide prediction in the PAMPA cell line. The R2 increased by about 0.32 compared with the reported models. Furthermore, the important molecular structural features that contribute to good permeability were interpreted; the influence of cell lines, peptide modification, and cyclization on permeability were successfully revealed. To facilitate broader use, we deployed these models on the user-friendly KNIME platform (https://github.com/ifyoungnet/PharmPapp). This work provides a rapid and reliable strategy for systematically assessing peptide permeability, aiding researchers in drug delivery optimization, peptide preselection during drug discovery, and potentially the design of targeted peptide-based materials.
Collapse
Affiliation(s)
- Xiaorong Tan
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410083, China
| | - Qianhui Liu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410083, China
| | - Yanpeng Fang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410083, China
| | - Yingli Zhu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410083, China
| | - Fei Chen
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410083, China
| | - Wenbin Zeng
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410083, China
| | - Defang Ouyang
- Institute of Chinese Medical Sciences (ICMS), State Key Laboratory of Quality Research in Chinese Medicine, University of Macau, Macau 999078, China
| | - Jie Dong
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410083, China
| |
Collapse
|
5
|
Yu Y, Gu M, Guo H, Deng Y, Chen D, Wang J, Wang C, Liu X, Yan W, Huang J. MuCoCP: a priori chemical knowledge-based multimodal contrastive learning pre-trained neural network for the prediction of cyclic peptide membrane penetration ability. Bioinformatics 2024; 40:btae473. [PMID: 39067027 PMCID: PMC11315609 DOI: 10.1093/bioinformatics/btae473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 07/04/2024] [Accepted: 07/25/2024] [Indexed: 07/30/2024] Open
Abstract
MOTIVATION There has been a burgeoning interest in cyclic peptide therapeutics due to their various outstanding advantages and strong potential for drug formation. However, it is undoubtedly costly and inefficient to use traditional wet lab methods to clarify their biological activities. Using artificial intelligence instead is a more energy-efficient and faster approach. MuCoCP aims to build a complete pre-trained model for extracting potential features of cyclic peptides, which can be fine-tuned to accurately predict cyclic peptide bioactivity on various downstream tasks. To maximize its effectiveness, we use a novel data augmentation method based on a priori chemical knowledge and multiple unsupervised training objective functions to greatly improve the information-grabbing ability of the model. RESULTS To assay the efficacy of the model, we conducted validation on the membrane-permeability of cyclic peptides which achieved an accuracy of 0.87 and R-squared of 0.503 on CycPeptMPDB using semi-supervised training and obtained an accuracy of 0.84 and R-squared of 0.384 using a model with frozen parameters on an external dataset. This result has achieved state-of-the-art, which substantiates the stability and generalization capability of MuCoCP. It means that MuCoCP can fully explore the high-dimensional information of cyclic peptides and make accurate predictions on downstream bioactivity tasks, which will serve as a guide for the future de novo design of cyclic peptide drugs and promote the development of cyclic peptide drugs. AVAILABILITY AND IMPLEMENTATION All code used in our proposed method can be found at https://github.com/lennonyu11234/MuCoCP.
Collapse
Affiliation(s)
- Yunxiang Yu
- School of Basic Medical Sciences, Lanzhou University, Lanzhou, 730000, China
| | - Mengyun Gu
- School of Basic Medical Sciences, Lanzhou University, Lanzhou, 730000, China
| | - Hai Guo
- The Second Hospital Clinical Medical School, Lanzhou University, Lanzhou, 730000, China
| | - Yabo Deng
- School of Basic Medical Sciences, Lanzhou University, Lanzhou, 730000, China
| | - Danna Chen
- School of Basic Medical Sciences, Lanzhou University, Lanzhou, 730000, China
- The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524000, China
- Guangzhou First People’s Hospital, South China University of Technology, Guangzhou, 510180, China
| | - Jianwei Wang
- Guangzhou First People’s Hospital, South China University of Technology, Guangzhou, 510180, China
| | - Caixia Wang
- Guangzhou First People’s Hospital, South China University of Technology, Guangzhou, 510180, China
| | - Xia Liu
- School of Basic Medical Sciences, Lanzhou University, Lanzhou, 730000, China
| | - Wenjin Yan
- School of Basic Medical Sciences, Lanzhou University, Lanzhou, 730000, China
| | - Jinqi Huang
- The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524000, China
- Guangzhou First People’s Hospital, South China University of Technology, Guangzhou, 510180, China
| |
Collapse
|
6
|
Li J, Yanagisawa K, Akiyama Y. CycPeptMP: enhancing membrane permeability prediction of cyclic peptides with multi-level molecular features and data augmentation. Brief Bioinform 2024; 25:bbae417. [PMID: 39210505 PMCID: PMC11361855 DOI: 10.1093/bib/bbae417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 07/23/2024] [Accepted: 08/22/2024] [Indexed: 09/04/2024] Open
Abstract
Cyclic peptides are versatile therapeutic agents that boast high binding affinity, minimal toxicity, and the potential to engage challenging protein targets. However, the pharmaceutical utility of cyclic peptides is limited by their low membrane permeability-an essential indicator of oral bioavailability and intracellular targeting. Current machine learning-based models of cyclic peptide permeability show variable performance owing to the limitations of experimental data. Furthermore, these methods use features derived from the whole molecule that have traditionally been used to predict small molecules and ignore the unique structural properties of cyclic peptides. This study presents CycPeptMP: an accurate and efficient method to predict cyclic peptide membrane permeability. We designed features for cyclic peptides at the atom-, monomer-, and peptide-levels and seamlessly integrated these into a fusion model using deep learning technology. Additionally, we applied various data augmentation techniques to enhance model training efficiency using the latest data. The fusion model exhibited excellent prediction performance for the logarithm of permeability, with a mean absolute error of $0.355$ and correlation coefficient of $0.883$. Ablation studies demonstrated that all feature levels contributed and were relatively essential to predicting membrane permeability, confirming the effectiveness of augmentation to improve prediction accuracy. A comparison with a molecular dynamics-based method showed that CycPeptMP accurately predicted peptide permeability, which is otherwise difficult to predict using simulations.
Collapse
Affiliation(s)
- Jianan Li
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Tokyo 1528550, Japan
| | - Keisuke Yanagisawa
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Tokyo 1528550, Japan
- Middle-Molecule ITbased Drug Discovery Laboratory (MIDL), Tokyo Institute of Technology, Tokyo 1528550, Japan
| | - Yutaka Akiyama
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Tokyo 1528550, Japan
- Middle-Molecule ITbased Drug Discovery Laboratory (MIDL), Tokyo Institute of Technology, Tokyo 1528550, Japan
| |
Collapse
|
7
|
Xu X, Xu C, He W, Wei L, Li H, Zhou J, Zhang R, Wang Y, Xiong Y, Gao X. HELM-GPT: de novo macrocyclic peptide design using generative pre-trained transformer. Bioinformatics 2024; 40:btae364. [PMID: 38867692 PMCID: PMC11256930 DOI: 10.1093/bioinformatics/btae364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 05/08/2024] [Accepted: 06/10/2024] [Indexed: 06/14/2024] Open
Abstract
MOTIVATION Macrocyclic peptides hold great promise as therapeutics targeting intracellular proteins. This stems from their remarkable ability to bind flat protein surfaces with high affinity and specificity while potentially traversing the cell membrane. Research has already explored their use in developing inhibitors for intracellular proteins, such as KRAS, a well-known driver in various cancers. However, computational approaches for de novo macrocyclic peptide design remain largely unexplored. RESULTS Here, we introduce HELM-GPT, a novel method that combines the strength of the hierarchical editing language for macromolecules (HELM) representation and generative pre-trained transformer (GPT) for de novo macrocyclic peptide design. Through reinforcement learning (RL), our experiments demonstrate that HELM-GPT has the ability to generate valid macrocyclic peptides and optimize their properties. Furthermore, we introduce a contrastive preference loss during the RL process, further enhanced the optimization performance. Finally, to co-optimize peptide permeability and KRAS binding affinity, we propose a step-by-step optimization strategy, demonstrating its effectiveness in generating molecules fulfilling both criteria. In conclusion, the HELM-GPT method can be used to identify novel macrocyclic peptides to target intracellular proteins. AVAILABILITY AND IMPLEMENTATION The code and data of HELM-GPT are freely available on GitHub (https://github.com/charlesxu90/helm-gpt).
Collapse
Affiliation(s)
- Xiaopeng Xu
- Computer Science Program, Computer, Electrical and Mathematical Science and Engineering (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
| | - Chencheng Xu
- Computer Science Program, Computer, Electrical and Mathematical Science and Engineering (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
| | - Wenjia He
- Computer Science Program, Computer, Electrical and Mathematical Science and Engineering (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
| | - Lesong Wei
- Computer Science Program, Computer, Electrical and Mathematical Science and Engineering (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
| | - Haoyang Li
- Computer Science Program, Computer, Electrical and Mathematical Science and Engineering (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
| | - Juexiao Zhou
- Computer Science Program, Computer, Electrical and Mathematical Science and Engineering (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
| | | | - Yu Wang
- Syneron Technology, Guangzhou 510000, China
| | | | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Science and Engineering (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
| |
Collapse
|
8
|
Frazee N, Billlings KR, Mertz B. Gaussian accelerated molecular dynamics simulations facilitate prediction of the permeability of cyclic peptides. PLoS One 2024; 19:e0300688. [PMID: 38652734 PMCID: PMC11037548 DOI: 10.1371/journal.pone.0300688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 03/02/2024] [Indexed: 04/25/2024] Open
Abstract
Despite their widespread use as therapeutics, clinical development of small molecule drugs remains challenging. Among the many parameters that undergo optimization during the drug development process, increasing passive cell permeability (i.e., log(P)) can have some of the largest impact on potency. Cyclic peptides (CPs) have emerged as a viable alternative to small molecules, as they retain many of the advantages of small molecules (oral availability, target specificity) while being highly effective at traversing the plasma membrane. However, the relationship between the dominant conformations that typify CPs in an aqueous versus a membrane environment and cell permeability remain poorly characterized. In this study, we have used Gaussian accelerated molecular dynamics (GaMD) simulations to characterize the effect of solvent on the free energy landscape of lariat peptides, a subset of CPs that have recently shown potential for drug development (Kelly et al., JACS 2021). Differences in the free energy of lariat peptides as a function of solvent can be used to predict permeability of these molecules, and our results show that permeability is most greatly influenced by N-methylation and exposure to solvent. Our approach lays the groundwork for using GaMD as a way to virtually screen large libraries of CPs and drive forward development of CP-based therapeutics.
Collapse
Affiliation(s)
- Nicolas Frazee
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, WV, United States of America
| | - Kyle R. Billlings
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, WV, United States of America
| | - Blake Mertz
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, WV, United States of America
| |
Collapse
|
9
|
Wu X, Lin H, Bai R, Duan H. Deep learning for advancing peptide drug development: Tools and methods in structure prediction and design. Eur J Med Chem 2024; 268:116262. [PMID: 38387334 DOI: 10.1016/j.ejmech.2024.116262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 02/06/2024] [Accepted: 02/17/2024] [Indexed: 02/24/2024]
Abstract
Peptides can bind challenging disease targets with high affinity and specificity, offering enormous opportunities for addressing unmet medical needs. However, peptides' unique features, including smaller size, increased structural flexibility, and limited data availability, pose additional challenges to the design process compared to proteins. This review explores the dynamic field of peptide therapeutics, leveraging deep learning to enhance structure prediction and design. Our exploration encompasses various facets of peptide research, ranging from dataset curation handling to model development. As deep learning technologies become more refined, we channel our efforts into peptide structure prediction and design, aligning with the fundamental principles of structure-activity relationships in drug development. To guide researchers in harnessing the potential of deep learning to advance peptide drug development, our insights comprehensively explore current challenges and future directions of peptide therapeutics.
Collapse
Affiliation(s)
- Xinyi Wu
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, PR China
| | - Huitian Lin
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, PR China
| | - Renren Bai
- School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, PR China.
| | - Hongliang Duan
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, PR China.
| |
Collapse
|
10
|
Cao L, Xu Z, Shang T, Zhang C, Wu X, Wu Y, Zhai S, Zhan Z, Duan H. Multi_CycGT: A Deep Learning-Based Multimodal Model for Predicting the Membrane Permeability of Cyclic Peptides. J Med Chem 2024; 67:1888-1899. [PMID: 38270541 DOI: 10.1021/acs.jmedchem.3c01611] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
Cyclic peptides are gaining attention for their strong binding affinity, low toxicity, and ability to target "undruggable" proteins; however, their therapeutic potential against intracellular targets is constrained by their limited membrane permeability, and researchers need much time and money to test this property in the laboratory. Herein, we propose an innovative multimodal model called Multi_CycGT, which combines a graph convolutional network (GCN) and a transformer to extract one- and two-dimensional features for predicting cyclic peptide permeability. The extensive benchmarking experiments show that our Multi_CycGT model can attain state-of-the-art performance, with an average accuracy of 0.8206 and an area under the curve of 0.8650, and demonstrates satisfactory generalization ability on several external data sets. To the best of our knowledge, it is the first deep learning-based attempt to predict the membrane permeability of cyclic peptides, which is beneficial in accelerating the design of cyclic peptide active drugs in medicinal chemistry and chemical biology applications.
Collapse
Affiliation(s)
- Lujing Cao
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, P. R. China
| | - Zhenyu Xu
- AI Department, Shanghai Highslab Therapeutics, Inc., Shanghai 201203, China
| | - Tianfeng Shang
- AI Department, Shanghai Highslab Therapeutics, Inc., Shanghai 201203, China
| | - Chengyun Zhang
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, P. R. China
- AI Department, Shanghai Highslab Therapeutics, Inc., Shanghai 201203, China
| | - Xinyi Wu
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, P. R. China
| | - Yejian Wu
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, P. R. China
| | - Silong Zhai
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, P. R. China
| | - Zhajun Zhan
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, P. R. China
| | - Hongliang Duan
- Faculty of Applied Sciences, Macao Polytechnic University, Macao 999078, China
| |
Collapse
|
11
|
de Raffele D, Ilie IM. Unlocking novel therapies: cyclic peptide design for amyloidogenic targets through synergies of experiments, simulations, and machine learning. Chem Commun (Camb) 2024; 60:632-645. [PMID: 38131333 DOI: 10.1039/d3cc04630c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Existing therapies for neurodegenerative diseases like Parkinson's and Alzheimer's address only their symptoms and do not prevent disease onset. Common therapeutic agents, such as small molecules and antibodies struggle with insufficient selectivity, stability and bioavailability, leading to poor performance in clinical trials. Peptide-based therapeutics are emerging as promising candidates, with successful applications for cardiovascular diseases and cancers due to their high bioavailability, good efficacy and specificity. In particular, cyclic peptides have a long in vivo stability, while maintaining a robust antibody-like binding affinity. However, the de novo design of cyclic peptides is challenging due to the lack of long-lived druggable pockets of the target polypeptide, absence of exhaustive conformational distributions of the target and/or the binder, unknown binding site, methodological limitations, associated constraints (failed trials, time, money) and the vast combinatorial sequence space. Hence, efficient alignment and cooperation between disciplines, and synergies between experiments and simulations complemented by popular techniques like machine-learning can significantly speed up the therapeutic cyclic-peptide development for neurodegenerative diseases. We review the latest advancements in cyclic peptide design against amyloidogenic targets from a computational perspective in light of recent advancements and potential of machine learning to optimize the design process. We discuss the difficulties encountered when designing novel peptide-based inhibitors and we propose new strategies incorporating experiments, simulations and machine learning to design cyclic peptides to inhibit the toxic propagation of amyloidogenic polypeptides. Importantly, these strategies extend beyond the mere design of cyclic peptides and serve as template for the de novo generation of (bio)materials with programmable properties.
Collapse
Affiliation(s)
- Daria de Raffele
- University of Amsterdam, van 't Hoff Institute for Molecular Sciences, Science Park 904, P.O. Box 94157, 1090 GD Amsterdam, The Netherlands.
- Amsterdam Center for Multiscale Modeling (ACMM), University of Amsterdam, P.O. Box 94157, 1090 GD Amsterdam, The Netherlands
| | - Ioana M Ilie
- University of Amsterdam, van 't Hoff Institute for Molecular Sciences, Science Park 904, P.O. Box 94157, 1090 GD Amsterdam, The Netherlands.
- Amsterdam Center for Multiscale Modeling (ACMM), University of Amsterdam, P.O. Box 94157, 1090 GD Amsterdam, The Netherlands
| |
Collapse
|
12
|
Chang L, Mondal A, Singh B, Martínez-Noa Y, Perez A. Revolutionizing Peptide-Based Drug Discovery: Advances in the Post-AlphaFold Era. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2024; 14:e1693. [PMID: 38680429 PMCID: PMC11052547 DOI: 10.1002/wcms.1693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 09/18/2023] [Indexed: 05/01/2024]
Abstract
Peptide-based drugs offer high specificity, potency, and selectivity. However, their inherent flexibility and differences in conformational preferences between their free and bound states create unique challenges that have hindered progress in effective drug discovery pipelines. The emergence of AlphaFold (AF) and Artificial Intelligence (AI) presents new opportunities for enhancing peptide-based drug discovery. We explore recent advancements that facilitate a successful peptide drug discovery pipeline, considering peptides' attractive therapeutic properties and strategies to enhance their stability and bioavailability. AF enables efficient and accurate prediction of peptide-protein structures, addressing a critical requirement in computational drug discovery pipelines. In the post-AF era, we are witnessing rapid progress with the potential to revolutionize peptide-based drug discovery such as the ability to rank peptide binders or classify them as binders/non-binders and the ability to design novel peptide sequences. However, AI-based methods are struggling due to the lack of well-curated datasets, for example to accommodate modified amino acids or unconventional cyclization. Thus, physics-based methods, such as docking or molecular dynamics simulations, continue to hold a complementary role in peptide drug discovery pipelines. Moreover, MD-based tools offer valuable insights into binding mechanisms, as well as the thermodynamic and kinetic properties of complexes. As we navigate this evolving landscape, a synergistic integration of AI and physics-based methods holds the promise of reshaping the landscape of peptide-based drug discovery.
Collapse
Affiliation(s)
- Liwei Chang
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | - Arup Mondal
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | - Bhumika Singh
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | | | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611
| |
Collapse
|