1
|
Madushanka A, Laird E, Clark C, Kraka E. SmartCADD: AI-QM Empowered Drug Discovery Platform with Explainability. J Chem Inf Model 2024; 64:6799-6813. [PMID: 39177478 DOI: 10.1021/acs.jcim.4c00720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/24/2024]
Abstract
Artificial intelligence (AI) has emerged as a pivotal force in enhancing productivity across various sectors, with its impact being profoundly felt within the pharmaceutical and biotechnology domains. Despite AI's rapid adoption, its integration into scientific research faces resistance due to myriad challenges: the opaqueness of AI models, the intricate nature of their implementation, and the issue of data scarcity. In response to these impediments, we introduce SmartCADD, an innovative, open-source virtual screening platform that combines deep learning, computer-aided drug design (CADD), and quantum mechanics methodologies within a user-friendly Python framework. SmartCADD is engineered to streamline the construction of comprehensive virtual screening workflows that incorporate a variety of formerly independent techniques─spanning ADMET property predictions, de novo 2D and 3D pharmacophore modeling, molecular docking, to the integration of explainable AI mechanisms. This manuscript highlights the foundational principles, key functionalities, and the unique integrative approach of SmartCADD. Furthermore, we demonstrate its efficacy through a case study focused on the identification of promising lead compounds for HIV inhibition. By democratizing access to advanced AI and quantum mechanics tools, SmartCADD stands as a catalyst for progress in pharmaceutical research and development, heralding a new era of innovation and efficiency.
Collapse
Affiliation(s)
- Ayesh Madushanka
- Department of Chemistry, Southern Methodist University, Dallas, Texas 75205, United States
| | - Eli Laird
- Department of Computer Science, Southern Methodist University, Dallas, Texas 75205, United States
| | - Corey Clark
- Department of Computer Science, Southern Methodist University, Dallas, Texas 75205, United States
| | - Elfi Kraka
- Department of Chemistry, Southern Methodist University, Dallas, Texas 75205, United States
| |
Collapse
|
2
|
Miljković F, Bajorath J. Kinase Drug Discovery: Impact of Open Science and Artificial Intelligence. Mol Pharm 2024. [PMID: 39240193 DOI: 10.1021/acs.molpharmaceut.4c00659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2024]
Abstract
Given their central role in signal transduction, protein kinases (PKs) were first implicated in cancer development, caused by aberrant intracellular signaling events. Since then, PKs have become major targets in different therapeutic areas. The preferred approach to therapeutic intervention of PK-dependent diseases is the use of small molecules to inhibit their catalytic phosphate group transfer activity. PK inhibitors (PKIs) are among the most intensely pursued drug candidates, with currently 80 approved compounds and several hundred in clinical trials. Following the elucidation of the human kinome and development of robust PK expression systems and high-throughput assays, large volumes of PK/PKI data have been produced in industrial and academic environments, more so than for many other pharmaceutical targets. In addition, hundreds of X-ray structures of PKs and their complexes with PKIs have been reported. Substantial amounts of PK/PKI data have been made publicly available in part as a result of open science initiatives. PK drug discovery is further supported through the incorporation of data science approaches, including the development of various specialized databases and online resources. Compound and activity data wealth compared to other targets has also made PKs a focal point for the application of artificial intelligence (AI) in pharmaceutical research. Herein, we discuss the interplay of open and data science in PK drug discovery and review exemplary studies that have substantially contributed to its development, including kinome profiling or the analysis of PKI promiscuity versus selectivity. We also take a close look at how AI approaches are beginning to impact PK drug discovery in light of their increasing data orientation.
Collapse
Affiliation(s)
- Filip Miljković
- Medicinal Chemistry, Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Pepparedsleden 1, SE-43183 Gothenburg, Sweden
| | - Jürgen Bajorath
- Department of Life Science Informatics and Data Science, B-IT, Lamarr Institute for Machine Learning and Artificial Intelligence, LIMES Program Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, 53115 Bonn, Germany
| |
Collapse
|
3
|
Jang H, Seo S, Park S, Kim BJ, Choi GW, Choi J, Park C. De novo drug design through gradient-based regularized search in information-theoretically controlled latent space. J Comput Aided Mol Des 2024; 38:32. [PMID: 39190191 DOI: 10.1007/s10822-024-00571-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 07/31/2024] [Indexed: 08/28/2024]
Abstract
Over the last decade, automatic chemical design frameworks for discovering molecules with drug-like properties have significantly progressed. Among them, the variational autoencoder (VAE) is a cutting-edge approach that models the tractable latent space of the molecular space. In particular, the usage of a VAE along with a property estimator has attracted considerable interest because it enables gradient-based optimization of a given molecule. However, although successful results have been achieved experimentally, the theoretical background and prerequisites for the correct operation of this method have not yet been clarified. In view of the above, we theoretically analyze and rigorously reconstruct the entire framework. From the perspective of parameterized distribution and the information theory, we first describe how the previous model overcomes the limitations of the beta VAE in discovering molecules with the desired properties. Furthermore, we describe the prerequisites for training the above model. Next, from the log-likelihood perspective of each term, we reformulate the objectives for exploring latent space to generate drug-like molecules. The distributional constraints are defined in this study, which will break away from the invalid molecular search. We demonstrated that our model could discover a novel chemical compound for targeting BCL-2 family proteins in de novo approach. Through the theoretical analysis and practical implementation, the importance of the aforementioned prerequisites and constraints to operate the model was verified.
Collapse
Affiliation(s)
- Hyosoon Jang
- Graduate School of AI, POSTECH, 77 Cheongam-Ro, Pohang, 37673, Gyeongbuk, Republic of Korea
| | - Sangmin Seo
- Department of Computer Science, Yonsei University, Yonsei-ro 50, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Sanghyun Park
- Department of Computer Science, Yonsei University, Yonsei-ro 50, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Byung Ju Kim
- UBLBio Corporation, Yeongtong-ro 237, Suwon, 16679, Gyeonggi-do, Republic of Korea
| | - Geon-Woo Choi
- Department of Medical Bigdata Convergence, Kangwon National University, 1 Kangwondaehak-gil, Chuncheon, 24341, Gangwon-do, Republic of Korea
| | - Jonghwan Choi
- College of Information Science, Hallym University, 1 Hallymdaehak-gil, Chuncheon, 24252, Gangwon-do, Republic of Korea.
| | - Chihyun Park
- Department of Medical Bigdata Convergence, Kangwon National University, 1 Kangwondaehak-gil, Chuncheon, 24341, Gangwon-do, Republic of Korea.
- Department of Compupter Science and Engineering, Kangwon National University, 1 Kangwondaehak-gil, Chuncheon, 24341, Gangwon-do, Republic of Korea.
| |
Collapse
|
4
|
Suriyaamporn P, Pamornpathomkul B, Patrojanasophon P, Ngawhirunpat T, Rojanarata T, Opanasopit P. The Artificial Intelligence-Powered New Era in Pharmaceutical Research and Development: A Review. AAPS PharmSciTech 2024; 25:188. [PMID: 39147952 DOI: 10.1208/s12249-024-02901-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Accepted: 07/22/2024] [Indexed: 08/17/2024] Open
Abstract
Currently, artificial intelligence (AI), machine learning (ML), and deep learning (DL) are gaining increased interest in many fields, particularly in pharmaceutical research and development, where they assist in decision-making in complex situations. Numerous research studies and advancements have demonstrated how these computational technologies are used in various pharmaceutical research and development aspects, including drug discovery, personalized medicine, drug formulation, optimization, predictions, drug interactions, pharmacokinetics/ pharmacodynamics, quality control/quality assurance, and manufacturing processes. Using advanced modeling techniques, these computational technologies can enhance efficiency and accuracy, handle complex data, and facilitate novel discoveries within minutes. Furthermore, these technologies offer several advantages over conventional statistics. They allow for pattern recognition from complex datasets, and the models, typically developed from data-driven algorithms, can predict a given outcome (model output) from a set of features (model inputs). Additionally, this review discusses emerging trends and provides perspectives on the application of AI with quality by design (QbD) and the future role of AI in this field. Ethical and regulatory considerations associated with integrating AI into pharmaceutical technology were also examined. This review aims to offer insights to researchers, professionals, and others on the current state of AI applications in pharmaceutical research and development and their potential role in the future of research and the era of pharmaceutical Industry 4.0 and 5.0.
Collapse
Affiliation(s)
- Phuvamin Suriyaamporn
- Pharmaceutical Development of Green Innovations Group (PDGIG), Department of Industrial Pharmacy, Faculty of Pharmacy, Silpakorn University, Nakhon Pathom, Thailand
| | - Boonnada Pamornpathomkul
- Pharmaceutical Development of Green Innovations Group (PDGIG), Department of Industrial Pharmacy, Faculty of Pharmacy, Silpakorn University, Nakhon Pathom, Thailand
| | - Prasopchai Patrojanasophon
- Pharmaceutical Development of Green Innovations Group (PDGIG), Department of Industrial Pharmacy, Faculty of Pharmacy, Silpakorn University, Nakhon Pathom, Thailand
| | - Tanasait Ngawhirunpat
- Pharmaceutical Development of Green Innovations Group (PDGIG), Department of Industrial Pharmacy, Faculty of Pharmacy, Silpakorn University, Nakhon Pathom, Thailand
| | - Theerasak Rojanarata
- Pharmaceutical Development of Green Innovations Group (PDGIG), Department of Industrial Pharmacy, Faculty of Pharmacy, Silpakorn University, Nakhon Pathom, Thailand
| | - Praneet Opanasopit
- Pharmaceutical Development of Green Innovations Group (PDGIG), Department of Industrial Pharmacy, Faculty of Pharmacy, Silpakorn University, Nakhon Pathom, Thailand.
| |
Collapse
|
5
|
Hao Y, Wang H, Liu X, Gai W, Hu S, Liu W, Miao Z, Gan Y, Yu X, Shi R, Tan Y, Kang T, Hai A, Zhao Y, Fu Y, Tang Y, Ye L, Liu J, Liang X, Ke B. Deep simulated annealing for the discovery of novel dental anesthetics with local anesthesia and anti-inflammatory properties. Acta Pharm Sin B 2024; 14:3086-3109. [PMID: 39027234 PMCID: PMC11252475 DOI: 10.1016/j.apsb.2024.01.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 01/04/2024] [Accepted: 01/22/2024] [Indexed: 07/20/2024] Open
Abstract
Multifunctional therapeutics have emerged as a solution to the constraints imposed by drugs with singular or insufficient therapeutic effects. The primary challenge is to integrate diverse pharmacophores within a single-molecule framework. To address this, we introduced DeepSA, a novel edit-based generative framework that utilizes deep simulated annealing for the modification of articaine, a well-known local anesthetic. DeepSA integrates deep neural networks into metaheuristics, effectively constraining molecular space during compound generation. This framework employs a sophisticated objective function that accounts for scaffold preservation, anti-inflammatory properties, and covalent constraints. Through a sequence of local editing to navigate the molecular space, DeepSA successfully identified AT-17, a derivative exhibiting potent analgesic properties and significant anti-inflammatory activity in various animal models. Mechanistic insights into AT-17 revealed its dual mode of action: selective inhibition of NaV1.7 and 1.8 channels, contributing to its prolonged local anesthetic effects, and suppression of inflammatory mediators via modulation of the NLRP3 inflammasome pathway. These findings not only highlight the efficacy of AT-17 as a multifunctional drug candidate but also highlight the potential of DeepSA in facilitating AI-enhanced drug discovery, particularly within stringent chemical constraints.
Collapse
Affiliation(s)
- Yihang Hao
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, West China Hospital of Stomatology, Sichuan University, Chengdu 610041, China
| | - Haofan Wang
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, West China Hospital of Stomatology, Sichuan University, Chengdu 610041, China
| | - Xianggen Liu
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Wenrui Gai
- Department of Anesthesiology, Laboratory of Anesthesia and Critical Care Medicine, National-Local Joint Engineering Research Centre of Translational Medicine of Anesthesiology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Shilong Hu
- Department of Anesthesiology, Laboratory of Anesthesia and Critical Care Medicine, National-Local Joint Engineering Research Centre of Translational Medicine of Anesthesiology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Wencheng Liu
- Department of Anesthesiology, Laboratory of Anesthesia and Critical Care Medicine, National-Local Joint Engineering Research Centre of Translational Medicine of Anesthesiology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Zhuang Miao
- Department of Anesthesiology, Laboratory of Anesthesia and Critical Care Medicine, National-Local Joint Engineering Research Centre of Translational Medicine of Anesthesiology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Yu Gan
- Department of Anesthesiology, Laboratory of Anesthesia and Critical Care Medicine, National-Local Joint Engineering Research Centre of Translational Medicine of Anesthesiology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Xianghua Yu
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, West China Hospital of Stomatology, Sichuan University, Chengdu 610041, China
| | - Rongjia Shi
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, West China Hospital of Stomatology, Sichuan University, Chengdu 610041, China
| | - Yongzhen Tan
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, West China Hospital of Stomatology, Sichuan University, Chengdu 610041, China
| | - Ting Kang
- Department of Anesthesiology, Laboratory of Anesthesia and Critical Care Medicine, National-Local Joint Engineering Research Centre of Translational Medicine of Anesthesiology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Ao Hai
- Department of Anesthesiology, Laboratory of Anesthesia and Critical Care Medicine, National-Local Joint Engineering Research Centre of Translational Medicine of Anesthesiology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Yi Zhao
- Department of Anesthesiology, Laboratory of Anesthesia and Critical Care Medicine, National-Local Joint Engineering Research Centre of Translational Medicine of Anesthesiology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Yihang Fu
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, West China Hospital of Stomatology, Sichuan University, Chengdu 610041, China
| | - Yaling Tang
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, West China Hospital of Stomatology, Sichuan University, Chengdu 610041, China
| | - Ling Ye
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, West China Hospital of Stomatology, Sichuan University, Chengdu 610041, China
| | - Jin Liu
- Department of Anesthesiology, Laboratory of Anesthesia and Critical Care Medicine, National-Local Joint Engineering Research Centre of Translational Medicine of Anesthesiology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Xinhua Liang
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, West China Hospital of Stomatology, Sichuan University, Chengdu 610041, China
| | - Bowen Ke
- Department of Anesthesiology, Laboratory of Anesthesia and Critical Care Medicine, National-Local Joint Engineering Research Centre of Translational Medicine of Anesthesiology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, China
| |
Collapse
|
6
|
Li Y, Liu B, Deng J, Guo Y, Du H. Image-based molecular representation learning for drug development: a survey. Brief Bioinform 2024; 25:bbae294. [PMID: 38920347 PMCID: PMC11200195 DOI: 10.1093/bib/bbae294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/19/2024] [Accepted: 06/08/2024] [Indexed: 06/27/2024] Open
Abstract
Artificial intelligence (AI) powered drug development has received remarkable attention in recent years. It addresses the limitations of traditional experimental methods that are costly and time-consuming. While there have been many surveys attempting to summarize related research, they only focus on general AI or specific aspects such as natural language processing and graph neural network. Considering the rapid advance on computer vision, using the molecular image to enable AI appears to be a more intuitive and effective approach since each chemical substance has a unique visual representation. In this paper, we provide the first survey on image-based molecular representation for drug development. The survey proposes a taxonomy based on the learning paradigms in computer vision and reviews a large number of corresponding papers, highlighting the contributions of molecular visual representation in drug development. Besides, we discuss the applications, limitations and future directions in the field. We hope this survey could offer valuable insight into the use of image-based molecular representation learning in the context of drug development.
Collapse
Affiliation(s)
- Yue Li
- Division of Gastroenterology, Dongzhimen Hospital, Beijing University of Chinese Medicine, No. 5 Haiyun Warehouse, 100700, Beijing, China
| | - Bingyan Liu
- School of Computer Science, Beijing University of Posts and Telecommunications, No.10 Xituchen Street, 100876, Beijing, China
| | - Jinyan Deng
- Division of Gastroenterology, Dongzhimen Hospital, Beijing University of Chinese Medicine, No. 5 Haiyun Warehouse, 100700, Beijing, China
| | - Yi Guo
- Division of Gastroenterology, Dongzhimen Hospital, Beijing University of Chinese Medicine, No. 5 Haiyun Warehouse, 100700, Beijing, China
| | - Hongbo Du
- Division of Gastroenterology, Dongzhimen Hospital, Beijing University of Chinese Medicine, No. 5 Haiyun Warehouse, 100700, Beijing, China
- Institute of Liver Disease, Beijing University of Chinese Medicine, No. 5 Haiyun Warehouse, 100700, Beijing, China
| |
Collapse
|
7
|
Xerxa E, Bajorath J. Data-oriented protein kinase drug discovery. Eur J Med Chem 2024; 271:116413. [PMID: 38636127 DOI: 10.1016/j.ejmech.2024.116413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/06/2024] [Accepted: 04/11/2024] [Indexed: 04/20/2024]
Abstract
The continued growth of data from biological screening and medicinal chemistry provides opportunities for data-driven experimental design and decision making in early-phase drug discovery. Approaches adopted from data science help to integrate internal and public domain data and extract knowledge from historical in-house data. Protein kinase (PK) drug discovery is an exemplary area where large amounts of data are accumulating, providing a valuable knowledge base for discovery projects. Herein, the evolution of PK drug discovery and development of small molecular PK inhibitors (PKIs) is reviewed, highlighting milestone developments in the field and discussing exemplary studies providing a basis for increasing data orientation of PK discovery efforts.
Collapse
Affiliation(s)
- Elena Xerxa
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Lamarr Institute for Machine Learning and Artificial Intelligence, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115, Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Lamarr Institute for Machine Learning and Artificial Intelligence, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115, Bonn, Germany.
| |
Collapse
|
8
|
He X, Chen Y, Wang S, Zhang G. Employing Graph Neural Networks for Predicting Electrode Average Voltages and Screening High-Voltage Sodium Cathode Materials. ACS APPLIED MATERIALS & INTERFACES 2024. [PMID: 38703109 DOI: 10.1021/acsami.4c00624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2024]
Abstract
For many years, humans have been relentlessly focused on enhancing battery longevity and boosting energy storage capacities. The performance and durability of a battery depend significantly on the material used for its electrodes. In this context, merging machine learning with density functional theory (DFT) calculations has emerged as a pivotal approach to advancing the exploration of battery crystal structures. We present a new method that combines a graph convolutional neural network (GNN) with a Transformer convolutional layer, which we call Transformer-GNN. To underscore its efficacy, we benchmarked Transformer-GNN against three established statistical machine learning models: Support Vector Machine, Random Forest, and XGBoost. We also developed a standard GNN, which we refer to as Basic-GNN. Additionally, we compared Basic-GNN with Transformer-GNN to highlight the improvements brought about by incorporating the Transformer convolutional layer. The Transformer-GNN model outperforms the other models, achieving the highest R2 value of 0.82 and the lowest mean squared error of 0.3161. Our findings demonstrate that the Transformer-GNN can profoundly understand battery crystal structures, thus forging the path toward more sophisticated and durable battery systems. Leveraging the GNN model's voltage predictions in tandem with the capacity data sourced from the database, we screened and pinpointed Na(NiO2)2 as a high-voltage (higher than 5 V), high-capacity sodium cathode material. We conducted DFT calculations on Na(NiO2)2 and revealed the migration mechanism of the Na ions.
Collapse
Affiliation(s)
- Xiaoyue He
- CAS Key Laboratory of Material for Energy Conversion, Department of Material Science and Engineering, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Yanxu Chen
- CAS Key Laboratory of Material for Energy Conversion, Department of Material Science and Engineering, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Shao Wang
- CAS Key Laboratory of Material for Energy Conversion, Department of Material Science and Engineering, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Genqiang Zhang
- CAS Key Laboratory of Material for Energy Conversion, Department of Material Science and Engineering, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
- Hefei National Research Center for Physical Sciences at the Microscale, University of Science and Technology of China, Hefei ,Anhui 230026, P. R. China
| |
Collapse
|
9
|
Luginina AP, Khnykin AN, Khorn PA, Moiseeva OV, Safronova NA, Pospelov VA, Dashevskii DE, Belousov AS, Borschevskiy VI, Mishin AV. Rational Design of Drugs Targeting G-Protein-Coupled Receptors: Ligand Search and Screening. BIOCHEMISTRY. BIOKHIMIIA 2024; 89:958-972. [PMID: 38880655 DOI: 10.1134/s0006297924050158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 02/22/2024] [Accepted: 02/23/2024] [Indexed: 06/18/2024]
Abstract
G protein-coupled receptors (GPCRs) are transmembrane proteins that participate in many physiological processes and represent major pharmacological targets. Recent advances in structural biology of GPCRs have enabled the development of drugs based on the receptor structure (structure-based drug design, SBDD). SBDD utilizes information about the receptor-ligand complex to search for suitable compounds, thus expanding the chemical space of possible receptor ligands without the need for experimental screening. The review describes the use of structure-based virtual screening (SBVS) for GPCR ligands and approaches for the functional testing of potential drug compounds, as well as discusses recent advances and successful examples in the application of SBDD for the identification of GPCR ligands.
Collapse
Affiliation(s)
- Aleksandra P Luginina
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Andrey N Khnykin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Polina A Khorn
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Olga V Moiseeva
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
- Skryabin Institute of Biochemistry and Physiology of Microorganisms, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russia
| | - Nadezhda A Safronova
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Vladimir A Pospelov
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Dmitrii E Dashevskii
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Anatolii S Belousov
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Valentin I Borschevskiy
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia.
- Frank Laboratory of Neutron Physics, Joint Institute for Nuclear Research, Dubna, Moscow Region, 141980, Russia
| | - Alexey V Mishin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia.
| |
Collapse
|
10
|
Ghandikota SK, Jegga AG. Application of artificial intelligence and machine learning in drug repurposing. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2024; 205:171-211. [PMID: 38789178 DOI: 10.1016/bs.pmbts.2024.03.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
The purpose of drug repurposing is to leverage previously approved drugs for a particular disease indication and apply them to another disease. It can be seen as a faster and more cost-effective approach to drug discovery and a powerful tool for achieving precision medicine. In addition, drug repurposing can be used to identify therapeutic candidates for rare diseases and phenotypic conditions with limited information on disease biology. Machine learning and artificial intelligence (AI) methodologies have enabled the construction of effective, data-driven repurposing pipelines by integrating and analyzing large-scale biomedical data. Recent technological advances, especially in heterogeneous network mining and natural language processing, have opened up exciting new opportunities and analytical strategies for drug repurposing. In this review, we first introduce the challenges in repurposing approaches and highlight some success stories, including those during the COVID-19 pandemic. Next, we review some existing computational frameworks in the literature, organized on the basis of the type of biomedical input data analyzed and the computational algorithms involved. In conclusion, we outline some exciting new directions that drug repurposing research may take, as pioneered by the generative AI revolution.
Collapse
Affiliation(s)
- Sudhir K Ghandikota
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Anil G Jegga
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States.
| |
Collapse
|
11
|
Wang Y, Stebe KJ, de la Fuente-Nunez C, Radhakrishnan R. Computational Design of Peptides for Biomaterials Applications. ACS APPLIED BIO MATERIALS 2024; 7:617-625. [PMID: 36971822 PMCID: PMC11190638 DOI: 10.1021/acsabm.2c01023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
Computer-aided molecular design and protein engineering emerge as promising and active subjects in bioengineering and biotechnological applications. On one hand, due to the advancing computing power in the past decade, modeling toolkits and force fields have been put to use for accurate multiscale modeling of biomolecules including lipid, protein, carbohydrate, and nucleic acids. On the other hand, machine learning emerges as a revolutionary data analysis tool that promises to leverage physicochemical properties and structural information obtained from modeling in order to build quantitative protein structure-function relationships. We review recent computational works that utilize state-of-the-art computational methods to engineer peptides and proteins for various emerging biomedical, antimicrobial, and antifreeze applications. We also discuss challenges and possible future directions toward developing a roadmap for efficient biomolecular design and engineering.
Collapse
Affiliation(s)
- Yiming Wang
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
- Department of Chemical and Biomolecular Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Kathleen J Stebe
- Department of Chemical and Biomolecular Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Cesar de la Fuente-Nunez
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
- Department of Chemical and Biomolecular Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
- Machine Biology Group, Department of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Ravi Radhakrishnan
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
- Department of Chemical and Biomolecular Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| |
Collapse
|
12
|
Zhang H, Huang J, Xie J, Huang W, Yang Y, Xu M, Lei J, Chen H. GRELinker: A Graph-Based Generative Model for Molecular Linker Design with Reinforcement and Curriculum Learning. J Chem Inf Model 2024; 64:666-676. [PMID: 38241022 DOI: 10.1021/acs.jcim.3c01700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2024]
Abstract
Fragment-based drug discovery (FBDD) is widely used in drug design. One useful strategy in FBDD is designing linkers for linking fragments to optimize their molecular properties. In the current study, we present a novel generative fragment linking model, GRELinker, which utilizes a gated-graph neural network combined with reinforcement and curriculum learning to generate molecules with desirable attributes. The model has been shown to be efficient in multiple tasks, including controlling log P, optimizing synthesizability or predicted bioactivity of compounds, and generating molecules with high 3D similarity but low 2D similarity to the lead compound. Specifically, our model outperforms the previously reported reinforcement learning (RL) built-in method DRlinker on these benchmark tasks. Moreover, GRELinker has been successfully used in an actual FBDD case to generate optimized molecules with enhanced affinities by employing the docking score as the scoring function in RL. Besides, the implementation of curriculum learning in our framework enables the generation of structurally complex linkers more efficiently. These results demonstrate the benefits and feasibility of GRELinker in linker design for molecular optimization and drug discovery.
Collapse
Affiliation(s)
- Hao Zhang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Jinchao Huang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Junjie Xie
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| | - Weifeng Huang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| | - Mingyuan Xu
- Guangzhou National Laboratory, Guangzhou International Bio Island, No. 9 Xin Dao Huan Bei Road, Guangzhou 510005, China
| | - Jinping Lei
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Hongming Chen
- Guangzhou National Laboratory, Guangzhou International Bio Island, No. 9 Xin Dao Huan Bei Road, Guangzhou 510005, China
| |
Collapse
|
13
|
Matevosyan M, Harutyunyan V, Abelyan N, Khachatryan H, Tirosyan I, Gabrielyan Y, Sahakyan V, Gevorgyan S, Arakelov V, Arakelov G, Zakaryan H. Design of new chemical entities targeting both native and H275Y mutant influenza a virus by deep reinforcement learning. J Biomol Struct Dyn 2023; 41:10798-10812. [PMID: 36541127 DOI: 10.1080/07391102.2022.2158936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 12/10/2022] [Indexed: 12/24/2022]
Abstract
Influenza virus remains a major public health challenge due to its high morbidity and mortality and seasonal surge. Although antiviral drugs against the influenza virus are widely used as a first-line defense, the virus undergoes rapid genetic changes, resulting in the emergence of drug-resistant strains. Thus, new antiviral drugs that can outwit resistant strains are of significant importance. Herein, we used deep reinforcement learning (RL) algorithm to design new chemical entities (NCEs) that are able to bind to the native and H275Y mutant (oseltamivir-resistant) neuraminidases (NAs) of influenza A virus with better binding energy than oseltamivir. We generated more than 66211 NCEs, which were prioritized based on the filtering rules, structural alerts, and synthetic accessibility. Then, 18 NCEs with better MM/PBSA scores than oseltamivir were further analyzed in molecular dynamics (MD) simulations conducted for 100 ns. The MD experiments showed that 8 NCEs formed very stable complexes with the binding pocket of both native and H275Y mutant NAs of H1N1. Furthermore, most NCEs demonstrated much better binding affinity to group 2 (N2, N3, and N9) and influenza B virus NAs than oseltamivir. Although all 8 NCEs have non-sialic acid-like structures, they showed a similar binding mode as oseltamivir, indicating that it is possible to find new scaffolds with better binding and antiviral properties than sialic acid-like inhibitors. In conclusion, we have designed potential compounds as antiviral candidates for further synthesis and testing against wild and mutant influenza virus.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Vahram Arakelov
- Denovo Sciences Inc, Yerevan, Armenia
- Institute of Molecular Biology of National Academy of Sciences, Yerevan, Armenia
| | - Grigor Arakelov
- Denovo Sciences Inc, Yerevan, Armenia
- Institute of Molecular Biology of National Academy of Sciences, Yerevan, Armenia
| | - Hovakim Zakaryan
- Denovo Sciences Inc, Yerevan, Armenia
- Institute of Molecular Biology of National Academy of Sciences, Yerevan, Armenia
| |
Collapse
|
14
|
Sultanov A, Crivello JC, Rebafka T, Sokolovska N. Data-Driven Score-Based Models for Generating Stable Structures with Adaptive Crystal Cells. J Chem Inf Model 2023; 63:6986-6997. [PMID: 37947477 DOI: 10.1021/acs.jcim.3c00969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]
Abstract
The discovery of new functional and stable materials is a big challenge due to its complexity. This work aims at the generation of new crystal structures with desired properties, such as chemical stability and specified chemical composition, by using machine learning generative models. Compared with the generation of molecules, crystal structures pose new difficulties arising from the periodic nature of the crystal and from the specific symmetry constraints related to the space group. In this work, score-based probabilistic models based on annealed Langevin dynamics, which have shown excellent performance in various applications, are adapted to the task of crystal generation. The novelty of the presented approach resides in the fact that the lattice of the crystal cell is not fixed. During the training of the model, the lattice is learned from the available data, whereas during the sampling of a new chemical structure, two denoising processes are used in parallel to generate the lattice along with the generation of the atomic positions. A multigraph crystal representation is introduced that respects symmetry constraints, yielding computational advantages and a better quality of the sampled structures. We show that our model is capable of generating new candidate structures in any chosen chemical system and crystal group without any additional training. To illustrate the functionality of the proposed method, a comparison of our model to other recent generative models based on descriptor-based metrics is provided.
Collapse
Affiliation(s)
- Arsen Sultanov
- Univ Paris Est Creteil, CNRS, ICMPE, UMR 7182, 2 rue Henri Dunant, 94320 Thiais, France
| | - Jean-Claude Crivello
- Univ Paris Est Creteil, CNRS, ICMPE, UMR 7182, 2 rue Henri Dunant, 94320 Thiais, France
- CNRS-Saint-Gobain-NIMS, IRL 3629, Laboratory for Innovative Key Materials and Structures (LINK), 1-1 Namiki, 305-0044 Tsukuba, Japan
| | - Tabea Rebafka
- LPSM, Sorbonne Université, Université Paris Cité, CNRS, 75005 Paris, France
- Université Paris-Saclay, INRAE, MaIAGE, 78350 Jouy-en-Josas, France
| | | |
Collapse
|
15
|
Handa K, Thomas MC, Kageyama M, Iijima T, Bender A. On the difficulty of validating molecular generative models realistically: a case study on public and proprietary data. J Cheminform 2023; 15:112. [PMID: 37990215 PMCID: PMC10664602 DOI: 10.1186/s13321-023-00781-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 11/10/2023] [Indexed: 11/23/2023] Open
Abstract
While a multitude of deep generative models have recently emerged there exists no best practice for their practically relevant validation. On the one hand, novel de novo-generated molecules cannot be refuted by retrospective validation (so that this type of validation is biased); but on the other hand prospective validation is expensive and then often biased by the human selection process. In this case study, we frame retrospective validation as the ability to mimic human drug design, by answering the following question: Can a generative model trained on early-stage project compounds generate middle/late-stage compounds de novo? To this end, we used experimental data that contains the elapsed time of a synthetic expansion following hit identification from five public (where the time series was pre-processed to better reflect realistic synthetic expansions) and six in-house project datasets, and used REINVENT as a widely adopted RNN-based generative model. After splitting the dataset and training REINVENT on early-stage compounds, we found that rediscovery of middle/late-stage compounds was much higher in public projects (at 1.60%, 0.64%, and 0.21% of the top 100, 500, and 5000 scored generated compounds) than in in-house projects (where the values were 0.00%, 0.03%, and 0.04%, respectively). Similarly, average single nearest neighbour similarity between early- and middle/late-stage compounds in public projects was higher between active compounds than inactive compounds; however, for in-house projects the converse was true, which makes rediscovery (if so desired) more difficult. We hence show that the generative model recovers very few middle/late-stage compounds from real-world drug discovery projects, highlighting the fundamental difference between purely algorithmic design and drug discovery as a real-world process. Evaluating de novo compound design approaches appears, based on the current study, difficult or even impossible to do retrospectively.Scientific Contribution This contribution hence illustrates aspects of evaluating the performance of generative models in a real-world setting which have not been extensively described previously and which hopefully contribute to their further future development.
Collapse
Affiliation(s)
- Koichi Handa
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
- Toxicology & DMPK Research Department, Teijin Institute for Bio-Medical Research, Teijin Pharma Limited, 4-3-2 Asahigaoka, Hino-Shi, Tokyo, 191-8512, Japan.
| | - Morgan C Thomas
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Michiharu Kageyama
- Toxicology & DMPK Research Department, Teijin Institute for Bio-Medical Research, Teijin Pharma Limited, 4-3-2 Asahigaoka, Hino-Shi, Tokyo, 191-8512, Japan
| | - Takeshi Iijima
- Toxicology & DMPK Research Department, Teijin Institute for Bio-Medical Research, Teijin Pharma Limited, 4-3-2 Asahigaoka, Hino-Shi, Tokyo, 191-8512, Japan
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| |
Collapse
|
16
|
Wang S, Wang L, Li F, Bai F. DeepSA: a deep-learning driven predictor of compound synthesis accessibility. J Cheminform 2023; 15:103. [PMID: 37919805 PMCID: PMC10621138 DOI: 10.1186/s13321-023-00771-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 10/20/2023] [Indexed: 11/04/2023] Open
Abstract
With the continuous development of artificial intelligence technology, more and more computational models for generating new molecules are being developed. However, we are often confronted with the question of whether these compounds are easy or difficult to synthesize, which refers to synthetic accessibility of compounds. In this study, a deep learning based computational model called DeepSA, was proposed to predict the synthesis accessibility of compounds, which provides a useful tool to choose molecules. DeepSA is a chemical language model that was developed by training on a dataset of 3,593,053 molecules using various natural language processing (NLP) algorithms, offering advantages over state-of-the-art methods and having a much higher area under the receiver operating characteristic curve (AUROC), i.e., 89.6%, in discriminating those molecules that are difficult to synthesize. This helps users select less expensive molecules for synthesis, reducing the time and cost required for drug discovery and development. Interestingly, a comparison of DeepSA with a Graph Attention-based method shows that using SMILES alone can also efficiently visualize and extract compound's informative features. DeepSA is available online on the below web server ( https://bailab.siais.shanghaitech.edu.cn/services/deepsa/ ) of our group, and the code is available at https://github.com/Shihang-Wang-58/DeepSA .
Collapse
Affiliation(s)
- Shihang Wang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
| | - Lin Wang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
| | - Fenglei Li
- School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
| | - Fang Bai
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China.
- School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China.
- Shanghai Clinical Research and Trial Center, Shanghai, 201210, China.
| |
Collapse
|
17
|
Ilnicka A, Schneider G. Designing molecules with autoencoder networks. NATURE COMPUTATIONAL SCIENCE 2023; 3:922-933. [PMID: 38177601 DOI: 10.1038/s43588-023-00548-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 10/03/2023] [Indexed: 01/06/2024]
Abstract
Autoencoders are versatile tools in molecular informatics. These unsupervised neural networks serve diverse tasks such as data-driven molecular representation and constructive molecular design. This Review explores their algorithmic foundations and applications in drug discovery, highlighting the most active areas of development and the contributions autoencoder networks have made in advancing this field. We also explore the challenges and prospects concerning the utilization of autoencoders and the various adaptations of this neural network architecture in molecular design.
Collapse
Affiliation(s)
- Agnieszka Ilnicka
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland.
| |
Collapse
|
18
|
Ge H, Ji B, Fang J, Wang J, Li J, Wang J. Discovery of Potent and Selective CB2 Agonists Utilizing a Function-Based Computational Screening Protocol. ACS Chem Neurosci 2023; 14:3941-3958. [PMID: 37823773 PMCID: PMC10623575 DOI: 10.1021/acschemneuro.3c00580] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 09/22/2023] [Indexed: 10/13/2023] Open
Abstract
Nowadays, the identification of agonists and antagonists represents a great challenge in computer-aided drug design. In this work, we developed a computational protocol enabling us to design/screen novel chemicals that are likely to serve as selective CB2 agonists. The principle of this protocol is that by calculating the ligand-residue interaction profile (LRIP) of a ligand binding to a specific target, the agonist-antagonist function of a compound is then able to be determined after statistical analysis and free energy calculations. This computational protocol was successfully applied in CB2 agonist development starting from a lead compound, and a success rate of 70% was achieved. The functions of the synthesized derivatives were determined by in vitro functional assays. Moreover, the identified potent CB2 agonists and antagonists strongly interact with the key residues identified using the already known potent CB2 agonists/antagonists. The analysis of the interaction profile of compound 6, a potent agonist, showed strong interactions with F2.61, I186, and F2.64, while compound 39, a potent antagonist, showed strong interactions with L17, W6.48, V6.51, and C7.42. Still, some residues including V3.32, T3.33, S7.39, F183, W5.43, and I3.29 are hotspots for both CB2 agonists and antagonists. More significantly, we identified three hotspot residues in the loop, including I186 for agonists, L17 for antagonists, and F183 for both. These hotspot residues are typically not considered in CB1/CB2 rational ligand design. In conclusion, LRIP is a useful concept in rationally designing a compound to possess a certain function.
Collapse
Affiliation(s)
- Haixia Ge
- School
of Life Sciences, Huzhou University, Huzhou 313000, China
| | - Beihong Ji
- Department
of Pharmaceutical Sciences and Computational Chemical Genomics Screening
Center, School of Pharmacy, University of
Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Jiahui Fang
- Chinese
Academy of Sciences Key Laboratory of Receptor Research, National
Center for Drug Screening, Shanghai Institute
of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Jiayang Wang
- School
of Life Sciences, Huzhou University, Huzhou 313000, China
| | - Jing Li
- Chinese
Academy of Sciences Key Laboratory of Receptor Research, National
Center for Drug Screening, Shanghai Institute
of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Junmei Wang
- Department
of Pharmaceutical Sciences and Computational Chemical Genomics Screening
Center, School of Pharmacy, University of
Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| |
Collapse
|
19
|
Li CH, Tabor DP. Generative organic electronic molecular design informed by quantum chemistry. Chem Sci 2023; 14:11045-11055. [PMID: 37860647 PMCID: PMC10583709 DOI: 10.1039/d3sc03781a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Accepted: 09/11/2023] [Indexed: 10/21/2023] Open
Abstract
Generative molecular design strategies have emerged as promising alternatives to trial-and-error approaches for exploring and optimizing within large chemical spaces. To date, generative models with reinforcement learning approaches have frequently used low-cost methods to evaluate the quality of the generated molecules, enabling many loops through the generative model. However, for functional molecular materials tasks, such low-cost methods are either not available or would require the generation of large amounts of training data to train surrogate machine learning models. In this work, we develop a framework that connects the REINVENT reinforcement learning framework with excited state quantum chemistry calculations to discover molecules with specified molecular excited state energy levels, specifically molecules with excited state landscapes that would serve as promising singlet fission or triplet-triplet annihilation materials. We employ a two-step curriculum strategy to first find a set of diverse promising molecules, then demonstrate the framework's ability to exploit a more focused chemical space with anthracene derivatives. Under this protocol, we show that the framework can find desired molecules and improve Pareto fronts for targeted properties versus synthesizability. Moreover, we are able to find several different design principles used by chemists for the design of singlet fission and triplet-triplet annihilation molecules.
Collapse
Affiliation(s)
- Cheng-Han Li
- Department of Chemistry, Texas A&M University College Station TX 77842 USA
| | - Daniel P Tabor
- Department of Chemistry, Texas A&M University College Station TX 77842 USA
| |
Collapse
|
20
|
Erikawa D, Yasuo N, Suzuki T, Nakamura S, Sekijima M. Gargoyles: An Open Source Graph-Based Molecular Optimization Method Based on Deep Reinforcement Learning. ACS OMEGA 2023; 8:37431-37441. [PMID: 37841174 PMCID: PMC10568706 DOI: 10.1021/acsomega.3c05430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 09/13/2023] [Indexed: 10/17/2023]
Abstract
Automatic optimization methods for compounds in the vast compound space are important for drug discovery and material design. Several machine learning-based molecular generative models for drug discovery have been proposed, but most of these methods generate compounds from scratch and are not suitable for exploring and optimizing user-defined compounds. In this study, we developed a compound optimization method based on molecular graphs using deep reinforcement learning. This method searches for compounds on a fragment-by-fragment basis and at high density by generating fragments to be added atom by atom. Experimental results confirmed that the quantum electrodynamics (QED), the optimization target set in this study, was enhanced by searching around the starting compound. As a use case, we successfully enhanced the activity of a compound by targeting dopamine receptor D2 (DRD2). This means that the generated compounds are not structurally dissimilar from the starting compounds, as well as increasing their activity, indicating that this method is suitable for optimizing molecules from a given compound. The source code is available at https://github.com/sekijima-lab/GARGOYLES.
Collapse
Affiliation(s)
- Daiki Erikawa
- Department
of Computer Science, Tokyo Institute of
Technology, 4259-J3-23, Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan
| | - Nobuaki Yasuo
- Academy
for Convergence of Materials and Informatics (TAC-MI), Tokyo Institute of Technology, S6-23, Ookayama, Meguro-ku, Tokyo 152-8550, Japan
| | - Takamasa Suzuki
- Department
of Computer Science, Tokyo Institute of
Technology, 4259-J3-23, Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan
| | - Shogo Nakamura
- Department
of Life Science and Technology, Tokyo Institute
of Technology, 4259-J3-23, Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan
| | - Masakazu Sekijima
- Department
of Computer Science, Tokyo Institute of
Technology, 4259-J3-23, Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan
| |
Collapse
|
21
|
Lamanna G, Delre P, Marcou G, Saviano M, Varnek A, Horvath D, Mangiatordi GF. GENERA: A Combined Genetic/Deep-Learning Algorithm for Multiobjective Target-Oriented De Novo Design. J Chem Inf Model 2023; 63:5107-5119. [PMID: 37556857 PMCID: PMC10466378 DOI: 10.1021/acs.jcim.3c00963] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Indexed: 08/11/2023]
Abstract
This study introduces a new de novo design algorithm called GENERA that combines the capabilities of a deep-learning algorithm for automated drug-like analogue design, called DeLA-Drug, with a genetic algorithm for generating molecules with desired target-oriented properties. Specifically, GENERA was applied to the angiotensin-converting enzyme 2 (ACE2) target, which is implicated in many pathological conditions, including COVID-19. The ability of GENERA to de novo design promising candidates for a specific target was assessed using two docking programs, PLANTS and GLIDE. A fitness function based on the Pareto dominance resulting from computed PLANTS and GLIDE scores was applied to demonstrate the algorithm's ability to perform multiobjective optimizations effectively. GENERA can quickly generate focused libraries that produce better scores compared to a starting set of known ACE-2 binders. This study is the first to utilize a DL-based algorithm designed for analogue generation as a mutational operator within a GA framework, representing an innovative approach to target-oriented de novo design.
Collapse
Affiliation(s)
- Giuseppe Lamanna
- Chemistry
Department, University of Bari “Aldo
Moro”, Via E.
Orabona, 4, I-70125 Bari, Italy
- CNR
− Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Pietro Delre
- CNR
− Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Gilles Marcou
- Laboratoire
de Chémoinformatique UMR7140, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Michele Saviano
- CNR
− Institute of Crystallography, Via Vivaldi 43, 81100 Caserta, Italy
| | - Alexandre Varnek
- Laboratoire
de Chémoinformatique UMR7140, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Dragos Horvath
- Laboratoire
de Chémoinformatique UMR7140, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | | |
Collapse
|
22
|
Singh S, Kumar R, Payra S, Singh SK. Artificial Intelligence and Machine Learning in Pharmacological Research: Bridging the Gap Between Data and Drug Discovery. Cureus 2023; 15:e44359. [PMID: 37779744 PMCID: PMC10539991 DOI: 10.7759/cureus.44359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/31/2023] [Indexed: 10/03/2023] Open
Abstract
Artificial intelligence (AI) has transformed pharmacological research through machine learning, deep learning, and natural language processing. These advancements have greatly influenced drug discovery, development, and precision medicine. AI algorithms analyze vast biomedical data identifying potential drug targets, predicting efficacy, and optimizing lead compounds. AI has diverse applications in pharmacological research, including target identification, drug repurposing, virtual screening, de novo drug design, toxicity prediction, and personalized medicine. AI improves patient selection, trial design, and real-time data analysis in clinical trials, leading to enhanced safety and efficacy outcomes. Post-marketing surveillance utilizes AI-based systems to monitor adverse events, detect drug interactions, and support pharmacovigilance efforts. Machine learning models extract patterns from complex datasets, enabling accurate predictions and informed decision-making, thus accelerating drug discovery. Deep learning, specifically convolutional neural networks (CNN), excels in image analysis, aiding biomarker identification and optimizing drug formulation. Natural language processing facilitates the mining and analysis of scientific literature, unlocking valuable insights and information. However, the adoption of AI in pharmacological research raises ethical considerations. Ensuring data privacy and security, addressing algorithm bias and transparency, obtaining informed consent, and maintaining human oversight in decision-making are crucial ethical concerns. The responsible deployment of AI necessitates robust frameworks and regulations. The future of AI in pharmacological research is promising, with integration with emerging technologies like genomics, proteomics, and metabolomics offering the potential for personalized medicine and targeted therapies. Collaboration among academia, industry, and regulatory bodies is essential for the ethical implementation of AI in drug discovery and development. Continuous research and development in AI techniques and comprehensive training programs will empower scientists and healthcare professionals to fully exploit AI's potential, leading to improved patient outcomes and innovative pharmacological interventions.
Collapse
Affiliation(s)
- Shruti Singh
- Department of Pharmacology, All India Institute of Medical Sciences, Patna, IND
| | - Rajesh Kumar
- Department of Pharmacology, All India Institute of Medical Sciences, Patna, IND
| | - Shuvasree Payra
- Department of Pharmacology, All India Institute of Medical Sciences, Patna, IND
| | - Sunil K Singh
- Department of Pharmacology, All India Institute of Medical Sciences, Patna, IND
| |
Collapse
|
23
|
Vora LK, Gholap AD, Jetha K, Thakur RRS, Solanki HK, Chavda VP. Artificial Intelligence in Pharmaceutical Technology and Drug Delivery Design. Pharmaceutics 2023; 15:1916. [PMID: 37514102 PMCID: PMC10385763 DOI: 10.3390/pharmaceutics15071916] [Citation(s) in RCA: 61] [Impact Index Per Article: 61.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 06/28/2023] [Accepted: 07/04/2023] [Indexed: 07/30/2023] Open
Abstract
Artificial intelligence (AI) has emerged as a powerful tool that harnesses anthropomorphic knowledge and provides expedited solutions to complex challenges. Remarkable advancements in AI technology and machine learning present a transformative opportunity in the drug discovery, formulation, and testing of pharmaceutical dosage forms. By utilizing AI algorithms that analyze extensive biological data, including genomics and proteomics, researchers can identify disease-associated targets and predict their interactions with potential drug candidates. This enables a more efficient and targeted approach to drug discovery, thereby increasing the likelihood of successful drug approvals. Furthermore, AI can contribute to reducing development costs by optimizing research and development processes. Machine learning algorithms assist in experimental design and can predict the pharmacokinetics and toxicity of drug candidates. This capability enables the prioritization and optimization of lead compounds, reducing the need for extensive and costly animal testing. Personalized medicine approaches can be facilitated through AI algorithms that analyze real-world patient data, leading to more effective treatment outcomes and improved patient adherence. This comprehensive review explores the wide-ranging applications of AI in drug discovery, drug delivery dosage form designs, process optimization, testing, and pharmacokinetics/pharmacodynamics (PK/PD) studies. This review provides an overview of various AI-based approaches utilized in pharmaceutical technology, highlighting their benefits and drawbacks. Nevertheless, the continued investment in and exploration of AI in the pharmaceutical industry offer exciting prospects for enhancing drug development processes and patient care.
Collapse
Affiliation(s)
- Lalitkumar K Vora
- School of Pharmacy, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, UK
| | - Amol D Gholap
- Department of Pharmaceutics, St. John Institute of Pharmacy and Research, Palghar 401404, Maharashtra, India
| | - Keshava Jetha
- Department of Pharmaceutics and Pharmaceutical Technology, L. M. College of Pharmacy, Ahmedabad 380009, Gujarat, India
- Ph.D. Section, Gujarat Technological University, Ahmedabad 382424, Gujarat, India
| | | | - Hetvi K Solanki
- Pharmacy Section, L. M. College of Pharmacy, Ahmedabad 380009, Gujarat, India
| | - Vivek P Chavda
- Department of Pharmaceutics and Pharmaceutical Technology, L. M. College of Pharmacy, Ahmedabad 380009, Gujarat, India
| |
Collapse
|
24
|
Chenthamarakshan V, Hoffman SC, Owen CD, Lukacik P, Strain-Damerell C, Fearon D, Malla TR, Tumber A, Schofield CJ, Duyvesteyn HM, Dejnirattisai W, Carrique L, Walter TS, Screaton GR, Matviiuk T, Mojsilovic A, Crain J, Walsh MA, Stuart DI, Das P. Accelerating drug target inhibitor discovery with a deep generative foundation model. SCIENCE ADVANCES 2023; 9:eadg7865. [PMID: 37343087 PMCID: PMC10284550 DOI: 10.1126/sciadv.adg7865] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 05/17/2023] [Indexed: 06/23/2023]
Abstract
Inhibitor discovery for emerging drug-target proteins is challenging, especially when target structure or active molecules are unknown. Here, we experimentally validate the broad utility of a deep generative framework trained at-scale on protein sequences, small molecules, and their mutual interactions-unbiased toward any specific target. We performed a protein sequence-conditioned sampling on the generative foundation model to design small-molecule inhibitors for two dissimilar targets: the spike protein receptor-binding domain (RBD) and the main protease from SARS-CoV-2. Despite using only the target sequence information during the model inference, micromolar-level inhibition was observed in vitro for two candidates out of four synthesized for each target. The most potent spike RBD inhibitor exhibited activity against several variants in live virus neutralization assays. These results establish that a single, broadly deployable generative foundation model for accelerated inhibitor discovery is effective and efficient, even in the absence of target structure or binder information.
Collapse
Affiliation(s)
| | - Samuel C. Hoffman
- IBM Research, Thomas J. Watson Research Center, Yorktown Heights, New York, NY, USA
| | - C. David Owen
- Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK
- Research Complex at Harwell, Harwell Science and Innovation Campus, OX11 0FA Didcot, UK
| | - Petra Lukacik
- Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK
- Research Complex at Harwell, Harwell Science and Innovation Campus, OX11 0FA Didcot, UK
| | - Claire Strain-Damerell
- Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK
- Research Complex at Harwell, Harwell Science and Innovation Campus, OX11 0FA Didcot, UK
| | - Daren Fearon
- Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK
- Research Complex at Harwell, Harwell Science and Innovation Campus, OX11 0FA Didcot, UK
| | - Tika R. Malla
- Chemistry Research Laboratory, Department of Chemistry and the Ineos Oxford Institute for Antimicrobial Research, University of Oxford, 12 Mansfield Road, OX1 3TA Oxford, UK
| | - Anthony Tumber
- Chemistry Research Laboratory, Department of Chemistry and the Ineos Oxford Institute for Antimicrobial Research, University of Oxford, 12 Mansfield Road, OX1 3TA Oxford, UK
| | - Christopher J. Schofield
- Chemistry Research Laboratory, Department of Chemistry and the Ineos Oxford Institute for Antimicrobial Research, University of Oxford, 12 Mansfield Road, OX1 3TA Oxford, UK
| | - Helen M.E. Duyvesteyn
- Division of Structural Biology, University of Oxford, The Wellcome Centre for Human Genetics, Headington, Oxford, UK
| | - Wanwisa Dejnirattisai
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
| | - Loic Carrique
- Division of Structural Biology, University of Oxford, The Wellcome Centre for Human Genetics, Headington, Oxford, UK
| | - Thomas S. Walter
- Division of Structural Biology, University of Oxford, The Wellcome Centre for Human Genetics, Headington, Oxford, UK
| | - Gavin R. Screaton
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
| | | | | | - Jason Crain
- IBM Research Europe, Hartree Centre, Daresbury WA4 4AD, UK
- Department of Biochemistry, University of Oxford, Oxford OX1 3QU, UK
| | - Martin A. Walsh
- Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK
- Research Complex at Harwell, Harwell Science and Innovation Campus, OX11 0FA Didcot, UK
| | - David I. Stuart
- Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK
- Division of Structural Biology, University of Oxford, The Wellcome Centre for Human Genetics, Headington, Oxford, UK
| | - Payel Das
- IBM Research, Thomas J. Watson Research Center, Yorktown Heights, New York, NY, USA
| |
Collapse
|
25
|
Ciepliński T, Danel T, Podlewska S, Jastrzȩbski S. Generative Models Should at Least Be Able to Design Molecules That Dock Well: A New Benchmark. J Chem Inf Model 2023. [PMID: 37224003 DOI: 10.1021/acs.jcim.2c01355] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Designing compounds with desired properties is a key element of the drug discovery process. However, measuring progress in the field has been challenging due to the lack of realistic retrospective benchmarks, and the large cost of prospective validation. To close this gap, we propose a benchmark based on docking, a widely used computational method for assessing molecule binding to a protein. Concretely, the goal is to generate drug-like molecules that are scored highly by SMINA, a popular docking software. We observe that various graph-based generative models fail to propose molecules with a high docking score when trained using a realistically sized training set. This suggests a limitation of the current incarnation of models for de novo drug design. Finally, we also include simpler tasks in the benchmark based on a simpler scoring function. We release the benchmark as an easy to use package available at https://github.com/cieplinski-tobiasz/smina-docking-benchmark. We hope that our benchmark will serve as a stepping stone toward the goal of automatically generating promising drug candidates.
Collapse
Affiliation(s)
- Tobiasz Ciepliński
- Faculty of Mathematics and Computer Science, Jagiellonian University, Łojasiewicza 6, 30-348 Kraków, Poland
| | - Tomasz Danel
- Faculty of Mathematics and Computer Science, Jagiellonian University, Łojasiewicza 6, 30-348 Kraków, Poland
| | - Sabina Podlewska
- Maj Institute of Pharmacology, Polish Academy of Sciences, Smȩtna 12, 31-343 Kraków, Poland
| | - Stanisław Jastrzȩbski
- Faculty of Mathematics and Computer Science, Jagiellonian University, Łojasiewicza 6, 30-348 Kraków, Poland
- Molecule.one, Al. Jerozolimskie 96, 00-807 Warsaw, Poland
| |
Collapse
|
26
|
Yoshimori A, Bajorath J. Motif2Mol: Prediction of New Active Compounds Based on Sequence Motifs of Ligand Binding Sites in Proteins Using a Biochemical Language Model. Biomolecules 2023; 13:biom13050833. [PMID: 37238703 DOI: 10.3390/biom13050833] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 05/05/2023] [Accepted: 05/12/2023] [Indexed: 05/28/2023] Open
Abstract
In drug design, the prediction of new active compounds from protein sequence data has only been attempted in a few studies thus far. This prediction task is principally challenging because global protein sequence similarity has strong evolutional and structural implications, but is often only vaguely related to ligand binding. Deep language models adapted from natural language processing offer new opportunities to attempt such predictions via machine translation by directly relating amino acid sequences and chemical structures to each based on textual molecular representations. Herein, we introduce a biochemical language model with transformer architecture for the prediction of new active compounds from sequence motifs of ligand binding sites. In a proof-of-concept application on inhibitors of more than 200 human kinases, the Motif2Mol model revealed promising learning characteristics and an unprecedented ability to consistently reproduce known inhibitors of different kinases.
Collapse
Affiliation(s)
- Atsushi Yoshimori
- Institute for Theoretical Medicine, Inc., 26-1 Muraoka-Higashi 2-Chome, Fujisawa 251-0012, Japan
| | - Jürgen Bajorath
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, 53115 Bonn, Germany
| |
Collapse
|
27
|
Qian X, Dai X, Luo L, Lin M, Xu Y, Zhao Y, Huang D, Qiu H, Liang L, Liu H, Liu Y, Gu L, Lu T, Chen Y, Zhang Y. An Interpretable Multitask Framework BiLAT Enables Accurate Prediction of Cyclin-Dependent Protein Kinase Inhibitors. J Chem Inf Model 2023. [PMID: 37171216 DOI: 10.1021/acs.jcim.3c00473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
The cyclin-dependent protein kinases (CDKs) are protein-serine/threonine kinases with crucial effects on the regulation of cell cycle and transcription. CDKs can be a hallmark of cancer since their excessive expression could lead to impaired cell proliferation. However, the selectivity profile of most developed CDK inhibitors is not enough, which have hindered the therapeutic use of CDK inhibitors. In this study, we propose a multitask deep learning framework called BiLAT based on SMILES representation for the prediction of the inhibitory activity of molecules on eight CDK subtypes (CDK1, 2, 4-9). The framework is mainly composed of an improved bidirectional long short-term memory module BiLSTM and the encode layer of the Transformer framework. Additionally, the data enhancement method of SMILES enumeration is applied to improve the performance of the model. Compared with baseline predictive models based on three conventional machine learning methods and two multitask deep learning algorithms, BiLAT achieves the best performance with the highest average AUC, ACC, F1-score, and MCC values of 0.938, 0.894, 0.911, and 0.715 for the test set. Moreover, we constructed a targeted external data set CDK-Dec for the CDK family, which mainly contains bait values screened by 3D similarity with active compounds. This dataset was utilized in the subsequent evaluation of our model. It is worth mentioning that the BiLAT model is interpretable and can be used by chemists to design and synthesize compounds with improved activity. To further verify the generalization ability of the multitask BiLAT model, we also conducted another evaluation on three public datasets (Tox21, ClinTox, and SIDER). Compared with several currently popular models, BiLAT shows the best performance on two datasets. These results indicate that BiLAT is an effective tool for accelerating drug discovery.
Collapse
Affiliation(s)
- Xu Qian
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Xiaowen Dai
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Lin Luo
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Mingde Lin
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yuan Xu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yang Zhao
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Dingfang Huang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Haodi Qiu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Li Liang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Haichun Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yingbo Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Lingxi Gu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Tao Lu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
- State Key Laboratory of Natural Medicines, China Pharmaceutical University, 24 Tongjiaxiang, Nanjing 210009, China
| | - Yadong Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yanmin Zhang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| |
Collapse
|
28
|
Thomas M, Bender A, de Graaf C. Integrating structure-based approaches in generative molecular design. Curr Opin Struct Biol 2023; 79:102559. [PMID: 36870277 DOI: 10.1016/j.sbi.2023.102559] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 01/23/2023] [Accepted: 01/31/2023] [Indexed: 03/06/2023]
Abstract
Generative molecular design for drug discovery and development has seen a recent resurgence promising to improve the efficiency of the design-make-test-analyse cycle; by computationally exploring much larger chemical spaces than traditional virtual screening techniques. However, most generative models thus far have only utilized small-molecule information to train and condition de novo molecule generators. Here, we instead focus on recent approaches that incorporate protein structure into de novo molecule optimization in an attempt to maximize the predicted on-target binding affinity of generated molecules. We summarize these structure integration principles into either distribution learning or goal-directed optimization and for each case whether the approach is protein structure-explicit or implicit with respect to the generative model. We discuss recent approaches in the context of this categorization and provide our perspective on the future direction of the field.
Collapse
Affiliation(s)
- Morgan Thomas
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK.
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK. https://twitter.com/@AndreasBenderUK
| | - Chris de Graaf
- Sosei Heptares, Steinmetz Building, Granta Park, Great Abington, Cambridge, CB21 6DG, UK. https://twitter.com/@Chris_de_Graaf
| |
Collapse
|
29
|
Kresoja KP, Unterhuber M, Wachter R, Thiele H, Lurz P. A cardiologist's guide to machine learning in cardiovascular disease prognosis prediction. Basic Res Cardiol 2023; 118:10. [PMID: 36939941 PMCID: PMC10027799 DOI: 10.1007/s00395-023-00982-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 02/21/2023] [Accepted: 02/26/2023] [Indexed: 03/21/2023]
Abstract
A modern-day physician is faced with a vast abundance of clinical and scientific data, by far surpassing the capabilities of the human mind. Until the last decade, advances in data availability have not been accompanied by analytical approaches. The advent of machine learning (ML) algorithms might improve the interpretation of complex data and should help to translate the near endless amount of data into clinical decision-making. ML has become part of our everyday practice and might even further change modern-day medicine. It is important to acknowledge the role of ML in prognosis prediction of cardiovascular disease. The present review aims on preparing the modern physician and researcher for the challenges that ML might bring, explaining basic concepts but also caveats that might arise when using these methods. Further, a brief overview of current established classical and emerging concepts of ML disease prediction in the fields of omics, imaging and basic science is presented.
Collapse
Affiliation(s)
- Karl-Patrik Kresoja
- Department of Internal Medicine/Cardiology, Heart Center Leipzig at University of Leipzig, Struempellstr. 39, 04289, Leipzig, Germany
- Leipzig Heart Institute, Leipzig Heart Science at Heart Center Leipzig, Leipzig, Germany
| | - Matthias Unterhuber
- Department of Internal Medicine/Cardiology, Heart Center Leipzig at University of Leipzig, Struempellstr. 39, 04289, Leipzig, Germany
- Leipzig Heart Institute, Leipzig Heart Science at Heart Center Leipzig, Leipzig, Germany
| | - Rolf Wachter
- Department of Cardiology, University Hospital Leipzig, Leipzig, Germany
- Clinic for Cardiology and Pneumology, University Medicine Göttingen, Göttingen, Germany
- German Cardiovascular Research Center (DZHK), Partner Site Göttingen, Göttingen, Germany
| | - Holger Thiele
- Department of Internal Medicine/Cardiology, Heart Center Leipzig at University of Leipzig, Struempellstr. 39, 04289, Leipzig, Germany.
- Leipzig Heart Institute, Leipzig Heart Science at Heart Center Leipzig, Leipzig, Germany.
| | - Philipp Lurz
- Department of Internal Medicine/Cardiology, Heart Center Leipzig at University of Leipzig, Struempellstr. 39, 04289, Leipzig, Germany.
- Leipzig Heart Institute, Leipzig Heart Science at Heart Center Leipzig, Leipzig, Germany.
| |
Collapse
|
30
|
Potlitz F, Link A, Schulig L. Advances in the discovery of new chemotypes through ultra-large library docking. Expert Opin Drug Discov 2023; 18:303-313. [PMID: 36714919 DOI: 10.1080/17460441.2023.2171984] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
INTRODUCTION The size and complexity of virtual screening libraries in drug discovery have skyrocketed in recent years, reaching up to multiple billions of accessible compounds. However, virtual screening of such ultra-large libraries poses several challenges associated with preparing the libraries, sampling, and pre-selection of suitable compounds. The utilization of artificial intelligence (AI)-assisted screening approaches, such as deep learning, poses a promising countermeasure to deal with this rapidly expanding chemical space. For example, various AI-driven methods were recently successfully used to identify novel small molecule inhibitors of the SARS-CoV-2 main protease (Mpro). AREAS COVERED This review focuses on presenting various kinds of virtual screening methods suitable for dealing with ultra-large libraries. Challenges associated with these computational methodologies are discussed, and recent advances are highlighted in the example of the discovery of novel Mpro inhibitors targeting the SARS-CoV-2 virus. EXPERT OPINION With the rapid expansion of the virtual chemical space, the methodologies for docking and screening such quantities of molecules need to keep pace. Employment of AI-driven screening compounds has already been shown to be effective in a range from a few thousand to multiple billion compounds, furthered by de novo generation of drug-like molecules without human interference.
Collapse
Affiliation(s)
- Felix Potlitz
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, University of Greifswald, Germany
| | - Andreas Link
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, University of Greifswald, Germany
| | - Lukas Schulig
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, University of Greifswald, Germany
| |
Collapse
|
31
|
Schoenmaker L, Béquignon OJM, Jespers W, van Westen GJP. UnCorrupt SMILES: a novel approach to de novo design. J Cheminform 2023; 15:22. [PMID: 36788579 PMCID: PMC9926805 DOI: 10.1186/s13321-023-00696-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 02/06/2023] [Indexed: 02/16/2023] Open
Abstract
Generative deep learning models have emerged as a powerful approach for de novo drug design as they aid researchers in finding new molecules with desired properties. Despite continuous improvements in the field, a subset of the outputs that sequence-based de novo generators produce cannot be progressed due to errors. Here, we propose to fix these invalid outputs post hoc. In similar tasks, transformer models from the field of natural language processing have been shown to be very effective. Therefore, here this type of model was trained to translate invalid Simplified Molecular-Input Line-Entry System (SMILES) into valid representations. The performance of this SMILES corrector was evaluated on four representative methods of de novo generation: a recurrent neural network (RNN), a target-directed RNN, a generative adversarial network (GAN), and a variational autoencoder (VAE). This study has found that the percentage of invalid outputs from these specific generative models ranges between 4 and 89%, with different models having different error-type distributions. Post hoc correction of SMILES was shown to increase model validity. The SMILES corrector trained with one error per input alters 60-90% of invalid generator outputs and fixes 35-80% of them. However, a higher error detection and performance was obtained for transformer models trained with multiple errors per input. In this case, the best model was able to correct 60-95% of invalid generator outputs. Further analysis showed that these fixed molecules are comparable to the correct molecules from the de novo generators based on novelty and similarity. Additionally, the SMILES corrector can be used to expand the amount of interesting new molecules within the targeted chemical space. Introducing different errors into existing molecules yields novel analogs with a uniqueness of 39% and a novelty of approximately 20%. The results of this research demonstrate that SMILES correction is a viable post hoc extension and can enhance the search for better drug candidates.
Collapse
Affiliation(s)
- Linde Schoenmaker
- grid.5132.50000 0001 2312 1970Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Olivier J. M. Béquignon
- grid.5132.50000 0001 2312 1970Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Willem Jespers
- grid.5132.50000 0001 2312 1970Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Gerard J. P. van Westen
- grid.5132.50000 0001 2312 1970Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| |
Collapse
|
32
|
Constâncio AS, Tsunoda DF, Silva HDFN, da Silveira JM, Carvalho DR. Deception detection with machine learning: A systematic review and statistical analysis. PLoS One 2023; 18:e0281323. [PMID: 36757928 PMCID: PMC9910662 DOI: 10.1371/journal.pone.0281323] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 01/20/2023] [Indexed: 02/10/2023] Open
Abstract
Several studies applying Machine Learning to deception detection have been published in the last decade. A rich and complex set of settings, approaches, theories, and results is now available. Therefore, one may find it difficult to identify trends, successful paths, gaps, and opportunities for contribution. The present literature review aims to provide the state of research regarding deception detection with Machine Learning. We followed the PRISMA protocol and retrieved 648 articles from ACM Digital Library, IEEE Xplore, Scopus, and Web of Science. 540 of them were screened (108 were duplicates). A final corpus of 81 documents has been summarized as mind maps. Metadata was extracted and has been encoded as Python dictionaries to support a statistical analysis scripted in Python programming language, and available as a collection of Jupyter Lab Notebooks in a GitHub repository. All are available as Jupyter Lab Notebooks. Neural Networks, Support Vector Machines, Random Forest, Decision Tree and K-nearest Neighbor are the five most explored techniques. The studies report a detection performance ranging from 51% to 100%, with 19 works reaching accuracy rate above 0.9. Monomodal, Bimodal, and Multimodal approaches were exploited and achieved various accuracy levels for detection. Bimodal and Multimodal approaches have become a trend over Monomodal ones, although there are high-performance examples of the latter. Studies that exploit language and linguistic features, 75% are dedicated to English. The findings include observations of the following: language and culture, emotional features, psychological traits, cognitive load, facial cues, complexity, performance, and Machine Learning topics. We also present a dataset benchmark. Main conclusions are that labeled datasets from real-life data are scarce. Also, there is still room for new approaches for deception detection with Machine Learning, especially if focused on languages and cultures other than English-based. Further research would greatly contribute by providing new labeled and multimodal datasets for deception detection, both for English and other languages.
Collapse
|
33
|
Danel T, Łęski J, Podlewska S, Podolak IT. Docking-based generative approaches in the search for new drug candidates. Drug Discov Today 2023; 28:103439. [PMID: 36372330 DOI: 10.1016/j.drudis.2022.103439] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/08/2022] [Accepted: 11/08/2022] [Indexed: 11/13/2022]
Abstract
Despite the popularity of virtual screening (VS) of existing compound libraries, the search for new potential drug candidates also takes advantage of generative protocols, where new compound suggestions are enumerated using various algorithms. To increase the activity potency of generative approaches, they have recently been coupled with molecular docking, a leading methodology of structure-based drug design (SBDD). In this review, we summarize progress since docking-based generative models emerged. We propose a new taxonomy for these methods and discuss their importance for the field of computer-aided drug design (CADD). In addition, we discuss the most promising directions for the further development of generative protocols coupled with docking.
Collapse
Affiliation(s)
- Tomasz Danel
- Faculty of Mathematics and Computer Science, Jagiellonian University, 6 Łojasiewicza Street, 30-348 Kraków, Poland.
| | - Jan Łęski
- Faculty of Mathematics and Computer Science, Jagiellonian University, 6 Łojasiewicza Street, 30-348 Kraków, Poland
| | - Sabina Podlewska
- Maj Institute of Pharmacology, Polish Academy of Sciences, Department of Medicinal Chemistry, 31-343 Kraków, Smętna Street 12, Poland
| | - Igor T Podolak
- Faculty of Mathematics and Computer Science, Jagiellonian University, 6 Łojasiewicza Street, 30-348 Kraków, Poland
| |
Collapse
|
34
|
Choi J, Seo S, Park S. COMA: efficient structure-constrained molecular generation using contractive and margin losses. J Cheminform 2023; 15:8. [PMID: 36658602 PMCID: PMC9850577 DOI: 10.1186/s13321-023-00679-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 01/04/2023] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Structure-constrained molecular generation is a promising approach to drug discovery. The goal of structure-constrained molecular generation is to produce a novel molecule that is similar to a given source molecule (e.g. hit molecules) but has enhanced chemical properties (for lead optimization). Many structure-constrained molecular generation models with superior performance in improving chemical properties have been proposed; however, they still have difficulty producing many novel molecules that satisfy both the high structural similarities to each source molecule and improved molecular properties. METHODS We propose a structure-constrained molecular generation model that utilizes contractive and margin loss terms to simultaneously achieve property improvement and high structural similarity. The proposed model has two training phases; a generator first learns molecular representation vectors using metric learning with contractive and margin losses and then explores optimized molecular structure for target property improvement via reinforcement learning. RESULTS We demonstrate the superiority of our proposed method by comparing it with various state-of-the-art baselines and through ablation studies. Furthermore, we demonstrate the use of our method in drug discovery using an example of sorafenib-like molecular generation in patients with drug resistance.
Collapse
Affiliation(s)
- Jonghwan Choi
- grid.15444.300000 0004 0470 5454Department of Computer Science, Yonsei University, Yonsei-ro 50, 03722 Seoul, Republic of Korea ,UBLBio Corporation, Yeongtong-ro 237, 16679 Suwon, Gyeonggi-do Republic of Korea
| | - Sangmin Seo
- grid.15444.300000 0004 0470 5454Department of Computer Science, Yonsei University, Yonsei-ro 50, 03722 Seoul, Republic of Korea ,UBLBio Corporation, Yeongtong-ro 237, 16679 Suwon, Gyeonggi-do Republic of Korea
| | - Sanghyun Park
- grid.15444.300000 0004 0470 5454Department of Computer Science, Yonsei University, Yonsei-ro 50, 03722 Seoul, Republic of Korea
| |
Collapse
|
35
|
Chan L, Kumar R, Verdonk M, Poelking C. A multilevel generative framework with hierarchical self-contrasting for bias control and transparency in structure-based ligand design. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00564-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
36
|
Bort W, Mazitov D, Horvath D, Bonachera F, Lin A, Marcou G, Baskin I, Madzhidov T, Varnek A. Inverse QSAR: Reversing Descriptor-Driven Prediction Pipeline Using Attention-Based Conditional Variational Autoencoder. J Chem Inf Model 2022; 62:5471-5484. [PMID: 36332178 DOI: 10.1021/acs.jcim.2c01086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
In order to better foramize it, the notorious inverse-QSAR problem (finding structures of given QSAR-predicted properties) is considered in this paper as a two-step process including (i) finding "seed" descriptor vectors corresponding to user-constrained QSAR model output values and (ii) identifying the chemical structures best matching the "seed" vectors. The main development effort here was focused on the latter stage, proposing a new attention-based conditional variational autoencoder neural-network architecture based on recent developments in attention-based methods. The obtained results show that this workflow was capable of generating compounds predicted to display desired activity while being completely novel compared to the training database (ChEMBL). Moreover, the generated compounds show acceptable druglikeness and synthetic accessibility. Both pharmacophore and docking studies were carried out as "orthogonal" in silico validation methods, proving that some of de novo structures are, beyond being predicted active by 2D-QSAR models, clearly able to match binding 3D pharmacophores and bind the protein pocket.
Collapse
Affiliation(s)
- William Bort
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Daniyar Mazitov
- Laboratory of Chemoinformatics and Molecular Modeling, A. M. Butlerov Institute of Chemistry, Kazan Federal University, 18, Kremlyovskaya str., 420008 Kazan, Russia
| | - Dragos Horvath
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Fanny Bonachera
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Arkadii Lin
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Gilles Marcou
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Igor Baskin
- Department of Material Science and Engineering, Technion─Israel Institute of Technology, 3200003 Haifa, Israel
| | - Timur Madzhidov
- Laboratory of Chemoinformatics and Molecular Modeling, A. M. Butlerov Institute of Chemistry, Kazan Federal University, 18, Kremlyovskaya str., 420008 Kazan, Russia
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| |
Collapse
|
37
|
Unterhuber M, Kresoja KP, Lurz P, Thiele H. Artificial intelligence in proteomics: new frontiers from risk prediction to treatment? Eur Heart J 2022; 43:4525-4527. [PMID: 35869894 DOI: 10.1093/eurheartj/ehac391] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Affiliation(s)
- Matthias Unterhuber
- Department of Cardiology, Heart Center Leipzig at University of Leipzig, Strümpellstraße 39, Leipzig 04289, Germany
| | - Karl-Patrik Kresoja
- Department of Cardiology, Heart Center Leipzig at University of Leipzig, Strümpellstraße 39, Leipzig 04289, Germany
| | - Philipp Lurz
- Department of Cardiology, Heart Center Leipzig at University of Leipzig, Strümpellstraße 39, Leipzig 04289, Germany
| | - Holger Thiele
- Department of Cardiology, Heart Center Leipzig at University of Leipzig, Strümpellstraße 39, Leipzig 04289, Germany.,Leipzig Heart Institute at Heart Center Leipzig, Russenstraße 69A, Leipzig 04289, Germany
| |
Collapse
|
38
|
Li Y, Zhang L, Wang Y, Zou J, Yang R, Luo X, Wu C, Yang W, Tian C, Xu H, Wang F, Yang X, Li L, Yang S. Generative deep learning enables the discovery of a potent and selective RIPK1 inhibitor. Nat Commun 2022; 13:6891. [PMID: 36371441 PMCID: PMC9653409 DOI: 10.1038/s41467-022-34692-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 11/03/2022] [Indexed: 11/13/2022] Open
Abstract
The retrieval of hit/lead compounds with novel scaffolds during early drug development is an important but challenging task. Various generative models have been proposed to create drug-like molecules. However, the capacity of these generative models to design wet-lab-validated and target-specific molecules with novel scaffolds has hardly been verified. We herein propose a generative deep learning (GDL) model, a distribution-learning conditional recurrent neural network (cRNN), to generate tailor-made virtual compound libraries for given biological targets. The GDL model is then applied to RIPK1. Virtual screening against the generated tailor-made compound library and subsequent bioactivity evaluation lead to the discovery of a potent and selective RIPK1 inhibitor with a previously unreported scaffold, RI-962. This compound displays potent in vitro activity in protecting cells from necroptosis, and good in vivo efficacy in two inflammatory models. Collectively, the findings prove the capacity of our GDL model in generating hit/lead compounds with unreported scaffolds, highlighting a great potential of deep learning in drug discovery.
Collapse
Affiliation(s)
- Yueshan Li
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Liting Zhang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Yifei Wang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Jun Zou
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Ruicheng Yang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Xinling Luo
- grid.13291.380000 0001 0807 1581Key Laboratory of Drug Targeting and Drug Delivery System of Ministry of Education, West China School of Pharmacy, Sichuan University, 610041 Chengdu, Sichuan China
| | - Chengyong Wu
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Wei Yang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Chenyu Tian
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Haixing Xu
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Falu Wang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Xin Yang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Linli Li
- grid.13291.380000 0001 0807 1581Key Laboratory of Drug Targeting and Drug Delivery System of Ministry of Education, West China School of Pharmacy, Sichuan University, 610041 Chengdu, Sichuan China
| | - Shengyong Yang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| |
Collapse
|
39
|
Bieniek MK, Cree B, Pirie R, Horton JT, Tatum NJ, Cole DJ. An open-source molecular builder and free energy preparation workflow. Commun Chem 2022; 5:136. [PMID: 36320862 PMCID: PMC9607723 DOI: 10.1038/s42004-022-00754-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 10/11/2022] [Indexed: 01/27/2023] Open
Abstract
Automated free energy calculations for the prediction of binding free energies of congeneric series of ligands to a protein target are growing in popularity, but building reliable initial binding poses for the ligands is challenging. Here, we introduce the open-source FEgrow workflow for building user-defined congeneric series of ligands in protein binding pockets for input to free energy calculations. For a given ligand core and receptor structure, FEgrow enumerates and optimises the bioactive conformations of the grown functional group(s), making use of hybrid machine learning/molecular mechanics potential energy functions where possible. Low energy structures are optionally scored using the gnina convolutional neural network scoring function, and output for more rigorous protein-ligand binding free energy predictions. We illustrate use of the workflow by building and scoring binding poses for ten congeneric series of ligands bound to targets from a standard, high quality dataset of protein-ligand complexes. Furthermore, we build a set of 13 inhibitors of the SARS-CoV-2 main protease from the literature, and use free energy calculations to retrospectively compute their relative binding free energies. FEgrow is freely available at https://github.com/cole-group/FEgrow, along with a tutorial.
Collapse
Affiliation(s)
- Mateusz K. Bieniek
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| | - Ben Cree
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| | - Rachael Pirie
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| | - Joshua T. Horton
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| | - Natalie J. Tatum
- Newcastle University Centre for Cancer, Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, NE2 4HH UK
| | - Daniel J. Cole
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| |
Collapse
|
40
|
Naseri P, Goussetis G, Fonseca NJG, Hum SV. Synthesis of multi-band reflective polarizing metasurfaces using a generative adversarial network. Sci Rep 2022; 12:17006. [PMID: 36220834 PMCID: PMC9554045 DOI: 10.1038/s41598-022-20851-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 09/20/2022] [Indexed: 12/03/2022] Open
Abstract
Electromagnetic linear-to-circular polarization converters with wide- and multi-band capabilities can simplify antenna systems where circular polarization is required. Multi-band solutions are attractive in satellite communication systems, which commonly have the additional requirement that the sense of polarization is reversed between adjacent bands. However, the design of these structures using conventional ad hoc methods relies heavily on empirical methods. Here, we employ a data-driven approach integrated with a generative adversarial network to explore the design space of the polarizer meta-atom thoroughly. Dual-band and triple-band reflective polarizers with stable performance over incident angles up to and including 30°, corresponding to typical reflector antenna system requirements, are synthesized using the proposed method. The feasibility and performance of the designed polarizer is validated through measurements of a fabricated prototype.
Collapse
Affiliation(s)
- Parinaz Naseri
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering, University of Toronto, Toronto, Canada.
| | - George Goussetis
- Institute of Sensors Signals and Systems, Heriot-Watt University, Edinburgh, Scotland
| | - Nelson J G Fonseca
- Antenna and Sub-Millimetre Waves Section, European Space Agency (ESA), Noordwijk, The Netherlands
| | - Sean V Hum
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering, University of Toronto, Toronto, Canada
| |
Collapse
|
41
|
Interpretable Machine Learning Models for Molecular Design of Tyrosine Kinase Inhibitors Using Variational Autoencoders and Perturbation-Based Approach of Chemical Space Exploration. Int J Mol Sci 2022; 23:ijms231911262. [PMID: 36232566 PMCID: PMC9569663 DOI: 10.3390/ijms231911262] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 09/21/2022] [Accepted: 09/21/2022] [Indexed: 11/17/2022] Open
Abstract
In the current study, we introduce an integrative machine learning strategy for the autonomous molecular design of protein kinase inhibitors using variational autoencoders and a novel cluster-based perturbation approach for exploration of the chemical latent space. The proposed strategy combines autoencoder-based embedding of small molecules with a cluster-based perturbation approach for efficient navigation of the latent space and a feature-based kinase inhibition likelihood classifier that guides optimization of the molecular properties and targeted molecular design. In the proposed generative approach, molecules sharing similar structures tend to cluster in the latent space, and interpolating between two molecules in the latent space enables smooth changes in the molecular structures and properties. The results demonstrated that the proposed strategy can efficiently explore the latent space of small molecules and kinase inhibitors along interpretable directions to guide the generation of novel family-specific kinase molecules that display a significant scaffold diversity and optimal biochemical properties. Through assessment of the latent-based and chemical feature-based binary and multiclass classifiers, we developed a robust probabilistic evaluator of kinase inhibition likelihood that is specifically tailored to guide the molecular design of novel SRC kinase molecules. The generated molecules originating from LCK and ABL1 kinase inhibitors yielded ~40% of novel and valid SRC kinase compounds with high kinase inhibition likelihood probability values (p > 0.75) and high similarity (Tanimoto coefficient > 0.6) to the known SRC inhibitors. By combining the molecular perturbation design with the kinase inhibition likelihood analysis and similarity assessments, we showed that the proposed molecular design strategy can produce novel valid molecules and transform known inhibitors of different kinase families into potential chemical probes of the SRC kinase with excellent physicochemical profiles and high similarity to the known SRC kinase drugs. The results of our study suggest that task-specific manipulation of a biased latent space may be an important direction for more effective task-oriented and target-specific autonomous chemical design models.
Collapse
|
42
|
Li C, Wang C, Sun M, Zeng Y, Yuan Y, Gou Q, Wang G, Guo Y, Pu X. Correlated RNN Framework to Quickly Generate Molecules with Desired Properties for Energetic Materials in the Low Data Regime. J Chem Inf Model 2022; 62:4873-4887. [PMID: 35998331 DOI: 10.1021/acs.jcim.2c00997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Motivated by the challenging of deep learning on the low data regime and the urgent demand for intelligent design on highly energetic materials, we explore a correlated deep learning framework, which consists of three recurrent neural networks (RNNs) correlated by the transfer learning strategy, to efficiently generate new energetic molecules with a high detonation velocity in the case of very limited data available. To avoid the dependence on the external big data set, data augmentation by fragment shuffling of 303 energetic compounds is utilized to produce 500,000 molecules to pretrain RNN, through which the model can learn sufficient structure knowledge. Then the pretrained RNN is fine-tuned by focusing on the 303 energetic compounds to generate 7153 molecules similar to the energetic compounds. In order to more reliably screen the molecules with a high detonation velocity, the SMILE enumeration augmentation coupled with the pretrained knowledge is utilized to build an RNN-based prediction model, through which R2 is boosted from 0.4446 to 0.9572. The comparable performance with the transfer learning strategy based on an existing big database (ChEMBL) to produce the energetic molecules and drug-like ones further supports the effectiveness and generality of our strategy in the low data regime. High-precision quantum mechanics calculations further confirm that 35 new molecules present a higher detonation velocity and lower synthetic accessibility than the classic explosive RDX, along with good thermal stability. In particular, three new molecules are comparable to caged CL-20 in the detonation velocity. All the source codes and the data set are freely available at https://github.com/wangchenghuidream/RNNMGM.
Collapse
Affiliation(s)
- Chuan Li
- College of Computer Science, Sichuan University, Chengdu 610064, China
| | - Chenghui Wang
- College of Computer Science, Sichuan University, Chengdu 610064, China
| | - Ming Sun
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Yan Zeng
- College of Computer Science, Sichuan University, Chengdu 610064, China
| | - Yuan Yuan
- College of Management, Southwest University for Nationalities, Chengdu 610041, China
| | - Qiaolin Gou
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Guangchuan Wang
- College of Computer Science, Sichuan University, Chengdu 610064, China
| | - Yanzhi Guo
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Xuemei Pu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| |
Collapse
|
43
|
Prediction and Design of Cyclodextrin Inclusion Complexes formation via Machine Learning-based Strategies. Chem Eng Sci 2022. [DOI: 10.1016/j.ces.2022.117946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
44
|
Xie W, Wang F, Li Y, Lai L, Pei J. Advances and Challenges in De Novo Drug Design Using Three-Dimensional Deep Generative Models. J Chem Inf Model 2022; 62:2269-2279. [PMID: 35544331 DOI: 10.1021/acs.jcim.2c00042] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A persistent goal for de novo drug design is to generate novel chemical compounds with desirable properties in a labor-, time-, and cost-efficient manner. Deep generative models provide alternative routes to this goal. Numerous model architectures and optimization strategies have been explored in recent years, most of which have been developed to generate two-dimensional molecular structures. Some generative models aiming at three-dimensional (3D) molecule generation have also been proposed, gaining attention for their unique advantages and potential to directly design drug-like molecules in a target-conditioning manner. This review highlights current developments in 3D molecular generative models combined with deep learning and discusses future directions for de novo drug design.
Collapse
Affiliation(s)
- Weixin Xie
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Fanhao Wang
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Yibo Li
- Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China.,Peking-Tsinghua Center for Life Science at BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| |
Collapse
|
45
|
Lu C, Liu S, Shi W, Yu J, Zhou Z, Zhang X, Lu X, Cai F, Xia N, Wang Y. Systemic evolutionary chemical space exploration for drug discovery. J Cheminform 2022; 14:19. [PMID: 35365231 PMCID: PMC8973791 DOI: 10.1186/s13321-022-00598-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 03/11/2022] [Indexed: 11/29/2022] Open
Abstract
Chemical space exploration is a major task of the hit-finding process during the pursuit of novel chemical entities. Compared with other screening technologies, computational de novo design has become a popular approach to overcome the limitation of current chemical libraries. Here, we reported a de novo design platform named systemic evolutionary chemical space explorer (SECSE). The platform was conceptually inspired by fragment-based drug design, that miniaturized a “lego-building” process within the pocket of a certain target. The key to virtual hits generation was then turned into a computational search problem. To enhance search and optimization, human intelligence and deep learning were integrated. Application of SECSE against phosphoglycerate dehydrogenase (PHGDH), proved its potential in finding novel and diverse small molecules that are attractive starting points for further validation. This platform is open-sourced and the code is available at http://github.com/KeenThera/SECSE.
Collapse
Affiliation(s)
- Chong Lu
- Keen Therapeutics Co., Ltd., Shanghai, China
| | - Shien Liu
- Keen Therapeutics Co., Ltd., Shanghai, China
| | - Weihua Shi
- Keen Therapeutics Co., Ltd., Shanghai, China
| | - Jun Yu
- Keen Therapeutics Co., Ltd., Shanghai, China
| | - Zhou Zhou
- Keen Therapeutics Co., Ltd., Shanghai, China
| | | | - Xiaoli Lu
- Keen Therapeutics Co., Ltd., Shanghai, China
| | - Faji Cai
- Keen Therapeutics Co., Ltd., Shanghai, China
| | | | - Yikai Wang
- Keen Therapeutics Co., Ltd., Shanghai, China.
| |
Collapse
|
46
|
Cao JS, Xu RZ, Luo JY, Feng Q, Fang F. Rapid quantification of intracellular polyhydroxyalkanoates via fluorescence techniques: A critical review. BIORESOURCE TECHNOLOGY 2022; 350:126906. [PMID: 35227918 DOI: 10.1016/j.biortech.2022.126906] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 02/14/2022] [Accepted: 02/22/2022] [Indexed: 06/14/2023]
Abstract
Polyhydroxyalkanoates (PHA) are promising bioplastics with excellent physicochemical properties and biodegradability, whereas PHA products suffer from high manufacturing costs. To reduce costs of PHA production, experiments with mixed microbial cultures and low-cost substrates have been conducted widely, where rapid and robust PHA quantification methods are necessary. Compared with traditional gas chromatography methods, PHA fluorescence quantification (PHA-FQ) methods may be quicker, safer and more suitable for modern experiments with high throughput requirements. However, practical applications of PHA-FQ methods are still limited. Therefore, this review provides a comprehensive understanding of PHA-FQ methods. Performance of PHA-staining fluorochromes, relevant spectral properties, and important staining procedures are summarized. Current developments of PHA-FQ protocols are critically reviewed. Main considerations needed to make PHA-FQ protocol reliable are comprehensively discussed. Finally, potential improvements in various aspects of PHA-FQ methods are highlighted. This review could help researchers develop more effective PHA-FQ methods and facilitate future experiments related to PHA.
Collapse
Affiliation(s)
- Jia-Shun Cao
- Key Laboratory of Integrated Regulation and Resource Development on Shallow Lakes, Ministry of Education, College of Environment, Hohai University, Nanjing 210098, China
| | - Run-Ze Xu
- Key Laboratory of Integrated Regulation and Resource Development on Shallow Lakes, Ministry of Education, College of Environment, Hohai University, Nanjing 210098, China
| | - Jing-Yang Luo
- Key Laboratory of Integrated Regulation and Resource Development on Shallow Lakes, Ministry of Education, College of Environment, Hohai University, Nanjing 210098, China
| | - Qian Feng
- Key Laboratory of Integrated Regulation and Resource Development on Shallow Lakes, Ministry of Education, College of Environment, Hohai University, Nanjing 210098, China
| | - Fang Fang
- Key Laboratory of Integrated Regulation and Resource Development on Shallow Lakes, Ministry of Education, College of Environment, Hohai University, Nanjing 210098, China.
| |
Collapse
|
47
|
Creanza TM, Lamanna G, Delre P, Contino M, Corriero N, Saviano M, Mangiatordi GF, Ancona N. DeLA-Drug: A Deep Learning Algorithm for Automated Design of Druglike Analogues. J Chem Inf Model 2022; 62:1411-1424. [PMID: 35294184 DOI: 10.1021/acs.jcim.2c00205] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
In this paper, we present a deep learning algorithm for automated design of druglike analogues (DeLA-Drug), a recurrent neural network (RNN) model composed of two long short-term memory (LSTM) layers and conceived for data-driven generation of similar-to-bioactive compounds. DeLA-Drug captures the syntax of SMILES strings of more than 1 million compounds belonging to the ChEMBL28 database and, by employing a new strategy called sampling with substitutions (SWS), generates molecules starting from a single user-defined query compound. Remarkably, the algorithm preserves druglikeness and synthetic accessibility of the known bioactive compounds present in the ChEMBL28 repository. The absence of any time-demanding fine-tuning procedure enables DeLA-Drug to perform a fast generation of focused libraries for further high-throughput screening and makes it a suitable tool for performing de novo design even in low-data regimes. To provide a concrete idea of its applicability, DeLA-Drug was applied to the cannabinoid receptor subtype 2 (CB2R), a known target involved in different pathological conditions such as cancer and neurodegeneration. DeLA-Drug, available as a free web platform (http://www.ba.ic.cnr.it/softwareic/deladrugportal/), can help medicinal chemists interested in generating analogues of compounds already available in their laboratories and, for this reason, good candidates for an easy and low-cost synthesis.
Collapse
Affiliation(s)
- Teresa Maria Creanza
- CNR─Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing, Via Amendola 122/o, 70126 Bari, Italy
| | - Giuseppe Lamanna
- Chemistry Department, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125 Bari, Italy.,CNR─Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Pietro Delre
- Chemistry Department, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125 Bari, Italy.,CNR─Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Marialessandra Contino
- Department of Pharmacy─Pharmaceutical Sciences, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125 Bari, Italy
| | - Nicola Corriero
- CNR─Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Michele Saviano
- CNR─Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | | | - Nicola Ancona
- CNR─Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing, Via Amendola 122/o, 70126 Bari, Italy
| |
Collapse
|
48
|
De Novo Molecular Design of Caspase-6 Inhibitors by a GRU-Based Recurrent Neural Network Combined with a Transfer Learning Approach. Pharmaceuticals (Basel) 2021; 14:ph14121249. [PMID: 34959651 PMCID: PMC8706867 DOI: 10.3390/ph14121249] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 11/21/2021] [Accepted: 11/24/2021] [Indexed: 12/31/2022] Open
Abstract
Due to their potential in the treatment of neurodegenerative diseases, caspase-6 inhibitors have attracted widespread attention. However, the existing caspase-6 inhibitors showed more or less inevitable deficiencies that restrict their clinical development and applications. Therefore, there is an urgent need to develop novel caspase-6 candidate inhibitors. Herein, a gated recurrent unit (GRU)-based recurrent neural network (RNN) combined with transfer learning was used to build a molecular generative model of caspase-6 inhibitors. The results showed that the GRU-based RNN model can accurately learn the SMILES grammars of about 2.4 million chemical molecules including ionic and isomeric compounds and can generate potential caspase-6 inhibitors after transfer learning of the known 433 caspase-6 inhibitors. Based on the novel molecules derived from the molecular generative model, an optimal logistic regression model and Surflex-dock were employed for predicting and ranking the inhibitory activities. According to the prediction results, three potential caspase-6 inhibitors with different scaffolds were selected as the promising candidates for further research. In general, this paper provides an efficient combinational strategy for de novo molecular design of caspase-6 inhibitors.
Collapse
|