1
|
Guichaoua G, Pinel P, Hoffmann B, Azencott CA, Stoven V. Drug-Target Interactions Prediction at Scale: The Komet Algorithm with the LCIdb Dataset. J Chem Inf Model 2024. [PMID: 39237105 DOI: 10.1021/acs.jcim.4c00422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2024]
Abstract
Drug-target interactions (DTIs) prediction algorithms are used at various stages of the drug discovery process. In this context, specific problems such as deorphanization of a new therapeutic target or target identification of a drug candidate arising from phenotypic screens require large-scale predictions across the protein and molecule spaces. DTI prediction heavily relies on supervised learning algorithms that use known DTIs to learn associations between molecule and protein features, allowing for the prediction of new interactions based on learned patterns. The algorithms must be broadly applicable to enable reliable predictions, even in regions of the protein or molecule spaces where data may be scarce. In this paper, we address two key challenges to fulfill these goals: building large, high-quality training datasets and designing prediction methods that can scale, in order to be trained on such large datasets. First, we introduce LCIdb, a curated, large-sized dataset of DTIs, offering extensive coverage of both the molecule and druggable protein spaces. Notably, LCIdb contains a much higher number of molecules than publicly available benchmarks, expanding coverage of the molecule space. Second, we propose Komet (Kronecker Optimized METhod), a DTI prediction pipeline designed for scalability without compromising performance. Komet leverages a three-step framework, incorporating efficient computation choices tailored for large datasets and involving the Nyström approximation. Specifically, Komet employs a Kronecker interaction module for (molecule, protein) pairs, which efficiently captures determinants in DTIs, and whose structure allows for reduced computational complexity and quasi-Newton optimization, ensuring that the model can handle large training sets, without compromising on performance. Our method is implemented in open-source software, leveraging GPU parallel computation for efficiency. We demonstrate the interest of our pipeline on various datasets, showing that Komet displays superior scalability and prediction performance compared to state-of-the-art deep learning approaches. Additionally, we illustrate the generalization properties of Komet by showing its performance on an external dataset, and on the publicly available L H benchmark designed for scaffold hopping problems. Komet is available open source at https://komet.readthedocs.io and all datasets, including LCIdb, can be found at https://zenodo.org/records/10731712.
Collapse
Affiliation(s)
- Gwenn Guichaoua
- Center for Computational Biology (CBIO), Mines Paris-PSL, 75006 Paris, France
- Institut Curie, Université PSL, 75005 Paris, France
- INSERM U900, 75005 Paris, France
| | - Philippe Pinel
- Center for Computational Biology (CBIO), Mines Paris-PSL, 75006 Paris, France
- Institut Curie, Université PSL, 75005 Paris, France
- INSERM U900, 75005 Paris, France
- Iktos SAS, 75017 Paris, France
| | | | - Chloé-Agathe Azencott
- Center for Computational Biology (CBIO), Mines Paris-PSL, 75006 Paris, France
- Institut Curie, Université PSL, 75005 Paris, France
- INSERM U900, 75005 Paris, France
| | - Véronique Stoven
- Center for Computational Biology (CBIO), Mines Paris-PSL, 75006 Paris, France
- Institut Curie, Université PSL, 75005 Paris, France
- INSERM U900, 75005 Paris, France
| |
Collapse
|
2
|
Bongini P, Pancino N, Bendjeddou A, Scarselli F, Maggini M, Bianchini M. Composite Graph Neural Networks for Molecular Property Prediction. Int J Mol Sci 2024; 25:6583. [PMID: 38928289 PMCID: PMC11203616 DOI: 10.3390/ijms25126583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 05/21/2024] [Accepted: 06/11/2024] [Indexed: 06/28/2024] Open
Abstract
Graph Neural Networks have proven to be very valuable models for the solution of a wide variety of problems on molecular graphs, as well as in many other research fields involving graph-structured data. Molecules are heterogeneous graphs composed of atoms of different species. Composite graph neural networks process heterogeneous graphs with multiple-state-updating networks, each one dedicated to a particular node type. This approach allows for the extraction of information from s graph more efficiently than standard graph neural networks that distinguish node types through a one-hot encoded type of vector. We carried out extensive experimentation on eight molecular graph datasets and on a large number of both classification and regression tasks. The results we obtained clearly show that composite graph neural networks are far more efficient in this setting than standard graph neural networks.
Collapse
Affiliation(s)
| | | | | | | | | | - Monica Bianchini
- Department of Information Engineering and Mathematics, University of Siena, 53100 Siena, Italy; (P.B.); (N.P.); (A.B.); (F.S.); (M.M.)
| |
Collapse
|
3
|
Aksamit N, Hou J, Li Y, Ombuki-Berman B. Integrating transformers and many-objective optimization for drug design. BMC Bioinformatics 2024; 25:208. [PMID: 38849719 PMCID: PMC11161990 DOI: 10.1186/s12859-024-05822-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Accepted: 05/30/2024] [Indexed: 06/09/2024] Open
Abstract
BACKGROUND Drug design is a challenging and important task that requires the generation of novel and effective molecules that can bind to specific protein targets. Artificial intelligence algorithms have recently showed promising potential to expedite the drug design process. However, existing methods adopt multi-objective approaches which limits the number of objectives. RESULTS In this paper, we expand this thread of research from the many-objective perspective, by proposing a novel framework that integrates a latent Transformer-based model for molecular generation, with a drug design system that incorporates absorption, distribution, metabolism, excretion, and toxicity prediction, molecular docking, and many-objective metaheuristics. We compared the performance of two latent Transformer models (ReLSO and FragNet) on a molecular generation task and show that ReLSO outperforms FragNet in terms of reconstruction and latent space organization. We then explored six different many-objective metaheuristics based on evolutionary algorithms and particle swarm optimization on a drug design task involving potential drug candidates to human lysophosphatidic acid receptor 1, a cancer-related protein target. CONCLUSION We show that multi-objective evolutionary algorithm based on dominance and decomposition performs the best in terms of finding molecules that satisfy many objectives, such as high binding affinity and low toxicity, and high drug-likeness. Our framework demonstrates the potential of combining Transformers and many-objective computational intelligence for drug design.
Collapse
Affiliation(s)
- Nicholas Aksamit
- Department of Computer Science, Brock University, 1812 Sir Isaac Brock Way, St. Catharines, ON, L2S 3A1, Canada
| | - Jinqiang Hou
- Department of Chemistry, Lakehead University, 955 Oliver Road, Thunder Bay, ON, P7B 5E1, Canada
- Thunder Bay Regional Health Research Institute, 980 Oliver Road, Thunder Bay, ON, P7B 6V4, Canada
| | - Yifeng Li
- Department of Computer Science, Brock University, 1812 Sir Isaac Brock Way, St. Catharines, ON, L2S 3A1, Canada.
- Department of Biological Sciences, Brock University, 1812 Sir Isaac Brock Way, St. Catharines, ON, L2S 3A1, Canada.
| | - Beatrice Ombuki-Berman
- Department of Computer Science, Brock University, 1812 Sir Isaac Brock Way, St. Catharines, ON, L2S 3A1, Canada.
| |
Collapse
|
4
|
Crouzet A, Lopez N, Riss Yaw B, Lepelletier Y, Demange L. The Millennia-Long Development of Drugs Associated with the 80-Year-Old Artificial Intelligence Story: The Therapeutic Big Bang? Molecules 2024; 29:2716. [PMID: 38930784 PMCID: PMC11206022 DOI: 10.3390/molecules29122716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 05/30/2024] [Accepted: 05/31/2024] [Indexed: 06/28/2024] Open
Abstract
The journey of drug discovery (DD) has evolved from ancient practices to modern technology-driven approaches, with Artificial Intelligence (AI) emerging as a pivotal force in streamlining and accelerating the process. Despite the vital importance of DD, it faces challenges such as high costs and lengthy timelines. This review examines the historical progression and current market of DD alongside the development and integration of AI technologies. We analyse the challenges encountered in applying AI to DD, focusing on drug design and protein-protein interactions. The discussion is enriched by presenting models that put forward the application of AI in DD. Three case studies are highlighted to demonstrate the successful application of AI in DD, including the discovery of a novel class of antibiotics and a small-molecule inhibitor that has progressed to phase II clinical trials. These cases underscore the potential of AI to identify new drug candidates and optimise the development process. The convergence of DD and AI embodies a transformative shift in the field, offering a path to overcome traditional obstacles. By leveraging AI, the future of DD promises enhanced efficiency and novel breakthroughs, heralding a new era of medical innovation even though there is still a long way to go.
Collapse
Affiliation(s)
- Aurore Crouzet
- UMR 8038 CNRS CiTCoM, Team PNAS, Faculté de Pharmacie, Université Paris Cité, 4 Avenue de l’Observatoire, 75006 Paris, France
- W-MedPhys, 128 Rue la Boétie, 75008 Paris, France
| | - Nicolas Lopez
- W-MedPhys, 128 Rue la Boétie, 75008 Paris, France
- ENOES, 62 Rue de Miromesnil, 75008 Paris, France
- Unité Mixte de Recherche «Institut de Physique Théorique (IPhT)» CEA-CNRS, UMR 3681, Bat 774, Route de l’Orme des Merisiers, 91191 St Aubin-Gif-sur-Yvette, France
| | - Benjamin Riss Yaw
- UMR 8038 CNRS CiTCoM, Team PNAS, Faculté de Pharmacie, Université Paris Cité, 4 Avenue de l’Observatoire, 75006 Paris, France
| | - Yves Lepelletier
- W-MedPhys, 128 Rue la Boétie, 75008 Paris, France
- Université Paris Cité, Imagine Institute, 24 Boulevard Montparnasse, 75015 Paris, France
- INSERM UMR 1163, Laboratory of Cellular and Molecular Basis of Normal Hematopoiesis and Hematological Disorders: Therapeutical Implications, 24 Boulevard Montparnasse, 75015 Paris, France
| | - Luc Demange
- UMR 8038 CNRS CiTCoM, Team PNAS, Faculté de Pharmacie, Université Paris Cité, 4 Avenue de l’Observatoire, 75006 Paris, France
| |
Collapse
|
5
|
Qiu X, Wang H, Tan X, Fang Z. G-K BertDTA: A graph representation learning and semantic embedding-based framework for drug-target affinity prediction. Comput Biol Med 2024; 173:108376. [PMID: 38552281 DOI: 10.1016/j.compbiomed.2024.108376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 03/21/2024] [Accepted: 03/24/2024] [Indexed: 04/17/2024]
Abstract
Developing new drugs is costly, time-consuming, and risky. Drug-target affinity (DTA), indicating the binding capability between drugs and target proteins, is a crucial indicator for drug development. Accurately predicting interaction strength between new drug-target pairs by analyzing previous experiments aids in screening potential drug molecules, repurposing them, and developing safe and effective medicines. Existing computational models for DTA prediction rely on strings or single-graph neural networks, lacking consideration of protein structure and molecular semantic information, leading to limited accuracy. Our experiments demonstrate that string-based methods may overlook protein conformations, causing a high root mean square error (RMSE) of 3.584 in affinity due to a lack of spatial context. Single graph networks also underperform on topology features, with a 6% lower confidence interval (CI) for activity classification. Absent semantic information also limits generalization across diverse compounds, resulting in 18% increment in RMSE and 5% in misclassifications within quantifications study, restricting potential drug discovery. To address these limitations, we propose G-K BertDTA, a novel framework for accurate DTA prediction incorporating protein features, molecular semantic features, and molecular structural information. In this proposed model, we represent drugs as graphs, with a GIN employed to learn the molecular topological information. For the extraction of protein structural features, we utilize a DenseNet architecture. A knowledge-based BERT semantic model is incorporated to obtain rich pre-trained semantic embeddings, thereby enhancing the feature information. We extensively evaluated our proposed approach on the publicly available benchmark datasets (i.e., KIBA and Davis), and experimental results demonstrate the promising performance of our method, which consistently outperforms previous state-of-the-art approaches. Code is available at https://github.com/AmbitYuki/G-K-BertDTA.
Collapse
Affiliation(s)
- Xihe Qiu
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, China
| | - Haoyu Wang
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, China
| | - Xiaoyu Tan
- INF Technology (Shanghai) Co., Ltd., Shanghai, China
| | - Zhijun Fang
- School of Computer Science and Technology, Donghua University, Shanghai, China.
| |
Collapse
|
6
|
Sutthibutpong T, Posansee K, Liangruksa M, Termsaithong T, Piyayotai S, Phitsuwan P, Saparpakorn P, Hannongbua S, Laomettachit T. Combining Deep Learning and Structural Modeling to Identify Potential Acetylcholinesterase Inhibitors from Hericium erinaceus. ACS OMEGA 2024; 9:16311-16321. [PMID: 38617639 PMCID: PMC11007777 DOI: 10.1021/acsomega.3c10459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 02/16/2024] [Accepted: 03/13/2024] [Indexed: 04/16/2024]
Abstract
Alzheimer's disease (AD) is the most common type of dementia, affecting over 50 million people worldwide. Currently, most approved medications for AD inhibit the activity of acetylcholinesterase (AChE), but these treatments often come with harmful side effects. There is growing interest in the use of natural compounds for disease prevention, alleviation, and treatment. This trend is driven by the anticipation that these substances may incur fewer side effects than existing medications. This research presents a computational approach combining machine learning with structural modeling to discover compounds from medicinal mushrooms with a high potential to inhibit the activity of AChE. First, we developed a deep neural network capable of rapidly screening a vast number of compounds to indicate their potential to inhibit AChE activity. Subsequently, we applied deep learning models to screen the compounds in the BACMUSHBASE database, which catalogs the bioactive compounds from cultivated and wild mushroom varieties local to Thailand, resulting in the identification of five promising compounds. Next, the five identified compounds underwent molecular docking techniques to calculate the binding energy between the compounds and AChE. This allowed us to refine the selection to two compounds, erinacerin A and hericenone B. Further analysis of the binding energy patterns between these compounds and the target protein revealed that both compounds displayed binding energy profiles similar to the combined characteristics of donepezil and galanthamine, the prescription drugs for AD. We propose that these two compounds, derived from Hericium erinaceus (also known as lion's mane mushroom), are suitable candidates for further research and development into symptom-alleviating AD medications.
Collapse
Affiliation(s)
- Thana Sutthibutpong
- Center
of Excellence in Theoretical and Computational Science (TaCS-CoE),
Faculty of Science, King Mongkut’s
University of Technology Thonburi (KMUTT), Bangkok 10140, Thailand
- Theoretical
and Computational Physics Group, Department of Physics, King Mongkut’s University of Technology Thonburi
(KMUTT), Bangkok 10140, Thailand
| | - Kewalin Posansee
- Theoretical
and Computational Physics Group, Department of Physics, King Mongkut’s University of Technology Thonburi
(KMUTT), Bangkok 10140, Thailand
| | - Monrudee Liangruksa
- National
Nanotechnology Center (NANOTEC), National
Science and Technology Development Agency (NSTDA), Pathum Thani 12120, Thailand
| | - Teerasit Termsaithong
- Center
of Excellence in Theoretical and Computational Science (TaCS-CoE),
Faculty of Science, King Mongkut’s
University of Technology Thonburi (KMUTT), Bangkok 10140, Thailand
- Theoretical
and Computational Physics Group, Department of Physics, King Mongkut’s University of Technology Thonburi
(KMUTT), Bangkok 10140, Thailand
- Learning
Institute, King Mongkut’s University
of Technology Thonburi (KMUTT), Bangkok 10140, Thailand
| | - Supanida Piyayotai
- Learning
Institute, King Mongkut’s University
of Technology Thonburi (KMUTT), Bangkok 10140, Thailand
| | - Paripok Phitsuwan
- Division
of Biochemical Technology, School of Bioresources and Technology, King Mongkut’s University of Technology Thonburi, Bangkok 10150, Thailand
| | | | - Supa Hannongbua
- Department
of Chemistry, Faculty of Science, Kasetsart
University, Bangkok 10900, Thailand
| | - Teeraphan Laomettachit
- Center
of Excellence in Theoretical and Computational Science (TaCS-CoE),
Faculty of Science, King Mongkut’s
University of Technology Thonburi (KMUTT), Bangkok 10140, Thailand
- Theoretical
and Computational Physics Group, Department of Physics, King Mongkut’s University of Technology Thonburi
(KMUTT), Bangkok 10140, Thailand
- Bioinformatics
and Systems Biology Program, School of Bioresources and Technology, King Mongkut’s University of Technology Thonburi, Bangkok 10150, Thailand
| |
Collapse
|
7
|
Nam K, Shao Y, Major DT, Wolf-Watz M. Perspectives on Computational Enzyme Modeling: From Mechanisms to Design and Drug Development. ACS OMEGA 2024; 9:7393-7412. [PMID: 38405524 PMCID: PMC10883025 DOI: 10.1021/acsomega.3c09084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/15/2024] [Accepted: 01/19/2024] [Indexed: 02/27/2024]
Abstract
Understanding enzyme mechanisms is essential for unraveling the complex molecular machinery of life. In this review, we survey the field of computational enzymology, highlighting key principles governing enzyme mechanisms and discussing ongoing challenges and promising advances. Over the years, computer simulations have become indispensable in the study of enzyme mechanisms, with the integration of experimental and computational exploration now established as a holistic approach to gain deep insights into enzymatic catalysis. Numerous studies have demonstrated the power of computer simulations in characterizing reaction pathways, transition states, substrate selectivity, product distribution, and dynamic conformational changes for various enzymes. Nevertheless, significant challenges remain in investigating the mechanisms of complex multistep reactions, large-scale conformational changes, and allosteric regulation. Beyond mechanistic studies, computational enzyme modeling has emerged as an essential tool for computer-aided enzyme design and the rational discovery of covalent drugs for targeted therapies. Overall, enzyme design/engineering and covalent drug development can greatly benefit from our understanding of the detailed mechanisms of enzymes, such as protein dynamics, entropy contributions, and allostery, as revealed by computational studies. Such a convergence of different research approaches is expected to continue, creating synergies in enzyme research. This review, by outlining the ever-expanding field of enzyme research, aims to provide guidance for future research directions and facilitate new developments in this important and evolving field.
Collapse
Affiliation(s)
- Kwangho Nam
- Department
of Chemistry and Biochemistry, University
of Texas at Arlington, Arlington, Texas 76019, United States
| | - Yihan Shao
- Department
of Chemistry and Biochemistry, University
of Oklahoma, Norman, Oklahoma 73019-5251, United States
| | - Dan T. Major
- Department
of Chemistry and Institute for Nanotechnology & Advanced Materials, Bar-Ilan University, Ramat-Gan 52900, Israel
| | | |
Collapse
|
8
|
Wu X, Li W, Tu H. Big data and artificial intelligence in cancer research. Trends Cancer 2024; 10:147-160. [PMID: 37977902 DOI: 10.1016/j.trecan.2023.10.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 10/17/2023] [Accepted: 10/20/2023] [Indexed: 11/19/2023]
Abstract
The field of oncology has witnessed an extraordinary surge in the application of big data and artificial intelligence (AI). AI development has made multiscale and multimodal data fusion and analysis possible. A new era of extracting information from complex big data is rapidly evolving. However, challenges related to efficient data curation, in-depth analysis, and utilization remain. We provide a comprehensive overview of the current state of the art in big data and computational analysis, highlighting key applications, challenges, and future opportunities in cancer research. By sketching the current landscape, we seek to foster a deeper understanding and facilitate the advancement of big data utilization in oncology, call for interdisciplinary collaborations, ultimately contributing to improved patient outcomes and a profound understanding of cancer.
Collapse
Affiliation(s)
- Xifeng Wu
- Department of Big Data in Health Science, School of Public Health, Center of Clinical Big Data and Analytics of The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China; National Institute for Data Science in Health and Medicine, Zhejiang University, Hangzhou, Zhejiang, China.
| | - Wenyuan Li
- Department of Big Data in Health Science, School of Public Health, Center of Clinical Big Data and Analytics of The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China; The Key Laboratory of Intelligent Preventive Medicine of Zhejiang Province, Hangzhou, Zhejiang, China
| | - Huakang Tu
- Department of Big Data in Health Science, School of Public Health, Center of Clinical Big Data and Analytics of The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China; Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
| |
Collapse
|
9
|
Guzman-Pando A, Ramirez-Alonso G, Arzate-Quintana C, Camarillo-Cisneros J. Deep learning algorithms applied to computational chemistry. Mol Divers 2023:10.1007/s11030-023-10771-y. [PMID: 38151697 DOI: 10.1007/s11030-023-10771-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 11/14/2023] [Indexed: 12/29/2023]
Abstract
Recently, there has been a significant increase in the use of deep learning techniques in the molecular sciences, which have shown high performance on datasets and the ability to generalize across data. However, no model has achieved perfect performance in solving all problems, and the pros and cons of each approach remain unclear to those new to the field. Therefore, this paper aims to review deep learning algorithms that have been applied to solve molecular challenges in computational chemistry. We proposed a comprehensive categorization that encompasses two primary approaches; conventional deep learning and geometric deep learning models. This classification takes into account the distinct techniques employed by the algorithms within each approach. We present an up-to-date analysis of these algorithms, emphasizing their key features and open issues. This includes details of input descriptors, datasets used, open-source code availability, task solutions, and actual research applications, focusing on general applications rather than specific ones such as drug discovery. Furthermore, our report discusses trends and future directions in molecular algorithm design, including the input descriptors used for each deep learning model, GPU usage, training and forward processing time, model parameters, the most commonly used datasets, libraries, and optimization schemes. This information aids in identifying the most suitable algorithms for a given task. It also serves as a reference for the datasets and input data frequently used for each algorithm technique. In addition, it provides insights into the benefits and open issues of each technique, and supports the development of novel computational chemistry systems.
Collapse
Affiliation(s)
- Abimael Guzman-Pando
- Computational Chemistry Physics Laboratory, Facultad de Medicina y Ciencias Biomédicas, Universidad Autónoma de Chihuahua, Campus II, 31125, Chihuahua, Mexico
| | - Graciela Ramirez-Alonso
- Faculty of Engineering, Universidad Autónoma de Chihuahua, Campus II, 31125, Chihuahua, Mexico
| | - Carlos Arzate-Quintana
- Computational Chemistry Physics Laboratory, Facultad de Medicina y Ciencias Biomédicas, Universidad Autónoma de Chihuahua, Campus II, 31125, Chihuahua, Mexico
| | - Javier Camarillo-Cisneros
- Computational Chemistry Physics Laboratory, Facultad de Medicina y Ciencias Biomédicas, Universidad Autónoma de Chihuahua, Campus II, 31125, Chihuahua, Mexico.
| |
Collapse
|
10
|
Mustali J, Yasuda I, Hirano Y, Yasuoka K, Gautieri A, Arai N. Unsupervised deep learning for molecular dynamics simulations: a novel analysis of protein-ligand interactions in SARS-CoV-2 M pro. RSC Adv 2023; 13:34249-34261. [PMID: 38019981 PMCID: PMC10663885 DOI: 10.1039/d3ra06375e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 11/06/2023] [Indexed: 12/01/2023] Open
Abstract
Molecular dynamics (MD) simulations, which are central to drug discovery, offer detailed insights into protein-ligand interactions. However, analyzing large MD datasets remains a challenge. Current machine-learning solutions are predominantly supervised and have data labelling and standardisation issues. In this study, we adopted an unsupervised deep-learning framework, previously benchmarked for rigid proteins, to study the more flexible SARS-CoV-2 main protease (Mpro). We ran MD simulations of Mpro with various ligands and refined the data by focusing on binding-site residues and time frames in stable protein conformations. The optimal descriptor chosen was the distance between the residues and the center of the binding pocket. Using this approach, a local dynamic ensemble was generated and fed into our neural network to compute Wasserstein distances across system pairs, revealing ligand-induced conformational differences in Mpro. Dimensionality reduction yielded an embedding map that correlated ligand-induced dynamics and binding affinity. Notably, the high-affinity compounds showed pronounced effects on the protein's conformations. We also identified the key residues that contributed to these differences. Our findings emphasize the potential of combining unsupervised deep learning with MD simulations to extract valuable information and accelerate drug discovery.
Collapse
Affiliation(s)
- Jessica Mustali
- Department of Electronics, Information and Bioengineering, Politecnico di Milano Italy
| | - Ikki Yasuda
- Department of Mechanical Engineering, Keio University Japan
| | | | - Kenji Yasuoka
- Department of Mechanical Engineering, Keio University Japan
| | - Alfonso Gautieri
- Department of Electronics, Information and Bioengineering, Politecnico di Milano Italy
| | - Noriyoshi Arai
- Department of Mechanical Engineering, Keio University Japan
| |
Collapse
|
11
|
Xia L, Xu L, Pan S, Niu D, Zhang B, Li Z. Drug-target binding affinity prediction using message passing neural network and self supervised learning. BMC Genomics 2023; 24:557. [PMID: 37730555 PMCID: PMC10510145 DOI: 10.1186/s12864-023-09664-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 09/09/2023] [Indexed: 09/22/2023] Open
Abstract
BACKGROUND Drug-target binding affinity (DTA) prediction is important for the rapid development of drug discovery. Compared to traditional methods, deep learning methods provide a new way for DTA prediction to achieve good performance without much knowledge of the biochemical background. However, there are still room for improvement in DTA prediction: (1) only focusing on the information of the atom leads to an incomplete representation of the molecular graph; (2) the self-supervised learning method could be introduced for protein representation. RESULTS In this paper, a DTA prediction model using the deep learning method is proposed, which uses an undirected-CMPNN for molecular embedding and combines CPCProt and MLM models for protein embedding. An attention mechanism is introduced to discover the important part of the protein sequence. The proposed method is evaluated on the datasets Ki and Davis, and the model outperformed other deep learning methods. CONCLUSIONS The proposed model improves the performance of the DTA prediction, which provides a novel strategy for deep learning-based virtual screening methods.
Collapse
Affiliation(s)
- Leiming Xia
- College of Computer Science and Technology, Qingdao University, Qingdao, China
| | - Lei Xu
- College of Computer Science and Technology, Qingdao University, Qingdao, China
| | - Shourun Pan
- College of Computer Science and Technology, Qingdao University, Qingdao, China
| | - Dongjiang Niu
- College of Computer Science and Technology, Qingdao University, Qingdao, China
| | - Beiyi Zhang
- College of Computer Science and Technology, Qingdao University, Qingdao, China
| | - Zhen Li
- College of Computer Science and Technology, Qingdao University, Qingdao, China.
| |
Collapse
|
12
|
Song Y, Chang S, Tian J, Pan W, Feng L, Ji H. A Comprehensive Comparative Analysis of Deep Learning Based Feature Representations for Molecular Taste Prediction. Foods 2023; 12:3386. [PMID: 37761095 PMCID: PMC10529232 DOI: 10.3390/foods12183386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 08/30/2023] [Accepted: 09/01/2023] [Indexed: 09/29/2023] Open
Abstract
Taste determination in small molecules is critical in food chemistry but traditional experimental methods can be time-consuming. Consequently, computational techniques have emerged as valuable tools for this task. In this study, we explore taste prediction using various molecular feature representations and assess the performance of different machine learning algorithms on a dataset comprising 2601 molecules. The results reveal that GNN-based models outperform other approaches in taste prediction. Moreover, consensus models that combine diverse molecular representations demonstrate improved performance. Among these, the molecular fingerprints + GNN consensus model emerges as the top performer, highlighting the complementary strengths of GNNs and molecular fingerprints. These findings have significant implications for food chemistry research and related fields. By leveraging these computational approaches, taste prediction can be expedited, leading to advancements in understanding the relationship between molecular structure and taste perception in various food components and related compounds.
Collapse
Affiliation(s)
- Yu Song
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, School of Agricultural Sciences, Zhengzhou University, Zhengzhou 450001, China;
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Shenzhen 518120, China
- Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Sihao Chang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Shenzhen 518120, China
- Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Jing Tian
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Shenzhen 518120, China
- Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Weihua Pan
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Shenzhen 518120, China
- Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Lu Feng
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, School of Agricultural Sciences, Zhengzhou University, Zhengzhou 450001, China;
| | - Hongchao Ji
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Shenzhen 518120, China
- Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| |
Collapse
|
13
|
Hagg A, Kirschner KN. Open-Source Machine Learning in Computational Chemistry. J Chem Inf Model 2023; 63:4505-4532. [PMID: 37466636 PMCID: PMC10430767 DOI: 10.1021/acs.jcim.3c00643] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Indexed: 07/20/2023]
Abstract
The field of computational chemistry has seen a significant increase in the integration of machine learning concepts and algorithms. In this Perspective, we surveyed 179 open-source software projects, with corresponding peer-reviewed papers published within the last 5 years, to better understand the topics within the field being investigated by machine learning approaches. For each project, we provide a short description, the link to the code, the accompanying license type, and whether the training data and resulting models are made publicly available. Based on those deposited in GitHub repositories, the most popular employed Python libraries are identified. We hope that this survey will serve as a resource to learn about machine learning or specific architectures thereof by identifying accessible codes with accompanying papers on a topic basis. To this end, we also include computational chemistry open-source software for generating training data and fundamental Python libraries for machine learning. Based on our observations and considering the three pillars of collaborative machine learning work, open data, open source (code), and open models, we provide some suggestions to the community.
Collapse
Affiliation(s)
- Alexander Hagg
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Electrical Engineering, Mechanical Engineering and Technical Journalism, University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| | - Karl N. Kirschner
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Computer Science, University of Applied
Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| |
Collapse
|
14
|
Ren X, Yan CX, Zhai RX, Xu K, Li H, Fu XJ. Comprehensive survey of target prediction web servers for Traditional Chinese Medicine. Heliyon 2023; 9:e19151. [PMID: 37664753 PMCID: PMC10468387 DOI: 10.1016/j.heliyon.2023.e19151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 07/27/2023] [Accepted: 08/14/2023] [Indexed: 09/05/2023] Open
Abstract
Traditional Chinese medicine (TCM) is characterized by multi-components, multiple targets, and complex mechanisms of action and therefore has significant advantages in treating diseases. However, the clinical application of TCM prescriptions is limited due to the difficulty in elucidating the effective substances and the lack of current scientific evidence on the mechanisms of action. In recent years, the development of network pharmacology based on drug systems research has provided a new approach for understanding the complex systems represented by TCM. The determination of drug targets is the core of TCM network pharmacology research. Over the past years, many web tools for drug targets with various features have been developed to facilitate target prediction, significantly promoting drug discovery. Therefore, this review introduces the widely used web tools for compound-target interaction prediction databases and web resources in TCM pharmacology research, and it compares and analyzes each web tool based on their basic properties, including the underlying theory, algorithms, datasets, and search results. Finally, we present the remaining challenges for the promising future of compound-target interaction prediction in TCM pharmacology research. This work may guide researchers in choosing web tools for target prediction and may also help develop more TCM tools based on these existing resources.
Collapse
Affiliation(s)
- Xia Ren
- Shandong University of Traditional Chinese Medicine, Jinan 250355, China
- Marine traditional Chinese medicine r research center, Qingdao Academy of Traditional Chinese medicine, Shandong University of Traditional Chinese Medicine, Qingdao 266114, China
| | - Chun-Xiao Yan
- Shandong University of Traditional Chinese Medicine, Jinan 250355, China
- Marine traditional Chinese medicine r research center, Qingdao Academy of Traditional Chinese medicine, Shandong University of Traditional Chinese Medicine, Qingdao 266114, China
| | - Run-Xiang Zhai
- Shandong University of Traditional Chinese Medicine, Jinan 250355, China
- Marine traditional Chinese medicine r research center, Qingdao Academy of Traditional Chinese medicine, Shandong University of Traditional Chinese Medicine, Qingdao 266114, China
| | - Kuo Xu
- Shandong University of Traditional Chinese Medicine, Jinan 250355, China
- Marine traditional Chinese medicine r research center, Qingdao Academy of Traditional Chinese medicine, Shandong University of Traditional Chinese Medicine, Qingdao 266114, China
| | - Hui Li
- Shandong University of Traditional Chinese Medicine, Jinan 250355, China
- Marine traditional Chinese medicine r research center, Qingdao Academy of Traditional Chinese medicine, Shandong University of Traditional Chinese Medicine, Qingdao 266114, China
| | - Xian-Jun Fu
- Shandong University of Traditional Chinese Medicine, Jinan 250355, China
- Marine traditional Chinese medicine r research center, Qingdao Academy of Traditional Chinese medicine, Shandong University of Traditional Chinese Medicine, Qingdao 266114, China
- Shandong Engineering and Technology Research Center of Traditional Chinese Medicine, Jinan 250355, China
| |
Collapse
|
15
|
Andress C, Kappel K, Villena ME, Cuperlovic-Culf M, Yan H, Li Y. DAPTEV: Deep aptamer evolutionary modelling for COVID-19 drug design. PLoS Comput Biol 2023; 19:e1010774. [PMID: 37406007 DOI: 10.1371/journal.pcbi.1010774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 06/13/2023] [Indexed: 07/07/2023] Open
Abstract
Typical drug discovery and development processes are costly, time consuming and often biased by expert opinion. Aptamers are short, single-stranded oligonucleotides (RNA/DNA) that bind to target proteins and other types of biomolecules. Compared with small-molecule drugs, aptamers can bind to their targets with high affinity (binding strength) and specificity (uniquely interacting with the target only). The conventional development process for aptamers utilizes a manual process known as Systematic Evolution of Ligands by Exponential Enrichment (SELEX), which is costly, slow, dependent on library choice and often produces aptamers that are not optimized. To address these challenges, in this research, we create an intelligent approach, named DAPTEV, for generating and evolving aptamer sequences to support aptamer-based drug discovery and development. Using the COVID-19 spike protein as a target, our computational results suggest that DAPTEV is able to produce structurally complex aptamers with strong binding affinities.
Collapse
Affiliation(s)
- Cameron Andress
- Department of Computer Science, Brock University, St. Catharines, Canada
| | - Kalli Kappel
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | | | | | - Hongbin Yan
- Department of Chemistry, Brock University, St. Catharines, Canada
| | - Yifeng Li
- Department of Computer Science, Brock University, St. Catharines, Canada
- Department of Biological Sciences, Brock University, St. Catharines, Canada
| |
Collapse
|
16
|
Dalkıran A, Atakan A, Rifaioğlu AS, Martin MJ, Atalay RÇ, Acar AC, Doğan T, Atalay V. Transfer learning for drug-target interaction prediction. Bioinformatics 2023; 39:i103-i110. [PMID: 37387156 DOI: 10.1093/bioinformatics/btad234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/19/2023] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Utilizing AI-driven approaches for drug-target interaction (DTI) prediction require large volumes of training data which are not available for the majority of target proteins. In this study, we investigate the use of deep transfer learning for the prediction of interactions between drug candidate compounds and understudied target proteins with scarce training data. The idea here is to first train a deep neural network classifier with a generalized source training dataset of large size and then to reuse this pre-trained neural network as an initial configuration for re-training/fine-tuning purposes with a small-sized specialized target training dataset. To explore this idea, we selected six protein families that have critical importance in biomedicine: kinases, G-protein-coupled receptors (GPCRs), ion channels, nuclear receptors, proteases, and transporters. In two independent experiments, the protein families of transporters and nuclear receptors were individually set as the target datasets, while the remaining five families were used as the source datasets. Several size-based target family training datasets were formed in a controlled manner to assess the benefit provided by the transfer learning approach. RESULTS Here, we present a systematic evaluation of our approach by pre-training a feed-forward neural network with source training datasets and applying different modes of transfer learning from the pre-trained source network to a target dataset. The performance of deep transfer learning is evaluated and compared with that of training the same deep neural network from scratch. We found that when the training dataset contains fewer than 100 compounds, transfer learning outperforms the conventional strategy of training the system from scratch, suggesting that transfer learning is advantageous for predicting binders to under-studied targets. AVAILABILITY AND IMPLEMENTATION The source code and datasets are available at https://github.com/cansyl/TransferLearning4DTI. Our web-based service containing the ready-to-use pre-trained models is accessible at https://tl4dti.kansil.org.
Collapse
Affiliation(s)
- Alperen Dalkıran
- Department of Computer Engineering, Middle East Technical University, Ankara 06800, Turkey
- Department of Computer Engineering, Adana Alparslan Türkeş Science and Technology University, Adana 01250, Turkey
| | - Ahmet Atakan
- Department of Computer Engineering, Middle East Technical University, Ankara 06800, Turkey
- Department of Computer Engineering, Erzincan Binali Yıldırım University, Erzincan 24002, Turkey
| | - Ahmet S Rifaioğlu
- Department of Computer Engineering, Iskenderun Technical University, Hatay 31200, Turkey
- Faculty of Medicine, Institute for Computational Biomedicine, Heidelberg University and Heidelberg University Hospital, Heidelberg 69120, Germany
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, Hinxton CB10 1SD, United Kingdom
| | - Rengül Çetin Atalay
- Faculty of Pulmonary and Critical Care Medicine, the University of Chicago, Chicago, IL, 60637, United States
| | - Aybar C Acar
- Cancer Systems Biology Laboratory (Kansil), Middle East Technical University, Ankara 06800, Turkey
| | - Tunca Doğan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, Hinxton CB10 1SD, United Kingdom
- Department of Computer Engineering, Hacettepe University, Ankara 06800, Turkey
| | - Volkan Atalay
- Department of Computer Engineering, Middle East Technical University, Ankara 06800, Turkey
| |
Collapse
|
17
|
Chen P, Zheng H. Drug-target interaction prediction based on spatial consistency constraint and graph convolutional autoencoder. BMC Bioinformatics 2023; 24:151. [PMID: 37069493 PMCID: PMC10109239 DOI: 10.1186/s12859-023-05275-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 04/05/2023] [Indexed: 04/19/2023] Open
Abstract
BACKGROUND Drug-target interaction (DTI) prediction plays an important role in drug discovery and repositioning. However, most of the computational methods used for identifying relevant DTIs do not consider the invariance of the nearest neighbour relationships between drugs or targets. In other words, they do not take into account the invariance of the topological relationships between nodes during representation learning. It may limit the performance of the DTI prediction methods. RESULTS Here, we propose a novel graph convolutional autoencoder-based model, named SDGAE, to predict DTIs. As the graph convolutional network cannot handle isolated nodes in a network, a pre-processing step was applied to reduce the number of isolated nodes in the heterogeneous network and facilitate effective exploitation of the graph convolutional network. By maintaining the graph structure during representation learning, the nearest neighbour relationships between nodes in the embedding space remained as close as possible to the original space. CONCLUSIONS Overall, we demonstrated that SDGAE can automatically learn more informative and robust feature vectors of drugs and targets, thus exhibiting significantly improved predictive accuracy for DTIs.
Collapse
Affiliation(s)
- Peng Chen
- School of Computer Science and Technology, University of Science and Technology of China, Jinzhai Road 96, Hefei, 230027, People's Republic of China
- Anhui Key Laboratory of Software Engineering in Computing and Communication, University of Science and Technology of China, Jinzhai Road 96, Hefei, 230027, People's Republic of China
| | - Haoran Zheng
- School of Computer Science and Technology, University of Science and Technology of China, Jinzhai Road 96, Hefei, 230027, People's Republic of China.
- Anhui Key Laboratory of Software Engineering in Computing and Communication, University of Science and Technology of China, Jinzhai Road 96, Hefei, 230027, People's Republic of China.
| |
Collapse
|
18
|
Fu Y, Fang Y, Gong S, Xue T, Wang P, She L, Huang J. Deep learning-based network pharmacology for exploring the mechanism of licorice for the treatment of COVID-19. Sci Rep 2023; 13:5844. [PMID: 37037848 PMCID: PMC10086012 DOI: 10.1038/s41598-023-31380-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 03/10/2023] [Indexed: 04/12/2023] Open
Abstract
Licorice, a traditional Chinese medicine, has been widely used for the treatment of COVID-19, but all active compounds and corresponding targets are still not clear. Therefore, this study proposed a deep learning-based network pharmacology approach to identify more potential active compounds and targets of licorice. 4 compounds (quercetin, naringenin, liquiritigenin, and licoisoflavanone), 2 targets (SYK and JAK2) and the relevant pathways (P53, cAMP, and NF-kB) were predicted, which were confirmed by previous studies to be associated with SARS-CoV-2-infection. In addition, 2 new active compounds (glabrone and vestitol) and 2 new targets (PTEN and MAP3K8) were further validated by molecular docking and molecular dynamics simulations (simultaneous molecular dynamics), as well as the results showed that these active compounds bound well to COVID-19 related targets, including the main protease (Mpro), the spike protein (S-protein) and the angiotensin-converting enzyme 2 (ACE2). Overall, in this study, glabrone and vestitol from licorice were found to inhibit viral replication by inhibiting the activation of Mpro, S-protein and ACE2; related compounds in licorice may reduce the inflammatory response and inhibit apoptosis by acting on PTEN and MAP3K8. Therefore, licorice has been proposed as an effective candidate for the treatment of COVID-19 through PTEN, MAP3K8, Mpro, S-protein and ACE2.
Collapse
Affiliation(s)
- Yu Fu
- Alibaba Business School, Hangzhou Normal University, Hangzhou, 310000, China
| | - Yangyue Fang
- Alibaba Business School, Hangzhou Normal University, Hangzhou, 310000, China
| | - Shuai Gong
- Alibaba Business School, Hangzhou Normal University, Hangzhou, 310000, China
| | - Tao Xue
- Alibaba Business School, Hangzhou Normal University, Hangzhou, 310000, China
| | - Peng Wang
- Alibaba Business School, Hangzhou Normal University, Hangzhou, 310000, China
| | - Li She
- Alibaba Business School, Hangzhou Normal University, Hangzhou, 310000, China
| | - Jianping Huang
- Alibaba Business School, Hangzhou Normal University, Hangzhou, 310000, China.
| |
Collapse
|
19
|
Khusial R, Bies RR, Akil A. Deep Learning Methods Applied to Drug Concentration Prediction of Olanzapine. Pharmaceutics 2023; 15:pharmaceutics15041139. [PMID: 37111625 PMCID: PMC10145228 DOI: 10.3390/pharmaceutics15041139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 03/17/2023] [Accepted: 03/31/2023] [Indexed: 04/07/2023] Open
Abstract
Pharmacometrics and the utilization of population pharmacokinetics play an integral role in model-informed drug discovery and development (MIDD). Recently, there has been a growth in the application of deep learning approaches to aid in areas within MIDD. In this study, a deep learning model, LSTM-ANN, was developed to predict olanzapine drug concentrations from the CATIE study. A total of 1527 olanzapine drug concentrations from 523 individuals along with 11 patient-specific covariates were used in model development. The hyperparameters of the LSTM-ANN model were optimized through a Bayesian optimization algorithm. A population pharmacokinetic model using the NONMEM model was constructed as a reference to compare to the performance of the LSTM-ANN model. The RMSE of the LSTM-ANN model was 29.566 in the validation set, while the RMSE of the NONMEM model was 31.129. Permutation importance revealed that age, sex, and smoking were highly influential covariates in the LSTM-ANN model. The LSTM-ANN model showed potential in the application of drug concentration predictions as it was able to capture the relationships within a sparsely sampled pharmacokinetic dataset and perform comparably to the NONMEM model.
Collapse
Affiliation(s)
- Richard Khusial
- Department of Pharmaceutical Sciences, College of Pharmacy, Mercer University, Atlanta, GA 30341, USA
| | - Robert R. Bies
- Department of Pharmaceutical Sciences, School of Pharmacy and Pharmaceutical Sciences, University at Buffalo, Buffalo, NY 14214, USA
- Institute for Artificial Intelligence and Data Science, University at Buffalo, Buffalo, NY 14260, USA
| | - Ayman Akil
- Department of Pharmaceutical Sciences, College of Pharmacy, Mercer University, Atlanta, GA 30341, USA
| |
Collapse
|
20
|
Koutroumpa NM, Papavasileiou KD, Papadiamantis AG, Melagraki G, Afantitis A. A Systematic Review of Deep Learning Methodologies Used in the Drug Discovery Process with Emphasis on In Vivo Validation. Int J Mol Sci 2023; 24:6573. [PMID: 37047543 PMCID: PMC10095548 DOI: 10.3390/ijms24076573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Revised: 03/24/2023] [Accepted: 03/28/2023] [Indexed: 04/05/2023] Open
Abstract
The discovery and development of new drugs are extremely long and costly processes. Recent progress in artificial intelligence has made a positive impact on the drug development pipeline. Numerous challenges have been addressed with the growing exploitation of drug-related data and the advancement of deep learning technology. Several model frameworks have been proposed to enhance the performance of deep learning algorithms in molecular design. However, only a few have had an immediate impact on drug development since computational results may not be confirmed experimentally. This systematic review aims to summarize the different deep learning architectures used in the drug discovery process and are validated with further in vivo experiments. For each presented study, the proposed molecule or peptide that has been generated or identified by the deep learning model has been biologically evaluated in animal models. These state-of-the-art studies highlight that even if artificial intelligence in drug discovery is still in its infancy, it has great potential to accelerate the drug discovery cycle, reduce the required costs, and contribute to the integration of the 3R (Replacement, Reduction, Refinement) principles. Out of all the reviewed scientific articles, seven algorithms were identified: recurrent neural networks, specifically, long short-term memory (LSTM-RNNs), Autoencoders (AEs) and their Wasserstein Autoencoders (WAEs) and Variational Autoencoders (VAEs) variants; Convolutional Neural Networks (CNNs); Direct Message Passing Neural Networks (D-MPNNs); and Multitask Deep Neural Networks (MTDNNs). LSTM-RNNs were the most used architectures with molecules or peptide sequences as inputs.
Collapse
Affiliation(s)
- Nikoletta-Maria Koutroumpa
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1070, Cyprus
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece
- Division of Data Driven Innovation, Entelos Institute, Larnaca 6059, Cyprus
| | - Konstantinos D. Papavasileiou
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1070, Cyprus
- Division of Data Driven Innovation, Entelos Institute, Larnaca 6059, Cyprus
- Department of ChemoInformatics, NovaMechanics MIKE., 185 45 Piraeus, Greece
| | - Anastasios G. Papadiamantis
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1070, Cyprus
- Division of Data Driven Innovation, Entelos Institute, Larnaca 6059, Cyprus
| | - Georgia Melagraki
- Division of Physical Sciences & Applications, Hellenic Military Academy, 166 73 Vari, Greece
| | - Antreas Afantitis
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1070, Cyprus
- Division of Data Driven Innovation, Entelos Institute, Larnaca 6059, Cyprus
- Department of ChemoInformatics, NovaMechanics MIKE., 185 45 Piraeus, Greece
| |
Collapse
|
21
|
Kwon Y, Park S, Lee J, Kang J, Lee HJ, Kim W. BEAR: A Novel Virtual Screening Method Based on Large-Scale Bioactivity Data. J Chem Inf Model 2023; 63:1429-1437. [PMID: 36821004 DOI: 10.1021/acs.jcim.2c01300] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]
Abstract
Data-driven drug discovery exploits a comprehensive set of big data to provide an efficient path for the development of new drugs. Currently, publicly available bioassay data sets provide extensive information regarding the bioactivity profiles of millions of compounds. Using these large-scale drug screening data sets, we developed a novel in silico method to virtually screen hit compounds against protein targets, named BEAR (Bioactive compound Enrichment by Assay Repositioning). The underlying idea of BEAR is to reuse bioassay data for predicting hit compounds for targets other than their originally intended purposes, i.e., "assay repositioning". The BEAR approach differs from conventional virtual screening methods in that (1) it relies solely on bioactivity data and requires no physicochemical features of either the target or ligand. (2) Accordingly, structurally diverse candidates are predicted, allowing for scaffold hopping. (3) BEAR shows stable performance across diverse target classes, suggesting its general applicability. Large-scale cross-validation of more than a thousand targets showed that BEAR accurately predicted known ligands (median area under the curve = 0.87), proving that BEAR maintained a robust performance even in the validation set with additional constraints. In addition, a comparative analysis demonstrated that BEAR outperformed other machine learning models, including a recent deep learning model for ABC transporter family targets. We predicted P-gp and BCRP dual inhibitors using the BEAR approach and validated the predicted candidates using in vitro assays. The intracellular accumulation effects of mitoxantrone, a well-known P-gp/BCRP dual substrate for cancer treatment, confirmed nine out of 72 dual inhibitor candidates preselected by primary cytotoxicity screening. Consequently, these nine hits are novel and potent dual inhibitors for both P-gp and BCRP, solely predicted by bioactivity profiles without relying on any structural information of targets or ligands.
Collapse
Affiliation(s)
| | - Sera Park
- KaiPharm, Seoul 03760, Republic of Korea
| | - Jaeok Lee
- College of Pharmacy, Research Institute of Pharmaceutical Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Jiyeon Kang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Hwa Jeong Lee
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Wankyu Kim
- KaiPharm, Seoul 03760, Republic of Korea.,Department of Life Sciences, College of Natural Science, Ewha Womans University, Seoul 03760, Republic of Korea
| |
Collapse
|
22
|
Identification of medicinal plant-based phytochemicals as a potential inhibitor for SARS-CoV-2 main protease (M pro) using molecular docking and deep learning methods. Comput Biol Med 2023; 157:106785. [PMID: 36931201 PMCID: PMC10008098 DOI: 10.1016/j.compbiomed.2023.106785] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 02/15/2023] [Accepted: 03/10/2023] [Indexed: 03/14/2023]
Abstract
Highly transmissive and rapidly evolving Coronavirus disease-2019 (COVID-19), a viral disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), triggered a global pandemic, which is one of the most researched viruses in the academia. Effective drugs to treat people with COVID-19 have yet to be developed to reduce mortality and transmission. Studies on the SARS-CoV-2 virus identified that its main protease (Mpro) might be a potential therapeutic target for drug development, as this enzyme plays a key role in viral replication. In search of potential inhibitors of Mpro, we developed a phytochemical library consisting of 2431 phytochemicals from 104 Korean medicinal plants that exhibited medicinal and antioxidant properties. The library was screened by molecular docking, followed by revalidation by re-screening with a deep learning method. Recurrent Neural Networks (RNN) computing system was used to develop an inhibitory predictive model using SARS coronavirus Mpro dataset. It was deployed to screen the top 12 compounds based on their docked binding affinity that ranged from -8.0 to -8.9 kcal/mol. The top two lead compounds, Catechin gallate and Quercetin 3-O-malonylglucoside, were selected depending on inhibitory potency against Mpro. Interactions with the target protein active sites, including His41, Met49, Cys145, Met165, and Thr190 were also examined. Molecular dynamics simulation was performed to analyze root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (RG), solvent accessible surface area (SASA), and number of hydrogen bonds. Results confirmed the inflexible nature of the docked complexes. Absorption, distribution, metabolism, excretion, and toxicity (ADMET), as well as bioactivity prediction confirmed the pharmaceutical activities of the lead compound. Findings of this research might help scientists to optimize compatible drugs for the treatment of COVID-19 patients.
Collapse
|
23
|
Bongini P, Scarselli F, Bianchini M, Dimitri GM, Pancino N, Lio P. Modular Multi-Source Prediction of Drug Side-Effects With DruGNN. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1211-1220. [PMID: 35576419 DOI: 10.1109/tcbb.2022.3175362] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Drug Side-Effects (DSEs) have a high impact on public health, care system costs, and drug discovery processes. Predicting the probability of side-effects, before their occurrence, is fundamental to reduce this impact, in particular on drug discovery. Candidate molecules could be screened before undergoing clinical trials, reducing the costs in time, money, and health of the participants. Drug side-effects are triggered by complex biological processes involving many different entities, from drug structures to protein-protein interactions. To predict their occurrence, it is necessary to integrate data from heterogeneous sources. In this work, such heterogeneous data is integrated into a graph dataset, expressively representing the relational information between different entities, such as drug molecules and genes. The relational nature of the dataset represents an important novelty for drug side-effect predictors. Graph Neural Networks (GNNs) are exploited to predict DSEs on our dataset with very promising results. GNNs are deep learning models that can process graph-structured data, with minimal information loss, and have been applied on a wide variety of biological tasks. Our experimental results confirm the advantage of using relationships between data entities, suggesting interesting future developments in this scope. The experimentation also shows the importance of specific subsets of data in determining associations between drugs and side-effects.
Collapse
|
24
|
SuHAN: Substructural hierarchical attention network for molecular representation. J Mol Graph Model 2023; 119:108401. [PMID: 36584590 DOI: 10.1016/j.jmgm.2022.108401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 12/16/2022] [Accepted: 12/23/2022] [Indexed: 12/26/2022]
Abstract
Recently, molecular representation and property exploration, with the combination of neural network, play a critical role in the field of drug design and discovery for assisting in drug related research. However, previous research in molecular representation relies heavily on artificial extraction of features based on biological experiments which may result in a manually introduced noise of molecular information with high cost in time and money. In this paper, a novel method named Substructural Hierarchical Attention Network (SuHAN) is proposed to discover inherent characteristics of molecules for representation learning. Specifically, SuHAN is composed of the cascaded layer: atom-level layer and substructure-level layer. Molecule in the SMILES format is divided into several substructural fragments by predefined partition rules, and then they are fed into atom-level layer and substructure-level layer successively to obtain feature representation from different perspective: atomic view and substructural view. In this way, the prominent structural features that may be omitted in global extraction are excavated from a fine-grained viewpoint and fused to reconstruct representative pattern in an overall view. Experiments on biophysics and physiology datasets demonstrate that our model is competitive with a significant improvement of both accuracy and stability in performance. We confirmed that the substructural segments and progressive hierarchical networks lead to an effective molecular representation for downstream tasks. These results provide a novel perspective about reconstructing overall pattern through local prominent structure.
Collapse
|
25
|
Kwak D, Choi J, Lee S. Rethinking Breast Cancer Diagnosis through Deep Learning Based Image Recognition. SENSORS (BASEL, SWITZERLAND) 2023; 23:2307. [PMID: 36850906 PMCID: PMC9958611 DOI: 10.3390/s23042307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 02/07/2023] [Accepted: 02/10/2023] [Indexed: 06/18/2023]
Abstract
This paper explored techniques for diagnosing breast cancer using deep learning based medical image recognition. X-ray (Mammography) images, ultrasound images, and histopathology images are used to improve the accuracy of the process by diagnosing breast cancer classification and by inferring their affected location. For this goal, the image recognition application strategies for the maximal diagnosis accuracy in each medical image data are investigated in terms of various image classification (VGGNet19, ResNet50, DenseNet121, EfficietNet v2), image segmentation (UNet, ResUNet++, DeepLab v3), and related loss functions (binary cross entropy, dice Loss, Tversky loss), and data augmentation. As a result of evaluations through the presented methods, when using filter-based data augmentation, ResNet50 showed the best performance in image classification, and UNet showed the best performance in both X-ray image and ultrasound image as image segmentation. When applying the proposed image recognition strategies for the maximal diagnosis accuracy in each medical image data, the accuracy can be improved by 33.3% in image segmentation in X-ray images, 29.9% in image segmentation in ultrasound images, and 22.8% in image classification in histopathology images.
Collapse
Affiliation(s)
- Deawon Kwak
- Electronic Engineering Department, Dong Seoul University, Seongnam 13120, Republic of Korea
| | - Jiwoo Choi
- Choi’s Breast Clinic, 197, Gwongwang-ro, Paldal-gu, Suwon-si 16489, Republic of Korea
| | - Sungjin Lee
- Electronic Engineering Department, Dong Seoul University, Seongnam 13120, Republic of Korea
| |
Collapse
|
26
|
Predicting Potent Compounds Using a Conditional Variational Autoencoder Based upon a New Structure-Potency Fingerprint. Biomolecules 2023; 13:biom13020393. [PMID: 36830761 PMCID: PMC9953226 DOI: 10.3390/biom13020393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 02/07/2023] [Accepted: 02/16/2023] [Indexed: 02/22/2023] Open
Abstract
Prediction of the potency of bioactive compounds generally relies on linear or nonlinear quantitative structure-activity relationship (QSAR) models. Nonlinear models are generated using machine learning methods. We introduce a novel approach for potency prediction that depends on a newly designed molecular fingerprint (FP) representation. This structure-potency fingerprint (SPFP) combines different modules accounting for the structural features of active compounds and their potency values in a single bit string, hence unifying structure and potency representation. This encoding enables the derivation of a conditional variational autoencoder (CVAE) using SPFPs of training compounds and apply the model to predict the SPFP potency module of test compounds using only their structure module as input. The SPFP-CVAE approach correctly predicts the potency values of compounds belonging to different activity classes with an accuracy comparable to support vector regression (SVR), representing the state-of-the-art in the field. In addition, highly potent compounds are predicted with very similar accuracy as SVR and deep neural networks.
Collapse
|
27
|
Iglesias CF, Ristovski M, Bolic M, Cuperlovic-Culf M. rAAV Manufacturing: The Challenges of Soft Sensing during Upstream Processing. Bioengineering (Basel) 2023; 10:bioengineering10020229. [PMID: 36829723 PMCID: PMC9951952 DOI: 10.3390/bioengineering10020229] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 01/31/2023] [Accepted: 02/02/2023] [Indexed: 02/11/2023] Open
Abstract
Recombinant adeno-associated virus (rAAV) is the most effective viral vector technology for directly translating the genomic revolution into medicinal therapies. However, the manufacturing of rAAV viral vectors remains challenging in the upstream processing with low rAAV yield in large-scale production and high cost, limiting the generalization of rAAV-based treatments. This situation can be improved by real-time monitoring of critical process parameters (CPP) that affect critical quality attributes (CQA). To achieve this aim, soft sensing combined with predictive modeling is an important strategy that can be used for optimizing the upstream process of rAAV production by monitoring critical process variables in real time. However, the development of soft sensors for rAAV production as a fast and low-cost monitoring approach is not an easy task. This review article describes four challenges and critically discusses the possible solutions that can enable the application of soft sensors for rAAV production monitoring. The challenges from a data scientist's perspective are (i) a predictor variable (soft-sensor inputs) set without AAV viral titer, (ii) multi-step forecasting, (iii) multiple process phases, and (iv) soft-sensor development composed of the mechanistic model.
Collapse
Affiliation(s)
| | - Milica Ristovski
- Faculty of Engineering, University of Ottawa, Ottawa, ON K1N 6N5, Canada
- Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Miodrag Bolic
- Faculty of Engineering, University of Ottawa, Ottawa, ON K1N 6N5, Canada
| | - Miroslava Cuperlovic-Culf
- Digital Technologies Research Center, National Research Council, Ottawa, ON K1A 0R6, Canada
- Department of Biochemistry, Microbiology, and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
- Correspondence:
| |
Collapse
|
28
|
Shu Y, Hai Y, Cao L, Wu J. Deep-learning based approach to identify substrates of human E3 ubiquitin ligases and deubiquitinases. Comput Struct Biotechnol J 2023; 21:1014-1021. [PMID: 36733699 PMCID: PMC9883182 DOI: 10.1016/j.csbj.2023.01.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 01/16/2023] [Accepted: 01/16/2023] [Indexed: 01/19/2023] Open
Abstract
E3 ubiquitin ligases (E3s) and deubiquitinating enzymes (DUBs) play key roles in protein degradation. However, a large number of E3 substrate interactions (ESIs) and DUB substrate interactions (DSIs) remain elusive. Here, we present DeepUSI, a deep learning-based framework to identify ESIs and DSIs using the rich information present in protein sequences. Utilizing the collected golden standard dataset, key hyperparameters in the process of model training, including the ones relevant to data sampling and number of epochs, have been systematically assessed. The performance of DeepUSI was thoroughly evaluated by multiple metrics, based on internal and external validation. Application of DeepUSI to cancer-associated E3 and DUB genes identified a list of druggable substrates with functional implications, warranting further investigation. Together, DeepUSI presents a new framework for predicting substrates of E3 ubiquitin ligases and deubiquitinates.
Collapse
Key Words
- AUPRC, area under the PR curve
- AUROC, area under the ROC curve
- CNN, convolutional neutral network
- DSI, DUB-substrate interaction
- DUB, deubiquitinating enzymes
- DUB-substrate interactions
- Deep learning
- E1, ubiquitin-activating enzymes
- E2, ubiquitin-conjugating enzymes
- E3, ubiquitin ligases
- E3-substrate interactions
- ESI, E3-substrate interaction
- GSP, gold standard positive dataset
- PR, precision recall
- Pan-cancer analysis
- ROC, receiver operating characteristic
- TCGA, The Cancer Genome Atlas
- UPS, ubiquitin-proteasome system
- Ubiquitin proteasome system
- Ubiquitination
Collapse
Affiliation(s)
- Yixuan Shu
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Center for Cancer Bioinformatics, Peking University Cancer Hospital & Institute, Beijing 100142, China
| | - Yanru Hai
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Center for Cancer Bioinformatics, Peking University Cancer Hospital & Institute, Beijing 100142, China
| | - Lihua Cao
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Center for Cancer Bioinformatics, Peking University Cancer Hospital & Institute, Beijing 100142, China
| | - Jianmin Wu
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Center for Cancer Bioinformatics, Peking University Cancer Hospital & Institute, Beijing 100142, China,Peking University International Cancer Institute, Peking University, Beijing 100191, China,Correspondence to: Center for Cancer Bioinformatics, Peking University Cancer Hospital & Institute, 52 Fu-Cheng Road, Hai-Dian District, Beijing 100142, China.
| |
Collapse
|
29
|
Bakalis D, Lambrinidis G, Kourounakis A, Manis G. Contribution of Deep Learning in the Investigation of Possible Dual LOX-3 Inhibitors/DPPH Scavengers: The Case of Recently Synthesized Compounds. BIOENGINEERING (BASEL, SWITZERLAND) 2022; 9:bioengineering9120800. [PMID: 36551006 PMCID: PMC9774961 DOI: 10.3390/bioengineering9120800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 11/18/2022] [Accepted: 12/09/2022] [Indexed: 12/15/2022]
Abstract
Even though non-steroidal anti-inflammatory drugs are the most effective treatment for inflammatory conditions, they have been linked to negative side effects. A promising approach to mitigating potential risks, is the development of new compounds able to combine anti-inflammatory with antioxidant activity to enhance activity and reduce toxicity. The implication of reactive oxygen species in inflammatory conditions has been extensively studied, based on the pro-inflammatory properties of generated free radicals. Drugs with dual activity (i.e., inhibiting inflammation related enzymes, e.g., LOX-3 and scavenging free radicals, e.g., DPPH) could find various therapeutic applications, such as in cardiovascular or neurodegenerating disorders. The challenge we embarked on using deep learning was the creation of appropriate classification and regression models to discriminate pharmacological activity and selectivity as well as to discover future compounds with dual activity prior to synthesis. An accurate filter algorithm was established, based on knowledge from compounds already evaluated in vitro, that can separate compounds with low, moderate or high activity. In this study, we constructed a customized highly effective one dimensional convolutional neural network (CONV1D), with accuracy scores up to 95.2%, that was able to identify dual active compounds, being LOX-3 inhibitors and DPPH scavengers, as an indication of simultaneous anti-inflammatory and antioxidant activity. Additionally, we created a highly accurate regression model that predicted the exact value of effectiveness of a set of recently synthesized compounds with anti-inflammatory activity, scoring a root mean square error value of 0.8. Eventually, we succeeded in observing the manner in which those newly synthesized compounds differentiate from each other, regarding a specific pharmacological target, using deep learning algorithms.
Collapse
Affiliation(s)
- Dimitrios Bakalis
- Department of Computer Science and Engineering, School of Engineering, University of Ioannina, 45110 Ioannina, Greece
| | - George Lambrinidis
- Division of Pharmaceutical Chemistry, Department of Pharmacy, School of Health Sciences, National & Kapodistrian University of Athens, 15771 Athens, Greece
- Correspondence:
| | - Angeliki Kourounakis
- Division of Pharmaceutical Chemistry, Department of Pharmacy, School of Health Sciences, National & Kapodistrian University of Athens, 15771 Athens, Greece
| | - George Manis
- Department of Computer Science and Engineering, School of Engineering, University of Ioannina, 45110 Ioannina, Greece
| |
Collapse
|
30
|
Askr H, Elgeldawi E, Aboul Ella H, Elshaier YAMM, Gomaa MM, Hassanien AE. Deep learning in drug discovery: an integrative review and future challenges. Artif Intell Rev 2022; 56:5975-6037. [PMID: 36415536 PMCID: PMC9669545 DOI: 10.1007/s10462-022-10306-1] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/24/2022] [Indexed: 11/18/2022]
Abstract
Recently, using artificial intelligence (AI) in drug discovery has received much attention since it significantly shortens the time and cost of developing new drugs. Deep learning (DL)-based approaches are increasingly being used in all stages of drug development as DL technology advances, and drug-related data grows. Therefore, this paper presents a systematic Literature review (SLR) that integrates the recent DL technologies and applications in drug discovery Including, drug-target interactions (DTIs), drug-drug similarity interactions (DDIs), drug sensitivity and responsiveness, and drug-side effect predictions. We present a review of more than 300 articles between 2000 and 2022. The benchmark data sets, the databases, and the evaluation measures are also presented. In addition, this paper provides an overview of how explainable AI (XAI) supports drug discovery problems. The drug dosing optimization and success stories are discussed as well. Finally, digital twining (DT) and open issues are suggested as future research challenges for drug discovery problems. Challenges to be addressed, future research directions are identified, and an extensive bibliography is also included.
Collapse
Affiliation(s)
- Heba Askr
- Faculty of Computers and Artificial Intelligence, University of Sadat City, Sadat City, Egypt
| | - Enas Elgeldawi
- Computer Science Department, Faculty of Science, Minia University, Minia, Egypt
| | - Heba Aboul Ella
- Faculty of Pharmacy and Drug Technology, Chinese University in Egypt (CUE), Cairo, Egypt
| | | | - Mamdouh M. Gomaa
- Computer Science Department, Faculty of Science, Minia University, Minia, Egypt
| | - Aboul Ella Hassanien
- Faculty of Computers and Artificial Intelligence, Cairo University, Cairo, Egypt
| |
Collapse
|
31
|
Andrusenko I, Gemmi M. 3D electron diffraction for structure determination of small-molecule nanocrystals: A possible breakthrough for the pharmaceutical industry. WILEY INTERDISCIPLINARY REVIEWS. NANOMEDICINE AND NANOBIOTECHNOLOGY 2022; 14:e1810. [PMID: 35595285 PMCID: PMC9539612 DOI: 10.1002/wnan.1810] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 04/29/2022] [Accepted: 05/02/2022] [Indexed: 11/10/2022]
Abstract
Nanomedicine is among the most fascinating areas of research. Most of the newly discovered pharmaceutical polymorphs, as well as many new synthesized or isolated natural products, appear only in form of nanocrystals. The development of techniques that allow investigating the atomic structure of nanocrystalline materials is therefore one of the most important frontiers of crystallography. Some unique features of electrons, like their non-neutral charge and their strong interaction with matter, make this radiation suitable for imaging and detecting individual atoms, molecules, or nanoscale objects down to sub-angstrom resolution. In the recent years the development of three-dimensional (3D) electron diffraction (3D ED) has shown that electron diffraction can be successfully used to solve the crystal structure of nanocrystals and most of its limiting factors like dynamical scattering or limited completeness can be easily overcome. This article is a review of the state of the art of this method with a specific focus on how it can be applied to beam sensitive samples like small-molecule organic nanocrystals. This article is categorized under: Therapeutic Approaches and Drug Discovery > Emerging Technologies.
Collapse
Affiliation(s)
- Iryna Andrusenko
- Center for Materials Interfaces, Electron CrystallographyIstituto Italiano di TecnologiaPontedera
| | - Mauro Gemmi
- Center for Materials Interfaces, Electron CrystallographyIstituto Italiano di TecnologiaPontedera
| |
Collapse
|
32
|
Pandey M, Radaeva M, Mslati H, Garland O, Fernandez M, Ester M, Cherkasov A. Ligand Binding Prediction Using Protein Structure Graphs and Residual Graph Attention Networks. Molecules 2022; 27:molecules27165114. [PMID: 36014351 PMCID: PMC9416537 DOI: 10.3390/molecules27165114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 08/03/2022] [Accepted: 08/09/2022] [Indexed: 11/25/2022] Open
Abstract
Computational prediction of ligand–target interactions is a crucial part of modern drug discovery as it helps to bypass high costs and labor demands of in vitro and in vivo screening. As the wealth of bioactivity data accumulates, it provides opportunities for the development of deep learning (DL) models with increasing predictive powers. Conventionally, such models were either limited to the use of very simplified representations of proteins or ineffective voxelization of their 3D structures. Herein, we present the development of the PSG-BAR (Protein Structure Graph-Binding Affinity Regression) approach that utilizes 3D structural information of the proteins along with 2D graph representations of ligands. The method also introduces attention scores to selectively weight protein regions that are most important for ligand binding. Results: The developed approach demonstrates the state-of-the-art performance on several binding affinity benchmarking datasets. The attention-based pooling of protein graphs enables identification of surface residues as critical residues for protein–ligand binding. Finally, we validate our model predictions against an experimental assay on a viral main protease (Mpro)—the hallmark target of SARS-CoV-2 coronavirus.
Collapse
Affiliation(s)
- Mohit Pandey
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
| | - Mariia Radaeva
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
| | - Hazem Mslati
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
| | - Olivia Garland
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
| | - Michael Fernandez
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
| | - Martin Ester
- School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Artem Cherkasov
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
- Correspondence:
| |
Collapse
|
33
|
Zhu S, Bai Q, Li L, Xu T. Drug repositioning in drug discovery of T2DM and repositioning potential of antidiabetic agents. Comput Struct Biotechnol J 2022; 20:2839-2847. [PMID: 35765655 PMCID: PMC9189996 DOI: 10.1016/j.csbj.2022.05.057] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/30/2022] [Accepted: 05/30/2022] [Indexed: 12/19/2022] Open
Abstract
Repositioning or repurposing drugs account for a substantial part of entering approval pipeline drugs, which indicates that drug repositioning has huge market potential and value. Computational technologies such as machine learning methods have accelerated the process of drug repositioning in the last few decades years. The repositioning potential of type 2 diabetes mellitus (T2DM) drugs for various diseases such as cancer, neurodegenerative diseases, and cardiovascular diseases have been widely studied. Hence, the related summary about repurposing antidiabetic drugs is of great significance. In this review, we focus on the machine learning methods for the development of new T2DM drugs and give an overview of the repurposing potential of the existing antidiabetic agents.
Collapse
Affiliation(s)
- Sha Zhu
- Key Lab of Preclinical Study for New Drugs of Gansu Province, Institute of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Lanzhou University, Lanzhou, Gansu 730000, PR China
| | - Qifeng Bai
- Key Lab of Preclinical Study for New Drugs of Gansu Province, Institute of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Lanzhou University, Lanzhou, Gansu 730000, PR China
- Corresponding author.
| | | | | |
Collapse
|
34
|
Volkov M, Turk JA, Drizard N, Martin N, Hoffmann B, Gaston-Mathé Y, Rognan D. On the Frustration to Predict Binding Affinities from Protein-Ligand Structures with Deep Neural Networks. J Med Chem 2022; 65:7946-7958. [PMID: 35608179 DOI: 10.1021/acs.jmedchem.2c00487] [Citation(s) in RCA: 48] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Accurate prediction of binding affinities from protein-ligand atomic coordinates remains a major challenge in early stages of drug discovery. Using modular message passing graph neural networks describing both the ligand and the protein in their free and bound states, we unambiguously evidence that an explicit description of protein-ligand noncovalent interactions does not provide any advantage with respect to ligand or protein descriptors. Simple models, inferring binding affinities of test samples from that of the closest ligands or proteins in the training set, already exhibit good performances, suggesting that memorization largely dominates true learning in the deep neural networks. The current study suggests considering only noncovalent interactions while omitting their protein and ligand atomic environments. Removing all hidden biases probably requires much denser protein-ligand training matrices and a coordinated effort of the drug design community to solve the necessary protein-ligand structures.
Collapse
Affiliation(s)
- Mikhail Volkov
- Laboratoire d'innovation thérapeutique, UMR7200 CNRS-Université de Strasbourg, 74 route du Rhin, Illkirch 67400, France
| | | | | | | | | | | | - Didier Rognan
- Laboratoire d'innovation thérapeutique, UMR7200 CNRS-Université de Strasbourg, 74 route du Rhin, Illkirch 67400, France
| |
Collapse
|
35
|
Das P, Pal V. Integrative analysis of chemical properties and functions of drugs for adverse drug reaction prediction based on multi-label deep neural network. J Integr Bioinform 2022; 19:jib-2022-0007. [PMID: 35585715 DOI: 10.1515/jib-2022-0007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Accepted: 03/16/2022] [Indexed: 11/15/2022] Open
Abstract
The prediction of adverse drug reactions (ADR) is an important step of drug discovery and design process. Different drug properties have been employed for ADR prediction but the prediction capability of drug properties and drug functions in integrated manner is yet to be explored. In the present work, a multi-label deep neural network and MLSMOTE based methodology has been proposed for ADR prediction. The proposed methodology has been applied on SMILES Strings data of drugs, 17 molecular descriptors data of drugs and drug functions data individually and in integrated manner for ADR prediction. The experimental results shows that the SMILES Strings + drug functions has outperformed other types of data with regards to ADR prediction capability.
Collapse
Affiliation(s)
- Pranab Das
- National Institute of Technology Meghalaya, Shillong, India
| | - Vipin Pal
- National Institute of Technology Meghalaya, Shillong, India
| |
Collapse
|
36
|
DTIP-TC2A: An analytical framework for drug-target interactions prediction methods. Comput Biol Chem 2022; 99:107707. [DOI: 10.1016/j.compbiolchem.2022.107707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 05/01/2022] [Accepted: 05/26/2022] [Indexed: 11/18/2022]
|
37
|
Drug repurposing in silico screening platforms. Biochem Soc Trans 2022; 50:747-758. [PMID: 35285479 DOI: 10.1042/bst20200967] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 02/08/2022] [Accepted: 02/21/2022] [Indexed: 12/15/2022]
Abstract
Over the last decade, for the first time, substantial efforts have been directed at the development of dedicated in silico platforms for drug repurposing, including initiatives targeting cancers and conditions as diverse as cryptosporidiosis, dengue, dental caries, diabetes, herpes, lupus, malaria, tuberculosis and Covid-19 related respiratory disease. This review outlines some of the exciting advances in the specific applications of in silico approaches to the challenge of drug repurposing and focuses particularly on where these efforts have resulted in the development of generic platform technologies of broad value to researchers involved in programmatic drug repurposing work. Recent advances in molecular docking methodologies and validation approaches, and their combination with machine learning or deep learning approaches are continually enhancing the precision of repurposing efforts. The meaningful integration of better understanding of molecular mechanisms with molecular pathway data and knowledge of disease networks is widening the scope for discovery of repurposing opportunities. The power of Artificial Intelligence is being gainfully exploited to advance progress in an integrated science that extends from the sub-atomic to the whole system level. There are many promising emerging developments but there are remaining challenges to be overcome in the successful integration of the new advances in useful platforms. In conclusion, the essential component requirements for development of powerful and well optimised drug repurposing screening platforms are discussed.
Collapse
|
38
|
Zhao B, Kurgan L. Deep learning in prediction of intrinsic disorder in proteins. Comput Struct Biotechnol J 2022; 20:1286-1294. [PMID: 35356546 PMCID: PMC8927795 DOI: 10.1016/j.csbj.2022.03.003] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 03/04/2022] [Accepted: 03/04/2022] [Indexed: 12/12/2022] Open
Abstract
Intrinsic disorder prediction is an active area that has developed over 100 predictors. We identify and investigate a recent trend towards the development of deep neural network (DNN)-based methods. The first DNN-based method was released in 2013 and since 2019 deep learners account for majority of the new disorder predictors. We find that the 13 currently available DNN-based predictors are diverse in their topologies, sizes of their networks and the inputs that they utilize. We empirically show that the deep learners are statistically more accurate than other types of disorder predictors using the blind test dataset from the recent community assessment of intrinsic disorder predictions (CAID). We also identify several well-rounded DNN-based predictors that are accurate, fast and/or conveniently available. The popularity, favorable predictive performance and architectural flexibility suggest that deep networks are likely to fuel the development of future disordered predictors. Novel hybrid designs of deep networks could be used to adequately accommodate for diversity of types and flavors of intrinsic disorder. We also discuss scarcity of the DNN-based methods for the prediction of disordered binding regions and the need to develop more accurate methods for this prediction.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
39
|
Factors Determining Plasticity of Responses to Drugs. Int J Mol Sci 2022; 23:ijms23042068. [PMID: 35216184 PMCID: PMC8877660 DOI: 10.3390/ijms23042068] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 02/07/2022] [Accepted: 02/10/2022] [Indexed: 12/16/2022] Open
Abstract
The plasticity of responses to drugs is an ever-present confounding factor for all aspects of pharmacology, influencing drug discovery and development, clinical use and the expectations of the patient. As an introduction to this Special Issue of the journal IJMS on pharmacological plasticity, we address the various levels at which plasticity appears and how such variability can be controlled, describing the ways in which drug responses can be affected with examples. The various levels include the molecular structures of drugs and their receptors, expression of genes for drug receptors and enzymes involved in metabolism, plasticity of cells targeted by drugs, tissues and clinical variables affected by whole body processes, changes in geography and the environment, and the influence of time and duration of changes. The article provides a rarely considered bird’s eye view of the problem and is intended to emphasize the need for increased awareness of pharmacological plasticity and to encourage further debate.
Collapse
|
40
|
Sabbadini R, Pesce E, Parodi A, Mustorgi E, Bruzzone S, Pedemonte N, Casale M, Millo E, Cichero E. Probing Allosteric Hsp70 Inhibitors by Molecular Modelling Studies to Expedite the Development of Novel Combined F508del CFTR Modulators. Pharmaceuticals (Basel) 2021; 14:ph14121296. [PMID: 34959696 PMCID: PMC8709398 DOI: 10.3390/ph14121296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 12/07/2021] [Accepted: 12/09/2021] [Indexed: 11/16/2022] Open
Abstract
Cystic fibrosis (CF) is caused by different mutations related to the cystic fibrosis transmembrane regulator protein (CFTR), with F508del being the most common. Pioneering the development of CFTR modulators, thanks to the development of effective correctors or potentiators, more recent studies deeply encouraged the administration of triple combination therapeutics. However, combinations of molecules interacting with other proteins involved in functionality of the CFTR channel recently arose as a promising approach to address a large rescue of F508del-CFTR. In this context, the design of compounds properly targeting the molecular chaperone Hsp70, such as the allosteric inhibitor MKT-077, proved to be effective for the development of indirect CFTR modulators, endowed with ability to amplify the accumulation of the rescued protein. Herein we performed structure-based studies of a number of allosteric HSP70 inhibitors, considering the recent X-ray crystallographic structure of the human enzyme. This allowed us to point out the main interaction supporting the binding mode of MKT-077, as well as of the related analogues. In particular, cation-π and π-π stacking with the conserve residue Tyr175 deeply stabilized inhibitor binding at the HSP70 cavity. Molecular docking studies had been followed by QSAR analysis and then by virtual screening of aminoaryl thiazoles (I-IIIa) as putative HSP70 inhibitors. Their effectiveness as CFTR modulators has been verified by biological assays, in combination with VX-809, whose positive results confirmed the reliability of the whole applied computational method. Along with this, the "in-silico" prediction of absorption, distribution, metabolism, and excretion (ADME) properties highlighted, once more, that AATs may represent a chemical class to be further investigated for the rational design of novel combination of compounds for CF treatment.
Collapse
Affiliation(s)
- Roberto Sabbadini
- Department of Pharmacy, Section of Medicinal Chemistry, School of Medical and Pharmaceutical Sciences, University of Genoa, Viale Benedetto XV, 3, 16132 Genoa, Italy;
| | - Emanuela Pesce
- UOC Genetica Medica, IRCCS Istituto Giannina Gaslini, Via Gerolamo Gaslini, 5, 16147 Genova, Italy; (E.P.); (N.P.)
| | - Alice Parodi
- Department of Experimental Medicine, Section of Biochemistry, University of Genoa, Viale Benedetto XV 1, 16132 Genoa, Italy; (A.P.); (S.B.)
| | - Eleonora Mustorgi
- Department of Pharmacy, Section of Chemistry and Food and Pharmaceutical Technologies, University of Genoa, Viale Cembrano, 4, 16148 Genoa, Italy; (E.M.); (M.C.)
| | - Santina Bruzzone
- Department of Experimental Medicine, Section of Biochemistry, University of Genoa, Viale Benedetto XV 1, 16132 Genoa, Italy; (A.P.); (S.B.)
| | - Nicoletta Pedemonte
- UOC Genetica Medica, IRCCS Istituto Giannina Gaslini, Via Gerolamo Gaslini, 5, 16147 Genova, Italy; (E.P.); (N.P.)
| | - Monica Casale
- Department of Pharmacy, Section of Chemistry and Food and Pharmaceutical Technologies, University of Genoa, Viale Cembrano, 4, 16148 Genoa, Italy; (E.M.); (M.C.)
| | - Enrico Millo
- Department of Experimental Medicine, Section of Biochemistry, University of Genoa, Viale Benedetto XV 1, 16132 Genoa, Italy; (A.P.); (S.B.)
- Correspondence: (E.M.); (E.C.); Tel.: +10-335-3032-3033 (E.M.); +39-010-353-8350 (E.C.)
| | - Elena Cichero
- Department of Pharmacy, Section of Medicinal Chemistry, School of Medical and Pharmaceutical Sciences, University of Genoa, Viale Benedetto XV, 3, 16132 Genoa, Italy;
- Correspondence: (E.M.); (E.C.); Tel.: +10-335-3032-3033 (E.M.); +39-010-353-8350 (E.C.)
| |
Collapse
|