1
|
Gangwal A, Ansari A, Ahmad I, Azad AK, Wan Sulaiman WMA. Current strategies to address data scarcity in artificial intelligence-based drug discovery: A comprehensive review. Comput Biol Med 2024; 179:108734. [PMID: 38964243 DOI: 10.1016/j.compbiomed.2024.108734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 06/01/2024] [Accepted: 06/08/2024] [Indexed: 07/06/2024]
Abstract
Artificial intelligence (AI) has played a vital role in computer-aided drug design (CADD). This development has been further accelerated with the increasing use of machine learning (ML), mainly deep learning (DL), and computing hardware and software advancements. As a result, initial doubts about the application of AI in drug discovery have been dispelled, leading to significant benefits in medicinal chemistry. At the same time, it is crucial to recognize that AI is still in its infancy and faces a few limitations that need to be addressed to harness its full potential in drug discovery. Some notable limitations are insufficient, unlabeled, and non-uniform data, the resemblance of some AI-generated molecules with existing molecules, unavailability of inadequate benchmarks, intellectual property rights (IPRs) related hurdles in data sharing, poor understanding of biology, focus on proxy data and ligands, lack of holistic methods to represent input (molecular structures) to prevent pre-processing of input molecules (feature engineering), etc. The major component in AI infrastructure is input data, as most of the successes of AI-driven efforts to improve drug discovery depend on the quality and quantity of data, used to train and test AI algorithms, besides a few other factors. Additionally, data-gulping DL approaches, without sufficient data, may collapse to live up to their promise. Current literature suggests a few methods, to certain extent, effectively handle low data for better output from the AI models in the context of drug discovery. These are transferring learning (TL), active learning (AL), single or one-shot learning (OSL), multi-task learning (MTL), data augmentation (DA), data synthesis (DS), etc. One different method, which enables sharing of proprietary data on a common platform (without compromising data privacy) to train ML model, is federated learning (FL). In this review, we compare and discuss these methods, their recent applications, and limitations while modeling small molecule data to get the improved output of AI methods in drug discovery. Article also sums up some other novel methods to handle inadequate data.
Collapse
Affiliation(s)
- Amit Gangwal
- Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal's Institute of Pharmacy, Dhule, 424001, Maharashtra, India.
| | - Azim Ansari
- Computer Aided Drug Design Center, Shri Vile Parle Kelavani Mandal's Institute of Pharmacy, Dhule, 424001, Maharashtra, India
| | - Iqrar Ahmad
- Department of Pharmaceutical Chemistry, Prof. Ravindra Nikam College of Pharmacy, Gondur, Dhule, 424002, Maharashtra, India.
| | - Abul Kalam Azad
- Faculty of Pharmacy, University College of MAIWP International, Batu Caves, 68100, Kuala Lumpur, Malaysia.
| | | |
Collapse
|
2
|
Noga M, Jurowski K. Toxicity of Bromo-DragonFLY as a New Psychoactive Substance: Application of In Silico Methods for the Prediction of Key Toxicological Parameters Important to Clinical and Forensic Toxicology. Chem Res Toxicol 2024. [PMID: 39119730 DOI: 10.1021/acs.chemrestox.4c00105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/10/2024]
Abstract
Bromo-DragonFLY is a synthetic new psychoactive substance (NPS) that has gained attention due to its powerful and long-lasting hallucinogenic effects, legal status, and widespread availability. This study aimed to use various in silico toxicology methods to predict key toxicological parameters for Bromo-DragonFLY, including acute toxicity (LD50), genotoxicity, cardiotoxicity, health effects, and the potential for endocrine disruption. The results indicate significant acute toxicity with noticeable variations across different species, a low likelihood of genotoxic potential suggesting potential DNA damage, and a notable risk of cardiotoxicity associated with inhibition of the hERG channel. Evaluation of endocrine disruption suggests a low probability of Bromo-DragonFLY interacting with the estrogen receptor α (ER-α), indicating minimal estrogenic activity. These insights from in silico investigations are important for advancing our understanding of this NPS in forensic and clinical toxicology. These initial toxicological examinations establish a foundation for future research efforts and contribute to developing risk assessment and management strategies for using and misusing NPS.
Collapse
Affiliation(s)
- Maciej Noga
- Department of Regulatory and Forensic Toxicology, Institute of Medical Expertises in Łódź, Ul. Aleksandrowska 67/93, 91-205 Łódź, Poland
| | - Kamil Jurowski
- Department of Regulatory and Forensic Toxicology, Institute of Medical Expertises in Łódź, Ul. Aleksandrowska 67/93, 91-205 Łódź, Poland
- Laboratory of Innovative Toxicological Research and Analyzes, Institute of Medical Studies, Medical College, Rzeszów University, Al. Mjr. W. Kopisto 2a, 35-959 Rzeszów, Poland
| |
Collapse
|
3
|
Lavecchia A. Navigating the frontier of drug-like chemical space with cutting-edge generative AI models. Drug Discov Today 2024; 29:104133. [PMID: 39103144 DOI: 10.1016/j.drudis.2024.104133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 07/20/2024] [Accepted: 07/31/2024] [Indexed: 08/07/2024]
Abstract
Deep generative models (GMs) have transformed the exploration of drug-like chemical space (CS) by generating novel molecules through complex, nontransparent processes, bypassing direct structural similarity. This review examines five key architectures for CS exploration: recurrent neural networks (RNNs), variational autoencoders (VAEs), generative adversarial networks (GANs), normalizing flows (NF), and Transformers. It discusses molecular representation choices, training strategies for focused CS exploration, evaluation criteria for CS coverage, and related challenges. Future directions include refining models, exploring new notations, improving benchmarks, and enhancing interpretability to better understand biologically relevant molecular properties.
Collapse
Affiliation(s)
- Antonio Lavecchia
- 'Drug Discovery' Laboratory, Department of Pharmacy, University of Naples Federico II, I-80131 Naples, Italy.
| |
Collapse
|
4
|
Lavecchia A. Advancing drug discovery with deep attention neural networks. Drug Discov Today 2024; 29:104067. [PMID: 38925473 DOI: 10.1016/j.drudis.2024.104067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 06/10/2024] [Accepted: 06/19/2024] [Indexed: 06/28/2024]
Abstract
In the dynamic field of drug discovery, deep attention neural networks are revolutionizing our approach to complex data. This review explores the attention mechanism and its extended architectures, including graph attention networks (GATs), transformers, bidirectional encoder representations from transformers (BERT), generative pre-trained transformers (GPTs) and bidirectional and auto-regressive transformers (BART). Delving into their core principles and multifaceted applications, we uncover their pivotal roles in catalyzing de novo drug design, predicting intricate molecular properties and deciphering elusive drug-target interactions. Despite challenges, these attention-based architectures hold unparalleled promise to drive transformative breakthroughs and accelerate progress in pharmaceutical research.
Collapse
Affiliation(s)
- Antonio Lavecchia
- Drug Discovery Laboratory, Department of Pharmacy, University of Napoli Federico II, I-80131 Naples, Italy.
| |
Collapse
|
5
|
Vittoria Togo M, Mastrolorito F, Orfino A, Graps EA, Tondo AR, Altomare CD, Ciriaco F, Trisciuzzi D, Nicolotti O, Amoroso N. Where developmental toxicity meets explainable artificial intelligence: state-of-the-art and perspectives. Expert Opin Drug Metab Toxicol 2024; 20:561-577. [PMID: 38141160 DOI: 10.1080/17425255.2023.2298827] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 12/20/2023] [Indexed: 12/24/2023]
Abstract
INTRODUCTION The application of Artificial Intelligence (AI) to predictive toxicology is rapidly increasing, particularly aiming to develop non-testing methods that effectively address ethical concerns and reduce economic costs. In this context, Developmental Toxicity (Dev Tox) stands as a key human health endpoint, especially significant for safeguarding maternal and child well-being. AREAS COVERED This review outlines the existing methods employed in Dev Tox predictions and underscores the benefits of utilizing New Approach Methodologies (NAMs), specifically focusing on eXplainable Artificial Intelligence (XAI), which proves highly efficient in constructing reliable and transparent models aligned with recommendations from international regulatory bodies. EXPERT OPINION The limited availability of high-quality data and the absence of dependable Dev Tox methodologies render XAI an appealing avenue for systematically developing interpretable and transparent models, which hold immense potential for both scientific evaluations and regulatory decision-making.
Collapse
Affiliation(s)
- Maria Vittoria Togo
- Department of Pharmacy - Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", Bari, Italy
| | - Fabrizio Mastrolorito
- Department of Pharmacy - Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", Bari, Italy
| | - Angelica Orfino
- Department of Pharmacy - Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", Bari, Italy
| | - Elisabetta Anna Graps
- ARESS Puglia - Agenzia Regionale strategica per laSalute ed il Sociale, Presidenza della Regione Puglia", Bari, Italy
| | - Anna Rita Tondo
- Department of Pharmacy - Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", Bari, Italy
| | - Cosimo Damiano Altomare
- Department of Pharmacy - Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", Bari, Italy
| | - Fulvio Ciriaco
- Department of Chemistry, Universitá degli Studi di Bari "Aldo Moro", Bari, Italy
| | - Daniela Trisciuzzi
- Department of Pharmacy - Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", Bari, Italy
| | - Orazio Nicolotti
- Department of Pharmacy - Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", Bari, Italy
| | - Nicola Amoroso
- Department of Pharmacy - Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", Bari, Italy
| |
Collapse
|
6
|
Saunders A, Harrington PDB. Advances in Activity/Property Prediction from Chemical Structures. Crit Rev Anal Chem 2024; 54:135-147. [PMID: 35482792 DOI: 10.1080/10408347.2022.2066461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Recent technological advancement in AI modeling of molecular property databases has significantly expanded the opportunities for drug design and development. Quantitative structure-activity relationships (QSARs) are shown to provide more accurate predictions with regards to biological activity as well as toxicological assessment. By using a combination of in-silico models or by combining disparate structure-activity databases, researchers have been able to improve accuracy for a variety of drug discovery and analysis methods, generating viable compounds, which in certain cases, can be synthesized and further studied in vitro to find candidates for potential development. Additionally, the development of compounds of determined toxicology can be discontinued earlier, allowing alternative routes to be evaluated, preventing wasted time and resources. Although the progress that has been made is tremendous, expert review is still necessary for most in-silico generated predictions. Regardless, the scientific community continues to move ever closer to completely automated drug discovery and evaluation.
Collapse
Affiliation(s)
- Arianne Saunders
- Department of Chemistry and Biochemistry, Ohio University, Athens, Ohio, USA
| | | |
Collapse
|
7
|
Rosa LS, Argolo CO, Nascimento CM, Pimentel AS. Identifying Substructures That Facilitate Compounds to Penetrate the Blood-Brain Barrier via Passive Transport Using Machine Learning Explainer Models. ACS Chem Neurosci 2024; 15:2144-2159. [PMID: 38723285 PMCID: PMC11157485 DOI: 10.1021/acschemneuro.3c00840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 04/15/2024] [Accepted: 04/16/2024] [Indexed: 06/06/2024] Open
Abstract
The local interpretable model-agnostic explanation (LIME) method was used to interpret two machine learning models of compounds penetrating the blood-brain barrier. The classification models, Random Forest, ExtraTrees, and Deep Residual Network, were trained and validated using the blood-brain barrier penetration dataset, which shows the penetrability of compounds in the blood-brain barrier. LIME was able to create explanations for such penetrability, highlighting the most important substructures of molecules that affect drug penetration in the barrier. The simple and intuitive outputs prove the applicability of this explainable model to interpreting the permeability of compounds across the blood-brain barrier in terms of molecular features. LIME explanations were filtered with a weight equal to or greater than 0.1 to obtain only the most relevant explanations. The results showed several structures that are important for blood-brain barrier penetration. In general, it was found that some compounds with nitrogenous substructures are more likely to permeate the blood-brain barrier. The application of these structural explanations may help the pharmaceutical industry and potential drug synthesis research groups to synthesize active molecules more rationally.
Collapse
Affiliation(s)
- Lucca
Caiaffa Santos Rosa
- Departamento de Química, Pontifícia Universidade Católica do
Rio de Janeiro, Rio de
Janeiro, RJ 22453-900, Brazil
| | - Caio Oliveira Argolo
- Departamento de Química, Pontifícia Universidade Católica do
Rio de Janeiro, Rio de
Janeiro, RJ 22453-900, Brazil
| | | | - Andre Silva Pimentel
- Departamento de Química, Pontifícia Universidade Católica do
Rio de Janeiro, Rio de
Janeiro, RJ 22453-900, Brazil
| |
Collapse
|
8
|
Gangwal A, Lavecchia A. Unleashing the power of generative AI in drug discovery. Drug Discov Today 2024; 29:103992. [PMID: 38663579 DOI: 10.1016/j.drudis.2024.103992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 03/22/2024] [Accepted: 04/18/2024] [Indexed: 05/04/2024]
Abstract
Artificial intelligence (AI) is revolutionizing drug discovery by enhancing precision, reducing timelines and costs, and enabling AI-driven computer-aided drug design. This review focuses on recent advancements in deep generative models (DGMs) for de novo drug design, exploring diverse algorithms and their profound impact. It critically analyses the challenges that are intricately interwoven into these technologies, proposing strategies to unlock their full potential. It features case studies of both successes and failures in advancing drugs to clinical trials with AI assistance. Last, it outlines a forward-looking plan for optimizing DGMs in de novo drug design, thereby fostering faster and more cost-effective drug development.
Collapse
Affiliation(s)
- Amit Gangwal
- Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal's Institute of Pharmacy, Dhule 424001, Maharashtra, India
| | - Antonio Lavecchia
- "Drug Discovery" Laboratory, Department of Pharmacy, University of Naples Federico II, I-80131 Naples, Italy.
| |
Collapse
|
9
|
Zhou Y, Wang Z, Huang Z, Li W, Chen Y, Yu X, Tang Y, Liu G. In silico prediction of ocular toxicity of compounds using explainable machine learning and deep learning approaches. J Appl Toxicol 2024; 44:892-907. [PMID: 38329145 DOI: 10.1002/jat.4586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 01/16/2024] [Accepted: 01/16/2024] [Indexed: 02/09/2024]
Abstract
The accurate identification of chemicals with ocular toxicity is of paramount importance in health hazard assessment. In contemporary chemical toxicology, there is a growing emphasis on refining, reducing, and replacing animal testing in safety evaluations. Therefore, the development of robust computational tools is crucial for regulatory applications. The performance of predictive models is heavily reliant on the quality and quantity of data. In this investigation, we amalgamated the most extensive dataset (4901 compounds) sourced from governmental GHS-compliant databases and literature to develop binary classification models of chemical ocular toxicity. We employed 12 molecular representations in conjunction with six machine learning algorithms and two deep learning algorithms to create a series of binary classification models. The findings indicated that the deep learning method GCN outperformed the machine learning models in cross-validation, achieving an impressive AUC of 0.915. However, the top-performing machine learning model (RF-Descriptor) demonstrated excellent performance with an AUC of 0.869 on the test set and was therefore selected as the best model. To enhance model interpretability, we conducted the SHAP method and attention weights analysis. The two approaches offered visual depictions of the relevance of key descriptors and substructures in predicting ocular toxicity of chemicals. Thus, we successfully struck a delicate balance between data quality and model interpretability, rendering our model valuable for predicting and comprehending potential ocular-toxic compounds in the early stages of drug discovery.
Collapse
Affiliation(s)
- Yiqing Zhou
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Ze Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Zejun Huang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Yuanting Chen
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Xinxin Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| |
Collapse
|
10
|
Sirocchi C, Biancucci F, Donati M, Bogliolo A, Magnani M, Menotta M, Montagna S. Exploring machine learning for untargeted metabolomics using molecular fingerprints. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108163. [PMID: 38626559 DOI: 10.1016/j.cmpb.2024.108163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 03/15/2024] [Accepted: 04/03/2024] [Indexed: 04/18/2024]
Abstract
BACKGROUND Metabolomics, the study of substrates and products of cellular metabolism, offers valuable insights into an organism's state under specific conditions and has the potential to revolutionise preventive healthcare and pharmaceutical research. However, analysing large metabolomics datasets remains challenging, with available methods relying on limited and incompletely annotated metabolic pathways. METHODS This study, inspired by well-established methods in drug discovery, employs machine learning on metabolite fingerprints to explore the relationship of their structure with responses in experimental conditions beyond known pathways, shedding light on metabolic processes. It evaluates fingerprinting effectiveness in representing metabolites, addressing challenges like class imbalance, data sparsity, high dimensionality, duplicate structural encoding, and interpretable features. Feature importance analysis is then applied to reveal key chemical configurations affecting classification, identifying related metabolite groups. RESULTS The approach is tested on two datasets: one on Ataxia Telangiectasia and another on endothelial cells under low oxygen. Machine learning on molecular fingerprints predicts metabolite responses effectively, and feature importance analysis aligns with known metabolic pathways, unveiling new affected metabolite groups for further study. CONCLUSION In conclusion, the presented approach leverages the strengths of drug discovery to address critical issues in metabolomics research and aims to bridge the gap between these two disciplines. This work lays the foundation for future research in this direction, possibly exploring alternative structural encodings and machine learning models.
Collapse
Affiliation(s)
- Christel Sirocchi
- Department of Pure and Applied Sciences, University of Urbino, Piazza della Repubblica, 13, Urbino, 61029, Italy.
| | - Federica Biancucci
- Department of Biomolecular Sciences, University of Urbino, Via Saffi 2, Urbino, 61029, Italy
| | - Matteo Donati
- Department of Pure and Applied Sciences, University of Urbino, Piazza della Repubblica, 13, Urbino, 61029, Italy
| | - Alessandro Bogliolo
- Department of Pure and Applied Sciences, University of Urbino, Piazza della Repubblica, 13, Urbino, 61029, Italy
| | - Mauro Magnani
- Department of Biomolecular Sciences, University of Urbino, Via Saffi 2, Urbino, 61029, Italy
| | - Michele Menotta
- Department of Biomolecular Sciences, University of Urbino, Via Saffi 2, Urbino, 61029, Italy
| | - Sara Montagna
- Department of Pure and Applied Sciences, University of Urbino, Piazza della Repubblica, 13, Urbino, 61029, Italy
| |
Collapse
|
11
|
Gyebi GA, Ogunyemi OM, Ibrahim IM, Ogunro OB, Afolabi SO, Ojo RJ, Anyanwu GO, El-Saber Batiha G, Adebayo JO. Identification of potential inhibitors of cholinergic and β-secretase enzymes from phytochemicals derived from Gongronema latifolium Benth leaf: an integrated computational analysis. Mol Divers 2024; 28:1305-1322. [PMID: 37338673 DOI: 10.1007/s11030-023-10658-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 05/13/2023] [Indexed: 06/21/2023]
Abstract
Neurodegenerative disorders (NDDs) are associated with increased activities of the brain acetylcholinesterase (AChE), butyrylcholinesterase (BChE) and β-secretase enzyme (BACE1). Inhibition of these enzymes affords therapeutic option for managing NDDs such as Alzheimer's disease (AD) and Parkinson's disease (PD). Although, Gongronema latifolium Benth (GL) has been widely documented in ethnopharmacological and scientific reports for the management of NDDs, there is paucity of information on its underlying mechanism and neurotherapeutic constituents. Herein, 152 previously reported Gongronema latifolium derived-phytochemicals (GLDP) were screened against hAChE, hBChE and hBACE-1 using molecular docking, molecular dynamics (MD) simulations, free energy of binding calculations and cluster analysis. The result of the computational analysis identified silymarin, alpha-amyrin and teraxeron with the highest binding energies (-12.3, -11.2, -10.5 Kcal/mol) for hAChE, hBChE and hBACE-1 respectively as compared with those of the reference inhibitors (-12.3, -9.8 and - 9.4 for donepezil, propidium and aminoquinoline compound respectively). These best docked phytochemicals were found to be orientated in the hydrophobic gorge where they interacted with the choline-binding pocket in the A-site and P-site of the cholinesterase and subsites S1, S3, S3' and flip (67-75) residues of the pocket of the BACE-1. The best docked phytochemicals complexed with the target proteins were stable in a 100 ns molecular dynamic simulation. The interactions with the catalytic residues were preserved during the simulation as observed from the MMGBSA decomposition and cluster analyses. The presence of these phytocompounds most notably silymarin, which demonstrated dual high binding tendencies to both cholinesterases, were identified as potential neurotherapeutics subject to further investigation.
Collapse
Affiliation(s)
- Gideon Ampoma Gyebi
- Department of Biochemistry, Faculty of Science and Technology, P.M.B 005, Karu, Nasarawa State, Nigeria.
- Natural Products and Structural (Bio-Chem)-informatics Research Laboratory (NpsBC-Rl), Bingham University, Nasarawa, Nigeria.
| | - Oludare M Ogunyemi
- Nutritional and Industrial Biochemistry Unit, Department of Biochemistry, Faculty of Basic Medical Sciences, College of Medicine, University of Ibadan, Ibadan, Nigeria
| | - Ibrahim M Ibrahim
- Department of Biophysics, Faculty of Sciences, Cairo University, Giza, Egypt
| | - Olalekan B Ogunro
- Department of Biological Sciences, KolaDaisi University, Ibadan, Nigeria
| | - Saheed O Afolabi
- Faculty of Basic Medical Sciences, Department of Pharmacology and Therapeutics, University of Ilorin, Ilorin, Nigeria
| | - Rotimi J Ojo
- Department of Biochemistry, Faculty of Computing and Applied Sciences, Baze University, Abuja, Nigeria
| | - Gabriel O Anyanwu
- Department of Biochemistry, Faculty of Science and Technology, P.M.B 005, Karu, Nasarawa State, Nigeria
| | - Gaber El-Saber Batiha
- Department of Pharmacology and Therapeutics, Faculty of Veterinary Medicine, Damanhour University, Damanhour, AlBeheira, 22511, Egypt
| | - Joseph O Adebayo
- Department of Biochemistry, Faculty of Life Sciences, University of Ilorin, Ilorin, Nigeria
| |
Collapse
|
12
|
Taub R, Savir Y. SAF: Smart Aggregation Framework for Revealing Atoms Importance Rank and Improving Prediction Rates in Drug Discovery. J Chem Inf Model 2024; 64:4021-4030. [PMID: 38695490 PMCID: PMC11134513 DOI: 10.1021/acs.jcim.4c00107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 04/16/2024] [Accepted: 04/16/2024] [Indexed: 05/28/2024]
Abstract
Machine learning, and representation learning in particular, has the potential to facilitate drug discovery by screening a large chemical space in silico. A successful approach for representing molecules is to treat them as graphs and utilize graph neural networks. One of the key limitations of such methods is the necessity to represent compounds with different numbers of atoms, which requires aggregating the atom's information. Common aggregation operators, such as averaging, result in a loss of information at the atom level. In this work, we propose a novel aggregating approach where each atom is weighted nonlinearly using the Boltzmann distribution with a hyperparameter analogous to temperature. We show that using this weighted aggregation improves the ability of the gold standard message-passing neural network to predict antibiotic activity. Moreover, by changing the temperature hyperparameter, our approach can reveal the atoms that are important for activity prediction in a smooth and consistent way, thus providing a novel regulated attention mechanism for graph neural networks. We further validate our method by showing that it recapitulates the functional group in β-lactam antibiotics. The ability of our approach to rank the atoms' importance for a desired function can be used within any graph neural network to provide interpretability of the results and predictions at the node level.
Collapse
Affiliation(s)
- Ronen Taub
- Department of Physiology, Biophysics
& Systems Biology, Medicine Faculty, Technion IIT, Haifa 3525422, Israel
| | - Yonatan Savir
- Department of Physiology, Biophysics
& Systems Biology, Medicine Faculty, Technion IIT, Haifa 3525422, Israel
| |
Collapse
|
13
|
Kumar N, Acharya V. Advances in machine intelligence-driven virtual screening approaches for big-data. Med Res Rev 2024; 44:939-974. [PMID: 38129992 DOI: 10.1002/med.21995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 07/15/2023] [Accepted: 10/29/2023] [Indexed: 12/23/2023]
Abstract
Virtual screening (VS) is an integral and ever-evolving domain of drug discovery framework. The VS is traditionally classified into ligand-based (LB) and structure-based (SB) approaches. Machine intelligence or artificial intelligence has wide applications in the drug discovery domain to reduce time and resource consumption. In combination with machine intelligence algorithms, VS has emerged into revolutionarily progressive technology that learns within robust decision orders for data curation and hit molecule screening from large VS libraries in minutes or hours. The exponential growth of chemical and biological data has evolved as "big-data" in the public domain demands modern and advanced machine intelligence-driven VS approaches to screen hit molecules from ultra-large VS libraries. VS has evolved from an individual approach (LB and SB) to integrated LB and SB techniques to explore various ligand and target protein aspects for the enhanced rate of appropriate hit molecule prediction. Current trends demand advanced and intelligent solutions to handle enormous data in drug discovery domain for screening and optimizing hits or lead with fewer or no false positive hits. Following the big-data drift and tremendous growth in computational architecture, we presented this review. Here, the article categorized and emphasized individual VS techniques, detailed literature presented for machine learning implementation, modern machine intelligence approaches, and limitations and deliberated the future prospects.
Collapse
Affiliation(s)
- Neeraj Kumar
- Artificial Intelligence for Computational Biology Lab (AICoB), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- Academy of Scientific and Innovative Research, Ghaziabad, India
| | - Vishal Acharya
- Artificial Intelligence for Computational Biology Lab (AICoB), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- Academy of Scientific and Innovative Research, Ghaziabad, India
| |
Collapse
|
14
|
Abad-Zapatero C. Artificial intelligence (AI) and alternative variables (AV) in drug discovery: A promising alliance. Drug Discov Today 2024; 29:103978. [PMID: 38599277 DOI: 10.1016/j.drudis.2024.103978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 03/31/2024] [Accepted: 04/02/2024] [Indexed: 04/12/2024]
Affiliation(s)
- Celerino Abad-Zapatero
- Institute for Tuberculosis Research, Center for Biomolecular Sciences, Department of Pharmaceutical Sciences, College of Pharmacy, University of Illinois Chicago, Chicago, IL, 60607, United States.
| |
Collapse
|
15
|
Gasperini D, Howe GA. Phytohormones in a universe of regulatory metabolites: lessons from jasmonate. PLANT PHYSIOLOGY 2024; 195:135-154. [PMID: 38290050 PMCID: PMC11060663 DOI: 10.1093/plphys/kiae045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 01/05/2024] [Accepted: 01/05/2024] [Indexed: 02/01/2024]
Abstract
Small-molecule phytohormones exert control over plant growth, development, and stress responses by coordinating the patterns of gene expression within and between cells. Increasing evidence indicates that currently recognized plant hormones are part of a larger group of regulatory metabolites that have acquired signaling properties during the evolution of land plants. This rich assortment of chemical signals reflects the tremendous diversity of plant secondary metabolism, which offers evolutionary solutions to the daunting challenges of sessility and other unique aspects of plant biology. A major gap in our current understanding of plant regulatory metabolites is the lack of insight into the direct targets of these compounds. Here, we illustrate the blurred distinction between classical phytohormones and other bioactive metabolites by highlighting the major scientific advances that transformed the view of jasmonate from an interesting floral scent to a potent transcriptional regulator. Lessons from jasmonate research generally apply to other phytohormones and thus may help provide a broad understanding of regulatory metabolite-protein interactions. In providing a framework that links small-molecule diversity to transcriptional plasticity, we hope to stimulate future research to explore the evolution, functions, and mechanisms of perception of a broad range of plant regulatory metabolites.
Collapse
Affiliation(s)
- Debora Gasperini
- Department of Molecular Signal Processing, Leibniz Institute of Plant Biochemistry, Halle 06120, Germany
| | - Gregg A Howe
- Department of Energy-Plant Research Laboratory, Michigan State University, East Lansing, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
- Plant Resilience Institute, Michigan State University, East Lansing, MI 42284, USA
| |
Collapse
|
16
|
Daoud S, Taha M. Protein characteristics substantially influence the propensity of activity cliffs among kinase inhibitors. Sci Rep 2024; 14:9058. [PMID: 38643174 PMCID: PMC11032345 DOI: 10.1038/s41598-024-59501-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Accepted: 04/11/2024] [Indexed: 04/22/2024] Open
Abstract
Activity cliffs (ACs) are pairs of structurally similar molecules with significantly different affinities for a biotarget, posing a challenge in computer-assisted drug discovery. This study focuses on protein kinases, significant therapeutic targets, with some exhibiting ACs while others do not despite numerous inhibitors. The hypothesis that the presence of ACs is dependent on the target protein and its complete structural context is explored. Machine learning models were developed to link protein properties to ACs, revealing specific tripeptide sequences and overall protein properties as critical factors in ACs occurrence. The study highlights the importance of considering the entire protein matrix rather than just the binding site in understanding ACs. This research provides valuable insights for drug discovery and design, paving the way for addressing ACs-related challenges in modern computational approaches.
Collapse
Affiliation(s)
- Safa Daoud
- Department of Pharmaceutical Chemistry and Pharmacognosy, Faculty of Pharmacy, Applied Sciences Private University, Amman, Jordan.
| | - Mutasem Taha
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, University of Jordan, Amman, Jordan.
| |
Collapse
|
17
|
Duda J, Podlewska S. Prediction of probability distributions of molecular properties: towards more efficient virtual screening and better understanding of compound representations. Mol Divers 2024; 28:437-448. [PMID: 36586082 DOI: 10.1007/s11030-022-10589-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 12/18/2022] [Indexed: 01/01/2023]
Abstract
Various in silico approaches to predict activity and properties of chemical compounds constitute nowadays the basis of computer-aided drug design. While there is a general focus on the predictions of values, mathematically more appropriate is the prognosis of probability distributions, which offers additional possibilities, such as the evaluation of uncertainty, higher moments, and quantiles. In this study, we applied the Hierarchical Correlation Reconstruction approach to assess several ADMET properties of chemical compounds. It uses multiple linear regression to independently assess multiple moments, which are then finally combined into predicted probability distribution. The method enables inexpensive selection of compounds with properties nearly certain to fall into the particular range during virtual screening and automatic rejection of predictions characterized by high rate of uncertainty; however, unlike to the currently used virtual screening methods, it focuses on the prediction of the property distribution, not its actual value. Moreover, the presented protocol enables detection of structural features, which should be carefully considered when optimizing compounds towards particular property, as well as it provides deeper understanding of the examined compound representations.
Collapse
Affiliation(s)
- Jarosław Duda
- Faculty of Mathematics and Computer Science, Jagiellonian University, Łojasiewicza 6, 30-348, Kraków, Poland
| | - Sabina Podlewska
- Department of Medicinal Chemistry, Maj Institute of Pharmacology, Polish Academy of Sciences, Smętna Street 12, 31-343, Kraków, Poland.
| |
Collapse
|
18
|
Wang Z, Wang S, Li Y, Guo J, Wei Y, Mu Y, Zheng L, Li W. A new paradigm for applying deep learning to protein-ligand interaction prediction. Brief Bioinform 2024; 25:bbae145. [PMID: 38581420 PMCID: PMC10998640 DOI: 10.1093/bib/bbae145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 02/21/2024] [Accepted: 03/18/2024] [Indexed: 04/08/2024] Open
Abstract
Protein-ligand interaction prediction presents a significant challenge in drug design. Numerous machine learning and deep learning (DL) models have been developed to accurately identify docking poses of ligands and active compounds against specific targets. However, current models often suffer from inadequate accuracy or lack practical physical significance in their scoring systems. In this research paper, we introduce IGModel, a novel approach that utilizes the geometric information of protein-ligand complexes as input for predicting the root mean square deviation of docking poses and the binding strength (pKd, the negative value of the logarithm of binding affinity) within the same prediction framework. This ensures that the output scores carry intuitive meaning. We extensively evaluate the performance of IGModel on various docking power test sets, including the CASF-2016 benchmark, PDBbind-CrossDocked-Core and DISCO set, consistently achieving state-of-the-art accuracies. Furthermore, we assess IGModel's generalizability and robustness by evaluating it on unbiased test sets and sets containing target structures generated by AlphaFold2. The exceptional performance of IGModel on these sets demonstrates its efficacy. Additionally, we visualize the latent space of protein-ligand interactions encoded by IGModel and conduct interpretability analysis, providing valuable insights. This study presents a novel framework for DL-based prediction of protein-ligand interactions, contributing to the advancement of this field. The IGModel is available at GitHub repository https://github.com/zchwang/IGModel.
Collapse
Affiliation(s)
- Zechen Wang
- School of Physics, Shandong University, South Shanda Road, 250100 Shandong, China
| | - Sheng Wang
- Shanghai Zelixir Biotech, Xiangke Road, 200030, Shanghai, China
| | - Yangyang Li
- School of Physics, Shandong University, South Shanda Road, 250100 Shandong, China
| | - Jingjing Guo
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, Macao, China
| | - Yanjie Wei
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Xueyuan Road 1068, Shenzhen, 518055 Guang Dong, China
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, Singapore
| | - Liangzhen Zheng
- Shanghai Zelixir Biotech, Xiangke Road, 200030, Shanghai, China
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Xueyuan Road 1068, Shenzhen, 518055 Guang Dong, China
| | - Weifeng Li
- School of Physics, Shandong University, South Shanda Road, 250100 Shandong, China
| |
Collapse
|
19
|
Banat R, Daoud S, Taha MO. Ligand-based pharmacophore modeling and machine learning for the discovery of potent aurora A kinase inhibitory leads of novel chemotypes. Mol Divers 2024:10.1007/s11030-024-10814-y. [PMID: 38446372 DOI: 10.1007/s11030-024-10814-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 01/19/2024] [Indexed: 03/07/2024]
Abstract
Aurora-A (AURKA) is serine/threonine protein kinase involved in the regulation of numerous processes of cell division. Numerous studies have demonstrated strong association between AURKA and cancer. AURKA is overexpressed in many cancers, such as colon, breast and prostate cancers. Consequently, AURKA has emerged as promising target for therapeutic intervention in cancer management. Herein, we describe a computational workflow for the discovery of novel anti-AURKA inhibitory leads starting with ligand-based assessment of the pharmacophoric space of six diverse sets of inhibitors. Subsequently, machine learning/QSAR modeling was coupled with genetic function algorithm to search for the best possible combination of machine learner, ligand-based pharmacophore(s) and molecular descriptors capable of explaining variation in anti-AURKA bioactivities within a collected list of inhibitors. Two learners succeeded in achieving acceptable structure/activity correlations, namely, random forests and extreme gradient boosting (XGBoost). Three pharmacophores emerged in the successful ML models. These were then used as 3D search queries to mine the National Cancer Institute database for novel anti-AURKA leads. Top-ranking 38 hits were assessed in vitro for their anti-AURKA bioactivities. Among them, three compounds exhibited promising dose-response curves, demonstrating experimental IC50 values ranging from sub-micromolar to low micromolar values. Remarkably, two of these compounds are of novel chemotypes.
Collapse
Affiliation(s)
- Rajaa Banat
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, University of Jordan, Amman, Jordan
| | - Safa Daoud
- Department of Pharmaceutical Chemistry and Pharmacognosy, Faculty of Pharmacy, Applied Sciences Private University, Amman, Jordan
| | - Mutasem Omar Taha
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, University of Jordan, Amman, Jordan.
| |
Collapse
|
20
|
Carbone MR, Maffettone PM, Qu X, Yoo S, Lu D. Accurate, Uncertainty-Aware Classification of Molecular Chemical Motifs from Multimodal X-ray Absorption Spectroscopy. J Phys Chem A 2024. [PMID: 38416723 DOI: 10.1021/acs.jpca.3c06910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2024]
Abstract
Accurate classification of molecular chemical motifs from experimental measurement is an important problem in molecular physics, chemistry, and biology. In this work, we present neural network ensemble classifiers for predicting the presence (or lack thereof) of 41 different chemical motifs on small molecules from simulated C, N, and O K-edge X-ray absorption near-edge structure (XANES) spectra. Our classifiers not only achieve class-balanced accuracies of more than 0.95 but also accurately quantify uncertainty. We also show that including multiple XANES modalities improves predictions notably on average, demonstrating a "multimodal advantage" over any single modality. In addition to structure refinement, our approach can be generalized to broad applications with molecular design pipelines.
Collapse
Affiliation(s)
- Matthew R Carbone
- Computational Science Initiative, Brookhaven National Laboratory, Upton, New York 11973, United States
| | - Phillip M Maffettone
- National Synchrotron Light Source II, Brookhaven National Laboratory, Upton, New York 11973, United States
| | - Xiaohui Qu
- Center for Functional Nanomaterials, Brookhaven National Laboratory, Upton, New York 11973, United States
| | - Shinjae Yoo
- Computational Science Initiative, Brookhaven National Laboratory, Upton, New York 11973, United States
| | - Deyu Lu
- Center for Functional Nanomaterials, Brookhaven National Laboratory, Upton, New York 11973, United States
| |
Collapse
|
21
|
Lindley S, Lu Y, Shukla D. The Experimentalist's Guide to Machine Learning for Small Molecule Design. ACS APPLIED BIO MATERIALS 2024; 7:657-684. [PMID: 37535819 PMCID: PMC10880109 DOI: 10.1021/acsabm.3c00054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 07/17/2023] [Indexed: 08/05/2023]
Abstract
Initially part of the field of artificial intelligence, machine learning (ML) has become a booming research area since branching out into its own field in the 1990s. After three decades of refinement, ML algorithms have accelerated scientific developments across a variety of research topics. The field of small molecule design is no exception, and an increasing number of researchers are applying ML techniques in their pursuit of discovering, generating, and optimizing small molecule compounds. The goal of this review is to provide simple, yet descriptive, explanations of some of the most commonly utilized ML algorithms in the field of small molecule design along with those that are highly applicable to an experimentally focused audience. The algorithms discussed here span across three ML paradigms: supervised learning, unsupervised learning, and ensemble methods. Examples from the published literature will be provided for each algorithm. Some common pitfalls of applying ML to biological and chemical data sets will also be explained, alongside a brief summary of a few more advanced paradigms, including reinforcement learning and semi-supervised learning.
Collapse
Affiliation(s)
- Sarah
E. Lindley
- Department
of Bioengineering, University of Illinois, Urbana−Champaign, Illinois 61801, United States
| | - Yiyang Lu
- Department
of Chemical and Biomolecular Engineering, University of Illinois, Urbana−Champaign, Illinois 61801, United States
| | - Diwakar Shukla
- Department
of Bioengineering, University of Illinois, Urbana−Champaign, Illinois 61801, United States
- Department
of Chemical and Biomolecular Engineering, University of Illinois, Urbana−Champaign, Illinois 61801, United States
- Center
for Biophysics & Computational Biology, University of Illinois, Urbana−Champaign, Illinois 61801, United States
- Department
of Plant Biology, University of Illinois, Urbana−Champaign, Illinois 61801, United States
| |
Collapse
|
22
|
Mastrolorito F, Togo MV, Gambacorta N, Trisciuzzi D, Giannuzzi V, Bonifazi F, Liantonio A, Imbrici P, De Luca A, Altomare CD, Ciriaco F, Amoroso N, Nicolotti O. TISBE: A Public Web Platform for the Consensus-Based Explainable Prediction of Developmental Toxicity. Chem Res Toxicol 2024; 37:323-339. [PMID: 38200616 DOI: 10.1021/acs.chemrestox.3c00310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2024]
Abstract
Despite being extremely relevant for the protection of prenatal and neonatal health, the developmental toxicity (Dev Tox) is a highly complex endpoint whose molecular rationale is still largely unknown. The lack of availability of high-quality data as well as robust nontesting methods makes its understanding even more difficult. Thus, the application of new explainable alternative methods is of utmost importance, with Dev Tox being one of the most animal-intensive research themes of regulatory toxicology. Descending from TIRESIA (Toxicology Intelligence and Regulatory Evaluations for Scientific and Industry Applications), the present work describes TISBE (TIRESIA Improved on Structure-Based Explainability), a new public web platform implementing four fundamental advancements for in silico analyses: a three times larger dataset, a transparent XAI (explainable artificial intelligence) framework employing a fragment-based fingerprint coding, a novel consensus classifier based on five independent machine learning models, and a new applicability domain (AD) method based on a double top-down approach for better estimating the prediction reliability. The training set (TS) includes as many as 1008 chemicals annotated with experimental toxicity values. Based on a 5-fold cross-validation, a median value of 0.410 for the Matthews correlation coefficient was calculated; TISBE was very effective, with a median value of sensitivity and specificity equal to 0.984 and 0.274, respectively. TISBE was applied on two external pools made of 1484 bioactive compounds and 85 pediatric drugs taken from ChEMBL (Chemical European Molecular Biology Laboratory) and TEDDY (Task-Force in Europe for Drug Development in the Young) repositories, respectively. Notably, TISBE gives users the option to clearly spot the molecular fragments responsible for the toxicity or the safety of a given chemical query and is available for free at https://prometheus.farmacia.uniba.it/tisbe.
Collapse
Affiliation(s)
- Fabrizio Mastrolorito
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Maria Vittoria Togo
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Nicola Gambacorta
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Viviana Giannuzzi
- Fondazione per la Ricerca Farmacologica Gianni Benzi Onlus, 70010 Valenzano (BA), Italy
| | - Fedele Bonifazi
- Fondazione per la Ricerca Farmacologica Gianni Benzi Onlus, 70010 Valenzano (BA), Italy
| | - Antonella Liantonio
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Paola Imbrici
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Annamaria De Luca
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Cosimo Damiano Altomare
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Fulvio Ciriaco
- Dipartimento di Chimica, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Nicola Amoroso
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| |
Collapse
|
23
|
Hassan J, Saeed SM, Deka L, Uddin MJ, Das DB. Applications of Machine Learning (ML) and Mathematical Modeling (MM) in Healthcare with Special Focus on Cancer Prognosis and Anticancer Therapy: Current Status and Challenges. Pharmaceutics 2024; 16:260. [PMID: 38399314 PMCID: PMC10892549 DOI: 10.3390/pharmaceutics16020260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 01/29/2024] [Accepted: 02/07/2024] [Indexed: 02/25/2024] Open
Abstract
The use of data-driven high-throughput analytical techniques, which has given rise to computational oncology, is undisputed. The widespread use of machine learning (ML) and mathematical modeling (MM)-based techniques is widely acknowledged. These two approaches have fueled the advancement in cancer research and eventually led to the uptake of telemedicine in cancer care. For diagnostic, prognostic, and treatment purposes concerning different types of cancer research, vast databases of varied information with manifold dimensions are required, and indeed, all this information can only be managed by an automated system developed utilizing ML and MM. In addition, MM is being used to probe the relationship between the pharmacokinetics and pharmacodynamics (PK/PD interactions) of anti-cancer substances to improve cancer treatment, and also to refine the quality of existing treatment models by being incorporated at all steps of research and development related to cancer and in routine patient care. This review will serve as a consolidation of the advancement and benefits of ML and MM techniques with a special focus on the area of cancer prognosis and anticancer therapy, leading to the identification of challenges (data quantity, ethical consideration, and data privacy) which are yet to be fully addressed in current studies.
Collapse
Affiliation(s)
- Jasmin Hassan
- Drug Delivery & Therapeutics Lab, Dhaka 1212, Bangladesh; (J.H.); (S.M.S.)
| | | | - Lipika Deka
- Faculty of Computing, Engineering and Media, De Montfort University, Leicester LE1 9BH, UK;
| | - Md Jasim Uddin
- Department of Pharmaceutical Technology, Faculty of Pharmacy, Universiti Malaya, Kuala Lumpur 50603, Malaysia
| | - Diganta B. Das
- Department of Chemical Engineering, Loughborough University, Loughborough LE11 3TU, UK
| |
Collapse
|
24
|
Adebar N, Keupp J, Emenike VN, Kühlborn J, Vom Dahl L, Möckel R, Smiatek J. Scientific Deep Machine Learning Concepts for the Prediction of Concentration Profiles and Chemical Reaction Kinetics: Consideration of Reaction Conditions. J Phys Chem A 2024; 128:929-944. [PMID: 38271617 DOI: 10.1021/acs.jpca.3c06265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
Emerging concepts from scientific deep machine learning such as physics-informed neural networks (PINNs) enable a data-driven approach for the study of complex kinetic problems. We present an extended framework that combines the advantages of PINNs with the detailed consideration of experimental parameter variations for the simulation and prediction of chemical reaction kinetics. The approach is based on truncated Taylor series expansions for the underlying fundamental equations, whereby the external variations can be interpreted as perturbations of the kinetic parameters. Accordingly, our method allows for an efficient consideration of experimental parameter settings and their influence on the concentration profiles and reaction kinetics. A particular advantage of our approach, in addition to the consideration of univariate and multivariate parameter variations, is the robust model-based exploration of the parameter space to determine optimal reaction conditions in combination with advanced reaction insights. The benefits of this concept are demonstrated for higher-order chemical reactions including catalytic and oscillatory systems in combination with small amounts of training data. All predicted values show a high level of accuracy, demonstrating the broad applicability and flexibility of our approach.
Collapse
Affiliation(s)
- Niklas Adebar
- Development NCE, Chemical Development, Boehringer Ingelheim Pharma GmbH & Co. KG, D-55218 Ingelheim (Rhein), Germany
| | - Julian Keupp
- Development NCE, Chemical Development, Boehringer Ingelheim Pharma GmbH & Co. KG, D-55218 Ingelheim (Rhein), Germany
| | - Victor N Emenike
- HP BioP Launch and Innovation, Boehringer Ingelheim Pharma GmbH & Co. KG, D-55218 Ingelheim (Rhein), Germany
| | - Jonas Kühlborn
- Development NCE, Chemical Development, Boehringer Ingelheim Pharma GmbH & Co. KG, D-55218 Ingelheim (Rhein), Germany
| | - Lisa Vom Dahl
- Development NCE, Analytical Development, Boehringer Ingelheim Pharma GmbH & Co. KG, D-55218 Ingelheim (Rhein), Germany
| | - Robert Möckel
- Development NCE, Chemical Development, Boehringer Ingelheim Pharma GmbH & Co. KG, D-55218 Ingelheim (Rhein), Germany
| | - Jens Smiatek
- Institute for Computational Physics, University of Stuttgart, D-70569 Stuttgart, Germany
- Development NCE, Strategy NCEs, Boehringer Ingelheim Pharma GmbH & Co. KG, D-88397 Biberach (Riss), Germany
| |
Collapse
|
25
|
Shafiq M, Sherwani ZA, Mushtaq M, Nur-E-Alam M, Ahmad A, Ul-Haq Z. A deep learning-based theoretical protocol to identify potentially isoform-selective PI3Kα inhibitors. Mol Divers 2024:10.1007/s11030-023-10799-0. [PMID: 38305819 DOI: 10.1007/s11030-023-10799-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 12/22/2023] [Indexed: 02/03/2024]
Abstract
Phosphoinositide 3-kinase alpha (PI3Kα) is one of the most frequently dysregulated kinases known for their pivotal role in many oncogenic diseases. While the side effects linked to existing drugs against PI3Kα-induced cancers provide an avenue for further research, the significant structural conservation among PI3Ks makes it extremely difficult to develop new isoform-selective PI3Kα inhibitors. Embracing this challenge, we herein designed a hybrid protocol by integrating machine learning (ML) with in silico drug-designing strategies. A deep learning classification model was developed and trained on the physicochemical descriptors data of known PI3Kα inhibitors and used as a screening filter for a database of small molecules. This approach led us to the prediction of 662 compounds showcasing appropriate features to be considered as PI3Kα inhibitors. Subsequently, a multiphase molecular docking was applied to further characterize the predicted hits in terms of their binding affinities and binding modes in the targeted cavity of the PI3Kα. As a result, a total of 12 compounds were identified whereas the best poses highlighted the efficiency of these ligands in maintaining interactions with the crucial residues of the protein to be targeted for the inhibition of associated activity. Notably, potential activity of compound 12 in counteracting PI3Kα function was found in a previous in vitro study. Following the drug-likeness and pharmacokinetic characterizations, six compounds (compounds 1, 2, 3, 6, 7, and 11) with suitable ADME-T profiles and promising bioavailability were selected. The mechanistic studies in dynamic mode further endorsed the potential of identified hits in blocking the ATP-binding site of the receptor with higher binding affinities than the native inhibitor, alpelisib (BYL-719), particularly the compounds 1, 2, and 11. These outcomes support the reliability of the developed classification model and the devised computational strategy for identifying new isoform-selective drug candidates for PI3Kα inhibition.
Collapse
Affiliation(s)
- Muhammad Shafiq
- H.E.J. Research Institute of Chemistry, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270, Pakistan
| | - Zaid Anis Sherwani
- Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270, Pakistan
| | - Mamona Mushtaq
- Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270, Pakistan
| | - Mohammad Nur-E-Alam
- Department of Pharmacognosy, College of Pharmacy, King Saud University, P.O. Box. 2457, Riyadh, 11451, Kingdom of Saudi Arabia
| | - Aftab Ahmad
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA, 92618, USA
| | - Zaheer Ul-Haq
- Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270, Pakistan.
| |
Collapse
|
26
|
Li Y, Tao C, Fu D, Jafvert CT, Zhu T. Integrating molecular descriptors for enhanced prediction: Shedding light on the potential of pH to model hydrated electron reaction rates for organic compounds. CHEMOSPHERE 2024; 349:140984. [PMID: 38122944 DOI: 10.1016/j.chemosphere.2023.140984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 12/13/2023] [Accepted: 12/14/2023] [Indexed: 12/23/2023]
Abstract
Hydrated electron reaction rate constant (ke-aq) is an important parameter to determine reductive degradation efficiency and to mitigate the ecological risk of organic compounds (OCs). However, OC species morphology and the concentration of hydrated electrons (e-aq) in water vary with pH, complicating OC fate assessment. This study introduced the environmental variable of pH, to develop models for ke-aq for 701 data points using 3 descriptor types: (i) molecular descriptors (MD), (ii) quantum chemical descriptors (QCD), and (iii) the combination of both (MD + QCD). Models were screened using 2 descriptor screening methods (MLR and RF) and 14 machine learning (ML) algorithms. The introduction of QCDs that characterized the electronic structure of OCs greatly improved the performance of models while ensuring the need for fewer descriptors. The optimal model MLR-XGBoost(MD + QCD), which included pH, achieved the most satisfactory prediction: R2tra = 0.988, Q2boot = 0.861, R2test = 0.875 and Q2test = 0.873. The mechanistic interpretation using the SHAP method further revealed that QCDs, polarizability, volume, and pH had a great influence on the reductive degradation of OCs by e-aq. Overall, the electrochemical parameters (QCDs, pH) related to the solvent and solute are of significance and should be considered in any future ML modeling that assesses the fate of OCs in aquatic environment.
Collapse
Affiliation(s)
- Yi Li
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou, 225127, Jiangsu, China
| | - Cuicui Tao
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou, 225127, Jiangsu, China
| | - Dafang Fu
- School of Civil Engineering, Southeast University, Nanjing, 210096, China
| | - Chad T Jafvert
- Lyles School of Civil Engineering, and Environmental & Ecological Engineering, Purdue University, West Lafayette, IN, 47907, USA
| | - Tengyi Zhu
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou, 225127, Jiangsu, China.
| |
Collapse
|
27
|
Oselusi SO, Dube P, Odugbemi AI, Akinyede KA, Ilori TL, Egieyeh E, Sibuyi NR, Meyer M, Madiehe AM, Wyckoff GJ, Egieyeh SA. The role and potential of computer-aided drug discovery strategies in the discovery of novel antimicrobials. Comput Biol Med 2024; 169:107927. [PMID: 38184864 DOI: 10.1016/j.compbiomed.2024.107927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 12/25/2023] [Accepted: 01/01/2024] [Indexed: 01/09/2024]
Abstract
Antimicrobial resistance (AMR) has become more of a concern in recent decades, particularly in infections associated with global public health threats. The development of new antibiotics is crucial to ensuring infection control and eradicating AMR. Although drug discovery and development are essential processes in the transformation of a drug candidate from the laboratory to the bedside, they are often very complicated, expensive, and time-consuming. The pharmaceutical sector is continuously innovating strategies to reduce research costs and accelerate the development of new drug candidates. Computer-aided drug discovery (CADD) has emerged as a powerful and promising technology that renews the hope of researchers for the faster identification, design, and development of cheaper, less resource-intensive, and more efficient drug candidates. In this review, we discuss an overview of AMR, the potential, and limitations of CADD in AMR drug discovery, and case studies of the successful application of this technique in the rapid identification of various drug candidates. This review will aid in achieving a better understanding of available CADD techniques in the discovery of novel drug candidates against resistant pathogens and other infectious agents.
Collapse
Affiliation(s)
- Samson O Oselusi
- DSI/Mintek Nanotechnology Innovation Centre (NIC), Biolabels Node, Department of Biotechnology, University of the Western Cape, Private Bag X17, Bellville, Cape Town, 7535, South Africa
| | - Phumuzile Dube
- DSI/Mintek Nanotechnology Innovation Centre (NIC), Biolabels Node, Department of Biotechnology, University of the Western Cape, Private Bag X17, Bellville, Cape Town, 7535, South Africa
| | - Adeshina I Odugbemi
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Cape Town, 7535, South Africa
| | - Kolajo A Akinyede
- Department of Science Technology, Biochemistry Unit, The Federal Polytechnic P.M.B.5351, Ado Ekiti, 360231, Nigeria
| | - Tosin L Ilori
- School of Pharmacy, University of the Western Cape, Bellville, Cape Town, 7535, South Africa
| | - Elizabeth Egieyeh
- School of Pharmacy, University of the Western Cape, Bellville, Cape Town, 7535, South Africa
| | - Nicole Rs Sibuyi
- DSI/Mintek Nanotechnology Innovation Centre (NIC), Biolabels Node, Department of Biotechnology, University of the Western Cape, Private Bag X17, Bellville, Cape Town, 7535, South Africa
| | - Mervin Meyer
- DSI/Mintek Nanotechnology Innovation Centre (NIC), Biolabels Node, Department of Biotechnology, University of the Western Cape, Private Bag X17, Bellville, Cape Town, 7535, South Africa
| | - Abram M Madiehe
- DSI/Mintek Nanotechnology Innovation Centre (NIC), Biolabels Node, Department of Biotechnology, University of the Western Cape, Private Bag X17, Bellville, Cape Town, 7535, South Africa
| | - Gerald J Wyckoff
- School of Pharmacy, Division of Pharmacology and Pharmaceutical Sciences, University of Missouri, Kansas City, MO, 64110-2446, United States
| | - Samuel A Egieyeh
- School of Pharmacy, University of the Western Cape, Bellville, Cape Town, 7535, South Africa.
| |
Collapse
|
28
|
Nopour R. Design of risk prediction model for esophageal cancer based on machine learning approach. Heliyon 2024; 10:e24797. [PMID: 38312629 PMCID: PMC10835323 DOI: 10.1016/j.heliyon.2024.e24797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 01/11/2024] [Accepted: 01/15/2024] [Indexed: 02/06/2024] Open
Abstract
Background and aim Esophageal cancer (EC) is a highly prevalent and progressive disease. Early prediction of EC risk in the population is crucial in preventing this disease and enhancing the overall health of individuals. So far, few studies have been conducted on predicting the EC risk based on the prediction models, and most of them focused on statistical methods. The ML approach obtained efficient predictive insights into the clinical domain. Therefore, this study aims to develop a risk prediction model for EC based on risk factors and by leveraging the ML approach to stratify the high-risk EC people and obtain efficient preventive purposes at the community level. Material and methods The current retrospective study was performed from 2018 to 2022 in Sari City based on 3256 EC and non-EC cases. The six selected algorithms, including Random Forest (RF), eXtreme Gradient Boosting (XG-Boost), Bagging, K-Nearest Neighbor (K-NN), Support Vector Machine (SVM), and Artificial Neural Networks (ANNs), were used to develop the risk prediction model for EC and achieve the preventive purposes. Results Comparing the performance efficiency of algorithms revealed that the XG-Boost model gained the best predictability for EC risk with AU-ROC = 0.92 and AU-ROC-test = 0.889 for internal and validation states, respectively. Based on the XG-Boost, the factors, including sex, drinking hot liquids, fruit consumption, achalasia, and vegetable consumption, were considered the five top predictors of EC risk. Conclusion This study showed that the XG-Boost could provide insight into the early prediction of the EC risk for people and clinical providers to stratify the high-risk group of EC and achieve preventive measures based on modifying the risk factors associated with EC and other clinical solutions.
Collapse
Affiliation(s)
- Raoof Nopour
- Department of Health Information Management, Student Research Committee, School of Health Management and Information Sciences Branch, Iran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
29
|
Chang H, Zhang Z, Tian J, Bai T, Xiao Z, Wang D, Qiao R, Li C. Machine Learning-Based Virtual Screening and Identification of the Fourth-Generation EGFR Inhibitors. ACS OMEGA 2024; 9:2314-2324. [PMID: 38250375 PMCID: PMC10795152 DOI: 10.1021/acsomega.3c06225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 11/06/2023] [Accepted: 11/15/2023] [Indexed: 01/23/2024]
Abstract
Epidermal growth factor receptor (EGFR) plays a pivotal regulatory role in treating patients with advanced nonsmall cell lung cancer (NSCLC). Following the emergence of the EGFR tertiary CIS C797S mutation, all types of inhibitors lose their inhibitory activity, necessitating the urgent development of new inhibitors. Computer systems employ machine learning methods to process substantial volumes of data and construct models that enable more accurate predictions of the outcomes of new inputs. The purpose of this article is to uncover innovative fourth-generation epidermal growth factor receptor tyrosine kinase inhibitors (EGFR-TKIs) with the aid of machine learning techniques. The paper's data set was high-dimensional and sparse, encompassing both structured and unstructured descriptors. To address this considerable challenge, we introduced a fusion framework to select critical molecule descriptors by integrating the full quadratic effect model and the Lasso model. Based on structural descriptors obtained from the full quadratic effect model, we conceived and synthesized a variety of small-molecule inhibitors. These inhibitors demonstrated potent inhibitory effects on the two mutated kinases L858R/T790M/C797S and Del19/T790M/C797S. Moreover, we applied our model to virtual screening, successfully identifying four hit compounds. We have evaluated these hit ADME characteristics and look forward to conducting activity evaluations on them in the future to discover a new generation of EGFR-TKI.
Collapse
Affiliation(s)
- Hao Chang
- State
Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, P. R. China
| | - Zeyu Zhang
- School
of Mathematics and Statistics, Beijing Institute
of Technology, Beijing 100081, P. R. China
| | - Jiaxin Tian
- State
Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, P. R. China
| | - Tian Bai
- School
of Mathematics and Statistics, Beijing Institute
of Technology, Beijing 100081, P. R. China
| | - Zijie Xiao
- State
Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, P. R. China
| | - Dianpeng Wang
- School
of Mathematics and Statistics, Beijing Institute
of Technology, Beijing 100081, P. R. China
| | - Renzhong Qiao
- State
Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, P. R. China
| | - Chao Li
- State
Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, P. R. China
| |
Collapse
|
30
|
Fan ZX, Chao SD. A Machine Learning Force Field for Bio-Macromolecular Modeling Based on Quantum Chemistry-Calculated Interaction Energy Datasets. Bioengineering (Basel) 2024; 11:51. [PMID: 38247928 PMCID: PMC11154266 DOI: 10.3390/bioengineering11010051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 12/23/2023] [Accepted: 12/25/2023] [Indexed: 01/23/2024] Open
Abstract
Accurate energy data from noncovalent interactions are essential for constructing force fields for molecular dynamics simulations of bio-macromolecular systems. There are two important practical issues in the construction of a reliable force field with the hope of balancing the desired chemical accuracy and working efficiency. One is to determine a suitable quantum chemistry level of theory for calculating interaction energies. The other is to use a suitable continuous energy function to model the quantum chemical energy data. For the first issue, we have recently calculated the intermolecular interaction energies using the SAPT0 level of theory, and we have systematically organized these energies into the ab initio SOFG-31 (homodimer) and SOFG-31-heterodimer datasets. In this work, we re-calculate these interaction energies by using the more advanced SAPT2 level of theory with a wider series of basis sets. Our purpose is to determine the SAPT level of theory proper for interaction energies with respect to the CCSD(T)/CBS benchmark chemical accuracy. Next, to utilize these energy datasets, we employ one of the well-developed machine learning techniques, called the CLIFF scheme, to construct a general-purpose force field for biomolecular dynamics simulations. Here we use the SOFG-31 dataset and the SOFG-31-heterodimer dataset as the training and test sets, respectively. Our results demonstrate that using the CLIFF scheme can reproduce a diverse range of dimeric interaction energy patterns with only a small training set. The overall errors for each SAPT energy component, as well as the SAPT total energy, are all well below the desired chemical accuracy of ~1 kcal/mol.
Collapse
Affiliation(s)
- Zhen-Xuan Fan
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
| | - Sheng D. Chao
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
- Center for Quantum Science and Engineering, National Taiwan University, Taipei 106, Taiwan
| |
Collapse
|
31
|
Siramshetty VB, Xu X, Shah P. Artificial Intelligence in ADME Property Prediction. Methods Mol Biol 2024; 2714:307-327. [PMID: 37676606 DOI: 10.1007/978-1-0716-3441-7_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Absorption, distribution, metabolism, excretion (ADME) are key properties of a small molecule that govern pharmacokinetic profiles and impact its efficacy and safety. Computational methods such as machine learning and artificial intelligence have gained significant interest in both academic and industrial settings to predict pharmacokinetic properties of small molecules. These methods are applied in drug discovery to optimize chemical libraries, prioritize hits from biological screens, and optimize ADME properties of lead molecules. In the recent years, the drug discovery community witnessed the use of a range of neural network architectures such as deep neural networks, recurrent neural networks, graph neural networks, and transformer neural networks, which marked a paradigm shift in computer-aided drug design and development. This chapter discusses recent developments with an emphasis on their application to predict ADME properties.
Collapse
Affiliation(s)
- Vishal B Siramshetty
- National Center for Advancing Translational Sciences, Rockville, MD, USA
- Department of Safety Assessment, Genentech, Inc., South San Francisco, CA, USA
| | - Xin Xu
- National Center for Advancing Translational Sciences, Rockville, MD, USA
| | - Pranav Shah
- National Center for Advancing Translational Sciences, Rockville, MD, USA.
| |
Collapse
|
32
|
Talevi A. Computer-Aided Drug Discovery and Design: Recent Advances and Future Prospects. Methods Mol Biol 2024; 2714:1-20. [PMID: 37676590 DOI: 10.1007/978-1-0716-3441-7_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Computer-aided drug discovery and design involve the use of information technologies to identify and develop, on a rational ground, chemical compounds that align a set of desired physicochemical and biological properties. In its most common form, it involves the identification and/or modification of an active scaffold (or the combination of known active scaffolds), although de novo drug design from scratch is also possible. Traditionally, the drug discovery and design processes have focused on the molecular determinants of the interactions between drug candidates and their known or intended pharmacological target(s). Nevertheless, in modern times, drug discovery and design are conceived as a particularly complex multiparameter optimization task, due to the complicated, often conflicting, property requirements.This chapter provides an updated overview of in silico approaches for identifying active scaffolds and guiding the subsequent optimization process. Recent groundbreaking advances in the field have also analyzed the integration of state-of-the-art machine learning approaches in every step of the drug discovery process (from prediction of target structure to customized molecular docking scoring functions), integration of multilevel omics data, and the use of a diversity of computational approaches to assist target validation and assess plausible binding pockets.
Collapse
Affiliation(s)
- Alan Talevi
- Laboratory of Bioactive Compound Research and Development (LIDeB), Faculty of Exact Sciences, National University of La Plata (UNLP), La Plata, Argentina.
- Argentinean National Council of Scientific and Technical Research (CONICET), La Plata, Argentina.
| |
Collapse
|
33
|
B S N, P K KN, Akey KS, Sankaran S, Raman RK, Natarajan J, Selvaraj J. Vitamin D analog calcitriol for breast cancer therapy; an integrated drug discovery approach. J Biomol Struct Dyn 2023; 41:11017-11043. [PMID: 37054526 DOI: 10.1080/07391102.2023.2199866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 12/11/2022] [Indexed: 04/15/2023]
Abstract
As breast cancer remains leading cause of cancer death globally, it is essential to develop an affordable breast cancer therapy in underdeveloped countries. Drug repurposing offers potential to address gaps in breast cancer treatment. Molecular networking studies were performed for drug repurposing approach by using heterogeneous data. The PPI networks were built to select the target genes from the EGFR overexpression signaling pathway and its associated family members. The selected genes EGFR, ErbB2, ErbB4 and ErbB3 were allowed to interact with 2637 drugs, leads to PDI network construction of 78, 61, 15 and 19 drugs, respectively. As drugs approved for treating non cancer-related diseases or disorders are clinically safe, effective, and affordable, these drugs were given considerable attention. Calcitriol had shown significant binding affinities with all four receptors than standard neratinib. The RMSD, RMSF, and H-bond analysis of protein-ligand complexes from molecular dynamics simulation (100 ns), confirmed the stable binding of calcitriol with ErbB2 and EGFR receptors. In addition, MMGBSA and MMP BSA also affirmed the docking results. These in-silico results were validated with in-vitro cytotoxicity studies in SK-BR-3 and Vero cells. The IC50 value of calcitriol (43.07 mg/ml) was found to be lower than neratinib (61.50 mg/ml) in SK-BR-3 cells. In Vero cells the IC50 value of calcitriol (431.05 mg/ml) was higher than neratinib (404.95 mg/ml). It demonstrates that calcitriol suggestively downregulated the SK-BR-3 cell viability in a dose-dependent manner. These implications revealed calcitriol has shown better cytotoxicity and decreased the proliferation rate of breast cancer cells than neratinib.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Nagaraj B S
- Department of Pharmaceutical Chemistry, JSS College of Pharmacy, JSS Academy of Higher Education and Research, Ooty, Tamilnadu, India
| | - Krishnan Namboori P K
- Amrita Molecular Modeling and Synthesis (AMMAS) Research lab, Amrita Vishwavidyapeetham, Coimbatore, Tamilnadu, India
| | - Krishna Swaroop Akey
- Department of Pharmaceutical Chemistry, JSS College of Pharmacy, JSS Academy of Higher Education and Research, Ooty, Tamilnadu, India
| | - Sathianarayanan Sankaran
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Karpagam Academy of Higher Education, Coimbatore, Tamilnadu, India
| | - Rajesh Kumar Raman
- Department of Pharmaceutical Biotechnology, JSS College of Pharmacy, JSS Academy of Higher Education and Research, Ooty, Tamilnadu, India
| | - Jawahar Natarajan
- Department of Pharmaceutics, JSS College of Pharmacy, JSS Academy of Higher Education and Research, Ooty, Tamilnadu, India
| | - Jubie Selvaraj
- Department of Pharmaceutical Chemistry, JSS College of Pharmacy, JSS Academy of Higher Education and Research, Ooty, Tamilnadu, India
| |
Collapse
|
34
|
Tiwari PC, Pal R, Chaudhary MJ, Nath R. Artificial intelligence revolutionizing drug development: Exploring opportunities and challenges. Drug Dev Res 2023; 84:1652-1663. [PMID: 37712494 DOI: 10.1002/ddr.22115] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 08/14/2023] [Accepted: 09/04/2023] [Indexed: 09/16/2023]
Abstract
By harnessing artificial intelligence (AI) algorithms and machine learning techniques, the entire drug discovery process stands to undergo a profound transformation, offering a myriad of advantages. Foremost among these is the ability of AI to conduct swift and efficient screenings of expansive compound libraries, significantly augmenting the identification of potential drug candidates. Moreover, AI algorithms can prove instrumental in predicting the efficacy and safety profiles of candidate compounds, thus endowing invaluable insights and reducing reliance on extensive preclinical and clinical testing. This predictive capacity of AI has the potential to streamline the drug development pipeline and enhance the success rate of clinical trials, ultimately resulting in the emergence of more efficacious and safer therapeutic agents. However, the deployment of AI in drug discovery introduces certain challenges that warrant attention. A primary hurdle entails the imperative acquisition of high-quality and diverse data. Furthermore, ensuring the interpretability of AI models assumes critical importance in securing regulatory endorsement and cultivating trust within scientific and medical communities. Addressing ethical considerations, including data privacy and mitigating bias, represents an additional momentous challenge, requiring assiduous navigation. In this review, we provide an intricate and comprehensive overview of the multifaceted challenges intrinsic to conventional drug development paradigms, while simultaneously interrogating the efficacy of AI in effectively surmounting these formidable obstacles.
Collapse
Affiliation(s)
- Prafulla C Tiwari
- Department of Pharmacology and Therapeutics, King George's Medical University, Lucknow, Uttar Pradesh, India
| | - Rishi Pal
- Department of Pharmacology and Therapeutics, King George's Medical University, Lucknow, Uttar Pradesh, India
| | - Manju J Chaudhary
- Department of Physiology, Government Medical College, Kannauj, Uttar Pradesh, India
| | - Rajendra Nath
- Department of Pharmacology and Therapeutics, King George's Medical University, Lucknow, Uttar Pradesh, India
| |
Collapse
|
35
|
Jaradat NJ, Hatmal M, Alqudah D, Taha MO. Computational workflow for discovering small molecular binders for shallow binding sites by integrating molecular dynamics simulation, pharmacophore modeling, and machine learning: STAT3 as case study. J Comput Aided Mol Des 2023; 37:659-678. [PMID: 37597062 DOI: 10.1007/s10822-023-00528-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 07/26/2023] [Indexed: 08/21/2023]
Abstract
STAT3 belongs to a family of seven transcription factors. It plays an important role in activating the transcription of various genes involved in a variety of cellular processes. High levels of STAT3 are detected in several types of cancer. Hence, STAT3 inhibition is considered a promising therapeutic anti-cancer strategy. However, since STAT3 inhibitors bind to the shallow SH2 domain of the protein, it is expected that hydration water molecules play significant role in ligand-binding complicating the discovery of potent binders. To remedy this issue, we herein propose to extract pharmacophores from molecular dynamics (MD) frames of a potent co-crystallized ligand complexed within STAT3 SH2 domain. Subsequently, we employ genetic function algorithm coupled with machine learning (GFA-ML) to explore the optimal combination of MD-derived pharmacophores that can account for the variations in bioactivity among a list of inhibitors. To enhance the dataset, the training and testing lists were augmented nearly a 100-fold by considering multiple conformers of the ligands. A single significant pharmacophore emerged after 188 ns of MD simulation to represent STAT3-ligand binding. Screening the National Cancer Institute (NCI) database with this model identified one low micromolar inhibitor most likely binds to the SH2 domain of STAT3 and inhibits this pathway.
Collapse
Affiliation(s)
- Nour Jamal Jaradat
- Department of Medical Laboratory Sciences, Faculty of Applied Health Sciences, The Hashemite University, P.O. Box 330127, Zarqa, 13133, Jordan
| | - Mamon Hatmal
- Department of Medical Laboratory Sciences, Faculty of Applied Health Sciences, The Hashemite University, P.O. Box 330127, Zarqa, 13133, Jordan
| | - Dana Alqudah
- Cell Therapy Center, the University of Jordan, Amman, 11942, Jordan
| | - Mutasem Omar Taha
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, University of Jordan, Amman, Jordan.
| |
Collapse
|
36
|
Mastropietro A, Feldmann C, Bajorath J. Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel. Sci Rep 2023; 13:19561. [PMID: 37949930 PMCID: PMC10638308 DOI: 10.1038/s41598-023-46930-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 11/07/2023] [Indexed: 11/12/2023] Open
Abstract
Machine learning (ML) algorithms are extensively used in pharmaceutical research. Most ML models have black-box character, thus preventing the interpretation of predictions. However, rationalizing model decisions is of critical importance if predictions should aid in experimental design. Accordingly, in interdisciplinary research, there is growing interest in explaining ML models. Methods devised for this purpose are a part of the explainable artificial intelligence (XAI) spectrum of approaches. In XAI, the Shapley value concept originating from cooperative game theory has become popular for identifying features determining predictions. The Shapley value concept has been adapted as a model-agnostic approach for explaining predictions. Since the computational time required for Shapley value calculations scales exponentially with the number of features used, local approximations such as Shapley additive explanations (SHAP) are usually required in ML. The support vector machine (SVM) algorithm is one of the most popular ML methods in pharmaceutical research and beyond. SVM models are often explained using SHAP. However, there is only limited correlation between SHAP and exact Shapley values, as previously demonstrated for SVM calculations using the Tanimoto kernel, which limits SVM model explanation. Since the Tanimoto kernel is a special kernel function mostly applied for assessing chemical similarity, we have developed the Shapley value-expressed radial basis function (SVERAD), a computationally efficient approach for the calculation of exact Shapley values for SVM models based upon radial basis function kernels that are widely applied in different areas. SVERAD is shown to produce meaningful explanations of SVM predictions.
Collapse
Affiliation(s)
- Andrea Mastropietro
- Department of Computer, Control and Management Engineering "Antonio Ruberti", Sapienza University of Rome, 00185, Rome, Italy
| | - Christian Feldmann
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, 53115, Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, 53115, Bonn, Germany.
| |
Collapse
|
37
|
Moreira-Filho JT, Neves BJ, Cajas RA, Moraes JD, Andrade CH. Artificial intelligence-guided approach for efficient virtual screening of hits against Schistosoma mansoni. Future Med Chem 2023; 15:2033-2050. [PMID: 37937522 DOI: 10.4155/fmc-2023-0152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 10/06/2023] [Indexed: 11/09/2023] Open
Abstract
Background: The impact of schistosomiasis, which affects over 230 million people, emphasizes the urgency of developing new antischistosomal drugs. Artificial intelligence is vital in accelerating the drug discovery process. Methodology & results: We developed classification and regression machine learning models to predict the schistosomicidal activity of compounds not experimentally tested. The prioritized compounds were tested on schistosomula and adult stages of Schistosoma mansoni. Four compounds demonstrated significant activity against schistosomula, with 50% effective concentration values ranging from 9.8 to 32.5 μM, while exhibiting no toxicity in animal and human cell lines. Conclusion: These findings represent a significant step forward in the discovery of antischistosomal drugs. Further optimization of these active compounds can pave the way for their progression into preclinical studies.
Collapse
Affiliation(s)
- José Teófilo Moreira-Filho
- Laboratory of Molecular Modeling and Drug Design (LabMol), Faculdade de Farmácia, Universidade Federal de Goiás, Goiânia, 74605-170, Brazil
| | - Bruno Junior Neves
- Laboratory of Molecular Modeling and Drug Design (LabMol), Faculdade de Farmácia, Universidade Federal de Goiás, Goiânia, 74605-170, Brazil
| | - Rayssa Araujo Cajas
- Research Center on Neglected Diseases (NPDN), Universidade Guarulhos, Guarulhos, 07023-070, Brazil
| | - Josué de Moraes
- Research Center on Neglected Diseases (NPDN), Universidade Guarulhos, Guarulhos, 07023-070, Brazil
| | - Carolina Horta Andrade
- Laboratory of Molecular Modeling and Drug Design (LabMol), Faculdade de Farmácia, Universidade Federal de Goiás, Goiânia, 74605-170, Brazil
- Center for the Research and Advancement in Fragments and molecular Targets (CRAFT), School of Pharmaceutical Sciences at Ribeirao Preto, University of São Paulo, Ribeirão Preto, SP, Brazil
| |
Collapse
|
38
|
Pratap Reddy Gajulapalli V. Development of Kinase-Centric Drugs: A Computational Perspective. ChemMedChem 2023; 18:e202200693. [PMID: 37442809 DOI: 10.1002/cmdc.202200693] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 07/12/2023] [Accepted: 07/12/2023] [Indexed: 07/15/2023]
Abstract
Kinases are prominent drug targets in the pharmaceutical and research community due to their involvement in signal transduction, physiological responses, and upon dysregulation, in diseases such as cancer, neurological and autoimmune disorders. Several FDA-approved small-molecule drugs have been developed to combat human diseases since Gleevec was approved for the treatment of chronic myelogenous leukemia. Kinases were considered "undruggable" in the beginning. Several FDA-approved small-molecule drugs have become available in recent years. Most of these drugs target ATP-binding sites, but a few target allosteric sites. Among kinases that belong to the same family, the catalytic domain shows high structural and sequence conservation. Inhibitors of ATP-binding sites can cause off-target binding. Because members of the same family have similar sequences and structural patterns, often complex relationships between kinases and inhibitors are observed. To design and develop drugs with desired selectivity, it is essential to understand the target selectivity for kinase inhibitors. To create new inhibitors with the desired selectivity, several experimental methods have been designed to profile the kinase selectivity of small molecules. Experimental approaches are often expensive, laborious, time-consuming, and limited by the available kinases. Researchers have used computational methodologies to address these limitations in the design and development of effective therapeutics. Many computational methods have been developed over the last few decades, either to complement experimental findings or to forecast kinase inhibitor activity and selectivity. The purpose of this review is to provide insight into recent advances in theoretical/computational approaches for the design of new kinase inhibitors with the desired selectivity and optimization of existing inhibitors.
Collapse
|
39
|
Gambacorta N, Ciriaco F, Amoroso N, Altomare CD, Bajorath J, Nicolotti O. CIRCE: Web-Based Platform for the Prediction of Cannabinoid Receptor Ligands Using Explainable Machine Learning. J Chem Inf Model 2023; 63:5916-5926. [PMID: 37675493 DOI: 10.1021/acs.jcim.3c00914] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
The endocannabinoid system, which includes cannabinoid receptor 1 and 2 subtypes (CB1R and CB2R, respectively), is responsible for the onset of various pathologies including neurodegeneration, cancer, neuropathic and inflammatory pain, obesity, and inflammatory bowel disease. Given the high similarity of CB1R and CB2R, generating subtype-selective ligands is still an open challenge. In this work, the Cannabinoid Iterative Revaluation for Classification and Explanation (CIRCE) compound prediction platform has been generated based on explainable machine learning to support the design of selective CB1R and CB2R ligands. Multilayer classifiers were combined with Shapley value analysis to facilitate explainable predictions. In test calculations, CIRCE predictions reached ∼80% accuracy and structural features determining ligand predictions were rationalized. CIRCE was designed as a web-based prediction platform that is made freely available as a part of our study.
Collapse
Affiliation(s)
- Nicola Gambacorta
- Dipartimento di Farmacia Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115 Bonn, Germany
| | - Fulvio Ciriaco
- Dipartimento di Chimica, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
| | - Nicola Amoroso
- Dipartimento di Farmacia Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
| | - Cosimo Damiano Altomare
- Dipartimento di Farmacia Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
| | - Jürgen Bajorath
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115 Bonn, Germany
| | - Orazio Nicolotti
- Dipartimento di Farmacia Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
| |
Collapse
|
40
|
Lee M, Min K. AmorProt: Amino Acid Molecular Fingerprints Repurposing-Based Protein Fingerprint. Biochemistry 2023; 62:2700-2709. [PMID: 37622182 DOI: 10.1021/acs.biochem.3c00253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/26/2023]
Abstract
As protein therapeutics play an important role in almost all medical fields, numerous studies have been conducted on proteins using artificial intelligence. Artificial intelligence has enabled data-driven predictions without the need for expensive experiments. Nevertheless, unlike the various molecular fingerprint algorithms that have been developed, protein fingerprint algorithms have rarely been studied. In this study, we proposed the amino acid molecular fingerprints repurposing-based protein (AmorProt) fingerprint, a protein sequence representation method that effectively uses the molecular fingerprints corresponding to 20 amino acids. Subsequently, the performances of the tree-based machine learning and artificial neural network models were compared using (1) amyloid classification and (2) isoelectric point regression. Finally, the applicability and advantages of the developed platform were demonstrated through a case study and the following experiments: (3) comparison of dataset dependence with feature-based methods, (4) feature importance analysis, and (5) protein space analysis. Consequently, the significantly improved model performance and data-set-independent versatility of the AmorProt fingerprint were verified. The results revealed that the current protein representation method can be applied to various fields related to proteins, such as predicting their fundamental properties or interaction with ligands.
Collapse
Affiliation(s)
- Myeonghun Lee
- School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul 06978, Republic of Korea
| | - Kyoungmin Min
- School of Mechanical Engineering, Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul 06978, Republic of Korea
| |
Collapse
|
41
|
Yang M, Yang B, Duan G, Wang J. ITRPCA: a new model for computational drug repositioning based on improved tensor robust principal component analysis. Front Genet 2023; 14:1271311. [PMID: 37795241 PMCID: PMC10545866 DOI: 10.3389/fgene.2023.1271311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 08/23/2023] [Indexed: 10/06/2023] Open
Abstract
Background: Drug repositioning is considered a promising drug development strategy with the goal of discovering new uses for existing drugs. Compared with the experimental screening for drug discovery, computational drug repositioning offers lower cost and higher efficiency and, hence, has become a hot issue in bioinformatics. However, there are sparse samples, multi-source information, and even some noises, which makes it difficult to accurately identify potential drug-associated indications. Methods: In this article, we propose a new scheme with improved tensor robust principal component analysis (ITRPCA) in multi-source data to predict promising drug-disease associations. First, we use a weighted k-nearest neighbor (WKNN) approach to increase the overall density of the drug-disease association matrix that will assist in prediction. Second, a drug tensor with five frontal slices and a disease tensor with two frontal slices are constructed using multi-similarity matrices and an updated association matrix. The two target tensors naturally integrate multiple sources of data from the drug-side aspect and the disease-side aspect, respectively. Third, ITRPCA is employed to isolate the low-rank tensor and noise information in the tensor. In this step, an additional range constraint is incorporated to ensure that all the predicted entry values of a low-rank tensor are within the specific interval. Finally, we focus on identifying promising drug indications by analyzing drug-disease association pairs derived from the low-rank drug and low-rank disease tensors. Results: We evaluate the effectiveness of the ITRPCA method by comparing it with five prominent existing drug repositioning methods. This evaluation is carried out using 10-fold cross-validation and independent testing experiments. Our numerical results show that ITRPCA not only yields higher prediction accuracy but also exhibits remarkable computational efficiency. Furthermore, case studies demonstrate the practical effectiveness of our method.
Collapse
Affiliation(s)
- Mengyun Yang
- School of Mechanical and Energy Engineering, Shaoyang University, Shaoyang, China
- School of Computer Science, Hunan First Normal University, Changsha, China
| | - Bin Yang
- School of Mechanical and Energy Engineering, Shaoyang University, Shaoyang, China
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
42
|
Alnammi M, Liu S, Ericksen SS, Ananiev GE, Voter AF, Guo S, Keck JL, Hoffmann FM, Wildman SA, Gitter A. Evaluating Scalable Supervised Learning for Synthesize-on-Demand Chemical Libraries. J Chem Inf Model 2023; 63:5513-5528. [PMID: 37625010 PMCID: PMC10538940 DOI: 10.1021/acs.jcim.3c00912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Indexed: 08/27/2023]
Abstract
Traditional small-molecule drug discovery is a time-consuming and costly endeavor. High-throughput chemical screening can only assess a tiny fraction of drug-like chemical space. The strong predictive power of modern machine-learning methods for virtual chemical screening enables training models on known active and inactive compounds and extrapolating to much larger chemical libraries. However, there has been limited experimental validation of these methods in practical applications on large commercially available or synthesize-on-demand chemical libraries. Through a prospective evaluation with the bacterial protein-protein interaction PriA-SSB, we demonstrate that ligand-based virtual screening can identify many active compounds in large commercial libraries. We use cross-validation to compare different types of supervised learning models and select a random forest (RF) classifier as the best model for this target. When predicting the activity of more than 8 million compounds from Aldrich Market Select, the RF substantially outperforms a naïve baseline based on chemical structure similarity. 48% of the RF's 701 selected compounds are active. The RF model easily scales to score one billion compounds from the synthesize-on-demand Enamine REAL database. We tested 68 chemically diverse top predictions from Enamine REAL and observed 31 hits (46%), including one with an IC50 value of 1.3 μM.
Collapse
Affiliation(s)
- Moayad Alnammi
- Department
of Computer Sciences, University of Wisconsin−Madison, Madison, Wisconsin 53706, United States
- Morgridge
Institute for Research, Madison, Wisconsin 53715, United States
- Department
of Information and Computer Science, King
Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia
| | - Shengchao Liu
- Department
of Computer Sciences, University of Wisconsin−Madison, Madison, Wisconsin 53706, United States
- Morgridge
Institute for Research, Madison, Wisconsin 53715, United States
| | - Spencer S. Ericksen
- Small
Molecule Screening Facility, University
of Wisconsin−Madison, Madison, Wisconsin 53792, United States
| | - Gene E. Ananiev
- Small
Molecule Screening Facility, University
of Wisconsin−Madison, Madison, Wisconsin 53792, United States
| | - Andrew F. Voter
- Department
of Biomolecular Chemistry, University of
Wisconsin−Madison, Madison, Wisconsin 53706, United States
| | - Song Guo
- Small
Molecule Screening Facility, University
of Wisconsin−Madison, Madison, Wisconsin 53792, United States
| | - James L. Keck
- Department
of Biomolecular Chemistry, University of
Wisconsin−Madison, Madison, Wisconsin 53706, United States
| | - F. Michael Hoffmann
- Small
Molecule Screening Facility, University
of Wisconsin−Madison, Madison, Wisconsin 53792, United States
- McArdle Laboratory
for Cancer Research, University of Wisconsin−Madison, Madison, Wisconsin 53705, United States
| | - Scott A. Wildman
- Small
Molecule Screening Facility, University
of Wisconsin−Madison, Madison, Wisconsin 53792, United States
| | - Anthony Gitter
- Department
of Computer Sciences, University of Wisconsin−Madison, Madison, Wisconsin 53706, United States
- Morgridge
Institute for Research, Madison, Wisconsin 53715, United States
- Department
of Biostatistics and Medical Informatics, University of Wisconsin−Madison, Madison, Wisconsin 53792, United States
| |
Collapse
|
43
|
Zhu Y, Zhao L, Wen N, Wang J, Wang C. DataDTA: a multi-feature and dual-interaction aggregation framework for drug-target binding affinity prediction. Bioinformatics 2023; 39:btad560. [PMID: 37688568 PMCID: PMC10516524 DOI: 10.1093/bioinformatics/btad560] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Revised: 05/09/2023] [Accepted: 09/07/2023] [Indexed: 09/11/2023] Open
Abstract
MOTIVATION Accurate prediction of drug-target binding affinity (DTA) is crucial for drug discovery. The increase in the publication of large-scale DTA datasets enables the development of various computational methods for DTA prediction. Numerous deep learning-based methods have been proposed to predict affinities, some of which only utilize original sequence information or complex structures, but the effective combination of various information and protein-binding pockets have not been fully mined. Therefore, a new method that integrates available key information is urgently needed to predict DTA and accelerate the drug discovery process. RESULTS In this study, we propose a novel deep learning-based predictor termed DataDTA to estimate the affinities of drug-target pairs. DataDTA utilizes descriptors of predicted pockets and sequences of proteins, as well as low-dimensional molecular features and SMILES strings of compounds as inputs. Specifically, the pockets were predicted from the three-dimensional structure of proteins and their descriptors were extracted as the partial input features for DTA prediction. The molecular representation of compounds based on algebraic graph features was collected to supplement the input information of targets. Furthermore, to ensure effective learning of multiscale interaction features, a dual-interaction aggregation neural network strategy was developed. DataDTA was compared with state-of-the-art methods on different datasets, and the results showed that DataDTA is a reliable prediction tool for affinities estimation. Specifically, the concordance index (CI) of DataDTA is 0.806 and the Pearson correlation coefficient (R) value is 0.814 on the test dataset, which is higher than other methods. AVAILABILITY AND IMPLEMENTATION The codes and datasets of DataDTA are available at https://github.com/YanZhu06/DataDTA.
Collapse
Affiliation(s)
- Yan Zhu
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Lingling Zhao
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Naifeng Wen
- School of Mechanical and Electrical Engineering, Dalian Minzu University, Dalian 116600, China
| | - Junjie Wang
- Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| |
Collapse
|
44
|
Williams AH, Zhan CG. Staying Ahead of the Game: How SARS-CoV-2 has Accelerated the Application of Machine Learning in Pandemic Management. BioDrugs 2023; 37:649-674. [PMID: 37464099 DOI: 10.1007/s40259-023-00611-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/28/2023] [Indexed: 07/20/2023]
Abstract
In recent years, machine learning (ML) techniques have garnered considerable interest for their potential use in accelerating the rate of drug discovery. With the emergence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic, the utilization of ML has become even more crucial in the search for effective antiviral medications. The pandemic has presented the scientific community with a unique challenge, and the rapid identification of potential treatments has become an urgent priority. Researchers have been able to accelerate the process of identifying drug candidates, repurposing existing drugs, and designing new compounds with desirable properties using machine learning in drug discovery. To train predictive models, ML techniques in drug discovery rely on the analysis of large datasets, including both experimental and clinical data. These models can be used to predict the biological activities, potential side effects, and interactions with specific target proteins of drug candidates. This strategy has proven to be an effective method for identifying potential coronavirus disease 2019 (COVID-19) and other disease treatments. This paper offers a thorough analysis of the various ML techniques implemented to combat COVID-19, including supervised and unsupervised learning, deep learning, and natural language processing. The paper discusses the impact of these techniques on pandemic drug development, including the identification of potential treatments, the understanding of the disease mechanism, and the creation of effective and safe therapeutics. The lessons learned can be applied to future outbreaks and drug discovery initiatives.
Collapse
Affiliation(s)
- Alexander H Williams
- Molecular Modeling and Biopharmaceutical Center, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA
- GSK Upper Providence, 1250 S. Collegeville Road, Collegeville, PA, 19426, USA
| | - Chang-Guo Zhan
- Molecular Modeling and Biopharmaceutical Center, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA.
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA.
| |
Collapse
|
45
|
Teng S, Yin C, Wang Y, Chen X, Yan Z, Cui L, Wei L. MolFPG: Multi-level fingerprint-based Graph Transformer for accurate and robust drug toxicity prediction. Comput Biol Med 2023; 164:106904. [PMID: 37453376 DOI: 10.1016/j.compbiomed.2023.106904] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 03/20/2023] [Accepted: 04/10/2023] [Indexed: 07/18/2023]
Abstract
Drug toxicity prediction is essential to drug development, which can help screen compounds with potential toxicity and reduce the cost and risk of animal experiments and clinical trials. However, traditional handcrafted feature-based and molecular-graph-based approaches are insufficient for molecular representation learning. To address the problem, we developed an innovative molecular fingerprint Graph Transformer framework (MolFPG) with a global-aware module for interpretable toxicity prediction. Our approach encodes compounds using multiple molecular fingerprinting techniques and integrates Graph Transformer-based molecular representation for feature learning and toxic prediction. Experimental results show that our proposed approach has high accuracy and reliability in predicting drug toxicity. In addition, we explored the relationship between drug features and toxicity through an interpretive analysis approach, which improved the interpretability of the approach. Our results highlight the potential of Graph Transformers and multi-level fingerprints for accelerating the drug discovery process by reliably, effectively alarming drug safety. We believe that our study will provide vital support and reference for further development in the field of drug development and toxicity assessment.
Collapse
Affiliation(s)
- Saisai Teng
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Chenglin Yin
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Yu Wang
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | | | - Zhongmin Yan
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China.
| | - Lizhen Cui
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China.
| | - Leyi Wei
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China.
| |
Collapse
|
46
|
Miao Y, Ma H, Huang J. Recent Advances in Toxicity Prediction: Applications of Deep Graph Learning. Chem Res Toxicol 2023; 36:1206-1226. [PMID: 37562046 DOI: 10.1021/acs.chemrestox.2c00384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
The development of new drugs is time-consuming and expensive, and as such, accurately predicting the potential toxicity of a drug candidate is crucial in ensuring its safety and efficacy. Recently, deep graph learning has become prevalent in this field due to its computational power and cost efficiency. Many novel deep graph learning methods aid toxicity prediction and further prompt drug development. This review aims to connect fundamental knowledge with burgeoning deep graph learning methods. We first summarize the essential components of deep graph learning models for toxicity prediction, including molecular descriptors, molecular representations, evaluation metrics, validation methods, and data sets. Furthermore, based on various graph-related representations of molecules, we introduce several representative studies and methods for toxicity prediction from the perspective of GNN architectures and graph pretrained models. Compared to other types of models, deep graph models not only advance in higher accuracy and efficiency but also provide more intuitive insights, which is significant in the development of model interpretation and generalization ability. The graph pretrained models are emerging as they can extract prominent features from large-scale unlabeled molecular graph data and improve the performance of downstream toxicity prediction tasks. We hope this survey can serve as a handbook for individuals interested in exploring deep graph learning for toxicity prediction.
Collapse
Affiliation(s)
- Yuwei Miao
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas 76019, United States
| | - Hehuan Ma
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas 76019, United States
| | - Junzhou Huang
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas 76019, United States
| |
Collapse
|
47
|
Kırboğa KK, Abbasi S, Küçüksille EU. Explainability and white box in drug discovery. Chem Biol Drug Des 2023; 102:217-233. [PMID: 37105727 DOI: 10.1111/cbdd.14262] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Revised: 03/24/2023] [Accepted: 04/12/2023] [Indexed: 04/29/2023]
Abstract
Recently, artificial intelligence (AI) techniques have been increasingly used to overcome the challenges in drug discovery. Although traditional AI techniques generally have high accuracy rates, there may be difficulties in explaining the decision process and patterns. This can create difficulties in understanding and making sense of the outputs of algorithms used in drug discovery. Therefore, using explainable AI (XAI) techniques, the causes and consequences of the decision process are better understood. This can help further improve the drug discovery process and make the right decisions. To address this issue, Explainable Artificial Intelligence (XAI) emerged as a process and method that securely captures the results and outputs of machine learning (ML) and deep learning (DL) algorithms. Using techniques such as SHAP (SHApley Additive ExPlanations) and LIME (Locally Interpretable Model-Independent Explanations) has made the drug targeting phase clearer and more understandable. XAI methods are expected to reduce time and cost in future computational drug discovery studies. This review provides a comprehensive overview of XAI-based drug discovery and development prediction. XAI mechanisms to increase confidence in AI and modeling methods. The limitations and future directions of XAI in drug discovery are also discussed.
Collapse
Affiliation(s)
- Kevser Kübra Kırboğa
- Bioengineering Department, Bilecik Seyh Edebali University, Bilecik, Turkey
- Informatics Institute, Istanbul Technical University, Maslak, Turkey
| | - Sumra Abbasi
- Department of Biological Sciences, National of Medical Sciences, Rawalpindi, Pakistan
| | - Ecir Uğur Küçüksille
- Department of Computer Engineering, Süleyman Demirel University, Isparta, Turkey
| |
Collapse
|
48
|
Das S, Babu A, Medha T, Ramanathan G, Mukherjee AG, Wanjari UR, Murali R, Kannampuzha S, Gopalakrishnan AV, Renu K, Sinha D, George Priya Doss C. Molecular mechanisms augmenting resistance to current therapies in clinics among cervical cancer patients. Med Oncol 2023; 40:149. [PMID: 37060468 PMCID: PMC10105157 DOI: 10.1007/s12032-023-01997-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 03/10/2023] [Indexed: 04/16/2023]
Abstract
Cervical cancer (CC) is the fourth leading cause of cancer death (~ 324,000 deaths annually) among women internationally, with 85% of these deaths reported in developing regions, particularly sub-Saharan Africa and Southeast Asia. Human papillomavirus (HPV) is considered the major driver of CC, and with the availability of the prophylactic vaccine, HPV-associated CC is expected to be eliminated soon. However, female patients with advanced-stage cervical cancer demonstrated a high recurrence rate (50-70%) within two years of completing radiochemotherapy. Currently, 90% of failures in chemotherapy are during the invasion and metastasis of cancers related to drug resistance. Although molecular target therapies have shown promising results in the lab, they have had little success in patients due to the tumor heterogeneity fueling resistance to these therapies and bypass the targeted signaling pathway. The last two decades have seen the emergence of immunotherapy, especially immune checkpoint blockade (ICB) therapies, as an effective treatment against metastatic tumors. Unfortunately, only a small subgroup of patients (< 20%) have benefited from this approach, reflecting disease heterogeneity and manifestation with primary or acquired resistance over time. Thus, understanding the mechanisms driving drug resistance in CC could significantly improve the quality of medical care for cancer patients and steer them to accurate, individualized treatment. The rise of artificial intelligence and machine learning has also been a pivotal factor in cancer drug discovery. With the advancement in such technology, cervical cancer screening and diagnosis are expected to become easier. This review will systematically discuss the different tumor-intrinsic and extrinsic mechanisms CC cells to adapt to resist current treatments and scheme novel strategies to overcome cancer drug resistance.
Collapse
Affiliation(s)
- Soumik Das
- School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore, Tamil Nadu, 632014, India
| | - Achsha Babu
- School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore, Tamil Nadu, 632014, India
| | - Tamma Medha
- School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore, Tamil Nadu, 632014, India
| | - Gnanasambandan Ramanathan
- School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore, Tamil Nadu, 632014, India
| | - Anirban Goutam Mukherjee
- School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore, Tamil Nadu, 632014, India
| | - Uddesh Ramesh Wanjari
- School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore, Tamil Nadu, 632014, India
| | - Reshma Murali
- School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore, Tamil Nadu, 632014, India
| | - Sandra Kannampuzha
- School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore, Tamil Nadu, 632014, India
| | | | - Kaviyarasi Renu
- Department of Biochemistry, Centre of Molecular Medicine and Diagnostics (COMManD), Saveetha Dental College & Hospitals, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, 600077, Tamil Nadu, India
| | - Debottam Sinha
- Faculty of Medicine, Frazer Institute, The University of Queensland, Brisbane, QLD, Australia
| | - C George Priya Doss
- School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore, Tamil Nadu, 632014, India.
| |
Collapse
|
49
|
Rullo R, Cerchia C, Nasso R, Romanelli V, Vendittis ED, Masullo M, Lavecchia A. Novel Reversible Inhibitors of Xanthine Oxidase Targeting the Active Site of the Enzyme. Antioxidants (Basel) 2023; 12:antiox12040825. [PMID: 37107199 PMCID: PMC10135315 DOI: 10.3390/antiox12040825] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 03/22/2023] [Accepted: 03/24/2023] [Indexed: 03/30/2023] Open
Abstract
Xanthine oxidase (XO) is a flavoprotein catalysing the oxidation of hypoxanthine to xanthine and then to uric acid, while simultaneously producing reactive oxygen species. Altered functions of XO may lead to severe pathological diseases, including gout-causing hyperuricemia and oxidative damage of tissues. These findings prompted research studies aimed at targeting the activity of this crucial enzyme. During the course of a virtual screening study aimed at the discovery of novel inhibitors targeting another oxidoreductase, superoxide dismutase, we identified four compounds with non-purine-like structures, namely ALS-1, -8, -15 and -28, that were capable of causing direct inhibition of XO. The kinetic studies of their inhibition mechanism allowed a definition of these compounds as competitive inhibitors of XO. The most potent molecule was ALS-28 (Ki 2.7 ± 1.5 µM), followed by ALS-8 (Ki 4.5 ± 1.5 µM) and by the less potent ALS-15 (Ki 23 ± 9 µM) and ALS-1 (Ki 41 ± 14 µM). Docking studies shed light on the molecular basis of the inhibitory activity of ALS-28, which hinders the enzyme cavity channel for substrate entry consistently with the competitive mechanism observed in kinetic studies. Moreover, the structural features emerging from the docked poses of ALS-8, -15 and -1 may explain the lower inhibition power with respect to ALS-28. All these structurally unrelated compounds represent valuable candidates for further elaboration into promising lead compounds.
Collapse
|
50
|
Mensa S, Sahin E, Tacchino F, Kl Barkoutsos P, Tavernelli I. Quantum machine learning framework for virtual screening in drug discovery: a prospective quantum advantage. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2023. [DOI: 10.1088/2632-2153/acb900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023] Open
Abstract
Abstract
Machine Learning for ligand based virtual screening (LB-VS) is an important in-silico tool for discovering new drugs in a faster and cost-effective manner, especially for emerging diseases such as COVID-19. In this paper, we propose a general-purpose framework combining a classical Support Vector Classifier algorithm with quantum kernel estimation for LB-VS on real-world databases, and we argue in favor of its prospective quantum advantage. Indeed, we heuristically prove that our quantum integrated workflow can, at least in some relevant instances, provide a tangible advantage compared to state-of-art classical algorithms operating on the same datasets, showing strong dependence on target and features selection method. Finally, we test our algorithm on IBM Quantum processors using ADRB2 and COVID-19 datasets, showing that hardware simulations provide results in line with the predicted performances and can surpass classical equivalents.
Collapse
|