1
|
Ambreen S, Umar M, Noor A, Jain H, Ali R. Advanced AI and ML frameworks for transforming drug discovery and optimization: With innovative insights in polypharmacology, drug repurposing, combination therapy and nanomedicine. Eur J Med Chem 2025; 284:117164. [PMID: 39721292 DOI: 10.1016/j.ejmech.2024.117164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Revised: 11/24/2024] [Accepted: 11/27/2024] [Indexed: 12/28/2024]
Abstract
Artificial Intelligence (AI) and Machine Learning (ML) are transforming drug discovery by overcoming traditional challenges like high costs, time-consuming, and frequent failures. AI-driven approaches streamline key phases, including target identification, lead optimization, de novo drug design, and drug repurposing. Frameworks such as deep neural networks (DNNs), convolutional neural networks (CNNs), and deep reinforcement learning (DRL) models have shown promise in identifying drug targets, optimizing delivery systems, and accelerating drug repurposing. Generative adversarial networks (GANs) and variational autoencoders (VAEs) aid de novo drug design by creating novel drug-like compounds with desired properties. Case studies, such as DDR1 kinase inhibitors designed using generative models and CDK20 inhibitors developed via structure-based methods, highlight AI's ability to produce highly specific therapeutics. Models like SNF-CVAE and DeepDR further advance drug repurposing by uncovering new therapeutic applications for existing drugs. Advanced ML algorithms enhance precision in predicting drug efficacy, toxicity, and ADME-Tox properties, reducing development costs and improving drug-target interactions. AI also supports polypharmacology by optimizing multi-target drug interactions and enhances combination therapy through predictions of drug synergies and antagonisms. In nanomedicine, AI models like CURATE.AI and the Hartung algorithm optimize personalized treatments by predicting toxicological risks and real-time dosing adjustments with high accuracy. Despite its potential, challenges like data quality, model interpretability, and ethical concerns must be addressed. High-quality datasets, transparent models, and unbiased algorithms are essential for reliable AI applications. As AI continues to evolve, it is poised to revolutionize drug discovery and personalized medicine, advancing therapeutic development and patient care.
Collapse
Affiliation(s)
- Subiya Ambreen
- Department of Pharmaceutical Chemistry, Delhi Institute of Pharmaceutical Sciences and Research (DIPSAR), DPSRU, Pushp Vihar, New Delhi, 110017, India
| | - Mohammad Umar
- Department of Pharmaceutical Chemistry, Delhi Institute of Pharmaceutical Sciences and Research (DIPSAR), DPSRU, Pushp Vihar, New Delhi, 110017, India
| | - Aaisha Noor
- Department of Pharmaceutical Chemistry, Delhi Institute of Pharmaceutical Sciences and Research (DIPSAR), DPSRU, Pushp Vihar, New Delhi, 110017, India
| | - Himangini Jain
- Department of Pharmaceutical Chemistry, Delhi Institute of Pharmaceutical Sciences and Research (DIPSAR), DPSRU, Pushp Vihar, New Delhi, 110017, India
| | - Ruhi Ali
- Department of Pharmaceutical Chemistry, Delhi Institute of Pharmaceutical Sciences and Research (DIPSAR), DPSRU, Pushp Vihar, New Delhi, 110017, India.
| |
Collapse
|
2
|
Brahma R, Moon S, Shin JM, Cho KH. AiGPro: a multi-tasks model for profiling of GPCRs for agonist and antagonist. J Cheminform 2025; 17:12. [PMID: 39881398 PMCID: PMC11780767 DOI: 10.1186/s13321-024-00945-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2024] [Accepted: 12/27/2024] [Indexed: 01/31/2025] Open
Abstract
G protein-coupled receptors (GPCRs) play vital roles in various physiological processes, making them attractive drug discovery targets. Meanwhile, deep learning techniques have revolutionized drug discovery by facilitating efficient tools for expediting the identification and optimization of ligands. However, existing models for the GPCRs often focus on single-target or a small subset of GPCRs or employ binary classification, constraining their applicability for high throughput virtual screening. To address these issues, we introduce AiGPro, a novel multitask model designed to predict small molecule agonists (EC50) and antagonists (IC50) across the 231 human GPCRs, making it a first-in-class solution for large-scale GPCR profiling. Leveraging multi-scale context aggregation and bidirectional multi-head cross-attention mechanisms, our approach demonstrates that ensemble models may not be necessary for predicting complex GPCR states and small molecule interactions. Through extensive validation using stratified tenfold cross-validation, AiGPro achieves robust performance with Pearson's correlation coefficient of 0.91, indicating broad generalizability. This breakthrough sets a new standard in the GPCR studies, outperforming previous studies. Moreover, our first-in-class multi-tasking model can predict agonist and antagonist activities across a wide range of GPCRs, offering a comprehensive perspective on ligand bioactivity within this diverse superfamily. To facilitate easy accessibility, we have deployed a web-based platform for model access at https://aicadd.ssu.ac.kr/AiGPro . Scientific Contribution We introduce a deep learning-based multi-task model to generalize the agonist and antagonist bioactivity prediction for GPCRs accurately. The model is implemented on a user-friendly web server to facilitate rapid screening of small-molecule libraries, expediting GPCR-targeted drug discovery. Covering a diverse set of 231 GPCR targets, the platform delivers a robust, scalable solution for advancing GPCR-focused therapeutic development. The proposed framework incorporates an innovative dual-label prediction strategy, enabling the simultaneous classification of molecules as agonists, antagonists, or both. Each prediction is further accompanied by a confidence score, offering a quantitative measure of activity likelihood. This advancement moves beyond conventional models focusing solely on binding affinity, providing a more comprehensive understanding of ligand-receptor interactions. At the core of our model lies the Bi-Directional Multi-Head Cross-Attention (BMCA) module, a novel architecture that captures forward and backward contextual embeddings of protein and ligand features. By leveraging BMCA, the model effectively integrates structural and sequence-level information, ensuring a precise representation of molecular interactions. Results show that this approach is highly accurate in binding affinity predictions and consistent across diverse GPCR families. By unifying agonist and antagonist bioactivity prediction into a single model architecture, we bridge a critical gap in GPCR modeling. This enhances prediction accuracy and accelerates virtual screening workflows, offering a valuable and innovative solution for advancing GPCR-targeted drug discovery.
Collapse
Affiliation(s)
- Rahul Brahma
- School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, 06978, Seoul, Republic of Korea
| | - Sunghyun Moon
- School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, 06978, Seoul, Republic of Korea
| | - Jae-Min Shin
- AzothBio, Rm. DA724 Hyundai Knowledge Industry Center, Hanam-si, Gyeonggi-do, Republic of Korea.
| | - Kwang-Hwi Cho
- School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, 06978, Seoul, Republic of Korea.
| |
Collapse
|
3
|
Jobe A, Vijayan R. Orphan G protein-coupled receptors: the ongoing search for a home. Front Pharmacol 2024; 15:1349097. [PMID: 38495099 PMCID: PMC10941346 DOI: 10.3389/fphar.2024.1349097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 02/15/2024] [Indexed: 03/19/2024] Open
Abstract
G protein-coupled receptors (GPCRs) make up the largest receptor superfamily, accounting for 4% of protein-coding genes. Despite the prevalence of such transmembrane receptors, a significant number remain orphans, lacking identified endogenous ligands. Since their conception, the reverse pharmacology approach has been used to characterize such receptors. However, the multifaceted and nuanced nature of GPCR signaling poses a great challenge to their pharmacological elucidation. Considering their therapeutic relevance, the search for native orphan GPCR ligands continues. Despite limited structural input in terms of 3D crystallized structures, with advances in machine-learning approaches, there has been great progress with respect to accurate ligand prediction. Though such an approach proves valuable given that ligand scarcity is the greatest hurdle to orphan GPCR deorphanization, the future pairings of the remaining orphan GPCRs may not necessarily take a one-size-fits-all approach but should be more comprehensive in accounting for numerous nuanced possibilities to cover the full spectrum of GPCR signaling.
Collapse
Affiliation(s)
- Amie Jobe
- Department of Biology, College of Science, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Ranjit Vijayan
- Department of Biology, College of Science, United Arab Emirates University, Al Ain, United Arab Emirates
- The Big Data Analytics Center, United Arab Emirates University, Al Ain, United Arab Emirates
- Zayed Bin Sultan Center for Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| |
Collapse
|
4
|
Remington JM, McKay KT, Beckage NB, Ferrell JB, Schneebeli ST, Li J. GPCRLigNet: rapid screening for GPCR active ligands using machine learning. J Comput Aided Mol Des 2023; 37:147-156. [PMID: 36840893 PMCID: PMC10379640 DOI: 10.1007/s10822-023-00497-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 02/03/2023] [Indexed: 02/26/2023]
Abstract
Molecules with bioactivity towards G protein-coupled receptors represent a subset of the vast space of small drug-like molecules. Here, we compare machine learning models, including dilated graph convolutional networks, that conduct binary classification to quickly identify molecules with activity towards G protein-coupled receptors. The models are trained and validated using a large set of over 600,000 active, inactive, and decoy compounds. The best performing machine learning model, dubbed GPCRLigNet, was a surprisingly simple feedforward dense neural network mapping from Morgan fingerprints to activity. Incorporation of GPCRLigNet into a high-throughput virtual screening workflow is demonstrated with molecular docking towards a particular G protein-coupled receptor, the pituitary adenylate cyclase-activating polypeptide receptor type 1. Through rigorous comparison of docking scores for molecules selected with and without using GPCRLigNet, we demonstrate an enrichment of potentially potent molecules using GPCRLigNet. This work provides a proof of principle that GPCRLigNet can effectively hone the chemical search space towards ligands with G protein-coupled receptor activity.
Collapse
Affiliation(s)
- Jacob M Remington
- Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA
| | - Kyle T McKay
- Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA
| | - Noah B Beckage
- Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA
| | - Jonathon B Ferrell
- Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA
| | - Severin T Schneebeli
- Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA.,Department of Industrial and Physical Pharmacy, Department of Chemistry, Purdue University, West Lafayette, IN, 47906, USA.,Department of Pathology, University of Vermont, Burlington, VT, 05405, USA
| | - Jianing Li
- Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA. .,Department of Pathology, University of Vermont, Burlington, VT, 05405, USA. .,Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, IN, 47906, USA.
| |
Collapse
|
5
|
Shaw TI, Zhao B, Li Y, Wang H, Wang L, Manley B, Stewart PA, Karolak A. Multi-omics approach to identifying isoform variants as therapeutic targets in cancer patients. Front Oncol 2022; 12:1051487. [PMID: 36505834 PMCID: PMC9730332 DOI: 10.3389/fonc.2022.1051487] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 11/07/2022] [Indexed: 11/25/2022] Open
Abstract
Cancer-specific alternatively spliced events (ASE) play a role in cancer pathogenesis and can be targeted by immunotherapy, oligonucleotide therapy, and small molecule inhibition. However, identifying actionable ASE targets remains challenging due to the uncertainty of its protein product, structure impact, and proteoform (protein isoform) function. Here we argue that an integrated multi-omics profiling strategy can overcome these challenges, allowing us to mine this untapped source of targets for therapeutic development. In this review, we will provide an overview of current multi-omics strategies in characterizing ASEs by utilizing the transcriptome, proteome, and state-of-art algorithms for protein structure prediction. We will discuss limitations and knowledge gaps associated with each technology and informatics analytics. Finally, we will discuss future directions that will enable the full integration of multi-omics data for ASE target discovery.
Collapse
Affiliation(s)
- Timothy I. Shaw
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States,*Correspondence: Timothy I. Shaw,
| | - Bi Zhao
- Department of Machine Learning, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States
| | - Yuxin Li
- Center for Proteomics and Metabolomics, St. Jude Children’s Research Hospital, Memphis, TN, United States
| | - Hong Wang
- Center for Proteomics and Metabolomics, St. Jude Children’s Research Hospital, Memphis, TN, United States
| | - Liang Wang
- Department of Tumor Biology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States
| | - Brandon Manley
- Department of Genitourinary Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States
| | - Paul A. Stewart
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States
| | - Aleksandra Karolak
- Department of Machine Learning, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States
| |
Collapse
|
6
|
Zhang H, Zhang T, Saravanan KM, Liao L, Wu H, Zhang H, Zhang H, Pan Y, Wu X, Wei Y. DeepBindBC: a practical deep learning method for identifying native-like protein-ligand complexes in virtual screening. Methods 2022; 205:247-262. [PMID: 35878751 DOI: 10.1016/j.ymeth.2022.07.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Revised: 06/29/2022] [Accepted: 07/12/2022] [Indexed: 12/18/2022] Open
Abstract
Identifying native-like protein-ligand complexes (PLCs) from an abundance of docking decoys is critical for large-scale virtual drug screening in early-stage drug discovery lead searching efforts. Providing reliable prediction is still a challenge for most current affinity predicting models because of a lack of non-binding data during model training, lost critical physical-chemical features, and difficulties in learning abstract information with limited neural layers. In this work, we proposed a deep learning model, DeepBindBC, for classifying putative ligands as binding or non-binding. Our model incorporates information on non-binding interactions, making it more suitable for real applications. ResNet model architecture and more detailed atom type representation guarantee implicit features can be learned more accurately. Here, we show that DeepBindBC outperforms Autodock Vina, Pafnucy, and DLSCORE for three DUD.E testing sets. Moreover, DeepBindBC identified a novel human pancreatic α-amylase binder validated by a fluorescence spectral experiment (Ka= 1.0×105 M). Furthermore, DeepBindBC can be used as a core component of a hybrid virtual screening pipeline that incorporating many other complementary methods, such as DFCNN, Autodock Vina docking, and pocket molecular dynamics simulation. Additionally, an online web server based on the model is available at http://cbblab.siat.ac.cn/DeepBindBC/index.php for the user's convenience. Our model and the web server provide alternative tools in the early steps of drug discovery by providing accurate identification of native-like PLCs.
Collapse
Affiliation(s)
- Haiping Zhang
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, PR China; Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518 055, PR China
| | - Tingting Zhang
- School of Medicine, Shenzhen University, Shenzhen, Guangdong Province 518060, PR China
| | - Konda Mani Saravanan
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai 600073, Tamil Nadu, India
| | - Linbu Liao
- College of Software Technology, Zhejiang University, Zhejiang Province 315048, PR China
| | - Hao Wu
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518 055, PR China
| | - Haishan Zhang
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518 055, PR China
| | - Huiling Zhang
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518 055, PR China
| | - Yi Pan
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518 055, PR China
| | - Xuli Wu
- School of Medicine, Shenzhen University, Shenzhen, Guangdong Province 518060, PR China.
| | - Yanjie Wei
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, PR China.
| |
Collapse
|
7
|
Optimized Multiscale Entropy Model Based on Resting-State fMRI for Appraising Cognitive Performance in Healthy Elderly. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:2484081. [PMID: 35712004 PMCID: PMC9197667 DOI: 10.1155/2022/2484081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 05/18/2022] [Accepted: 05/19/2022] [Indexed: 11/29/2022]
Abstract
Many studies have indicated that an entropy model can capture the dynamic characteristics of resting-state functional magnetic resonance imaging (rfMRI) signals. However, there are problems of subjectivity and lack of uniform standards in the selection of model parameters relying on experience when using the entropy model to analyze rfMRI. To address this issue, an optimized multiscale entropy (MSE) model was proposed to confirm the parameters objectively. All healthy elderly volunteers were divided into two groups, namely, excellent and poor, by the scores estimated through traditional scale tests before the rfMRI scan. The parameters of the MSE model were optimized with the help of sensitivity parameters such as receiver operating characteristic (ROC) and area under the ROC curve (AUC) in a comparison study between the two groups. The brain regions with significant differences in entropy values were considered biomarkers. Their entropy values were regarded as feature vectors to use as input for the probabilistic neural network in the classification of cognitive scores. Classification accuracy of 80.05% was obtained using machine learning. These results show that the optimized MSE model can accurately select the brain regions sensitive to cognitive performance and objectively select fixed parameters for MSE. This work was expected to provide the basis for entropy to test the cognitive scores of the healthy elderly.
Collapse
|
8
|
Dankwah KO, Mohl JE, Begum K, Leung MY. What Makes GPCRs from Different Families Bind to the Same Ligand? Biomolecules 2022; 12:863. [PMID: 35883418 PMCID: PMC9313020 DOI: 10.3390/biom12070863] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Revised: 06/09/2022] [Accepted: 06/19/2022] [Indexed: 12/10/2022] Open
Abstract
G protein-coupled receptors (GPCRs) are the largest class of cell-surface receptor proteins with important functions in signal transduction and often serve as therapeutic drug targets. With the rapidly growing public data on three dimensional (3D) structures of GPCRs and GPCR-ligand interactions, computational prediction of GPCR ligand binding becomes a convincing option to high throughput screening and other experimental approaches during the beginning phases of ligand discovery. In this work, we set out to computationally uncover and understand the binding of a single ligand to GPCRs from several different families. Three-dimensional structural comparisons of the GPCRs that bind to the same ligand revealed local 3D structural similarities and often these regions overlap with locations of binding pockets. These pockets were found to be similar (based on backbone geometry and side-chain orientation using APoc), and they correlate positively with electrostatic properties of the pockets. Moreover, the more similar the pockets, the more likely a ligand binding to the pockets will interact with similar residues, have similar conformations, and produce similar binding affinities across the pockets. These findings can be exploited to improve protein function inference, drug repurposing and drug toxicity prediction, and accelerate the development of new drugs.
Collapse
Affiliation(s)
- Kwabena Owusu Dankwah
- Computational Science Program, The University of Texas at El Paso, El Paso, TX 79968, USA;
| | - Jonathon E. Mohl
- Computational Science Program, The University of Texas at El Paso, El Paso, TX 79968, USA;
- Bioinformatics Program, The University of Texas at El Paso, El Paso, TX 79968, USA;
- Department of Mathematical Sciences, The University of Texas at El Paso, El Paso, TX 79968, USA
- Border Biomedical Research Center, The University of Texas at El Paso, El Paso, TX 79968, USA
| | - Khodeza Begum
- Bioinformatics Program, The University of Texas at El Paso, El Paso, TX 79968, USA;
- Border Biomedical Research Center, The University of Texas at El Paso, El Paso, TX 79968, USA
| | - Ming-Ying Leung
- Computational Science Program, The University of Texas at El Paso, El Paso, TX 79968, USA;
- Bioinformatics Program, The University of Texas at El Paso, El Paso, TX 79968, USA;
- Department of Mathematical Sciences, The University of Texas at El Paso, El Paso, TX 79968, USA
- Border Biomedical Research Center, The University of Texas at El Paso, El Paso, TX 79968, USA
| |
Collapse
|
9
|
Binding site identification of G protein-coupled receptors through a 3D Zernike polynomials-based method: application to C. elegans olfactory receptors. J Comput Aided Mol Des 2022; 36:11-24. [PMID: 34977999 PMCID: PMC8831295 DOI: 10.1007/s10822-021-00434-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 11/18/2021] [Indexed: 11/01/2022]
Abstract
Studying the binding processes of G protein-coupled receptors (GPCRs) proteins is of particular interest both to better understand the molecular mechanisms that regulate the signaling between the extracellular and intracellular environment and for drug design purposes. In this study, we propose a new computational approach for the identification of the binding site for a specific ligand on a GPCR. The method is based on the Zernike polynomials and performs the ligand-GPCR association through a shape complementarity analysis of the local molecular surfaces. The method is parameter-free and it can distinguish, working on hundreds of experimentally GPCR-ligand complexes, binding pockets from randomly sampled regions on the receptor surface, obtaining an Area Under ROC curve of 0.77. Given its importance both as a model organism and in terms of applications, we thus investigated the olfactory receptors of the C. elegans, building a list of associations between 21 GPCRs belonging to its olfactory neurons and a set of possible ligands. Thus, we can not only carry out rapid and efficient screenings of drugs proposed for GPCRs, key targets in many pathologies, but also we laid the groundwork for computational mutagenesis processes, aimed at increasing or decreasing the binding affinity between ligands and receptors.
Collapse
|
10
|
Velloso JPL, Ascher DB, Pires DEV. pdCSM-GPCR: predicting potent GPCR ligands with graph-based signatures. BIOINFORMATICS ADVANCES 2021; 1:vbab031. [PMID: 34901870 PMCID: PMC8651072 DOI: 10.1093/bioadv/vbab031] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 09/30/2021] [Accepted: 11/02/2021] [Indexed: 01/26/2023]
Abstract
MOTIVATION G protein-coupled receptors (GPCRs) can selectively bind to many types of ligands, ranging from light-sensitive compounds, ions, hormones, pheromones and neurotransmitters, modulating cell physiology. Considering their role in many essential cellular processes, they are one of the most targeted protein families, with over a third of all approved drugs modulating GPCR signalling. Despite this, the large diversity of receptors and their multipass transmembrane architectures make the identification and development of novel specific, and safe GPCR ligands a challenge. While computational approaches have the potential to assist GPCR drug development, they have presented limited performance and generalization capabilities. Here, we explored the use of graph-based signatures to develop pdCSM-GPCR, a method capable of rapidly and accurately screening potential GPCR ligands. RESULTS Bioactivity data (IC50, EC50, Ki and Kd) for individual GPCRs were curated. After curation, we used the data for developing predictive models for 36 major GPCR targets, across 4 classes (A, B, C and F). Our models compose the most comprehensive computational resource for GPCR bioactivity prediction to date. Across stratified 10-fold cross-validation and blind tests, our approach achieved Pearson's correlations of up to 0.89, significantly outperforming previous methods. Interpreting our results, we identified common important features of potent GPCRs ligands, which tend to have bicyclic rings, leading to higher levels of aromaticity. We believe pdCSM-GPCR will be an invaluable tool to assist screening efforts, enriching compound libraries and ranking candidates for further experimental validation. AVAILABILITY AND IMPLEMENTATION pdCSM-GPCR predictive models and datasets used have been made available via a freely accessible and easy-to-use web server at http://biosig.unimelb.edu.au/pdcsm_gpcr/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- João Paulo L Velloso
- Fundação Oswaldo Cruz, Instituto René Rachou, Belo Horizonte 30190-009, Brazil
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne 3052, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne 3052, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Australia
- Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne 3052, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne 3052, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Australia
- Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Melbourne 3052, Australia
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne 3052, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne 3052, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne 3053, Australia
| |
Collapse
|
11
|
Selvaraj C, Chandra I, Singh SK. Artificial intelligence and machine learning approaches for drug design: challenges and opportunities for the pharmaceutical industries. Mol Divers 2021; 26:1893-1913. [PMID: 34686947 PMCID: PMC8536481 DOI: 10.1007/s11030-021-10326-z] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Accepted: 09/24/2021] [Indexed: 12/27/2022]
Abstract
The global spread of COVID-19 has raised the importance of pharmaceutical drug development as intractable and hot research. Developing new drug molecules to overcome any disease is a costly and lengthy process, but the process continues uninterrupted. The critical point to consider the drug design is to use the available data resources and to find new and novel leads. Once the drug target is identified, several interdisciplinary areas work together with artificial intelligence (AI) and machine learning (ML) methods to get enriched drugs. These AI and ML methods are applied in every step of the computer-aided drug design, and integrating these AI and ML methods results in a high success rate of hit compounds. In addition, this AI and ML integration with high-dimension data and its powerful capacity have taken a step forward. Clinical trials output prediction through the AI/ML integrated models could further decrease the clinical trials cost by also improving the success rate. Through this review, we discuss the backend of AI and ML methods in supporting the computer-aided drug design, along with its challenge and opportunity for the pharmaceutical industry. From the available information or data, the AI and ML based prediction for the high throughput virtual screening. After this integration of AI and ML, the success rate of hit identification has gained a momentum with huge success by providing novel drugs.
Collapse
Affiliation(s)
- Chandrabose Selvaraj
- CADD and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Science Block, Karaikudi, Tamil Nadu, 630004, India.
| | - Ishwar Chandra
- CADD and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Science Block, Karaikudi, Tamil Nadu, 630004, India
| | - Sanjeev Kumar Singh
- CADD and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Science Block, Karaikudi, Tamil Nadu, 630004, India.
| |
Collapse
|
12
|
Ali MH, Khan DM, Jamal K, Ahmad Z, Manzoor S, Khan Z. Prediction of Multidrug-Resistant Tuberculosis Using Machine Learning Algorithms in SWAT, Pakistan. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:2567080. [PMID: 34512933 PMCID: PMC8426057 DOI: 10.1155/2021/2567080] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 08/18/2021] [Indexed: 11/20/2022]
Abstract
In this paper, we have focused on machine learning (ML) feature selection (FS) algorithms for identifying and diagnosing multidrug-resistant (MDR) tuberculosis (TB). MDR-TB is a universal public health problem, and its early detection has been one of the burning issues. The present study has been conducted in the Malakand Division of Khyber Pakhtunkhwa, Pakistan, to further add to the knowledge on the disease and to deal with the issues of identification and early detection of MDR-TB by ML algorithms. These models also identify the most important factors causing MDR-TB infection whose study gives additional insights into the matter. ML algorithms such as random forest, k-nearest neighbors, support vector machine, logistic regression, leaset absolute shrinkage and selection operator (LASSO), artificial neural networks (ANNs), and decision trees are applied to analyse the case-control dataset. This study reveals that close contacts of MDR-TB patients, smoking, depression, previous TB history, improper treatment, and interruption in first-line TB treatment have a great impact on the status of MDR. Accordingly, weight loss, chest pain, hemoptysis, and fatigue are important symptoms. Based on accuracy, sensitivity, and specificity, SVM and RF are the suggested models to be used for patients' classifications.
Collapse
Affiliation(s)
- Mian Haider Ali
- Department of Statistics, Abdul Wali Khan University, Mardan, Pakistan
- Programmatic Management of Drug-Resistant Tuberculosis, Saidu Teaching Hospital, Swat, Pakistan
| | | | - Khalid Jamal
- Programmatic Management of Drug-Resistant Tuberculosis, Saidu Teaching Hospital, Swat, Pakistan
| | - Zubair Ahmad
- Department of Statistics, Yazd University, P.O. Box 89175-741, Yazd, Iran
| | - Sadaf Manzoor
- Department of Statistics, Islamia College Peshawar, Peshawar, Pakistan
| | - Zardad Khan
- Department of Statistics, Abdul Wali Khan University, Mardan, Pakistan
| |
Collapse
|
13
|
GPCR_LigandClassify.py; a rigorous machine learning classifier for GPCR targeting compounds. Sci Rep 2021; 11:9510. [PMID: 33947911 PMCID: PMC8097070 DOI: 10.1038/s41598-021-88939-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2020] [Accepted: 04/12/2021] [Indexed: 02/02/2023] Open
Abstract
The current study describes the construction of various ligand-based machine learning models to be used for drug-repurposing against the family of G-Protein Coupled Receptors (GPCRs). In building these models, we collected > 500,000 data points, encompassing experimentally measured molecular association data of > 160,000 unique ligands against > 250 GPCRs. These data points were retrieved from the GPCR-Ligand Association (GLASS) database. We have used diverse molecular featurization methods to describe the input molecules. Multiple supervised ML algorithms were developed, tested and compared for their accuracy, F scores, as well as for their Matthews' correlation coefficient scores (MCC). Our data suggest that combined with molecular fingerprinting, ensemble decision trees and gradient boosted trees ML algorithms are on the accuracy border of the rather sophisticated deep neural nets (DNNs)-based algorithms. On a test dataset, these models displayed an excellent performance, reaching a ~ 90% classification accuracy. Additionally, we showcase a few examples where our models were able to identify interesting connections between known drugs from the Drug-Bank database and members of the GPCR family of receptors. Our findings are in excellent agreement with previously reported experimental observations in the literature. We hope the models presented in this paper synergize with the currently ongoing interest of applying machine learning modeling in the field of drug repurposing and computational drug discovery in general.
Collapse
|
14
|
Karimi S, Ahmadi M, Goudarzi F, Ferdousi R. A computational model for GPCR-ligand interaction prediction. J Integr Bioinform 2020; 18:155-165. [PMID: 34171942 PMCID: PMC7790179 DOI: 10.1515/jib-2019-0084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Accepted: 11/25/2020] [Indexed: 11/25/2022] Open
Abstract
G protein-coupled receptors (GPCRs) play an essential role in critical human activities, and they are considered targets for a wide range of drugs. Accordingly, based on these crucial roles, GPCRs are mainly considered and focused on pharmaceutical research. Hence, there are a lot of investigations on GPCRs. Experimental laboratory research is very costly in terms of time and expenses, and accordingly, there is a marked tendency to use computational methods as an alternative method. In this study, a prediction model based on machine learning (ML) approaches was developed to predict GPCRs and ligand interactions. Decision tree (DT), random forest (RF), multilayer perceptron (MLP), support vector machine (SVM), and Naive Bayes (NB) were the algorithms that were investigated in this study. After several optimization steps, receiver operating characteristic (ROC) for DT, RF, MLP, SVM, and NB algorithm were 95.2, 98.1, 96.3, 95.5, and 97.3, respectively. Accordingly final model was made base on the RF algorithm. The current computational study compared with others focused on specific and important types of proteins (GPCR) interaction and employed/examined different types of sequence-based features to obtain more accurate results. Drug science researchers could widely use the developed prediction model in this study. The developed predictor was applied over 16,132 GPCR-ligand pairs and about 6778 potential interactions predicted.
Collapse
Affiliation(s)
- Shiva Karimi
- Health Information Management Department, Paramedical School, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Maryam Ahmadi
- Department of Health Information Management, School of Management and Medical Information Sciences, Iran University of Medical Sciences, Tehran, Iran
| | - Farjam Goudarzi
- Regenerative Medicine Research Center, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| |
Collapse
|