1
|
Zhu Y, Zhao L, Wen N, Wang J, Wang C. DataDTA: a multi-feature and dual-interaction aggregation framework for drug-target binding affinity prediction. Bioinformatics 2023; 39:btad560. [PMID: 37688568 PMCID: PMC10516524 DOI: 10.1093/bioinformatics/btad560] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Revised: 05/09/2023] [Accepted: 09/07/2023] [Indexed: 09/11/2023] Open
Abstract
MOTIVATION Accurate prediction of drug-target binding affinity (DTA) is crucial for drug discovery. The increase in the publication of large-scale DTA datasets enables the development of various computational methods for DTA prediction. Numerous deep learning-based methods have been proposed to predict affinities, some of which only utilize original sequence information or complex structures, but the effective combination of various information and protein-binding pockets have not been fully mined. Therefore, a new method that integrates available key information is urgently needed to predict DTA and accelerate the drug discovery process. RESULTS In this study, we propose a novel deep learning-based predictor termed DataDTA to estimate the affinities of drug-target pairs. DataDTA utilizes descriptors of predicted pockets and sequences of proteins, as well as low-dimensional molecular features and SMILES strings of compounds as inputs. Specifically, the pockets were predicted from the three-dimensional structure of proteins and their descriptors were extracted as the partial input features for DTA prediction. The molecular representation of compounds based on algebraic graph features was collected to supplement the input information of targets. Furthermore, to ensure effective learning of multiscale interaction features, a dual-interaction aggregation neural network strategy was developed. DataDTA was compared with state-of-the-art methods on different datasets, and the results showed that DataDTA is a reliable prediction tool for affinities estimation. Specifically, the concordance index (CI) of DataDTA is 0.806 and the Pearson correlation coefficient (R) value is 0.814 on the test dataset, which is higher than other methods. AVAILABILITY AND IMPLEMENTATION The codes and datasets of DataDTA are available at https://github.com/YanZhu06/DataDTA.
Collapse
Affiliation(s)
- Yan Zhu
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Lingling Zhao
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Naifeng Wen
- School of Mechanical and Electrical Engineering, Dalian Minzu University, Dalian 116600, China
| | - Junjie Wang
- Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| |
Collapse
|
2
|
Singha M, Pu L, Stanfield BA, Uche IK, Rider PJF, Kousoulas KG, Ramanujam J, Brylinski M. Artificial intelligence to guide precision anticancer therapy with multitargeted kinase inhibitors. BMC Cancer 2022; 22:1211. [PMID: 36434556 PMCID: PMC9694576 DOI: 10.1186/s12885-022-10293-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Accepted: 11/07/2022] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Vast amounts of rapidly accumulating biological data related to cancer and a remarkable progress in the field of artificial intelligence (AI) have paved the way for precision oncology. Our recent contribution to this area of research is CancerOmicsNet, an AI-based system to predict the therapeutic effects of multitargeted kinase inhibitors across various cancers. This approach was previously demonstrated to outperform other deep learning methods, graph kernel models, molecular docking, and drug binding pocket matching. METHODS CancerOmicsNet integrates multiple heterogeneous data by utilizing a deep graph learning model with sophisticated attention propagation mechanisms to extract highly predictive features from cancer-specific networks. The AI-based system was devised to provide more accurate and robust predictions than data-driven therapeutic discovery using gene signature reversion. RESULTS Selected CancerOmicsNet predictions obtained for "unseen" data are positively validated against the biomedical literature and by live-cell time course inhibition assays performed against breast, pancreatic, and prostate cancer cell lines. Encouragingly, six molecules exhibited dose-dependent antiproliferative activities, with pan-CDK inhibitor JNJ-7706621 and Src inhibitor PP1 being the most potent against the pancreatic cancer cell line Panc 04.03. CONCLUSIONS CancerOmicsNet is a promising AI-based platform to help guide the development of new approaches in precision oncology involving a variety of tumor types and therapeutics.
Collapse
Affiliation(s)
- Manali Singha
- grid.64337.350000 0001 0662 7451Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803 USA
| | - Limeng Pu
- grid.64337.350000 0001 0662 7451Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803 USA
| | - Brent A. Stanfield
- grid.64337.350000 0001 0662 7451Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA
| | - Ifeanyi K. Uche
- grid.64337.350000 0001 0662 7451Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA ,grid.64337.350000 0001 0662 7451Division of Biotechnology and Molecular Medicine, Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA ,grid.279863.10000 0000 8954 1233School of Medicine, Louisiana State University Health Sciences Center, New Orleans, LA 70112 USA
| | - Paul J. F. Rider
- grid.64337.350000 0001 0662 7451Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA ,grid.64337.350000 0001 0662 7451Division of Biotechnology and Molecular Medicine, Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA
| | - Konstantin G. Kousoulas
- grid.64337.350000 0001 0662 7451Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA ,grid.64337.350000 0001 0662 7451Division of Biotechnology and Molecular Medicine, Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA
| | - J. Ramanujam
- grid.64337.350000 0001 0662 7451Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803 USA ,grid.64337.350000 0001 0662 7451Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803 USA
| | - Michal Brylinski
- grid.64337.350000 0001 0662 7451Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803 USA ,grid.64337.350000 0001 0662 7451Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803 USA
| |
Collapse
|
3
|
Shi W, Singha M, Pu L, Srivastava G, Ramanujam J, Brylinski M. GraphSite: Ligand Binding Site Classification with Deep Graph Learning. Biomolecules 2022; 12:1053. [PMID: 36008947 PMCID: PMC9405584 DOI: 10.3390/biom12081053] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/18/2022] [Accepted: 07/20/2022] [Indexed: 12/10/2022] Open
Abstract
The binding of small organic molecules to protein targets is fundamental to a wide array of cellular functions. It is also routinely exploited to develop new therapeutic strategies against a variety of diseases. On that account, the ability to effectively detect and classify ligand binding sites in proteins is of paramount importance to modern structure-based drug discovery. These complex and non-trivial tasks require sophisticated algorithms from the field of artificial intelligence to achieve a high prediction accuracy. In this communication, we describe GraphSite, a deep learning-based method utilizing a graph representation of local protein structures and a state-of-the-art graph neural network to classify ligand binding sites. Using neural weighted message passing layers to effectively capture the structural, physicochemical, and evolutionary characteristics of binding pockets mitigates model overfitting and improves the classification accuracy. Indeed, comprehensive cross-validation benchmarks against a large dataset of binding pockets belonging to 14 diverse functional classes demonstrate that GraphSite yields the class-weighted F1-score of 81.7%, outperforming other approaches such as molecular docking and binding site matching. Further, it also generalizes well to unseen data with the F1-score of 70.7%, which is the expected performance in real-world applications. We also discuss new directions to improve and extend GraphSite in the future.
Collapse
Affiliation(s)
- Wentao Shi
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803, USA; (W.S.); (J.R.)
| | - Manali Singha
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA; (M.S.); (G.S.)
| | - Limeng Pu
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803, USA;
| | - Gopal Srivastava
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA; (M.S.); (G.S.)
| | - Jagannathan Ramanujam
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803, USA; (W.S.); (J.R.)
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803, USA;
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA; (M.S.); (G.S.)
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803, USA;
| |
Collapse
|
4
|
Choudhury C, Arul Murugan N, Deva Priyakumar U. Structure-based drug repurposing: traditional and advanced AI/ML-aided methods. Drug Discov Today 2022; 27:1847-1861. [PMID: 35301148 PMCID: PMC8920090 DOI: 10.1016/j.drudis.2022.03.006] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 02/16/2022] [Accepted: 03/10/2022] [Indexed: 02/08/2023]
Abstract
The current global health emergency in the form of the Coronavirus 2019 (COVID-19) pandemic has highlighted the need for fast, accurate, and efficient drug discovery pipelines. Traditional drug discovery projects relying on in vitro high-throughput screening (HTS) involve large investments and sophisticated experimental set-ups, affordable only to big biopharmaceutical companies. In this scenario, application of efficient state-of-the-art computational methods and modern artificial intelligence (AI)-based algorithms for rapid screening of repurposable chemical space [approved drugs and natural products (NPs) with proven pharmacokinetic profiles] to identify the initial leads is a powerful option to save resources and time. Structure-based drug repurposing is a popular in silico repurposing approach. In this review, we discuss traditional and modern AI-based computational methods and tools applied at various stages for structure-based drug discovery (SBDD) pipelines. Additionally, we highlight the role of generative models in generating molecules with scaffolds from repurposable chemical space. Teaser: This review highlights the importance of repurposable chemical space, and the contributions of conventional in silico approaches and modern machine-learning algorithms for rapid structure-based drug repurposing.
Collapse
Affiliation(s)
- Chinmayee Choudhury
- Department of Experimental Medicine and Biotechnology, Postgraduate Institute of Medical Education and Research, Sector-12, Chandigarh 160012, India
| | - N Arul Murugan
- Department of Computer Science, School of Electrical Engineering and Computer Sciences, KTH Royal Institute of Technology, S-100 44, Stockholm, Sweden; Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi 110020, India.
| | - U Deva Priyakumar
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032, India
| |
Collapse
|
5
|
Shi W, Singha M, Srivastava G, Pu L, Ramanujam J, Brylinski M. Pocket2Drug: An Encoder-Decoder Deep Neural Network for the Target-Based Drug Design. Front Pharmacol 2022; 13:837715. [PMID: 35359869 PMCID: PMC8962739 DOI: 10.3389/fphar.2022.837715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 02/10/2022] [Indexed: 11/13/2022] Open
Abstract
Computational modeling is an essential component of modern drug discovery. One of its most important applications is to select promising drug candidates for pharmacologically relevant target proteins. Because of continuing advances in structural biology, putative binding sites for small organic molecules are being discovered in numerous proteins linked to various diseases. These valuable data offer new opportunities to build efficient computational models predicting binding molecules for target sites through the application of data mining and machine learning. In particular, deep neural networks are powerful techniques capable of learning from complex data in order to make informed drug binding predictions. In this communication, we describe Pocket2Drug, a deep graph neural network model to predict binding molecules for a given a ligand binding site. This approach first learns the conditional probability distribution of small molecules from a large dataset of pocket structures with supervised training, followed by the sampling of drug candidates from the trained model. Comprehensive benchmarking simulations show that using Pocket2Drug significantly improves the chances of finding molecules binding to target pockets compared to traditional drug selection procedures. Specifically, known binders are generated for as many as 80.5% of targets present in the testing set consisting of dissimilar data from that used to train the deep graph neural network model. Overall, Pocket2Drug is a promising computational approach to inform the discovery of novel biopharmaceuticals.
Collapse
Affiliation(s)
- Wentao Shi
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, United States
| | - Manali Singha
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Gopal Srivastava
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Limeng Pu
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, United States
| | - J. Ramanujam
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, United States
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, United States
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, United States
- *Correspondence: Michal Brylinski,
| |
Collapse
|
6
|
Liu G, Singha M, Pu L, Neupane P, Feinstein J, Wu HC, Ramanujam J, Brylinski M. GraphDTI: A robust deep learning predictor of drug-target interactions from multiple heterogeneous data. J Cheminform 2021; 13:58. [PMID: 34380569 PMCID: PMC8356453 DOI: 10.1186/s13321-021-00540-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 07/31/2021] [Indexed: 12/22/2022] Open
Abstract
Traditional techniques to identify macromolecular targets for drugs utilize solely the information on a query drug and a putative target. Nonetheless, the mechanisms of action of many drugs depend not only on their binding affinity toward a single protein, but also on the signal transduction through cascades of molecular interactions leading to certain phenotypes. Although using protein-protein interaction networks and drug-perturbed gene expression profiles can facilitate system-level investigations of drug-target interactions, utilizing such large and heterogeneous data poses notable challenges. To improve the state-of-the-art in drug target identification, we developed GraphDTI, a robust machine learning framework integrating the molecular-level information on drugs, proteins, and binding sites with the system-level information on gene expression and protein-protein interactions. In order to properly evaluate the performance of GraphDTI, we compiled a high-quality benchmarking dataset and devised a new cluster-based cross-validation protocol. Encouragingly, GraphDTI not only yields an AUC of 0.996 against the validation dataset, but it also generalizes well to unseen data with an AUC of 0.939, significantly outperforming other predictors. Finally, selected examples of identified drugtarget interactions are validated against the biomedical literature. Numerous applications of GraphDTI include the investigation of drug polypharmacological effects, side effects through offtarget binding, and repositioning opportunities.
Collapse
Affiliation(s)
- Guannan Liu
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Manali Singha
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Limeng Pu
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Prasanga Neupane
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Joseph Feinstein
- Department of Computer Science, Brown University, Providence, RI, 02902, USA
| | - Hsiao-Chun Wu
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - J Ramanujam
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, 70803, USA.,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA. .,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA.
| |
Collapse
|
7
|
Feinstein J, Shi W, Ramanujam J, Brylinski M. Bionoi: A Voronoi Diagram-Based Representation of Ligand-Binding Sites in Proteins for Machine Learning Applications. Methods Mol Biol 2021; 2266:299-312. [PMID: 33759134 DOI: 10.1007/978-1-0716-1209-5_17] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Bionoi is a new software to generate Voronoi representations of ligand-binding sites in proteins for machine learning applications. Unlike many other deep learning models in biomedicine, Bionoi utilizes off-the-shelf convolutional neural network architectures, reducing the development work without sacrificing the performance. When initially generating images of binding sites, users have the option to color the Voronoi cells based on either one of six structural, physicochemical, and evolutionary properties, or a blend of all six individual properties. Encouragingly, after inputting images generated by Bionoi into the convolutional autoencoder, the network was able to effectively learn the most salient features of binding pockets. The accuracy of the generated model is evaluated both visually and numerically through the reconstruction of binding site images from the latent feature space. The generated feature vectors capture well various properties of binding sites and thus can be applied in a multitude of machine learning projects. As a demonstration, we trained the ResNet-18 architecture from Microsoft on Bionoi images to show that it is capable to effectively classify nucleotide- and heme-binding pockets against a large dataset of control pockets binding a variety of small molecules. Bionoi is freely available to the research community at https://github.com/CSBG-LSU/BionoiNet.
Collapse
Affiliation(s)
- Joseph Feinstein
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, USA
| | - Wentao Shi
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, USA
| | - J Ramanujam
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, USA.,Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, USA
| | - Michal Brylinski
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, USA. .,Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA.
| |
Collapse
|