1
|
Jia Y, Dong H, Li L, Wang F, Juan L, Wang Y, Guo H, Zhao T. xQTLatlas: a comprehensive resource for human cellular-resolution multi-omics genetic regulatory landscape. Nucleic Acids Res 2024:gkae837. [PMID: 39351883 DOI: 10.1093/nar/gkae837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Revised: 08/26/2024] [Accepted: 09/13/2024] [Indexed: 10/03/2024] Open
Abstract
Understanding how genetic variants influence molecular phenotypes in different cellular contexts is crucial for elucidating the molecular and cellular mechanisms behind complex traits, which in turn has spurred significant advances in research into molecular quantitative trait locus (xQTL) at the cellular level. With the rapid proliferation of data, there is a critical need for a comprehensive and accessible platform to integrate this information. To meet this need, we developed xQTLatlas (http://www.hitxqtl.org.cn/), a database that provides a multi-omics genetic regulatory landscape at cellular resolution. xQTLatlas compiles xQTL summary statistics from 151 cell types and 339 cell states across 55 human tissues. It organizes these data into 20 xQTL types, based on four distinct discovery strategies, and spans 13 molecular phenotypes. Each entry in xQTLatlas is meticulously annotated with comprehensive metadata, including the origin of the tissue, cell type, cell state and the QTL discovery strategies utilized. Additionally, xQTLatlas features multiscale data exploration tools and a suite of interactive visualizations, facilitating in-depth analysis of cell-level xQTL. xQTLatlas provides a valuable resource for deepening our understanding of the impact of functional variants on molecular phenotypes in different cellular environments, thereby facilitating extensive research efforts.
Collapse
Affiliation(s)
- Yuran Jia
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Hongchao Dong
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Linhao Li
- School of Medicine and Health, Harbin Institute of Technology, Harbin 150001, China
| | - Fang Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Liran Juan
- School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Yadong Wang
- School of Medicine and Health, Harbin Institute of Technology, Harbin 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Harbin 450000, China
| | - Hongzhe Guo
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Harbin 450000, China
| | - Tianyi Zhao
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
- School of Medicine and Health, Harbin Institute of Technology, Harbin 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Harbin 450000, China
| |
Collapse
|
2
|
Liu Y, Xia X, Gong Y, Song B, Zeng X. SSR-DTA: Substructure-aware multi-layer graph neural networks for drug-target binding affinity prediction. Artif Intell Med 2024; 157:102983. [PMID: 39321746 DOI: 10.1016/j.artmed.2024.102983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 09/10/2024] [Accepted: 09/13/2024] [Indexed: 09/27/2024]
Abstract
Accurate prediction of drug-target binding affinity (DTA) is essential in the field of drug discovery. Recently, scientists have been attempting to utilize artificial intelligence prediction to screen out a significant number of ineffective compounds, thereby mitigating labor and financial losses. While graph neural networks (GNNs) have been applied to DTA, existing GNNs have limitations in effectively extracting substructural features across various sizes. Functional groups play a crucial role in modulating molecular properties, but existing GNNs struggle with feature extraction from certain motifs due to scale mismatches. Additionally, sequence-based models for target proteins lack the integration of structural information. To address these limitations, we present SSR-DTA, a multi-layer graph network capable of adapting to diverse structural sizes, which can extract richer biological features, thereby improving the robustness and accuracy of predictions. Multi-layer GNNs enable the capture of molecular motifs across different scales, ranging from atomic to macrocyclic motifs. Furthermore, we introduce BiGNN to simultaneously learn sequence and structural information. Sequence information corresponds to the primary structure of proteins, while graph information represents the tertiary structure. BiGNN assimilates richer information compared to sequence-based methods while mitigating the impact of errors from predicted structures, resulting in more accurate predictions. Through rigorous experimental evaluations conducted on four benchmark datasets, we demonstrate the superiority of SSR-DTA over state-of-the-art models. Particularly, in comparison to state-of-the-art models, SSR-DTA demonstrates an impressive 20% reduction in mean squared error on the Davis dataset and a 5% reduction on the KIBA dataset, underscoring its potential as a valuable tool for advancing DTA prediction.
Collapse
Affiliation(s)
- Yuansheng Liu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410086, Hunan, China; Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, Anhui University, Hefei, 230601, Anhui, China
| | - Xinyan Xia
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410086, Hunan, China
| | - Yongshun Gong
- School of Software, Shandong University, Jinan, 250100, Shandong, China
| | - Bosheng Song
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410086, Hunan, China
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410086, Hunan, China.
| |
Collapse
|
3
|
Wang S, Liu Y, Zhang Y, Zhang K, Song X, Zhang Y, Pang S. CHL-DTI: A Novel High-Low Order Information Convergence Framework for Effective Drug-Target Interaction Prediction. Interdiscip Sci 2024; 16:568-578. [PMID: 38483753 DOI: 10.1007/s12539-024-00608-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 01/05/2024] [Accepted: 01/07/2024] [Indexed: 09/19/2024]
Abstract
Recognizing drug-target interactions (DTI) stands as a pivotal element in the expansive field of drug discovery. Traditional biological wet experiments, although valuable, are time-consuming and costly as methods. Recently, computational methods grounded in network learning have demonstrated great advantages by effective topological feature extraction and attracted extensive research attention. However, most existing network-based learning methods only consider the low-order binary correlation between individual drug and target, neglecting the potential higher-order correlation information derived from multiple drugs and targets. High-order information, as an essential component, exhibits complementarity with low-order information. Hence, the incorporation of higher-order associations between drugs and targets, while adequately integrating them with the existing lower-order information, could potentially yield substantial breakthroughs in predicting drug-target interactions. We propose a novel dual channels network-based learning model CHL-DTI that converges high-order information from hypergraphs and low-order information from ordinary graph for drug-target interaction prediction. The convergence of high-low order information in CHL-DTI is manifested in two key aspects. First, during the feature extraction stage, the model integrates both high-level semantic information and low-level topological information by combining hypergraphs and ordinary graph. Second, CHL-DTI fully fuse the innovative introduced drug-protein pairs (DPP) hypergraph network structure with ordinary topological network structure information. Extensive experimentation conducted on three public datasets showcases the superior performance of CHL-DTI in DTI prediction tasks when compared to SOTA methods. The source code of CHL-DTI is available at https://github.com/UPCLyy/CHL-DTI .
Collapse
Affiliation(s)
- Shudong Wang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
| | - Yingye Liu
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
| | - Yuanyuan Zhang
- College of Information and Control Engineering, Qingdao University of Technology, Qingdao, 266520, China.
| | - Kuijie Zhang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
| | - Xuanmo Song
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
| | - Yu Zhang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
| | - Shanchen Pang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
| |
Collapse
|
4
|
Long S, Tang X, Si X, Kong T, Zhu Y, Wang C, Qi C, Mu Z, Liu J. TriFusion enables accurate prediction of miRNA-disease association by a tri-channel fusion neural network. Commun Biol 2024; 7:1067. [PMID: 39215090 PMCID: PMC11364641 DOI: 10.1038/s42003-024-06734-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 08/13/2024] [Indexed: 09/04/2024] Open
Abstract
The identification of miRNA-disease associations is crucial for early disease prevention and treatment. However, it is still a computational challenge to accurately predict such associations due to improper information encoding. Previous methods characterize miRNA-disease associations only from single levels, causing the loss of multi-level association information. In this study, we propose TriFusion, a powerful and interpretable deep learning framework for miRNA-disease association prediction. It develops a tri-channel architecture to encode the association features of miRNAs and diseases from different levels and designs a feature fusion encoder to smoothly fuse these features. After training and testing, TriFusion outperforms other leading methods and offers strong interpretability through its learned representations. Furthermore, TriFusion is applied to three high-risk sexually associated cancers (ovarian, breast, and prostate cancers) and exhibits remarkable ability in the identification of miRNAs associated with the three diseases.
Collapse
Affiliation(s)
- Sheng Long
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Xiaoran Tang
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Xinyi Si
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Tongxin Kong
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Yanhao Zhu
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Chuanzhi Wang
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Chenqing Qi
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Zengchao Mu
- School of Mathematics and Statistics, Shandong University, Weihai, China.
| | - Juntao Liu
- School of Mathematics and Statistics, Shandong University, Weihai, China.
| |
Collapse
|
5
|
Jia ZC, Yang X, Wu YK, Li M, Das D, Chen MX, Wu J. The Art of Finding the Right Drug Target: Emerging Methods and Strategies. Pharmacol Rev 2024; 76:896-914. [PMID: 38866560 PMCID: PMC11334170 DOI: 10.1124/pharmrev.123.001028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 05/28/2024] [Accepted: 05/31/2024] [Indexed: 06/14/2024] Open
Abstract
Drug targets are specific molecules in biological tissues and body fluids that interact with drugs. Drug target discovery is a key component of drug discovery and is essential for the development of new drugs in areas such as cancer therapy and precision medicine. Traditional in vitro or in vivo target discovery methods are time-consuming and labor-intensive, limiting the pace of drug discovery. With the development of modern discovery methods, the discovery and application of various emerging technologies have greatly improved the efficiency of drug discovery, shortened the cycle time, and reduced the cost. This review provides a comprehensive overview of various emerging drug target discovery strategies, including computer-assisted approaches, drug affinity response target stability, multiomics analysis, gene editing, and nonsense-mediated mRNA degradation, and discusses the effectiveness and limitations of the various approaches, as well as their application in real cases. Through the review of the aforementioned contents, a general overview of the development of novel drug targets and disease treatment strategies will be provided, and a theoretical basis will be provided for those who are engaged in pharmaceutical science research. SIGNIFICANCE STATEMENT: Target-based drug discovery has been the main approach to drug discovery in the pharmaceutical industry for the past three decades. Traditional drug target discovery methods based on in vivo or in vitro validation are time-consuming and costly, greatly limiting the development of new drugs. Therefore, the development and selection of new methods in the drug target discovery process is crucial.
Collapse
Affiliation(s)
- Zi-Chang Jia
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals of Guizhou University, Guiyang, China (Z.-C.J., X.Y., Y.-K.W., M.-X.C., J.W.); The Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee (D.D.); and State Key Laboratory of Crop Biology, College of Life Science, Shandong Agricultural University, Taian, Shandong, China (M.L.)
| | - Xue Yang
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals of Guizhou University, Guiyang, China (Z.-C.J., X.Y., Y.-K.W., M.-X.C., J.W.); The Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee (D.D.); and State Key Laboratory of Crop Biology, College of Life Science, Shandong Agricultural University, Taian, Shandong, China (M.L.)
| | - Yi-Kun Wu
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals of Guizhou University, Guiyang, China (Z.-C.J., X.Y., Y.-K.W., M.-X.C., J.W.); The Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee (D.D.); and State Key Laboratory of Crop Biology, College of Life Science, Shandong Agricultural University, Taian, Shandong, China (M.L.)
| | - Min Li
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals of Guizhou University, Guiyang, China (Z.-C.J., X.Y., Y.-K.W., M.-X.C., J.W.); The Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee (D.D.); and State Key Laboratory of Crop Biology, College of Life Science, Shandong Agricultural University, Taian, Shandong, China (M.L.)
| | - Debatosh Das
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals of Guizhou University, Guiyang, China (Z.-C.J., X.Y., Y.-K.W., M.-X.C., J.W.); The Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee (D.D.); and State Key Laboratory of Crop Biology, College of Life Science, Shandong Agricultural University, Taian, Shandong, China (M.L.) ;
| | - Mo-Xian Chen
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals of Guizhou University, Guiyang, China (Z.-C.J., X.Y., Y.-K.W., M.-X.C., J.W.); The Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee (D.D.); and State Key Laboratory of Crop Biology, College of Life Science, Shandong Agricultural University, Taian, Shandong, China (M.L.) ;
| | - Jian Wu
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals of Guizhou University, Guiyang, China (Z.-C.J., X.Y., Y.-K.W., M.-X.C., J.W.); The Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee (D.D.); and State Key Laboratory of Crop Biology, College of Life Science, Shandong Agricultural University, Taian, Shandong, China (M.L.) ;
| |
Collapse
|
6
|
Lavecchia A. Advancing drug discovery with deep attention neural networks. Drug Discov Today 2024; 29:104067. [PMID: 38925473 DOI: 10.1016/j.drudis.2024.104067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 06/10/2024] [Accepted: 06/19/2024] [Indexed: 06/28/2024]
Abstract
In the dynamic field of drug discovery, deep attention neural networks are revolutionizing our approach to complex data. This review explores the attention mechanism and its extended architectures, including graph attention networks (GATs), transformers, bidirectional encoder representations from transformers (BERT), generative pre-trained transformers (GPTs) and bidirectional and auto-regressive transformers (BART). Delving into their core principles and multifaceted applications, we uncover their pivotal roles in catalyzing de novo drug design, predicting intricate molecular properties and deciphering elusive drug-target interactions. Despite challenges, these attention-based architectures hold unparalleled promise to drive transformative breakthroughs and accelerate progress in pharmaceutical research.
Collapse
Affiliation(s)
- Antonio Lavecchia
- Drug Discovery Laboratory, Department of Pharmacy, University of Napoli Federico II, I-80131 Naples, Italy.
| |
Collapse
|
7
|
Tian Z, Dai Y, Hu F, Shen Z, Xu H, Zhang H, Xu J, Hu Y, Diao Y, Li H. Enhancing Chemical Reaction Monitoring with a Deep Learning Model for NMR Spectra Image Matching to Target Compounds. J Chem Inf Model 2024; 64:5624-5633. [PMID: 38979856 DOI: 10.1021/acs.jcim.4c00522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
In the synthetic laboratory, researchers typically rely on nuclear magnetic resonance (NMR) spectra to elucidate structures of synthesized products and confirm whether they match the desired target compounds. As chemical synthesis technology evolves toward intelligence and continuity, efficient computer-assisted structure elucidation (CASE) techniques are required to replace time-consuming manual analysis and provide the necessary speed. However, current CASE methods typically aim to derive precise chemical structures from spectroscopic data, yet they suffer from drawbacks such as low accuracy, high computational cost, and reliance on chemical libraries. In meticulously designed chemical synthesis reactions, researchers prioritize confirming the attainment of the target product based on NMR spectra, rather than focusing on identifying the specific product obtained. For this purpose, we innovatively developed a binary classification model, termed as MatCS, to directly predict the relationship between NMR spectra image (including 1H NMR and 13C NMR) and the molecular structure of the target compound. After evaluating various feature extraction methods, MatCS employs a combination of the Graph Attention Networks and Graph Convolutional Networks to learn the structural features of molecular graphs and the pretrained ResNet101 network with a Convolutional Block Attention Module to extract features from NMR spectra images. The results show that on a challenging Testsim data set, which poses difficulty in distinguishing spectra of similar molecular structures, MatCS achieves comprehensive evaluation metrics with an F1-score of 0.81 and an AUC value of 0.87. Simultaneously, it exhibited commendable performance on an external SDBS data set containing experimental NMR spectra, showcasing substantial potential for structural verification tasks in real automated chemical synthesis.
Collapse
Affiliation(s)
- ZiJing Tian
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Yan Dai
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Feng Hu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - ZiHao Shen
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - HongLing Xu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - HongWen Zhang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - JinHang Xu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - YuTing Hu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - YanYan Diao
- Innovation Center for AI and Drug Discovery, School of Pharmacy, East China Normal University, Shanghai 200062, China
- Lingang Laboratory, Shanghai 200031, China
| | - HongLin Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
- Innovation Center for AI and Drug Discovery, School of Pharmacy, East China Normal University, Shanghai 200062, China
- Lingang Laboratory, Shanghai 200031, China
| |
Collapse
|
8
|
Zhu Y, Ning C, Zhang N, Wang M, Zhang Y. GSRF-DTI: a framework for drug-target interaction prediction based on a drug-target pair network and representation learning on a large graph. BMC Biol 2024; 22:156. [PMID: 39020316 PMCID: PMC11256582 DOI: 10.1186/s12915-024-01949-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 07/01/2024] [Indexed: 07/19/2024] Open
Abstract
BACKGROUND Identification of potential drug-target interactions (DTIs) with high accuracy is a key step in drug discovery and repositioning, especially concerning specific drug targets. Traditional experimental methods for identifying the DTIs are arduous, time-intensive, and financially burdensome. In addition, robust computational methods have been developed for predicting the DTIs and are widely applied in drug discovery research. However, advancing more precise algorithms for predicting DTIs is essential to meet the stringent standards demanded by drug discovery. RESULTS We proposed a novel method called GSRF-DTI, which integrates networks with a deep learning algorithm to identify DTIs. Firstly, GSRF-DTI learned the embedding representation of drugs and targets by integrating multiple drug association information and target association information, respectively. Then, GSRF-DTI considered the influence of drug-target pair (DTP) association on DTI prediction to construct a drug-target pair network (DTP-NET). Next, we utilized GraphSAGE on DTP-NET to learn the potential features of the network and applied random forest (RF) to predict the DTIs. Furthermore, we conducted ablation experiments to validate the necessity of integrating different types of network features for identifying DTIs. It is worth noting that GSRF-DTI proposed three novel DTIs. CONCLUSIONS GSRF-DTI not only considered the influence of the interaction relationship between drug and target but also considered the impact of DTP association relationship on DTI prediction. We initially use GraphSAGE to aggregate the neighbor information of nodes for better identification. Experimental analysis on Luo's dataset and the newly constructed dataset revealed that the GSRF-DTI framework outperformed several state-of-the-art methods significantly.
Collapse
Affiliation(s)
- Yongdi Zhu
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China
| | - Chunhui Ning
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China
| | - Naiqian Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China
| | - Mingyi Wang
- Department of Central Lab, Weihai Municipal Hospital, Weihai, Shandong, China.
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China.
| |
Collapse
|
9
|
Hu X, Sun Z, Nian Y, Wang Y, Dang Y, Li F, Feng J, Yu E, Tao C. Self-Explainable Graph Neural Network for Alzheimer Disease and Related Dementias Risk Prediction: Algorithm Development and Validation Study. JMIR Aging 2024; 7:e54748. [PMID: 38976869 PMCID: PMC11263893 DOI: 10.2196/54748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 03/31/2024] [Accepted: 06/02/2024] [Indexed: 07/10/2024] Open
Abstract
BACKGROUND Alzheimer disease and related dementias (ADRD) rank as the sixth leading cause of death in the United States, underlining the importance of accurate ADRD risk prediction. While recent advancements in ADRD risk prediction have primarily relied on imaging analysis, not all patients undergo medical imaging before an ADRD diagnosis. Merging machine learning with claims data can reveal additional risk factors and uncover interconnections among diverse medical codes. OBJECTIVE The study aims to use graph neural networks (GNNs) with claim data for ADRD risk prediction. Addressing the lack of human-interpretable reasons behind these predictions, we introduce an innovative, self-explainable method to evaluate relationship importance and its influence on ADRD risk prediction. METHODS We used a variationally regularized encoder-decoder GNN (variational GNN [VGNN]) integrated with our proposed relation importance method for estimating ADRD likelihood. This self-explainable method can provide a feature-important explanation in the context of ADRD risk prediction, leveraging relational information within a graph. Three scenarios with 1-year, 2-year, and 3-year prediction windows were created to assess the model's efficiency, respectively. Random forest (RF) and light gradient boost machine (LGBM) were used as baselines. By using this method, we further clarify the key relationships for ADRD risk prediction. RESULTS In scenario 1, the VGNN model showed area under the receiver operating characteristic (AUROC) scores of 0.7272 and 0.7480 for the small subset and the matched cohort data set. It outperforms RF and LGBM by 10.6% and 9.1%, respectively, on average. In scenario 2, it achieved AUROC scores of 0.7125 and 0.7281, surpassing the other models by 10.5% and 8.9%, respectively. Similarly, in scenario 3, AUROC scores of 0.7001 and 0.7187 were obtained, exceeding 10.1% and 8.5% than the baseline models, respectively. These results clearly demonstrate the significant superiority of the graph-based approach over the tree-based models (RF and LGBM) in predicting ADRD. Furthermore, the integration of the VGNN model and our relation importance interpretation could provide valuable insight into paired factors that may contribute to or delay ADRD progression. CONCLUSIONS Using our innovative self-explainable method with claims data enhances ADRD risk prediction and provides insights into the impact of interconnected medical code relationships. This methodology not only enables ADRD risk modeling but also shows potential for other image analysis predictions using claims data.
Collapse
Affiliation(s)
- Xinyue Hu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, United States
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Zenan Sun
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Yi Nian
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Yichen Wang
- Division of Hospital Medicine at Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, United States
| | - Yifang Dang
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Fang Li
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, United States
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Jingna Feng
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, United States
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Evan Yu
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Cui Tao
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, United States
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| |
Collapse
|
10
|
Wu H, Liu J, Zhang R, Lu Y, Cui G, Cui Z, Ding Y. A review of deep learning methods for ligand based drug virtual screening. FUNDAMENTAL RESEARCH 2024; 4:715-737. [PMID: 39156568 PMCID: PMC11330120 DOI: 10.1016/j.fmre.2024.02.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/10/2024] [Accepted: 02/18/2024] [Indexed: 08/20/2024] Open
Abstract
Drug discovery is costly and time consuming, and modern drug discovery endeavors are progressively reliant on computational methodologies, aiming to mitigate temporal and financial expenditures associated with the process. In particular, the time required for vaccine and drug discovery is prolonged during emergency situations such as the coronavirus 2019 pandemic. Recently, the performance of deep learning methods in drug virtual screening has been particularly prominent. It has become a concern for researchers how to summarize the existing deep learning in drug virtual screening, select different models for different drug screening problems, exploit the advantages of deep learning models, and further improve the capability of deep learning in drug virtual screening. This review first introduces the basic concepts of drug virtual screening, common datasets, and data representation methods. Then, large numbers of common deep learning methods for drug virtual screening are compared and analyzed. In addition, a dataset of different sizes is constructed independently to evaluate the performance of each deep learning model for the difficult problem of large-scale ligand virtual screening. Finally, the existing challenges and future directions in the field of virtual screening are presented.
Collapse
Affiliation(s)
- Hongjie Wu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Junkai Liu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Runhua Zhang
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Yaoyao Lu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Guozeng Cui
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Zhiming Cui
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| |
Collapse
|
11
|
Yao R, Shen Z, Xu X, Ling G, Xiang R, Song T, Zhai F, Zhai Y. Knowledge mapping of graph neural networks for drug discovery: a bibliometric and visualized analysis. Front Pharmacol 2024; 15:1393415. [PMID: 38799167 PMCID: PMC11116974 DOI: 10.3389/fphar.2024.1393415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 04/12/2024] [Indexed: 05/29/2024] Open
Abstract
Introduction In recent years, graph neural network has been extensively applied to drug discovery research. Although researchers have made significant progress in this field, there is less research on bibliometrics. The purpose of this study is to conduct a comprehensive bibliometric analysis of graph neural network applications in drug discovery in order to identify current research hotspots and trends, as well as serve as a reference for future research. Methods Publications from 2017 to 2023 about the application of graph neural network in drug discovery were collected from the Web of Science Core Collection. Bibliometrix, VOSviewer, and Citespace were mainly used for bibliometric studies. Results and Discussion In this paper, a total of 652 papers from 48 countries/regions were included. Research interest in this field is continuously increasing. China and the United States have a significant advantage in terms of funding, the number of publications, and collaborations with other institutions and countries. Although some cooperation networks have been formed in this field, extensive worldwide cooperation still needs to be strengthened. The results of the keyword analysis clarified that graph neural network has primarily been applied to drug-target interaction, drug repurposing, and drug-drug interaction, while graph convolutional neural network and its related optimization methods are currently the core algorithms in this field. Data availability and ethical supervision, balancing computing resources, and developing novel graph neural network models with better interpretability are the key technical issues currently faced. This paper analyzes the current state, hot spots, and trends of graph neural network applications in drug discovery through bibliometric approaches, as well as the current issues and challenges in this field. These findings provide researchers with valuable insights on the current status and future directions of this field.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Fei Zhai
- Faculty of Medical Device, Shenyang Pharmaceutical University, Shenyang, China
| | - Yuxuan Zhai
- Faculty of Medical Device, Shenyang Pharmaceutical University, Shenyang, China
| |
Collapse
|
12
|
Wang R, Zhou Z, Wu X, Jiang X, Zhuo L, Liu M, Li H, Fu X, Yao X. An Effective Plant Small Secretory Peptide Recognition Model Based on Feature Correction Strategy. J Chem Inf Model 2024; 64:2798-2806. [PMID: 37643082 DOI: 10.1021/acs.jcim.3c00868] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Plant small secretory peptides (SSPs) play an important role in the regulation of biological processes in plants. Accurately predicting SSPs enables efficient exploration of their functions. Traditional experimental verification methods are very reliable and accurate, but they require expensive equipment and a lot of time. The method of machine learning speeds up the prediction process of SSPs, but the instability of feature extraction will also lead to further limitations of this type of method. Therefore, this paper proposes a new feature-correction-based model for SSP recognition in plants, abbreviated as SE-SSP. The model mainly includes the following three advantages: First, the use of transformer encoders can better reveal implicit features. Second, design a feature correction module suitable for sequences, named 2-D SENET, to adaptively adjust the features to obtain a more robust feature representation. Third, stack multiple linear modules to further dig out the deep information on the sample. At the same time, the training based on a contrastive learning strategy can alleviate the problem of sparse samples. We construct experiments on publicly available data sets, and the results verify that our model shows an excellent performance. The proposed model can be used as a convenient and effective SSP prediction tool in the future. Our data and code are publicly available at https://github.com/wrab12/SE-SSP/.
Collapse
Affiliation(s)
- Rui Wang
- Wenzhou University of Technology, 325000 Wenzhou, China
| | - Zhecheng Zhou
- Wenzhou University of Technology, 325000 Wenzhou, China
| | - Xiaonan Wu
- Wenzhou University of Technology, 325000 Wenzhou, China
| | - Xin Jiang
- Wenzhou University of Technology, 325000 Wenzhou, China
| | - Linlin Zhuo
- Wenzhou University of Technology, 325000 Wenzhou, China
| | - Mingzhe Liu
- Wenzhou University of Technology, 325000 Wenzhou, China
| | - Hao Li
- Central South University, 410083 Changsha, China
| | - Xiangzheng Fu
- Faculty of Applied Sciences, Macao Polytechnic University, 999078, Macao
| | - Xiaojun Yao
- Faculty of Applied Sciences, Macao Polytechnic University, 999078, Macao
| |
Collapse
|
13
|
Zhang Z, Zhao L, Gao M, Chen Y, Wang J, Wang C. PPII-AEAT: Prediction of protein-protein interaction inhibitors based on autoencoders with adversarial training. Comput Biol Med 2024; 172:108287. [PMID: 38503089 DOI: 10.1016/j.compbiomed.2024.108287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 02/21/2024] [Accepted: 03/12/2024] [Indexed: 03/21/2024]
Abstract
Protein-protein interactions (PPIs) have shown increasing potential as novel drug targets. The design and development of small molecule inhibitors targeting specific PPIs are crucial for the prevention and treatment of related diseases. Accordingly, effective computational methods are highly desired to meet the emerging need for the large-scale accurate prediction of PPI inhibitors. However, existing machine learning models rely heavily on the manual screening of features and lack generalizability. Here, we propose a new PPI inhibitor prediction method based on autoencoders with adversarial training (named PPII-AEAT) that can adaptively learn molecule representation to cope with different PPI targets. First, Extended-connectivity fingerprints and Mordred descriptors are employed to extract the primary features of small molecular compounds. Then, an autoencoder architecture is trained in three phases to learn high-level representations and predict inhibitory scores. We evaluate PPII-AEAT on nine PPI targets and two different tasks, including the PPI inhibitor identification task and inhibitory potency prediction task. The experimental results show that our proposed PPII-AEAT outperforms state-of-the-art methods.
Collapse
Affiliation(s)
- Zitong Zhang
- Faculty of Computing, Harbin Institute of Technology, Harbin, 150001, China
| | - Lingling Zhao
- Faculty of Computing, Harbin Institute of Technology, Harbin, 150001, China
| | - Mengyao Gao
- Faculty of Computing, Harbin Institute of Technology, Harbin, 150001, China
| | - Yuanlong Chen
- Faculty of Computing, Harbin Institute of Technology, Harbin, 150001, China
| | - Junjie Wang
- Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, 211166, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, 150001, China.
| |
Collapse
|
14
|
Zhang Y, Li S, Meng K, Sun S. Machine Learning for Sequence and Structure-Based Protein-Ligand Interaction Prediction. J Chem Inf Model 2024; 64:1456-1472. [PMID: 38385768 DOI: 10.1021/acs.jcim.3c01841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
Developing new drugs is too expensive and time -consuming. Accurately predicting the interaction between drugs and targets will likely change how the drug is discovered. Machine learning-based protein-ligand interaction prediction has demonstrated significant potential. In this paper, computational methods, focusing on sequence and structure to study protein-ligand interactions, are examined. Therefore, this paper starts by presenting an overview of the data sets applied in this area, as well as the various approaches applied for representing proteins and ligands. Then, sequence-based and structure-based classification criteria are subsequently utilized to categorize and summarize both the classical machine learning models and deep learning models employed in protein-ligand interaction studies. Moreover, the evaluation methods and interpretability of these models are proposed. Furthermore, delving into the diverse applications of protein-ligand interaction models in drug research is presented. Lastly, the current challenges and future directions in this field are addressed.
Collapse
Affiliation(s)
- Yunjiang Zhang
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Shuyuan Li
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Kong Meng
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Shaorui Sun
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| |
Collapse
|
15
|
E Z, Qiao G, Wang G, Li Y. GSL-DTI: Graph structure learning network for Drug-Target interaction prediction. Methods 2024; 223:136-145. [PMID: 38360082 DOI: 10.1016/j.ymeth.2024.01.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 12/23/2023] [Accepted: 01/29/2024] [Indexed: 02/17/2024] Open
Abstract
MOTIVATION Drug-target interaction prediction is an important area of research to predict whether there is an interaction between a drug molecule and its target protein. It plays a critical role in drug discovery and development by facilitating the identification of potential drug candidates and expediting the overall process. Given the time-consuming, expensive, and high-risk nature of traditional drug discovery methods, the prediction of drug-target interactions has become an indispensable tool. Using machine learning and deep learning to tackle this class of problems has become a mainstream approach, and graph-based models have recently received much attention in this field. However, many current graph-based Drug-Target Interaction (DTI) prediction methods rely on manually defined rules to construct the Drug-Protein Pair (DPP) network during the DPP representation learning process. However, these methods fail to capture the true underlying relationships between drug molecules and target proteins. RESULTS We propose GSL-DTI, an automatic graph structure learning model used for predicting drug-target interactions (DTIs). Initially, we integrate large-scale heterogeneous networks using a graph convolution network based on meta-paths, effectively learning the representations of drugs and target proteins. Subsequently, we construct drug-protein pairs based on these representations. In contrast to previous studies that construct DPP networks based on manual rules, our method introduces an automatic graph structure learning approach. This approach utilizes a filter gate on the affinity scores of DPPs and relies on the classification loss of downstream tasks to guide the learning of the underlying DPP network structure. Based on the learned DPP network, we transform the prediction of drug-target interactions into a node classification problem. The comprehensive experiments conducted on three public datasets have shown the superiority of GSL-DTI in the tasks of DTI prediction. Additionally, GSL-DTI provides a fresh perspective for advancing research in graph structure learning for DTI prediction.
Collapse
Affiliation(s)
- Zixuan E
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China
| | - Guanyu Qiao
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China.
| | - Yang Li
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China.
| |
Collapse
|
16
|
Huang Z, Xiao Q, Xiong T, Shi W, Yang Y, Li G. Predicting Drug-Protein Interactions through Branch-Chain Mining and multi-dimensional attention network. Comput Biol Med 2024; 171:108127. [PMID: 38350397 DOI: 10.1016/j.compbiomed.2024.108127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 01/26/2024] [Accepted: 02/06/2024] [Indexed: 02/15/2024]
Abstract
Identifying drug-protein interactions (DPIs) is crucial in drug discovery and repurposing. Computational methods for precise DPI identification can expedite development timelines and reduce expenses compared with conventional experimental methods. Lately, deep learning techniques have been employed for predicting DPIs, enhancing these processes. Nevertheless, the limitations observed in prior studies, where many extract features from complete drug and protein entities, overlooking the crucial theoretical foundation that pharmacological responses are often correlated with specific substructures, can lead to poor predictive performance. Furthermore, certain substructure-focused research confines its exploration to a solitary fragment category, such as a functional group. In this study, addressing these constraints, we present an end-to-end framework termed BCMMDA for predicting DPIs. The framework considers various substructure types, including branch chains, common substructures, and specific fragments. We designed a specific feature learning module by combining our proposed multi-dimensional attention mechanism with convolutional neural networks (CNNs). Deep CNNs assist in capturing the synergistic effects among these fragment sets, enabling the extraction of relevant features of drugs and proteins. Meanwhile, the multi-dimensional attention mechanism refines the relationship between drug and protein features by assigning attention vectors to each drug compound and amino acid. This mechanism empowers the model to further concentrate on pivotal substructures and elements, thereby improving its ability to identify essential interactions in DPI prediction. We evaluated the performance of BCMMDA on four well-known benchmark datasets. The results indicated that BCMMDA outperformed state-of-the-art baseline models, demonstrating significant improvement in performance.
Collapse
Affiliation(s)
- Zhuo Huang
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China; MOE-LCSM, School of Mathematics and Statistics, Hunan Normal University, Changsha, 410081, China; College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China.
| | - Tuo Xiong
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
| | - Wanwan Shi
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Yide Yang
- Key Laboratory of Molecular Epidemiology of Hunan Province, School of Medicine, Hunan Normal University, Changsha, 410006, China.
| | - Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, 330013, China.
| |
Collapse
|
17
|
Chang Z, Zhu R, Liu J, Shang J, Dai L. HGSMDA: miRNA-Disease Association Prediction Based on HyperGCN and Sørensen-Dice Loss. Noncoding RNA 2024; 10:9. [PMID: 38392964 PMCID: PMC10893088 DOI: 10.3390/ncrna10010009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 01/19/2024] [Accepted: 01/24/2024] [Indexed: 02/25/2024] Open
Abstract
Biological research has demonstrated the significance of identifying miRNA-disease associations in the context of disease prevention, diagnosis, and treatment. However, the utilization of experimental approaches involving biological subjects to infer these associations is both costly and inefficient. Consequently, there is a pressing need to devise novel approaches that offer enhanced accuracy and effectiveness. Presently, the predominant methods employed for predicting disease associations rely on Graph Convolutional Network (GCN) techniques. However, the Graph Convolutional Network algorithm, which is locally aggregated, solely incorporates information from the immediate neighboring nodes of a given node at each layer. Consequently, GCN cannot simultaneously aggregate information from multiple nodes. This constraint significantly impacts the predictive efficacy of the model. To tackle this problem, we propose a novel approach, based on HyperGCN and Sørensen-Dice loss (HGSMDA), for predicting associations between miRNAs and diseases. In the initial phase, we developed multiple networks to represent the similarity between miRNAs and diseases and employed GCNs to extract information from diverse perspectives. Subsequently, we draw into HyperGCN to construct a miRNA-disease heteromorphic hypergraph using hypernodes and train GCN on the graph to aggregate information. Finally, we utilized the Sørensen-Dice loss function to evaluate the degree of similarity between the predicted outcomes and the ground truth values, thereby enabling the prediction of associations between miRNAs and diseases. In order to assess the soundness of our methodology, an extensive series of experiments was conducted employing the Human MicroRNA Disease Database (HMDD v3.2) as the dataset. The experimental outcomes unequivocally indicate that HGSMDA exhibits remarkable efficacy when compared to alternative methodologies. Furthermore, the predictive capacity of HGSMDA was corroborated through a case study focused on colon cancer. These findings strongly imply that HGSMDA represents a dependable and valid framework, thereby offering a novel avenue for investigating the intricate association between miRNAs and diseases.
Collapse
Affiliation(s)
| | - Rong Zhu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Z.C.); (J.L.); (J.S.); (L.D.)
| | | | | | | |
Collapse
|
18
|
Veleiro U, de la Fuente J, Serrano G, Pizurica M, Casals M, Pineda-Lucena A, Vicent S, Ochoa I, Gevaert O, Hernaez M. GeNNius: an ultrafast drug-target interaction inference method based on graph neural networks. Bioinformatics 2024; 40:btad774. [PMID: 38134424 PMCID: PMC10766589 DOI: 10.1093/bioinformatics/btad774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 11/20/2023] [Accepted: 12/21/2023] [Indexed: 12/24/2023] Open
Abstract
MOTIVATION Drug-target interaction (DTI) prediction is a relevant but challenging task in the drug repurposing field. In-silico approaches have drawn particular attention as they can reduce associated costs and time commitment of traditional methodologies. Yet, current state-of-the-art methods present several limitations: existing DTI prediction approaches are computationally expensive, thereby hindering the ability to use large networks and exploit available datasets and, the generalization to unseen datasets of DTI prediction methods remains unexplored, which could potentially improve the development processes of DTI inferring approaches in terms of accuracy and robustness. RESULTS In this work, we introduce GeNNius (Graph Embedding Neural Network Interaction Uncovering System), a Graph Neural Network (GNN)-based method that outperforms state-of-the-art models in terms of both accuracy and time efficiency across a variety of datasets. We also demonstrated its prediction power to uncover new interactions by evaluating not previously known DTIs for each dataset. We further assessed the generalization capability of GeNNius by training and testing it on different datasets, showing that this framework can potentially improve the DTI prediction task by training on large datasets and testing on smaller ones. Finally, we investigated qualitatively the embeddings generated by GeNNius, revealing that the GNN encoder maintains biological information after the graph convolutions while diffusing this information through nodes, eventually distinguishing protein families in the node embedding space. AVAILABILITY AND IMPLEMENTATION GeNNius code is available at https://github.com/ubioinformat/GeNNius.
Collapse
Affiliation(s)
- Uxía Veleiro
- CIMA University of Navarra, IdiSNA, 31008 Pamplona, Spain
| | - Jesús de la Fuente
- TECNUN, University of Navarra, 20016 San Sebastian, Spain
- Center for Data Science, New York University, New York, NY 10012, United States
| | - Guillermo Serrano
- CIMA University of Navarra, IdiSNA, 31008 Pamplona, Spain
- TECNUN, University of Navarra, 20016 San Sebastian, Spain
| | - Marija Pizurica
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Department Biomedical Data Science, Stanford University, Stanford, CA 94305, United States
- Internet Technology and Data Science LAB (IDLab), Ghent University, Gent 9052, Belgium
| | - Mikel Casals
- TECNUN, University of Navarra, 20016 San Sebastian, Spain
| | | | - Silve Vicent
- CIMA University of Navarra, IdiSNA, 31008 Pamplona, Spain
| | - Idoia Ochoa
- TECNUN, University of Navarra, 20016 San Sebastian, Spain
- Instituto de Ciencia de los Datos e Inteligencia Artificial (DATAI), University of Navarra, 31008 Pamplona, Spain
| | - Olivier Gevaert
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Department Biomedical Data Science, Stanford University, Stanford, CA 94305, United States
| | - Mikel Hernaez
- CIMA University of Navarra, IdiSNA, 31008 Pamplona, Spain
- Instituto de Ciencia de los Datos e Inteligencia Artificial (DATAI), University of Navarra, 31008 Pamplona, Spain
| |
Collapse
|
19
|
Zhang C, Sheng Q, Zhao N, Huang S, Zhao Y. DNA hypomethylation mediates immune response in pan-cancer. Epigenetics 2023; 18:2192894. [PMID: 36945884 PMCID: PMC10038033 DOI: 10.1080/15592294.2023.2192894] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2023] Open
Abstract
Abnormal DNA methylation is a fundamental characterization of epigenetics in cancer. Here we demonstrate that aberrant DNA methylating can modulate the tumour immune microenvironment in 16 cancer types. Differential DNA methylation in promoter region can regulate the transcriptomic pattern of immune-related genes and DNA hypomethylation mainly participated in the processes of immunity, carcinogenesis and immune infiltration. Moreover, many cancer types shared immune-related functions, like activation of innate immune response, interferon gamma response and NOD-like receptor signalling pathway. DNA methylation can further help identify molecular subtypes of kidney renal clear cell carcinoma. These subtypes are characterized by DNA methylation pattern, major histocompatibility complex, cytolytic activity and cytotoxic t lymphocyte and tumour mutation burden, and subtype with hypomethylation pattern shows unstable immune status. Then, we investigate the DNA methylation pattern of exhaustion-related marker genes and further demonstrate the role of hypomethylation in tumour immune microenvironment. In summary, our findings support the use of hypomethylation as a biomarker to understand the mechanism of tumour immune environment.
Collapse
Affiliation(s)
- Chunlong Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, Heilongjiang, China
| | - Qi Sheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Ning Zhao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, Heilongjiang, China
| | - Shan Huang
- The Second Affiliated Hospital, Harbin Medical University, Harbin, Heilongjiang, China
| | - Yuming Zhao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, Heilongjiang, China
| |
Collapse
|
20
|
Wang J, Xiao Y, Shang X, Peng J. Predicting drug-target binding affinity with cross-scale graph contrastive learning. Brief Bioinform 2023; 25:bbad516. [PMID: 38221904 PMCID: PMC10788681 DOI: 10.1093/bib/bbad516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 12/04/2023] [Accepted: 12/07/2023] [Indexed: 01/16/2024] Open
Abstract
Identifying the binding affinity between a drug and its target is essential in drug discovery and repurposing. Numerous computational approaches have been proposed for understanding these interactions. However, most existing methods only utilize either the molecular structure information of drugs and targets or the interaction information of drug-target bipartite networks. They may fail to combine the molecule-scale and network-scale features to obtain high-quality representations. In this study, we propose CSCo-DTA, a novel cross-scale graph contrastive learning approach for drug-target binding affinity prediction. The proposed model combines features learned from the molecular scale and the network scale to capture information from both local and global perspectives. We conducted experiments on two benchmark datasets, and the proposed model outperformed existing state-of-art methods. The ablation experiment demonstrated the significance and efficacy of multi-scale features and cross-scale contrastive learning modules in improving the prediction performance. Moreover, we applied the CSCo-DTA to predict the novel potential targets for Erlotinib and validated the predicted targets with the molecular docking analysis.
Collapse
Affiliation(s)
- Jingru Wang
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072, China
- Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi’an, 710072, China
- The National Engineering Laboratory for Integrated Aerospace-Ground-Ocean Big Data Application Technology, Xi’an, 710072, China
| | - Yihang Xiao
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072, China
- Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi’an, 710072, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072, China
- Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi’an, 710072, China
- The National Engineering Laboratory for Integrated Aerospace-Ground-Ocean Big Data Application Technology, Xi’an, 710072, China
| | - Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072, China
- Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi’an, 710072, China
- The National Engineering Laboratory for Integrated Aerospace-Ground-Ocean Big Data Application Technology, Xi’an, 710072, China
- Research and Development Institute of Northwestern Polytechnical University in Shenzhen, Shenzhen, 518000, China
| |
Collapse
|
21
|
Liu L, Zhang Q, Wei Y, Zhao Q, Liao B. A Biological Feature and Heterogeneous Network Representation Learning-Based Framework for Drug-Target Interaction Prediction. Molecules 2023; 28:6546. [PMID: 37764321 PMCID: PMC10535805 DOI: 10.3390/molecules28186546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 09/06/2023] [Accepted: 09/07/2023] [Indexed: 09/29/2023] Open
Abstract
The prediction of drug-target interaction (DTI) is crucial to drug discovery. Although the interactions between the drug and target can be accurately verified by traditional biochemical experiments, the determination of DTI through biochemical experiments is a time-consuming, laborious, and expensive process. Therefore, we propose a learning-based framework named BG-DTI for drug-target interaction prediction. Our model combines two main approaches based on biological features and heterogeneous networks to identify interactions between drugs and targets. First, we extract original features from the sequence to encode each drug and target. Later, we further consider the relationships among various biological entities by constructing drug-drug similarity networks and target-target similarity networks. Furthermore, a graph convolutional network and a graph attention network in the graph representation learning module help us learn the features representation of drugs and targets. After obtaining the features from graph representation learning modules, these features are combined into fusion descriptors for drug-target pairs. Finally, we send the fusion descriptors and labels to a random forest classifier for predicting DTI. The evaluation results show that BG-DTI achieves an average AUC of 0.938 and an average AUPR of 0.930, which is better than those of five existing state-of-the-art methods. We believe that BG-DTI can facilitate the development of drug discovery or drug repurposing.
Collapse
Affiliation(s)
- Liwei Liu
- College of Science, Dalian Jiaotong University, Dalian 116028, China; (L.L.); (Q.Z.)
- Key Laboratory of Computational Science and Application of Hainan Province, Hainan Normal University, Haikou 571158, China
| | - Qi Zhang
- College of Science, Dalian Jiaotong University, Dalian 116028, China; (L.L.); (Q.Z.)
| | - Yuxiao Wei
- College of Software, Dalian Jiaotong University, Dalian 116028, China;
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan 114051, China
| | - Bo Liao
- Key Laboratory of Computational Science and Application of Hainan Province, Hainan Normal University, Haikou 571158, China
| |
Collapse
|
22
|
Yao K, Wang X, Li W, Zhu H, Jiang Y, Li Y, Tian T, Yang Z, Liu Q, Liu Q. Semi-supervised heterogeneous graph contrastive learning for drug-target interaction prediction. Comput Biol Med 2023; 163:107199. [PMID: 37421738 DOI: 10.1016/j.compbiomed.2023.107199] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 04/15/2023] [Accepted: 06/19/2023] [Indexed: 07/10/2023]
Abstract
Identification of drug-target interactions (DTIs) is an important step in drug discovery and drug repositioning. In recent years, graph-based methods have attracted great attention and show advantages on predicting potential DTIs. However, these methods face the problem that the known DTIs are very limited and expensive to obtain, which decreases the generalization ability of the methods. Self-supervised contrastive learning is independent of labeled DTIs, which can mitigate the impact of the problem. Therefore, we propose a framework SHGCL-DTI for predicting DTIs, which supplements the classical semi-supervised DTI prediction task with an auxiliary graph contrastive learning module. Specifically, we generate representations for the nodes through the neighbor view and meta-path view, and define positive and negative pairs to maximize the similarity between positive pairs from different views. Subsequently, SHGCL-DTI reconstructs the original heterogeneous network to predict the potential DTIs. The experiments on the public dataset show that SHGCL-DTI has significant improvement in different scenarios, compared with existing state-of-the-art methods. We also demonstrate that the contrastive learning module improves the prediction performance and generalization ability of SHGCL-DTI through ablation study. In addition, we have found several novel predicted DTIs supported by the biological literature. The data and source code are available at: https://github.com/TOJSSE-iData/SHGCL-DTI.
Collapse
Affiliation(s)
- Kainan Yao
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China
| | - Xiaowen Wang
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China
| | - Wannian Li
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Yangpu District, Shanghai, 200092, China.
| | - Hongming Zhu
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China
| | - Yizhi Jiang
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China
| | - Yulong Li
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China
| | - Tongxuan Tian
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China
| | - Zhaoyi Yang
- The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, No. 96, JinZhai Road Baohe District, Hefei, 230001, Anhui, China.
| | - Qi Liu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Yangpu District, Shanghai, 200092, China.
| | - Qin Liu
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China.
| |
Collapse
|
23
|
Wang S, Song X, Zhang Y, Zhang K, Liu Y, Ren C, Pang S. MSGNN-DTA: Multi-Scale Topological Feature Fusion Based on Graph Neural Networks for Drug-Target Binding Affinity Prediction. Int J Mol Sci 2023; 24:ijms24098326. [PMID: 37176031 PMCID: PMC10179712 DOI: 10.3390/ijms24098326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 05/03/2023] [Accepted: 05/04/2023] [Indexed: 05/15/2023] Open
Abstract
The accurate prediction of drug-target binding affinity (DTA) is an essential step in drug discovery and drug repositioning. Although deep learning methods have been widely adopted for DTA prediction, the complexity of extracting drug and target protein features hampers the accuracy of these predictions. In this study, we propose a novel model for DTA prediction named MSGNN-DTA, which leverages a fused multi-scale topological feature approach based on graph neural networks (GNNs). To address the challenge of accurately extracting drug and target protein features, we introduce a gated skip-connection mechanism during the feature learning process to fuse multi-scale topological features, resulting in information-rich representations of drugs and proteins. Our approach constructs drug atom graphs, motif graphs, and weighted protein graphs to fully extract topological information and provide a comprehensive understanding of underlying molecular interactions from multiple perspectives. Experimental results on two benchmark datasets demonstrate that MSGNN-DTA outperforms the state-of-the-art models in all evaluation metrics, showcasing the effectiveness of the proposed approach. Moreover, the study conducts a case study based on already FDA-approved drugs in the DrugBank dataset to highlight the potential of the MSGNN-DTA framework in identifying drug candidates for specific targets, which could accelerate the process of virtual screening and drug repositioning.
Collapse
Affiliation(s)
- Shudong Wang
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
| | - Xuanmo Song
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
| | - Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266525, China
| | - Kuijie Zhang
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
| | - Yingye Liu
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
| | - Chuanru Ren
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
| | - Shanchen Pang
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
| |
Collapse
|
24
|
Zulfiqar H, Ahmed Z, Kissanga Grace-Mercure B, Hassan F, Zhang ZY, Liu F. Computational prediction of promotors in Agrobacterium tumefaciens strain C58 by using the machine learning technique. Front Microbiol 2023; 14:1170785. [PMID: 37125199 PMCID: PMC10133480 DOI: 10.3389/fmicb.2023.1170785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 03/17/2023] [Indexed: 05/02/2023] Open
Abstract
Promotors are those genomic regions on the upstream of genes, which are bound by RNA polymerase for starting gene transcription. Because it is the most critical element of gene expression, the recognition of promoters is crucial to understand the regulation of gene expression. This study aimed to develop a machine learning-based model to predict promotors in Agrobacterium tumefaciens (A. tumefaciens) strain C58. In the model, promotor sequences were encoded by three different kinds of feature descriptors, namely, accumulated nucleotide frequency, k-mer nucleotide composition, and binary encodings. The obtained features were optimized by using correlation and the mRMR-based algorithm. These optimized features were inputted into a random forest (RF) classifier to discriminate promotor sequences from non-promotor sequences in A. tumefaciens strain C58. The examination of 10-fold cross-validation showed that the proposed model could yield an overall accuracy of 0.837. This model will provide help for the study of promoters in A. tumefaciens C58 strain.
Collapse
Affiliation(s)
- Hasan Zulfiqar
- Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou, China
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zahoor Ahmed
- Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou, China
| | - Bakanina Kissanga Grace-Mercure
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Farwa Hassan
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zhao-Yue Zhang
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Fen Liu
- Department of Radiation Oncology, Peking University Cancer Hospital (Inner Mongolia Campus), Affiliated Cancer Hospital of Inner Mongolia Medical University, Inner Mongolia Cancer Hospital, Hohhot, China
| |
Collapse
|
25
|
Jiang Y, Jin S, Jin X, Xiao X, Wu W, Liu X, Zhang Q, Zeng X, Yang G, Niu Z. Pharmacophoric-constrained heterogeneous graph transformer model for molecular property prediction. Commun Chem 2023; 6:60. [PMID: 37012352 PMCID: PMC10070395 DOI: 10.1038/s42004-023-00857-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 03/20/2023] [Indexed: 04/05/2023] Open
Abstract
Informative representation of molecules is a crucial prerequisite in AI-driven drug design and discovery. Pharmacophore information including functional groups and chemical reactions can indicate molecular properties, which have not been fully exploited by prior atom-based molecular graph representation. To obtain a more informative representation of molecules for better molecule property prediction, we propose the Pharmacophoric-constrained Heterogeneous Graph Transformer (PharmHGT). We design a pharmacophoric-constrained multi-views molecular representation graph, enabling PharmHGT to extract vital chemical information from functional substructures and chemical reactions. With a carefully designed pharmacophoric-constrained multi-view molecular representation graph, PharmHGT can learn more chemical information from molecular functional substructures and chemical reaction information. Extensive downstream experiments prove that PharmHGT achieves remarkably superior performance over the state-of-the-art models the performance of our model is up to 1.55% in ROC-AUC and 0.272 in RMSE higher than the best baseline model) on molecular properties prediction. The ablation study and case study show that our proposed molecular graph representation method and heterogeneous graph transformer model can better capture the pharmacophoric structure and chemical information features. Further visualization studies also indicated a better representation capacity achieved by our model.
Collapse
Affiliation(s)
| | - Shuting Jin
- MindRank AI Ltd., 310000, Hangzhou, China
- School of Informatics, Xiamen University, 361005, Xiamen, China
- National Institute for Data Science in Health and Medicine, Xiamen University, 361005, Xiamen, China
| | - Xurui Jin
- MindRank AI Ltd., 310000, Hangzhou, China
| | | | - Wenfan Wu
- MindRank AI Ltd., 310000, Hangzhou, China
| | - Xiangrong Liu
- School of Informatics, Xiamen University, 361005, Xiamen, China
- National Institute for Data Science in Health and Medicine, Xiamen University, 361005, Xiamen, China
| | - Qiang Zhang
- School of Informatics, Zhejiang University, 310013, Hangzhou, China
| | - Xiangxiang Zeng
- School of Information Science and Engineering, Hunan University, 410082, Changsha, Hunan, China
| | - Guang Yang
- National Heart and Lung Institute, Imperial College London, London, UK.
| | - Zhangming Niu
- MindRank AI Ltd., 310000, Hangzhou, China.
- National Heart and Lung Institute, Imperial College London, London, UK.
| |
Collapse
|
26
|
Li M, Cai X, Xu S, Ji H. Metapath-aggregated heterogeneous graph neural network for drug-target interaction prediction. Brief Bioinform 2023; 24:6966534. [PMID: 36592060 DOI: 10.1093/bib/bbac578] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 11/03/2022] [Accepted: 11/26/2022] [Indexed: 01/03/2023] Open
Abstract
Drug-target interaction (DTI) prediction is an essential step in drug repositioning. A few graph neural network (GNN)-based methods have been proposed for DTI prediction using heterogeneous biological data. However, existing GNN-based methods only aggregate information from directly connected nodes restricted in a drug-related or a target-related network and are incapable of capturing high-order dependencies in the biological heterogeneous graph. In this paper, we propose a metapath-aggregated heterogeneous graph neural network (MHGNN) to capture complex structures and rich semantics in the biological heterogeneous graph for DTI prediction. Specifically, MHGNN enhances heterogeneous graph structure learning and high-order semantics learning by modeling high-order relations via metapaths. Additionally, MHGNN enriches high-order correlations between drug-target pairs (DTPs) by constructing a DTP correlation graph with DTPs as nodes. We conduct extensive experiments on three biological heterogeneous datasets. MHGNN favorably surpasses 17 state-of-the-art methods over 6 evaluation metrics, which verifies its efficacy for DTI prediction. The code is available at https://github.com/Zora-LM/MHGNN-DTI.
Collapse
Affiliation(s)
- Mei Li
- Tianjin Key Laboratory of Network and Data Security Technology, China.,College of Computer Science, Nankai University, 300350, Tianjin, China
| | - Xiangrui Cai
- Tianjin Key Laboratory of Network and Data Security Technology, China.,College of Computer Science, Nankai University, 300350, Tianjin, China
| | - Sihan Xu
- Tianjin Key Laboratory of Network and Data Security Technology, China.,College of Cyber Science, Nankai University, 300350, Tianjin, China
| | - Hua Ji
- Tianjin Key Laboratory of Network and Data Security Technology, China.,College of Computer Science, Nankai University, 300350, Tianjin, China
| |
Collapse
|
27
|
Talat A, Khan AU. Artificial intelligence as a smart approach to develop antimicrobial drug molecules: A paradigm to combat drug-resistant infections. Drug Discov Today 2023; 28:103491. [PMID: 36646245 DOI: 10.1016/j.drudis.2023.103491] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 01/01/2023] [Accepted: 01/05/2023] [Indexed: 01/15/2023]
Abstract
Antimicrobial resistance (AMR) is a silent pandemic with the third highest global mortality. The antibiotic development pipeline is scarce even though AMR has escalated uncontrollably. Artificial intelligence (AI) is a revolutionary approach, accelerating drug discovery because of its fast pace, cost efficiency, lower labor requirements, and fewer chances of failure. AI has been used to discover several beta-lactamase inhibitors and antibiotic alternatives from antimicrobial peptides (AMPs), nonribosomal peptides, bacteriocins, and marine natural products. The significant recent increase in the use of AI platforms by pharmaceutical companies could result in the discovery of efficient antibiotic alternatives with lower chances of resistance generation.
Collapse
Affiliation(s)
- Absar Talat
- Medical Microbiology and Molecular Biology Laboratory, Interdisciplinary Biotechnology Unit, Aligarh Muslim University, Aligarh, India
| | - Asad U Khan
- Medical Microbiology and Molecular Biology Laboratory, Interdisciplinary Biotechnology Unit, Aligarh Muslim University, Aligarh, India.
| |
Collapse
|
28
|
Su W, Deng S, Gu Z, Yang K, Ding H, Chen H, Zhang Z. Prediction of apoptosis protein subcellular location based on amphiphilic pseudo amino acid composition. Front Genet 2023; 14:1157021. [PMID: 36926588 PMCID: PMC10011625 DOI: 10.3389/fgene.2023.1157021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 02/20/2023] [Indexed: 03/08/2023] Open
Abstract
Introduction: Apoptosis proteins play an important role in the process of cell apoptosis, which makes the rate of cell proliferation and death reach a relative balance. The function of apoptosis protein is closely related to its subcellular location, it is of great significance to study the subcellular locations of apoptosis proteins. Many efforts in bioinformatics research have been aimed at predicting their subcellular location. However, the subcellular localization of apoptotic proteins needs to be carefully studied. Methods: In this paper, based on amphiphilic pseudo amino acid composition and support vector machine algorithm, a new method was proposed for the prediction of apoptosis proteins\x{2019} subcellular location. Results and Discussion: The method achieved good performance on three data sets. The Jackknife test accuracy of the three data sets reached 90.5%, 93.9% and 84.0%, respectively. Compared with previous methods, the prediction accuracies of APACC_SVM were improved.
Collapse
Affiliation(s)
- Wenxia Su
- College of Science, Inner Mongolia Agriculture University, Hohhot, China
| | - Shuyi Deng
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zhifeng Gu
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Keli Yang
- Nonlinear Research Institute, Baoji University of Arts and Sciences, Baoji, China
| | - Hui Ding
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hui Chen
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu, China
| | - Zhaoyue Zhang
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Healthcare Technology, Chengdu Neusoft University, Chengdu, China
| |
Collapse
|
29
|
Dong R, Yang H, Ai C, Duan G, Wang J, Guo F. DeepBLI: A Transferable Multichannel Model for Detecting β-Lactamase-Inhibitor Interaction. J Chem Inf Model 2022; 62:5830-5840. [PMID: 36245217 DOI: 10.1021/acs.jcim.2c01008] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Pathogens producing β-lactamase pose a great challenge to antibiotic-resistant infection treatment; thus, it is urgent to discover novel β-lactamase inhibitors for drug development. Conventional high-throughput screening is very costly, and structure-based virtual screening is limited with mechanisms. In this study, we construct a novel multichannel deep neural network (DeepBLI) for β-lactamase inhibitor screening, pretrained with a label reversal KIBA data set and fine-tuned on β-lactamase-inhibitor pairs from BindingDB. First, the pairs of encoders (Conv and Att) fuse the information spatially and sequentially for both enzymes and inhibitors. Then, a co-attention module creates the connection between the inhibitor and enzyme embeddings. Finally, multichannel outputs fuse with an element-wise product and then are fed into 3-layer fully connected networks to predict interactions. Comparing the state-of-the-art methods, DeepBLI yields an AUROC of 0.9240 and an AUPRC of 0.9715, which indicates that it can identify new β-lactamase-inhibitor interactions. To demonstrate its prediction ability, an application of DeepBLI is described to screen potential inhibitor compounds for metallo-β-lactamase AIM-1 and repurpose rottlerin for four classes of β-lactamase targets, showing the possibility of being a broad-spectrum inhibitor. DeepBLI provides an effective way for antibacterial drug development, contributing to antibiotic-resistant therapeutics.
Collapse
Affiliation(s)
- Ruihan Dong
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing100871, China
| | - Hongpeng Yang
- Department of Computer Science and Engineering, University of South Carolina, Columbia, South Carolina29208, United States
| | - Chengwei Ai
- College of Intelligence and Computing, Tianjin University, Tianjin300350, China
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University, Changsha410083, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha410083, China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha410083, China
| |
Collapse
|
30
|
Tian Z, Peng X, Fang H, Zhang W, Dai Q, Ye Y. MHADTI: predicting drug-target interactions via multiview heterogeneous information network embedding with hierarchical attention mechanisms. Brief Bioinform 2022; 23:6761042. [PMID: 36242566 DOI: 10.1093/bib/bbac434] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 08/19/2022] [Accepted: 09/08/2022] [Indexed: 12/14/2022] Open
Abstract
MOTIVATION Discovering the drug-target interactions (DTIs) is a crucial step in drug development such as the identification of drug side effects and drug repositioning. Since identifying DTIs by web-biological experiments is time-consuming and costly, many computational-based approaches have been proposed and have become an efficient manner to infer the potential interactions. Although extensive effort is invested to solve this task, the prediction accuracy still needs to be improved. More especially, heterogeneous network-based approaches do not fully consider the complex structure and rich semantic information in these heterogeneous networks. Therefore, it is still a challenge to predict DTIs efficiently. RESULTS In this study, we develop a novel method via Multiview heterogeneous information network embedding with Hierarchical Attention mechanisms to discover potential Drug-Target Interactions (MHADTI). Firstly, MHADTI constructs different similarity networks for drugs and targets by utilizing their multisource information. Combined with the known DTI network, three drug-target heterogeneous information networks (HINs) with different views are established. Secondly, MHADTI learns embeddings of drugs and targets from multiview HINs with hierarchical attention mechanisms, which include the node-level, semantic-level and graph-level attentions. Lastly, MHADTI employs the multilayer perceptron to predict DTIs with the learned deep feature representations. The hierarchical attention mechanisms could fully consider the importance of nodes, meta-paths and graphs in learning the feature representations of drugs and targets, which makes their embeddings more comprehensively. Extensive experimental results demonstrate that MHADTI performs better than other SOTA prediction models. Moreover, analysis of prediction results for some interested drugs and targets further indicates that MHADTI has advantages in discovering DTIs. AVAILABILITY AND IMPLEMENTATION https://github.com/pxystudy/MHADTI.
Collapse
Affiliation(s)
- Zhen Tian
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Xiangyu Peng
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Haichuan Fang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Wenjie Zhang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Qiguo Dai
- School of Computer Science and Engineering, Dalian Minzu University, Dalian,116600, China
| | - Yangdong Ye
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| |
Collapse
|
31
|
Xiao J, Liu M, Huang Q, Sun Z, Ning L, Duan J, Zhu S, Huang J, Lin H, Yang H. Analysis and modeling of myopia-related factors based on questionnaire survey. Comput Biol Med 2022; 150:106162. [PMID: 36252365 DOI: 10.1016/j.compbiomed.2022.106162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 09/12/2022] [Accepted: 10/01/2022] [Indexed: 11/03/2022]
Abstract
With the rapid development of science and technology, the trend of low age myopia is becoming increasingly significant. The latest national survey done by the Chinese government found that more than 80% of Chinese teenagers suffer from myopia. Adolescent myopia is closely related to living environment, heredity, and living habits. Quantifying the relationship between myopia and living environment, heredity, and living habits is conductive to the prevention and intervention of adolescent myopia. In this study, we investigated the relationships between four main factors (environment, habits, parental vision, and demographic) and myopia status by analyzing the questionnaire data. Data were collected from Chengdu, China in 2021, including 2808 myopia samples and 5693 non-myopia samples, with a total of 22 features. Then, these 22 features were inputted into three machine learning algorithms to discriminate the two classes of samples. Results show that the computational model could produce an AUC of 0.768. To pick out the most important features which play important roles in classification, we used incremental feature selection strategy to screen the 22 features. As a result, we found that the 4 most influential features with XGBoost could achieve a competitive AUC of 0.764. To further investigate the risk and protective factors affecting adolescent myopia, we used OR values derived from MLE-LR to analyze the relationship between 22 features and adolescent myopia. Results showed that the age variable was the most significant risk factor for myopia, followed by the myopia status of parents. The most protective factor for eyesight is the measure taken by the children, followed by the distance between books and eyes when reading. These discoveries can guide the prevention and control of myopia in children and adolescents.
Collapse
Affiliation(s)
- Jianqiang Xiao
- Eye School, Chengdu University of Traditional Chinese Medicine, Ineye Hospital of Chengdu University of TCM, China
| | - Mujiexin Liu
- Eye School, Chengdu University of Traditional Chinese Medicine, Ineye Hospital of Chengdu University of TCM, China
| | - Qinlai Huang
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Zijie Sun
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Lin Ning
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu, 611844, China
| | - Junguo Duan
- Eye School, Chengdu University of Traditional Chinese Medicine, Ineye Hospital of Chengdu University of TCM, China
| | - Siquan Zhu
- Eye School, Chengdu University of Traditional Chinese Medicine, Ineye Hospital of Chengdu University of TCM, China
| | - Jian Huang
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Hao Lin
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| | - Hui Yang
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China; School of Computer Science, Chengdu University of Information Technology, Chengdu, 611844, China.
| |
Collapse
|
32
|
GCHN-DTI: Predicting drug-target interactions by graph convolution on heterogeneous networks. Methods 2022; 206:101-107. [PMID: 36058415 DOI: 10.1016/j.ymeth.2022.08.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 08/17/2022] [Accepted: 08/29/2022] [Indexed: 11/22/2022] Open
Abstract
Determining the interaction of drug and target plays a key role in the process of drug development and discovery. The calculation methods can predict new interactions and speed up the process of drug development. In recent studies, the network-based approaches have been proposed to predict drug-target interactions. However, these methods cannot fully utilize the node information from heterogeneous networks. Therefore, we propose a method based on heterogeneous graph convolutional neural network for drug-target interaction prediction, GCHN-DTI (Predicting drug-target interactions by graph convolution on heterogeneous net-works), to predict potential DTIs. GCHN-DTI integrates network information from drug-target interactions, drug-drug interactions, drug-similarities, target-target interactions, and target-similarities. Then, the graph convolution operation is used in the heterogeneous network to obtain the node embedding of the drugs and the targets. Furthermore, we incorporate an attention mechanism between graph convolutional layers to combine node embedding from each layer. Finally, the drug-target interaction score is predicted based on the node embedding of the drugs and the targets. Our model uses fewer network types and achieves higher prediction performance. In addition, the prediction performance of the model will be significantly improved on the dataset with a higher proportion of positive samples. The experimental evaluations show that GCHN-DTI outperforms several state-of-the-art prediction methods.
Collapse
|
33
|
Yeh SJ, Yeh TY, Chen BS. Systems Drug Discovery for Diffuse Large B Cell Lymphoma Based on Pathogenic Molecular Mechanism via Big Data Mining and Deep Learning Method. Int J Mol Sci 2022; 23:ijms23126732. [PMID: 35743172 PMCID: PMC9224183 DOI: 10.3390/ijms23126732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 06/10/2022] [Accepted: 06/15/2022] [Indexed: 02/01/2023] Open
Abstract
Diffuse large B cell lymphoma (DLBCL) is an aggressive heterogeneous disease. The most common subtypes of DLBCL include germinal center b-cell (GCB) type and activated b-cell (ABC) type. To learn more about the pathogenesis of two DLBCL subtypes (i.e., DLBCL ABC and DLBCL GCB), we firstly construct a candidate genome-wide genetic and epigenetic network (GWGEN) by big database mining. With the help of two DLBCL subtypes’ genome-wide microarray data, we identify their real GWGENs via system identification and model order selection approaches. Afterword, the core GWGENs of two DLBCL subtypes could be extracted from real GWGENs by principal network projection (PNP) method. By comparing core signaling pathways and investigating pathogenic mechanisms, we are able to identify pathogenic biomarkers as drug targets for DLBCL ABC and DLBCL GCD, respectively. Furthermore, we do drug discovery considering drug-target interaction ability, drug regulation ability, and drug toxicity. Among them, a deep neural network (DNN)-based drug-target interaction (DTI) model is trained in advance to predict potential drug candidates holding higher probability to interact with identified biomarkers. Consequently, two drug combinations are proposed to alleviate DLBCL ABC and DLBCL GCB, respectively.
Collapse
|
34
|
Wang S, Xu D, Gao B, Yan S, Sun Y, Tang X, Jiao Y, Huang S, Zhang S. Heterogeneity Analysis of Bladder Cancer Based on DNA Methylation Molecular Profiling. Front Oncol 2022; 12:915542. [PMID: 35747826 PMCID: PMC9209659 DOI: 10.3389/fonc.2022.915542] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 05/13/2022] [Indexed: 11/13/2022] Open
Abstract
Bladder cancer is a highly complex and heterogeneous malignancy. Tumor heterogeneity is a barrier to effective diagnosis and treatment of bladder cancer. Human carcinogenesis is closely related to abnormal gene expression, and DNA methylation is an important regulatory factor of gene expression. Therefore, it is of great significance for bladder cancer research to characterize tumor heterogeneity by integrating genetic and epigenetic characteristics. This study explored specific molecular subtypes based on DNA methylation status and identified subtype-specific characteristics using patient samples from the TCGA database with DNA methylation and gene expression were measured simultaneously. The results were validated using an independent cohort from GEO database. Four DNA methylation molecular subtypes of bladder cancer were obtained with different prognostic states. In addition, subtype-specific DNA methylation markers were identified using an information entropy-based algorithm to represent the unique molecular characteristics of the subtype and verified in the test set. The results of this study can provide an important reference for clinicians to make treatment decisions.
Collapse
Affiliation(s)
- Shuyu Wang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Dali Xu
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Bo Gao
- Department of Radiology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Shuhan Yan
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yiwei Sun
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Xinxing Tang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yanjia Jiao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Shan Huang
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
- *Correspondence: Shumei Zhang, ; Shan Huang,
| | - Shumei Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
- *Correspondence: Shumei Zhang, ; Shan Huang,
| |
Collapse
|
35
|
Tran HNT, Thomas JJ, Ahamed Hassain Malim NH. DeepNC: a framework for drug-target interaction prediction with graph neural networks. PeerJ 2022; 10:e13163. [PMID: 35578674 PMCID: PMC9107302 DOI: 10.7717/peerj.13163] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 03/03/2022] [Indexed: 01/12/2023] Open
Abstract
The exploration of drug-target interactions (DTI) is an essential stage in the drug development pipeline. Thanks to the assistance of computational models, notably in the deep learning approach, scientists have been able to shorten the time spent on this stage. Widely practiced deep learning algorithms such as convolutional neural networks and recurrent neural networks are commonly employed in DTI prediction projects. However, they can hardly utilize the natural graph structure of molecular inputs. For that reason, a graph neural network (GNN) is an applicable choice for learning the chemical and structural characteristics of molecules when it represents molecular compounds as graphs and learns the compound features from those graphs. In an effort to construct an advanced deep learning-based model for DTI prediction, we propose Deep Neural Computation (DeepNC), which is a framework utilizing three GNN algorithms: Generalized Aggregation Networks (GENConv), Graph Convolutional Networks (GCNConv), and Hypergraph Convolution-Hypergraph Attention (HypergraphConv). In short, our framework learns the features of drugs and targets by the layers of GNN and 1-D convolution network, respectively. Then, representations of the drugs and targets are fed into fully-connected layers to predict the binding affinity values. The models of DeepNC were evaluated on two benchmarked datasets (Davis, Kiba) and one independently proposed dataset (Allergy) to confirm that they are suitable for predicting the binding affinity of drugs and targets. Moreover, compared to the results of baseline methods that worked on the same problem, DeepNC proves to improve the performance in terms of mean square error and concordance index.
Collapse
Affiliation(s)
- Huu Ngoc Tran Tran
- Department of Computing, UOW Malaysia, KDU Penang University College, George Town, Penang, Malaysia
| | - J. Joshua Thomas
- Department of Computing, UOW Malaysia, KDU Penang University College, George Town, Penang, Malaysia
| | | |
Collapse
|
36
|
Lou Z, Cheng Z, Li H, Teng Z, Liu Y, Tian Z. Predicting miRNA-disease associations via learning multimodal networks and fusing mixed neighborhood information. Brief Bioinform 2022; 23:6582005. [PMID: 35524503 DOI: 10.1093/bib/bbac159] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 03/29/2022] [Accepted: 04/10/2022] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION In recent years, a large number of biological experiments have strongly shown that miRNAs play an important role in understanding disease pathogenesis. The discovery of miRNA-disease associations is beneficial for disease diagnosis and treatment. Since inferring these associations through biological experiments is time-consuming and expensive, researchers have sought to identify the associations utilizing computational approaches. Graph Convolutional Networks (GCNs), which exhibit excellent performance in link prediction problems, have been successfully used in miRNA-disease association prediction. However, GCNs only consider 1st-order neighborhood information at one layer but fail to capture information from high-order neighbors to learn miRNA and disease representations through information propagation. Therefore, how to aggregate information from high-order neighborhood effectively in an explicit way is still challenging. RESULTS To address such a challenge, we propose a novel method called mixed neighborhood information for miRNA-disease association (MINIMDA), which could fuse mixed high-order neighborhood information of miRNAs and diseases in multimodal networks. First, MINIMDA constructs the integrated miRNA similarity network and integrated disease similarity network respectively with their multisource information. Then, the embedding representations of miRNAs and diseases are obtained by fusing mixed high-order neighborhood information from multimodal network which are the integrated miRNA similarity network, integrated disease similarity network and the miRNA-disease association networks. Finally, we concentrate the multimodal embedding representations of miRNAs and diseases and feed them into the multilayer perceptron (MLP) to predict their underlying associations. Extensive experimental results show that MINIMDA is superior to other state-of-the-art methods overall. Moreover, the outstanding performance on case studies for esophageal cancer, colon tumor and lung cancer further demonstrates the effectiveness of MINIMDA. AVAILABILITY AND IMPLEMENTATION https://github.com/chengxu123/MINIMDA and http://120.79.173.96/.
Collapse
Affiliation(s)
- Zhengzheng Lou
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Zhaoxu Cheng
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Hui Li
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Zhixia Teng
- College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
| | - Yang Liu
- Departments of Cerebrovascular Diseases, The Second Affiliated Hospital of Zhengzhou University, Zhengzhou 450000, China
| | - Zhen Tian
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| |
Collapse
|
37
|
Nag S, Baidya ATK, Mandal A, Mathew AT, Das B, Devi B, Kumar R. Deep learning tools for advancing drug discovery and development. 3 Biotech 2022; 12:110. [PMID: 35433167 PMCID: PMC8994527 DOI: 10.1007/s13205-022-03165-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 03/18/2022] [Indexed: 12/26/2022] Open
Abstract
A few decades ago, drug discovery and development were limited to a bunch of medicinal chemists working in a lab with enormous amount of testing, validations, and synthetic procedures, all contributing to considerable investments in time and wealth to get one drug out into the clinics. The advancements in computational techniques combined with a boom in multi-omics data led to the development of various bioinformatics/pharmacoinformatics/cheminformatics tools that have helped speed up the drug development process. But with the advent of artificial intelligence (AI), machine learning (ML) and deep learning (DL), the conventional drug discovery process has been further rationalized. Extensive biological data in the form of big data present in various databases across the globe acts as the raw materials for the ML/DL-based approaches and helps in accurate identifications of patterns and models which can be used to identify therapeutically active molecules with much fewer investments on time, workforce and wealth. In this review, we have begun by introducing the general concepts in the drug discovery pipeline, followed by an outline of the fields in the drug discovery process where ML/DL can be utilized. We have also introduced ML and DL along with their applications, various learning methods, and training models used to develop the ML/DL-based algorithms. Furthermore, we have summarized various DL-based tools existing in the public domain with their application in the drug discovery paradigm which includes DL tools for identification of drug targets and drug-target interaction such as DeepCPI, DeepDTA, WideDTA, PADME DeepAffinity, and DeepPocket. Additionally, we have discussed various DL-based models used in protein structure prediction, de novo design of new chemical scaffolds, virtual screening of chemical libraries for hit identification, absorption, distribution, metabolism, excretion, and toxicity (ADMET) prediction, metabolite prediction, clinical trial design, and oral bioavailability prediction. In the end, we have tried to shed light on some of the successful ML/DL-based models used in the drug discovery and development pipeline while also discussing the current challenges and prospects of the application of DL tools in drug discovery and development. We believe that this review will be useful for medicinal and computational chemists searching for DL tools for use in their drug discovery projects.
Collapse
Affiliation(s)
- Sagorika Nag
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Anurag T. K. Baidya
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Abhimanyu Mandal
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Alen T. Mathew
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Bhanuranjan Das
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Bharti Devi
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Rajnish Kumar
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| |
Collapse
|