1
|
Zheng Y, Ma Y, Xiong Q, Zhu K, Weng N, Zhu Q. The role of artificial intelligence in the development of anticancer therapeutics from natural polyphenols: Current advances and future prospects. Pharmacol Res 2024; 208:107381. [PMID: 39218422 DOI: 10.1016/j.phrs.2024.107381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 08/06/2024] [Accepted: 08/26/2024] [Indexed: 09/04/2024]
Abstract
Natural polyphenols, abundant in the human diet, are derived from a wide variety of sources. Numerous preclinical studies have demonstrated their significant anticancer properties against various malignancies, making them valuable resources for drug development. However, traditional experimental methods for developing anticancer therapies from natural polyphenols are time-consuming and labor-intensive. Recently, artificial intelligence has shown promising advancements in drug discovery. Integrating AI technologies into the development process for natural polyphenols can substantially reduce development time and enhance efficiency. In this study, we review the crucial roles of natural polyphenols in anticancer treatment and explore the potential of AI technologies to aid in drug development. Specifically, we discuss the application of AI in key stages such as drug structure prediction, virtual drug screening, prediction of biological activity, and drug-target protein interaction, highlighting the potential to revolutionize the development of natural polyphenol-based anticancer therapies.
Collapse
Affiliation(s)
- Ying Zheng
- Division of Abdominal Tumor Multimodality Treatment, Cancer Center, West China Hospital, Sichuan University, No.37 Guoxue Alley, Chengdu, Sichuan 610041, China
| | - Yifei Ma
- Division of Abdominal Tumor Multimodality Treatment, Cancer Center, West China Hospital, Sichuan University, No.37 Guoxue Alley, Chengdu, Sichuan 610041, China
| | - Qunli Xiong
- Division of Abdominal Tumor Multimodality Treatment, Cancer Center, West China Hospital, Sichuan University, No.37 Guoxue Alley, Chengdu, Sichuan 610041, China
| | - Kai Zhu
- Department of Medical Oncology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fujian 350011, PR China
| | - Ningna Weng
- Department of Medical Oncology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fujian 350011, PR China
| | - Qing Zhu
- Division of Abdominal Tumor Multimodality Treatment, Cancer Center, West China Hospital, Sichuan University, No.37 Guoxue Alley, Chengdu, Sichuan 610041, China.
| |
Collapse
|
2
|
de Alencar Morais Lima W, de Souza JG, García-Villén F, Loureiro JL, Raffin FN, Fernandes MAC, Souto EB, Severino P, Barbosa RDM. Next-generation pediatric care: nanotechnology-based and AI-driven solutions for cardiovascular, respiratory, and gastrointestinal disorders. World J Pediatr 2024:10.1007/s12519-024-00834-x. [PMID: 39192003 DOI: 10.1007/s12519-024-00834-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 07/21/2024] [Indexed: 08/29/2024]
Abstract
BACKGROUND Global pediatric healthcare reveals significant morbidity and mortality rates linked to respiratory, cardiac, and gastrointestinal disorders in children and newborns, mostly due to the complexity of therapeutic management in pediatrics and neonatology, owing to the lack of suitable dosage forms for these patients, often rendering them "therapeutic orphans". The development and application of pediatric drug formulations encounter numerous challenges, including physiological heterogeneity within age groups, limited profitability for the pharmaceutical industry, and ethical and clinical constraints. Many drugs are used unlicensed or off-label, posing a high risk of toxicity and reduced efficacy. Despite these circumstances, some regulatory changes are being performed, thus thrusting research innovation in this field. DATA SOURCES Up-to-date peer-reviewed journal articles, books, government and institutional reports, data repositories and databases were used as main data sources. RESULTS Among the main strategies proposed to address the current pediatric care situation, nanotechnology is specially promising for pediatric respiratory diseases since they offer a non-invasive, versatile, tunable, site-specific drug release. Tissue engineering is in the spotlight as strategy to address pediatric cardiac diseases, together with theragnostic systems. The integration of nanotechnology and theragnostic stands poised to refine and propel nanomedicine approaches, ushering in an era of innovative and personalized drug delivery for pediatric patients. Finally, the intersection of drug repurposing and artificial intelligence tools in pediatric healthcare holds great potential. This promises not only to enhance efficiency in drug development in general, but also in the pediatric field, hopefully boosting clinical trials for this population. CONCLUSIONS Despite the long road ahead, the deepening of nanotechnology, the evolution of tissue engineering, and the combination of traditional techniques with artificial intelligence are the most recently reported strategies in the specific field of pediatric therapeutics.
Collapse
Affiliation(s)
| | - Jackson G de Souza
- InovAI Lab, nPITI/IMD, Federal University of Rio Grande Do Norte, Natal, RN, 59078-970, Brazil
| | - Fátima García-Villén
- Department of Pharmacy and Pharmaceutical Technology, School of Pharmacy, University of Granada, Campus of Cartuja, 18071, Granada, Spain.
| | - Julia Lira Loureiro
- Laboratory of Galenic Pharmacy, Department of Pharmacy, Federal University of Rio Grande Do Norte, Natal, 59012-570, Brazil
| | - Fernanda Nervo Raffin
- Laboratory of Galenic Pharmacy, Department of Pharmacy, Federal University of Rio Grande Do Norte, Natal, 59012-570, Brazil
| | - Marcelo A C Fernandes
- InovAI Lab, nPITI/IMD, Federal University of Rio Grande Do Norte, Natal, RN, 59078-970, Brazil
- Department of Computer Engineering and Automation, Federal University of Rio Grande Do Norte, Natal, RN, 59078-970, Brazil
| | - Eliana B Souto
- Laboratory of Pharmaceutical Technology, Faculty of Pharmacy, University of Porto, Rua Jorge de Viterbo Ferreira, 228, 4050-313, Porto, Portugal
| | - Patricia Severino
- Industrial Biotechnology Program, University of Tiradentes (UNIT), Aracaju, Sergipe, 49032-490, Brazil
| | - Raquel de M Barbosa
- Department of Pharmacy and Pharmaceutical Technology, School of Pharmacy, University of Seville, C/Professor García González, 2, 41012, Seville, Spain.
| |
Collapse
|
3
|
Li Y, Liang W, Peng L, Zhang D, Yang C, Li KC. Predicting Drug-Target Interactions Via Dual-Stream Graph Neural Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:948-958. [PMID: 36074878 DOI: 10.1109/tcbb.2022.3204188] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Drug target interaction prediction is a crucial stage in drug discovery. However, brute-force search over a compound database is financially infeasible. We have witnessed the increasing measured drug-target interactions records in recent years, and the rich drug/protein-related information allows the usage of graph machine learning. Despite the advances in deep learning-enabled drug-target interaction, there are still open challenges: (1) rich and complex relationship between drugs and proteins can be explored; (2) the intermediate node is not calibrated in the heterogeneous graph. To tackle with above issues, this paper proposed a framework named DSG-DTI. Specifically, DSG-DTI has the heterogeneous graph autoencoder and heterogeneous attention network-based Matrix Completion. Our framework ensures that the known types of nodes (e.g., drug, target, side effects, diseases) are precisely embedded into high-dimensional space with our pretraining skills. Also, the attention-based heterogeneous graph-based matrix completion achieves highly competitive results via effective long-range dependencies extraction. We verify our model on two public benchmarks. The result of two publicly available benchmark application programs show that the proposed scheme effectively predicts drug-target interactions and can generalize to newly registered drugs and targets with slight performance degradation, outperforming the best accuracy compared with other baselines.
Collapse
|
4
|
Abubakar ML, Kapoor N, Sharma A, Gambhir L, Jasuja ND, Sharma G. Artificial Intelligence in Drug Identification and Validation: A Scoping Review. Drug Res (Stuttg) 2024; 74:208-219. [PMID: 38830370 DOI: 10.1055/a-2306-8311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
The end-to-end process in the discovery of drugs involves therapeutic candidate identification, validation of identified targets, identification of hit compound series, lead identification and optimization, characterization, and formulation and development. The process is lengthy, expensive, tedious, and inefficient, with a large attrition rate for novel drug discovery. Today, the pharmaceutical industry is focused on improving the drug discovery process. Finding and selecting acceptable drug candidates effectively can significantly impact the price and profitability of new medications. Aside from the cost, there is a need to reduce the end-to-end process time, limiting the number of experiments at various stages. To achieve this, artificial intelligence (AI) has been utilized at various stages of drug discovery. The present study aims to identify the recent work that has developed AI-based models at various stages of drug discovery, identify the stages that need more concern, present the taxonomy of AI methods in drug discovery, and provide research opportunities. From January 2016 to September 1, 2023, the study identified all publications that were cited in the electronic databases including Scopus, NCBI PubMed, MEDLINE, Anthropology Plus, Embase, APA PsycInfo, SOCIndex, and CINAHL. Utilising a standardized form, data were extracted, and presented possible research prospects based on the analysis of the extracted data.
Collapse
Affiliation(s)
| | - Neha Kapoor
- School of Applied Sciences, Suresh Gyan Vihar University, Jaipur, Rajasthan, India
| | - Asha Sharma
- Department of Zoology, Swargiya P. N. K. S. Govt. PG College, Dausa, Rajasthan, India
| | - Lokesh Gambhir
- School of Basic and Applied Sciences, Shri Guru Ram Rai University, Dehradun, Uttarakhand, India
| | | | - Gaurav Sharma
- School of Applied Sciences, Suresh Gyan Vihar University, Jaipur, Rajasthan, India
| |
Collapse
|
5
|
Abdul Raheem AK, Dhannoon BN. Comprehensive Review on Drug-target Interaction Prediction - Latest Developments and Overview. Curr Drug Discov Technol 2024; 21:e010923220652. [PMID: 37680152 DOI: 10.2174/1570163820666230901160043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 05/29/2023] [Accepted: 07/18/2023] [Indexed: 09/09/2023]
Abstract
Drug-target interactions (DTIs) are an important part of the drug development process. When the drug (a chemical molecule) binds to a target (proteins or nucleic acids), it modulates the biological behavior/function of the target, returning it to its normal state. Predicting DTIs plays a vital role in the drug discovery (DD) process as it has the potential to enhance efficiency and reduce costs. However, DTI prediction poses significant challenges and expenses due to the time-consuming and costly nature of experimental assays. As a result, researchers have increased their efforts to identify the association between medications and targets in the hopes of speeding up drug development and shortening the time to market. This paper provides a detailed discussion of the initial stage in drug discovery, namely drug-target interactions. It focuses on exploring the application of machine learning methods within this step. Additionally, we aim to conduct a comprehensive review of relevant papers and databases utilized in this field. Drug target interaction prediction covers a wide range of applications: drug discovery, prediction of adverse effects and drug repositioning. The prediction of drugtarget interactions can be categorized into three main computational methods: docking simulation approaches, ligand-based methods, and machine-learning techniques.
Collapse
Affiliation(s)
- Ali K Abdul Raheem
- Software Department, College of Information Technology, University of Babylon, Hillah, Babil, Iraq
- University of Warith Al-Anbiyaa, Kerbala, Iraq
| | - Ban N Dhannoon
- Department of Computer Science, College of Science, Al-Nahrain University, Baghdad, Iraq
| |
Collapse
|
6
|
Chebanov DK, Misyurin VA, Shubina IZ. An algorithm for drug discovery based on deep learning with an example of developing a drug for the treatment of lung cancer. FRONTIERS IN BIOINFORMATICS 2023; 3:1225149. [PMID: 38025397 PMCID: PMC10666046 DOI: 10.3389/fbinf.2023.1225149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Accepted: 10/02/2023] [Indexed: 12/01/2023] Open
Abstract
In this study, we present an algorithmic framework integrated within the created software platform tailored for the discovery of novel small-molecule anti-tumor agents. Our approach was exemplified in the context of combatting lung cancer. In the initial phase, target identification for therapeutic intervention was accomplished. Leveraging deep learning, we scrutinized gene expression profiles, focusing on those associated with adverse clinical outcomes in lung cancer patients. Augmenting this, generative adversarial neural (GAN) networks were employed to amass additional patient data. This effort yielded a subset of genes definitively linked to unfavorable prognoses. We further employed deep learning to delineate genes capable of discriminating between normal and tumor tissues based on expression patterns. The remaining genes were earmarked as potential targets for precision lung cancer therapy. Subsequently, a dedicated module was formulated to predict the interactions between inhibitors and proteins. To achieve this, protein amino acid sequences and chemical compound formulations engaged in protein interactions were encoded into vectorized representations. Additionally, a deep learning-based component was developed to forecast IC50 values through experimentation on cell lines. Virtual pre-clinical trials employing these inhibitors facilitated the selection of pertinent cell lines for subsequent laboratory assays. In summary, our study culminated in the derivation of several small-molecule formulas projected to bind selectively to specific proteins. This algorithmic platform holds promise in accelerating the identification and design of anti-tumor compounds, a critical pursuit in advancing targeted cancer therapies.
Collapse
Affiliation(s)
- Dmitrii K. Chebanov
- Department of Molecular Biology of Cancer, BioAlg Corp., Covina, CA, United States
| | - Vsevolod A. Misyurin
- Department of Molecular Biology of Cancer, BioAlg Corp., Covina, CA, United States
| | - Irina Zh. Shubina
- The Russian Melanoma Professional Association (Melanoma.PRO), Moscow, Russia
| |
Collapse
|
7
|
Liu L, Zhang Q, Wei Y, Zhao Q, Liao B. A Biological Feature and Heterogeneous Network Representation Learning-Based Framework for Drug-Target Interaction Prediction. Molecules 2023; 28:6546. [PMID: 37764321 PMCID: PMC10535805 DOI: 10.3390/molecules28186546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 09/06/2023] [Accepted: 09/07/2023] [Indexed: 09/29/2023] Open
Abstract
The prediction of drug-target interaction (DTI) is crucial to drug discovery. Although the interactions between the drug and target can be accurately verified by traditional biochemical experiments, the determination of DTI through biochemical experiments is a time-consuming, laborious, and expensive process. Therefore, we propose a learning-based framework named BG-DTI for drug-target interaction prediction. Our model combines two main approaches based on biological features and heterogeneous networks to identify interactions between drugs and targets. First, we extract original features from the sequence to encode each drug and target. Later, we further consider the relationships among various biological entities by constructing drug-drug similarity networks and target-target similarity networks. Furthermore, a graph convolutional network and a graph attention network in the graph representation learning module help us learn the features representation of drugs and targets. After obtaining the features from graph representation learning modules, these features are combined into fusion descriptors for drug-target pairs. Finally, we send the fusion descriptors and labels to a random forest classifier for predicting DTI. The evaluation results show that BG-DTI achieves an average AUC of 0.938 and an average AUPR of 0.930, which is better than those of five existing state-of-the-art methods. We believe that BG-DTI can facilitate the development of drug discovery or drug repurposing.
Collapse
Affiliation(s)
- Liwei Liu
- College of Science, Dalian Jiaotong University, Dalian 116028, China; (L.L.); (Q.Z.)
- Key Laboratory of Computational Science and Application of Hainan Province, Hainan Normal University, Haikou 571158, China
| | - Qi Zhang
- College of Science, Dalian Jiaotong University, Dalian 116028, China; (L.L.); (Q.Z.)
| | - Yuxiao Wei
- College of Software, Dalian Jiaotong University, Dalian 116028, China;
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan 114051, China
| | - Bo Liao
- Key Laboratory of Computational Science and Application of Hainan Province, Hainan Normal University, Haikou 571158, China
| |
Collapse
|
8
|
Ye Q, Zhang X, Lin X. Drug-Target Interaction Prediction via Graph Auto-Encoder and Multi-Subspace Deep Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2647-2658. [PMID: 36107905 DOI: 10.1109/tcbb.2022.3206907] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Computational prediction of drug-target interaction (DTI) is important for the new drug discovery. Currently, the deep neural network (DNN) has been widely used in DTI prediction. However, parameters of the DNN could be insufficiently trained and features of the data could be insufficiently utilized, because the DTI data is limited and its dimension is very high. To deal with the above problems, in this paper, a graph auto-encoder and multi-subspace deep neural network (GAEMSDNN) is designed. GAEMSDNN enhances its learning ability with a graph auto-encoder, a subspace layer and an ensemble layer. The graph auto-encoder can preserve the reconstruction information. The subspace layer can obtain different strong feature subsets. The ensemble layer in the GAEMSDNN can comprehensively utilize these strong feature subsets in a unified optimization framework. As a result, more features can be extracted from the network input and the DNN network can be better trained. In experiments, the results of GAEMSDNN are significantly improved compared to the previous methods, which validates the effectiveness of our strategies.
Collapse
|
9
|
Li H, Gao Q, Zhang Z, Zhang Y, Ren G. Spatial and temporal prediction of secondary crashes combining stacked sparse auto-encoder and long short-term memory. ACCIDENT; ANALYSIS AND PREVENTION 2023; 191:107205. [PMID: 37413700 DOI: 10.1016/j.aap.2023.107205] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Revised: 06/26/2023] [Accepted: 07/02/2023] [Indexed: 07/08/2023]
Abstract
Secondary crashes occur within the spatial and temporal impact area of primary crashes, resulting in traffic delays and safety problems. While most existing studies focus on the likelihood of secondary crashes, predicting the spatio-temporal location of secondary crashes could offer valuable insights for implementing prevention strategies. This includes guiding the deployment of emergency response measures and determining appropriate speed limits. The main objective of this study is to develop a prediction method for the spatial and temporal locations of secondary crashes. A hybrid deep learning model SSAE-LSTM is proposed by combining stacked sparse auto-encoder (SSAE) and long short-term memory network (LSTM). Traffic and crash data on the California I-880 highway covering the period of 2017-2021 are collected. The identification of secondary crashes is performed by the speed contour map method. The time and distance gaps between primary and secondary crashes are modeled using multiple 5-minute interval traffic variables as inputs. Multiple models are developed for benchmarking purposes, including PCA-LSTM, which incorporates principal component analysis (PCA) and LSTM, SSAE-SVM, which incorporates SSAE and support vector machine (SVM), and back propagation neural network (BPNN). The performance comparison indicates that the hybrid SSAE-LSTM model outperforms the other models in terms of both spatial and temporal prediction. In particular, SSAE4-LSTM1 (with 4 SSAE layers and 1 LSTM layer) demonstrates superior spatial prediction performance, while SSAE4-LSTM2 (with 4 SSAE layers and 2 LSTM layers) excels in temporal prediction. A joint spatio-temporal evaluation is also conducted to measure the overall accuracy of the optimal models over different permitted spatio-temporal ranges. Finally, practical suggestions are provided for secondary crash prevention.
Collapse
Affiliation(s)
- Haojie Li
- School of Transportation, Southeast University, China; Jiangsu Key Laboratory of Urban ITS, China; Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, China.
| | - Qi Gao
- School of Transportation, Southeast University, China; Jiangsu Key Laboratory of Urban ITS, China; Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, China
| | - Ziqian Zhang
- School of Transportation, Southeast University, China; Jiangsu Key Laboratory of Urban ITS, China; Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, China
| | - Yingheng Zhang
- School of Transportation, Southeast University, China; Jiangsu Key Laboratory of Urban ITS, China; Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, China
| | - Gang Ren
- School of Transportation, Southeast University, China; Jiangsu Key Laboratory of Urban ITS, China; Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, China
| |
Collapse
|
10
|
Zhou L, Wang Y, Peng L, Li Z, Luo X. Identifying potential drug-target interactions based on ensemble deep learning. Front Aging Neurosci 2023; 15:1176400. [PMID: 37396659 PMCID: PMC10309650 DOI: 10.3389/fnagi.2023.1176400] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 05/10/2023] [Indexed: 07/04/2023] Open
Abstract
Introduction Drug-target interaction prediction is one important step in drug research and development. Experimental methods are time consuming and laborious. Methods In this study, we developed a novel DTI prediction method called EnGDD by combining initial feature acquisition, dimensional reduction, and DTI classification based on Gradient boosting neural network, Deep neural network, and Deep Forest. Results EnGDD was compared with seven stat-of-the-art DTI prediction methods (BLM-NII, NRLMF, WNNGIP, NEDTP, DTi2Vec, RoFDT, and MolTrans) on the nuclear receptor, GPCR, ion channel, and enzyme datasets under cross validations on drugs, targets, and drug-target pairs, respectively. EnGDD computed the best recall, accuracy, F1-score, AUC, and AUPR under the majority of conditions, demonstrating its powerful DTI identification performance. EnGDD predicted that D00182 and hsa2099, D07871 and hsa1813, DB00599 and hsa2562, D00002 and hsa10935 have a higher interaction probabilities among unknown drug-target pairs and may be potential DTIs on the four datasets, respectively. In particular, D00002 (Nadide) was identified to interact with hsa10935 (Mitochondrial peroxiredoxin3) whose up-regulation might be used to treat neurodegenerative diseases. Finally, EnGDD was used to find possible drug targets for Parkinson's disease and Alzheimer's disease after confirming its DTI identification performance. The results show that D01277, D04641, and D08969 may be applied to the treatment of Parkinson's disease through targeting hsa1813 (dopamine receptor D2) and D02173, D02558, and D03822 may be the clues of treatment for patients with Alzheimer's disease through targeting hsa5743 (prostaglandinendoperoxide synthase 2). The above prediction results need further biomedical validation. Discussion We anticipate that our proposed EnGDD model can help discover potential therapeutic clues for various diseases including neurodegenerative diseases.
Collapse
Affiliation(s)
- Liqian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Yuzhuang Wang
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Zejun Li
- School of Computer Science, Hunan Institute of Technology, Hengyang, China
| | - Xueming Luo
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| |
Collapse
|
11
|
Yousefi N, Yazdani-Jahromi M, Tayebi A, Kolanthai E, Neal CJ, Banerjee T, Gosai A, Balasubramanian G, Seal S, Ozmen Garibay O. BindingSite-AugmentedDTA: enabling a next-generation pipeline for interpretable prediction models in drug repurposing. Brief Bioinform 2023; 24:7140297. [PMID: 37096593 DOI: 10.1093/bib/bbad136] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 03/02/2022] [Accepted: 03/16/2023] [Indexed: 04/26/2023] Open
Abstract
While research into drug-target interaction (DTI) prediction is fairly mature, generalizability and interpretability are not always addressed in the existing works in this field. In this paper, we propose a deep learning (DL)-based framework, called BindingSite-AugmentedDTA, which improves drug-target affinity (DTA) predictions by reducing the search space of potential-binding sites of the protein, thus making the binding affinity prediction more efficient and accurate. Our BindingSite-AugmentedDTA is highly generalizable as it can be integrated with any DL-based regression model, while it significantly improves their prediction performance. Also, unlike many existing models, our model is highly interpretable due to its architecture and self-attention mechanism, which can provide a deeper understanding of its underlying prediction mechanism by mapping attention weights back to protein-binding sites. The computational results confirm that our framework can enhance the prediction performance of seven state-of-the-art DTA prediction algorithms in terms of four widely used evaluation metrics, including concordance index, mean squared error, modified squared correlation coefficient ($r^2_m$) and the area under the precision curve. We also contribute to three benchmark drug-traget interaction datasets by including additional information on 3D structure of all proteins contained in those datasets, which include the two most commonly used datasets, namely Kiba and Davis, as well as the data from IDG-DREAM drug-kinase binding prediction challenge. Furthermore, we experimentally validate the practical potential of our proposed framework through in-lab experiments. The relatively high agreement between computationally predicted and experimentally observed binding interactions supports the potential of our framework as the next-generation pipeline for prediction models in drug repurposing.
Collapse
Affiliation(s)
- Niloofar Yousefi
- Industrial Engineering and Management Systems, University of Central Florida, 32816, 4000 Central Florida Blvd., Orlando, FL, USA
| | - Mehdi Yazdani-Jahromi
- Computer Science, University of Central Florida, 32816, 4000 Central Florida Blvd., Orlando, FL, USA
| | - Aida Tayebi
- Industrial Engineering and Management Systems, University of Central Florida, 32816, 4000 Central Florida Blvd., Orlando, FL, USA
| | - Elayaraja Kolanthai
- College of Medicine, Bionix Cluster, University of Central Florida, 4000 Central Florida Blvd., Orlando 32816, FL, USA
| | - Craig J Neal
- College of Medicine, Bionix Cluster, University of Central Florida, 4000 Central Florida Blvd., Orlando 32816, FL, USA
| | - Tanumoy Banerjee
- Department of Mechanical Engineering and Mechanics, Lehigh University, Bethlehem 18015, PA, USA
| | | | - Ganesh Balasubramanian
- Department of Mechanical Engineering and Mechanics, Lehigh University, Bethlehem 18015, PA, USA
| | - Sudipta Seal
- College of Medicine, Bionix Cluster, University of Central Florida, 4000 Central Florida Blvd., Orlando 32816, FL, USA
- Advanced Materials Processing and Analysis Center, Department of Materials Science and Engineering, University of Central Florida, 4000 Central Florida Blvd., Orlando 32816, FL, USA
| | - Ozlem Ozmen Garibay
- Industrial Engineering and Management Systems, University of Central Florida, 32816, 4000 Central Florida Blvd., Orlando, FL, USA
| |
Collapse
|
12
|
Zhao Q, Duan G, Yang M, Cheng Z, Li Y, Wang J. AttentionDTA: Drug-Target Binding Affinity Prediction by Sequence-Based Deep Learning With Attention Mechanism. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:852-863. [PMID: 35471889 DOI: 10.1109/tcbb.2022.3170365] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The identification of drug-target relations (DTRs) is substantial in drug development. A large number of methods treat DTRs as drug-target interactions (DTIs), a binary classification problem. The main drawback of these methods are the lack of reliable negative samples and the absence of many important aspects of DTR, including their dose dependence and quantitative affinities. With increasing number of publications of drug-protein binding affinity data recently, DTRs prediction can be viewed as a regression problem of drug-target affinities (DTAs) which reflects how tightly the drug binds to the target and can present more detailed and specific information than DTIs. The growth of affinity data enables the use of deep learning architectures, which have been shown to be among the state-of-the-art methods in binding affinity prediction. Although relatively effective, due to the black-box nature of deep learning, these models are less biologically interpretable. In this study, we proposed a deep learning-based model, named AttentionDTA, which uses attention mechanism to predict DTAs. Different from the models using 3D structures of drug-target complexes or graph representation of drugs and proteins, the novelty of our work is to use attention mechanism to focus on key subsequences which are important in drug and protein sequences when predicting its affinity. We use two separate one-dimensional Convolution Neural Networks (1D-CNNs) to extract the semantic information of drug's SMILES string and protein's amino acid sequence. Furthermore, a two-side multi-head attention mechanism is developed and embedded to our model to explore the relationship between drug features and protein features. We evaluate our model on three established DTA benchmark datasets, Davis, Metz, and KIBA. AttentionDTA outperforms the state-of-the-art deep learning methods under different evaluation metrics. The results show that the attention-based model can effectively extract protein features related to drug information and drug features related to protein information to better predict drug target affinities. It is worth mentioning that we test our model on IC50 dataset, which provides the binding sites between drugs and proteins, to evaluate the ability of our model to locate binding sites. Finally, we visualize the attention weight to demonstrate the biological significance of the model. The source code of AttentionDTA can be downloaded from https://github.com/zhaoqichang/AttentionDTA_TCBB.
Collapse
|
13
|
Chu T, Nguyen TT, Hai BD, Nguyen QH, Nguyen T. Graph Transformer for Drug Response Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1065-1072. [PMID: 36107906 DOI: 10.1109/tcbb.2022.3206888] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
BACKGROUND Previous models have shown that learning drug features from their graph representation is more efficient than learning from their strings or numeric representations. Furthermore, integrating multi-omics data of cell lines increases the performance of drug response prediction. However, these models have shown drawbacks in extracting drug features from graph representation and incorporating redundancy information from multi-omics data. This paper proposes a deep learning model, GraTransDRP, to better drug representation and reduce information redundancy. First, the Graph transformer was utilized to extract the drug representation more efficiently. Next, Convolutional neural networks were used to learn the mutation, meth, and transcriptomics features. However, the dimension of transcriptomics features was up to 17737. Therefore, KernelPCA was applied to transcriptomics features to reduce the dimension and transform them into a dense presentation before putting them through the CNN model. Finally, drug and omics features were combined to predict a response value by a fully connected network. Experimental results show that our model outperforms some state-of-the-art methods, including GraphDRP and GraOmicDRP.
Collapse
|
14
|
DoubleSG-DTA: Deep Learning for Drug Discovery: Case Study on the Non-Small Cell Lung Cancer with EGFRT790M Mutation. Pharmaceutics 2023; 15:pharmaceutics15020675. [PMID: 36839996 PMCID: PMC9965659 DOI: 10.3390/pharmaceutics15020675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 02/05/2023] [Accepted: 02/14/2023] [Indexed: 02/19/2023] Open
Abstract
Drug-targeted therapies are promising approaches to treating tumors, and research on receptor-ligand interactions for discovering high-affinity targeted drugs has been accelerating drug development. This study presents a mechanism-driven deep learning-based computational model to learn double drug sequences, protein sequences, and drug graphs to project drug-target affinities (DTAs), which was termed the DoubleSG-DTA. We deployed lightweight graph isomorphism networks to aggregate drug graph representations and discriminate between molecular structures, and stacked multilayer squeeze-and-excitation networks to selectively enhance spatial features of drug and protein sequences. What is more, cross-multi-head attentions were constructed to further model the non-covalent molecular docking behavior. The multiple cross-validation experimental evaluations on various datasets indicated that DoubleSG-DTA consistently outperformed all previously reported works. To showcase the value of DoubleSG-DTA, we applied it to generate promising hit compounds of Non-Small Cell Lung Cancer harboring EGFRT790M mutation from natural products, which were consistent with reported laboratory studies. Afterward, we further investigated the interpretability of the graph-based "black box" model and highlighted the active structures that contributed the most. DoubleSG-DTA thus provides a powerful and interpretable framework that extrapolates for potential chemicals to modulate the systemic response to disease.
Collapse
|
15
|
A Novel Autoencoder-Based Feature Selection Method for Drug-Target Interaction Prediction with Human-Interpretable Feature Weights. Symmetry (Basel) 2023. [DOI: 10.3390/sym15010192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Drug-target interaction prediction provides important information that could be exploited for drug discovery, drug design, and drug repurposing. Chemogenomic approaches for predicting drug-target interaction assume that similar receptors bind to similar ligands. Capturing this similarity in so-called “fingerprints” and combining the target and ligand fingerprints provide an efficient way to search for protein-ligand pairs that are more likely to interact. In this study, we constructed drug and target fingerprints by employing features extracted from the DrugBank. However, the number of extracted features is quite large, necessitating an effective feature selection mechanism since some features can be redundant or irrelevant to drug-target interaction prediction problems. Although such feature selection methods are readily available in the literature, usually they act as black boxes and do not provide any quantitative information about why a specific feature is preferred over another. To alleviate this lack of human interpretability, we proposed a novel feature selection method in which we used an autoencoder as a symmetric learning method and compared the proposed method to some popular feature selection algorithms, such as Kbest, Variance Threshold, and Decision Tree. The results of a detailed performance study, in which we trained six Multi-Layer Perceptron (MLP) Networks of different sizes and configurations for prediction, demonstrate that the proposed method yields superior results compared to the aforementioned methods.
Collapse
|
16
|
Wang YX, Yang Z, Wang WX, Huang YX, Zhang Q, Li JJ, Tang YP, Yue SJ. Methodology of network pharmacology for research on Chinese herbal medicine against COVID-19: A review. JOURNAL OF INTEGRATIVE MEDICINE 2022; 20:477-487. [PMID: 36182651 PMCID: PMC9508683 DOI: 10.1016/j.joim.2022.09.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 08/15/2022] [Indexed: 12/09/2022]
Abstract
Traditional Chinese medicine, as a complementary and alternative medicine, has been practiced for thousands of years in China and possesses remarkable clinical efficacy. Thus, systematic analysis and examination of the mechanistic links between Chinese herbal medicine (CHM) and the complex human body can benefit contemporary understandings by carrying out qualitative and quantitative analysis. With increasing attention, the approach of network pharmacology has begun to unveil the mystery of CHM by constructing the heterogeneous network relationship of "herb-compound-target-pathway," which corresponds to the holistic mechanisms of CHM. By integrating computational techniques into network pharmacology, the efficiency and accuracy of active compound screening and target fishing have been improved at an unprecedented pace. This review dissects the core innovations to the network pharmacology approach that were developed in the years since 2015 and highlights how this tool has been applied to understanding the coronavirus disease 2019 and refining the clinical use of CHM to combat it.
Collapse
Affiliation(s)
- Yi-Xuan Wang
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi'an 712046, Shaanxi Province, China; Department of Scientific Research, Shaanxi Provincial People's Hospital, Xi'an 710068, Shaanxi Province, China
| | - Zhen Yang
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi'an 712046, Shaanxi Province, China
| | - Wen-Xiao Wang
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi'an 712046, Shaanxi Province, China
| | - Yu-Xi Huang
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi'an 712046, Shaanxi Province, China
| | - Qiao Zhang
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi'an 712046, Shaanxi Province, China
| | - Jia-Jia Li
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi'an 712046, Shaanxi Province, China
| | - Yu-Ping Tang
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi'an 712046, Shaanxi Province, China
| | - Shi-Jun Yue
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi'an 712046, Shaanxi Province, China.
| |
Collapse
|
17
|
Xie F, Yang Z, Song J, Dai Q, Duan X. DHNLDA: A Novel Deep Hierarchical Network Based Method for Predicting lncRNA-Disease Associations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3395-3403. [PMID: 34543201 DOI: 10.1109/tcbb.2021.3113326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Recent studies have found that lncRNA (long non-coding RNA) in ncRNA (non-coding RNA) is not only involved in many biological processes, but also abnormally expressed in many complex diseases. Identification of lncRNA-disease associations accurately is of great significance for understanding the function of lncRNA and disease mechanism. In this paper, a deep learning framework consisting of stacked autoencoder(SAE), multi-scale ResNet and stacked ensemble module, named DHNLDA, was constructed to predict lncRNA-disease associations, which integrates multiple biological data sources and constructing feature matrices. Among them, the biological data including the similarity and the interaction of lncRNAs, diseases and miRNAs are integrated. The feature matrices are obtained by node2vec embedding and feature extraction respectively. Then, the SAE and the multi-scale ResNet are used to learn the complementary information between nodes, and the high-level features of node attributes are obtained. Finally, the fusion of high-level feature is input into the stacked ensemble module to obtain the prediction results of lncRNA-disease associations. The experimental results of five-fold cross-validation show that the AUC of DHNLDA reaches 0.975 better than the existing methods. Case studies of stomach cancer, breast cancer and lung cancer have shown the great ability of DHNLDA to discover the potential lncRNA-disease associations.
Collapse
|
18
|
Wei Q, Zhang Q, Gao H, Song T, Salhi A, Yu B. DEEPStack-RBP: Accurate identification of RNA-binding proteins based on autoencoder feature selection and deep stacking ensemble classifier. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
19
|
Zhang Y, Hu Y, Li H, Liu X. Drug-protein interaction prediction via variational autoencoders and attention mechanisms. Front Genet 2022; 13:1032779. [PMID: 36313473 PMCID: PMC9614151 DOI: 10.3389/fgene.2022.1032779] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 09/30/2022] [Indexed: 09/29/2023] Open
Abstract
During the process of drug discovery, exploring drug-protein interactions (DPIs) is a key step. With the rapid development of biological data, computer-aided methods are much faster than biological experiments. Deep learning methods have become popular and are mainly used to extract the characteristics of drugs and proteins for further DPIs prediction. Since the prediction of DPIs through machine learning cannot fully extract effective features, in our work, we propose a deep learning framework that uses variational autoencoders and attention mechanisms; it utilizes convolutional neural networks (CNNs) to obtain local features and attention mechanisms to obtain important information about drugs and proteins, which is very important for predicting DPIs. Compared with some machine learning methods on the C.elegans and human datasets, our approach provides a better effect. On the BindingDB dataset, its accuracy (ACC) and area under the curve (AUC) reach 0.862 and 0.913, respectively. To verify the robustness of the model, multiclass classification tasks are performed on Davis and KIBA datasets, and the ACC values reach 0.850 and 0.841, respectively, thus further demonstrating the effectiveness of the model.
Collapse
Affiliation(s)
- Yue Zhang
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | | | | | | |
Collapse
|
20
|
Yue R, Dutta A. Computational systems biology in disease modeling and control, review and perspectives. NPJ Syst Biol Appl 2022; 8:37. [PMID: 36192551 PMCID: PMC9528884 DOI: 10.1038/s41540-022-00247-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 09/05/2022] [Indexed: 02/02/2023] Open
Abstract
Omics-based approaches have become increasingly influential in identifying disease mechanisms and drug responses. Considering that diseases and drug responses are co-expressed and regulated in the relevant omics data interactions, the traditional way of grabbing omics data from single isolated layers cannot always obtain valuable inference. Also, drugs have adverse effects that may impair patients, and launching new medicines for diseases is costly. To resolve the above difficulties, systems biology is applied to predict potential molecular interactions by integrating omics data from genomic, proteomic, transcriptional, and metabolic layers. Combined with known drug reactions, the resulting models improve medicines' therapeutical performance by re-purposing the existing drugs and combining drug molecules without off-target effects. Based on the identified computational models, drug administration control laws are designed to balance toxicity and efficacy. This review introduces biomedical applications and analyses of interactions among gene, protein and drug molecules for modeling disease mechanisms and drug responses. The therapeutical performance can be improved by combining the predictive and computational models with drug administration designed by control laws. The challenges are also discussed for its clinical uses in this work.
Collapse
Affiliation(s)
- Rongting Yue
- Department of Electrical and Computer Engineering, University of Connecticut, 371 Fairfield Way, Storrs, CT, 06269, USA.
| | - Abhishek Dutta
- Department of Electrical and Computer Engineering, University of Connecticut, 371 Fairfield Way, Storrs, CT, 06269, USA
| |
Collapse
|
21
|
Severino AGV, de Lima JMM, de Araújo FMU. Industrial Soft Sensor Optimized by Improved PSO: A Deep Representation-Learning Approach. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22186887. [PMID: 36146235 PMCID: PMC9505118 DOI: 10.3390/s22186887] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 08/14/2022] [Accepted: 08/16/2022] [Indexed: 06/07/2023]
Abstract
Soft sensors based on deep learning approaches are growing in popularity due to their ability to extract high-level features from training, improving soft sensors' performance. In the training process of such a deep model, the set of hyperparameters is critical to archive generalization and reliability. However, choosing the training hyperparameters is a complex task. Usually, a random approach defines the set of hyperparameters, which may not be adequate regarding the high number of sets and the soft sensing purposes. This work proposes the RB-PSOSAE, a Representation-Based Particle Swarm Optimization with a modified evaluation function to optimize the hyperparameter set of a Stacked AutoEncoder-based soft sensor. The evaluation function considers the mean square error (MSE) of validation and the representation of the features extracted through mutual information (MI) analysis in the pre-training step. By doing this, the RB-PSOSAE computes hyperparameters capable of supporting the training process to generate models with improved generalization and relevant hidden features. As a result, the proposed method can generate more than 16.4% improvement in RMSE compared to another standard PSO-based method and, in some cases, more than 50% improvement compared to traditional methods applied to the same real-world nonlinear industrial process. Thus, the results demonstrate better prediction performance than traditional and state-of-the-art methods.
Collapse
|
22
|
Guo LX, You ZH, Wang L, Yu CQ, Zhao BW, Ren ZH, Pan J. A novel circRNA-miRNA association prediction model based on structural deep neural network embedding. Brief Bioinform 2022; 23:6694810. [DOI: 10.1093/bib/bbac391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 07/14/2022] [Accepted: 08/11/2022] [Indexed: 11/14/2022] Open
Abstract
Abstract
A large amount of clinical evidence began to mount, showing that circular ribonucleic acids (RNAs; circRNAs) perform a very important function in complex diseases by participating in transcription and translation regulation of microRNA (miRNA) target genes. However, with strict high-throughput techniques based on traditional biological experiments and the conditions and environment, the association between circRNA and miRNA can be discovered to be labor-intensive, expensive, time-consuming, and inefficient. In this paper, we proposed a novel computational model based on Word2vec, Structural Deep Network Embedding (SDNE), Convolutional Neural Network and Deep Neural Network, which predicts the potential circRNA-miRNA associations, called Word2vec, SDNE, Convolutional Neural Network and Deep Neural Network (WSCD). Specifically, the WSCD model extracts attribute feature and behaviour feature by word embedding and graph embedding algorithm, respectively, and ultimately feed them into a feature fusion model constructed by combining Convolutional Neural Network and Deep Neural Network to deduce potential circRNA-miRNA interactions. The proposed method is proved on dataset and obtained a prediction accuracy and an area under the receiver operating characteristic curve of 81.61% and 0.8898, respectively, which is shown to have much higher accuracy than the state-of-the-art models and classifier models in prediction. In addition, 23 miRNA-related circular RNAs (circRNAs) from the top 30 were confirmed in relevant experiences. In these works, all results represent that WSCD would be a helpful supplementary reliable method for predicting potential miRNA-circRNA associations compared to wet laboratory experiments.
Collapse
Affiliation(s)
- Lu-Xiang Guo
- College of Information Engineering, Xijing University , Xi’an 710123, China
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University , Xi’an, 710129, China
| | - Lei Wang
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences , Nanning 530007, China
- College of Information Science and Engineering, Zaozhuang University , Shandong 277100, China
| | - Chang-Qing Yu
- College of Information Engineering, Xijing University , Xi’an 710123, China
| | - Bo-Wei Zhao
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences , Urumqi 830011, China
| | - Zhong-Hao Ren
- College of Information Engineering, Xijing University , Xi’an 710123, China
| | - Jie Pan
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, College of Life Science, Northwest University , Xi’an 710069, China
| |
Collapse
|
23
|
Pu Y, Li J, Tang J, Guo F. DeepFusionDTA: Drug-Target Binding Affinity Prediction With Information Fusion and Hybrid Deep-Learning Ensemble Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2760-2769. [PMID: 34379594 DOI: 10.1109/tcbb.2021.3103966] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Identification of drug-target interaction (DTI) is the most important issue in the broad field of drug discovery. Using purely biological experiments to verify drug-target binding profiles takes lots of time and effort, so computational technologies for this task obviously have great benefits in reducing the drug search space. Most of computational methods to predict DTI are proposed to solve a binary classification problem, which ignore the influence of binding strength. Therefore, drug-target binding affinity prediction is still a challenging issue. Currently, lots of studies only extract sequence information that lacks feature-rich representation, but we consider more spatial features in order to merge various data in drug and target spaces. In this study, we propose a two-stage deep neural network ensemble model for detecting drug-target binding affinity, called DeepFusionDTA, via various information analysis modules. First stage is to utilize sequence and structure information to generate fusion feature map of candidate protein and drug pair through various analysis modules based deep learning. Second stage is to apply bagging-based ensemble learning strategy for regression prediction, and we obtain outstanding results by combining the advantages of various algorithms in efficient feature abstraction and regression calculation. Importantly, we evaluate our novel method, DeepFusionDTA, which delivers 1.5 percent CI increase on KIBA dataset and 1.0 percent increase on Davis dataset, by comparing with existing prediction tools, DeepDTA. Furthermore, the ideas we have offered can be applied to in-silico screening of the interaction space, to provide novel DTIs which can be experimentally pursued. The codes and data are available from https://github.com/guofei-tju/DeepFusionDTA.
Collapse
|
24
|
Dasgupta A, Bakshi A, Mukherjee S, Das K, Talukdar S, Chatterjee P, Mondal S, Das P, Ghosh S, Som A, Roy P, Kundu R, Sarkar A, Biswas A, Paul K, Basak S, Manna K, Saha C, Mukhopadhyay S, Bhattacharyya NP, De RK. Epidemiological challenges in pandemic coronavirus disease (COVID-19): Role of artificial intelligence. WILEY INTERDISCIPLINARY REVIEWS. DATA MINING AND KNOWLEDGE DISCOVERY 2022; 12:e1462. [PMID: 35942397 PMCID: PMC9350133 DOI: 10.1002/widm.1462] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 03/28/2022] [Accepted: 04/28/2022] [Indexed: 05/02/2023]
Abstract
World is now experiencing a major health calamity due to the coronavirus disease (COVID-19) pandemic, caused by the severe acute respiratory syndrome coronavirus clade 2. The foremost challenge facing the scientific community is to explore the growth and transmission capability of the virus. Use of artificial intelligence (AI), such as deep learning, in (i) rapid disease detection from x-ray or computed tomography (CT) or high-resolution CT (HRCT) images, (ii) accurate prediction of the epidemic patterns and their saturation throughout the globe, (iii) forecasting the disease and psychological impact on the population from social networking data, and (iv) prediction of drug-protein interactions for repurposing the drugs, has attracted much attention. In the present study, we describe the role of various AI-based technologies for rapid and efficient detection from CT images complementing quantitative real-time polymerase chain reaction and immunodiagnostic assays. AI-based technologies to anticipate the current pandemic pattern, prevent the spread of disease, and face mask detection are also discussed. We inspect how the virus transmits depending on different factors. We investigate the deep learning technique to assess the affinity of the most probable drugs to treat COVID-19. This article is categorized under:Application Areas > Health CareAlgorithmic Development > Biological Data MiningTechnologies > Machine Learning.
Collapse
Affiliation(s)
- Abhijit Dasgupta
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Abhisek Bakshi
- Department of Information TechnologyBengal Institute of TechnologyKolkataWest BengalIndia
| | - Srijani Mukherjee
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Kuntal Das
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Soumyajeet Talukdar
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Pratyayee Chatterjee
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Sagnik Mondal
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Puspita Das
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Subhrojit Ghosh
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Archisman Som
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Pritha Roy
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Rima Kundu
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Akash Sarkar
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Arnab Biswas
- Department of Data Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Karnelia Paul
- Department of BiotechnologyUniversity of CalcuttaKolkataWest BengalIndia
| | - Sujit Basak
- Department of Physiology and BiophysicsStony Brook UniversityStony BrookNew YorkUSA
| | - Krishnendu Manna
- Department of Food and NutritionUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Chinmay Saha
- Department of Genome Science, School of Interdisciplinary StudiesUniversity of Kalyani, KalyaniNadiaWest BengalIndia
| | - Satinath Mukhopadhyay
- Department of Endocrinology and MetabolismInstitute of Post Graduate Medical Education and Research and Seth Sukhlal Karnani Memorial HospitalKolkataWest BengalIndia
| | - Nitai P. Bhattacharyya
- Department of Endocrinology and MetabolismInstitute of Post Graduate Medical Education and Research and Seth Sukhlal Karnani Memorial HospitalKolkataWest BengalIndia
| | - Rajat K. De
- Machine Intelligence UnitIndian Statistical InstituteKolkataWest BengalIndia
| |
Collapse
|
25
|
Cheng Z, Yan C, Wu FX, Wang J. Drug-Target Interaction Prediction Using Multi-Head Self-Attention and Graph Attention Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2208-2218. [PMID: 33956632 DOI: 10.1109/tcbb.2021.3077905] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Identifying drug-target interactions (DTIs) is an important step in the process of new drug discovery and drug repositioning. Accurate predictions for DTIs can improve the efficiency in the drug discovery and development. Although rapid advances in deep learning technologies have generated various computational methods, it is still appealing to further investigate how to design efficient networks for predicting DTIs. In this study, we propose an end-to-end deep learning method (called MHSADTI) to predict DTIs based on the graph attention network and multi-head self-attention mechanism. First, the characteristics of drugs and proteins are extracted by the graph attention network and multi-head self-attention mechanism, respectively. Then, the attention scores are used to consider which amino acid subsequence in a protein is more important for the drug to predict its interactions. Finally, we predict DTIs by a fully connected layer after obtaining the feature vectors of drugs and proteins. MHSADTI takes advantage of self-attention mechanism for obtaining long-dependent contextual relationship in amino acid sequences and predicting DTI interpretability. More effective molecular characteristics are also obtained by the attention mechanism in graph attention networks. Multiple cross validation experiments are adopted to assess the performance of our MHSADTI. The experiments on four datasets, human, C.elegans, DUD-E and DrugBank show our method outperforms the state-of-the-art methods in terms of AUC, Precision, Recall, AUPR and F1-score. In addition, the case studies further demonstrate that our method can provide effective visualizations to interpret the prediction results from biological insights.
Collapse
|
26
|
Detecting Drug–Target Interactions with Feature Similarity Fusion and Molecular Graphs. BIOLOGY 2022; 11:biology11070967. [PMID: 36101348 PMCID: PMC9312204 DOI: 10.3390/biology11070967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 06/12/2022] [Accepted: 06/24/2022] [Indexed: 12/03/2022]
Abstract
Simple Summary Accurate identification of potential targets for drugs to interact with can accelerate drug development. The identification of drug–target interactions can provide insights into hidden drug efficacy. This paper presents a prediction model based on feature similarity fusion that can identify crucial features of drugs and targets to help predict drug–target interactions. Abstract The key to drug discovery is the identification of a target and a corresponding drug compound. Effective identification of drug–target interactions facilitates the development of drug discovery. In this paper, drug similarity and target similarity are considered, and graphical representations are used to extract internal structural information and intermolecular interaction information about drugs and targets. First, drug similarity and target similarity are fused using the similarity network fusion (SNF) method. Then, the graph isomorphic network (GIN) is used to extract the features with information about the internal structure of drug molecules. For target proteins, feature extraction is carried out using TextCNN to efficiently capture the features of target protein sequences. Three different divisions (CVD, CVP, CVT) are used on the standard dataset, and experiments are carried out separately to validate the performance of the model for drug–target interaction prediction. The experimental results show that our method achieves better results on AUC and AUPR. The docking results also show the superiority of the proposed model in predicting drug–target interactions.
Collapse
|
27
|
Ahmed F, Lee JW, Samantasinghar A, Kim YS, Kim KH, Kang IS, Memon FH, Lim JH, Choi KH. SperoPredictor: An Integrated Machine Learning and Molecular Docking-Based Drug Repurposing Framework With Use Case of COVID-19. Front Public Health 2022; 10:902123. [PMID: 35784208 PMCID: PMC9244710 DOI: 10.3389/fpubh.2022.902123] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 05/02/2022] [Indexed: 12/13/2022] Open
Abstract
The global spread of the SARS coronavirus 2 (SARS-CoV-2), its manifestation in human hosts as a contagious disease, and its variants have induced a pandemic resulting in the deaths of over 6,000,000 people. Extensive efforts have been devoted to drug research to cure and refrain the spread of COVID-19, but only one drug has received FDA approval yet. Traditional drug discovery is inefficient, costly, and unable to react to pandemic threats. Drug repurposing represents an effective strategy for drug discovery and reduces the time and cost compared to de novo drug discovery. In this study, a generic drug repurposing framework (SperoPredictor) has been developed which systematically integrates the various types of drugs and disease data and takes the advantage of machine learning (Random Forest, Tree Ensemble, and Gradient Boosted Trees) to repurpose potential drug candidates against any disease of interest. Drug and disease data for FDA-approved drugs (n = 2,865), containing four drug features and three disease features, were collected from chemical and biological databases and integrated with the form of drug-disease association tables. The resulting dataset was split into 70% for training, 15% for testing, and the remaining 15% for validation. The testing and validation accuracies of the models were 99.3% for Random Forest and 99.03% for Tree Ensemble. In practice, SperoPredictor identified 25 potential drug candidates against 6 human host-target proteomes identified from a systematic review of journals. Literature-based validation indicated 12 of 25 predicted drugs (48%) have been already used for COVID-19 followed by molecular docking and re-docking which indicated 4 of 13 drugs (30%) as potential candidates against COVID-19 to be pre-clinically and clinically validated. Finally, SperoPredictor results illustrated the ability of the platform to be rapidly deployed to repurpose the drugs as a rapid response to emergent situations (like COVID-19 and other pandemics).
Collapse
Affiliation(s)
- Faheem Ahmed
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
| | - Jae Wook Lee
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
- BioSpero, Inc., Jeju, South Korea
| | | | | | - Kyung Hwan Kim
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
| | - In Suk Kang
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
| | - Fida Hussain Memon
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
| | - Jong Hwan Lim
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
| | - Kyung Hyun Choi
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
- BioSpero, Inc., Jeju, South Korea
| |
Collapse
|
28
|
Zhao L, Zhu Y, Wang J, Wen N, Wang C, Cheng L. A brief review of protein-ligand interaction prediction. Comput Struct Biotechnol J 2022; 20:2831-2838. [PMID: 35765652 PMCID: PMC9189993 DOI: 10.1016/j.csbj.2022.06.004] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/30/2022] [Accepted: 06/01/2022] [Indexed: 01/21/2023] Open
Abstract
The task of identifying protein–ligand interactions (PLIs) plays a prominent role in the field of drug discovery. However, it is infeasible to identify potential PLIs via costly and laborious in vitro experiments. There is a need to develop PLI computational prediction approaches to speed up the drug discovery process. In this review, we summarize a brief introduction to various computation-based PLIs. We discuss these approaches, in particular, machine learning-based methods, with illustrations of different emphases based on mainstream trends. Moreover, we analyzed three research dynamics that can be further explored in future studies.
Collapse
Affiliation(s)
- Lingling Zhao
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Yan Zhu
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Junjie Wang
- Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| | - Naifeng Wen
- School of Mechanical and Electrical Engineering, Dalian Minzu University, Dalian, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
- Corresponding authors.
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- NHC and CAMS Key Laboratory of Molecular Probe and Targeted Theranostics, Harbin Medical University, Harbin, China
- Corresponding authors.
| |
Collapse
|
29
|
Rani P, Dutta K, Kumar V. Artificial intelligence techniques for prediction of drug synergy in malignant diseases: Past, present, and future. Comput Biol Med 2022; 144:105334. [DOI: 10.1016/j.compbiomed.2022.105334] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 02/13/2022] [Accepted: 02/13/2022] [Indexed: 12/22/2022]
|
30
|
Nam S, Kim D, Jung W, Zhu Y. Understanding the Research Landscape of Deep Learning in Biomedical Science: Scientometric Analysis. J Med Internet Res 2022; 24:e28114. [PMID: 35451980 PMCID: PMC9077503 DOI: 10.2196/28114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 05/30/2021] [Accepted: 02/20/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Advances in biomedical research using deep learning techniques have generated a large volume of related literature. However, there is a lack of scientometric studies that provide a bird's-eye view of them. This absence has led to a partial and fragmented understanding of the field and its progress. OBJECTIVE This study aimed to gain a quantitative and qualitative understanding of the scientific domain by analyzing diverse bibliographic entities that represent the research landscape from multiple perspectives and levels of granularity. METHODS We searched and retrieved 978 deep learning studies in biomedicine from the PubMed database. A scientometric analysis was performed by analyzing the metadata, content of influential works, and cited references. RESULTS In the process, we identified the current leading fields, major research topics and techniques, knowledge diffusion, and research collaboration. There was a predominant focus on applying deep learning, especially convolutional neural networks, to radiology and medical imaging, whereas a few studies focused on protein or genome analysis. Radiology and medical imaging also appeared to be the most significant knowledge sources and an important field in knowledge diffusion, followed by computer science and electrical engineering. A coauthorship analysis revealed various collaborations among engineering-oriented and biomedicine-oriented clusters of disciplines. CONCLUSIONS This study investigated the landscape of deep learning research in biomedicine and confirmed its interdisciplinary nature. Although it has been successful, we believe that there is a need for diverse applications in certain areas to further boost the contributions of deep learning in addressing biomedical research problems. We expect the results of this study to help researchers and communities better align their present and future work.
Collapse
Affiliation(s)
- Seojin Nam
- Department of Library and Information Science, Sungkyunkwan University, Seoul, Republic of Korea
| | - Donghun Kim
- Department of Library and Information Science, Sungkyunkwan University, Seoul, Republic of Korea
| | - Woojin Jung
- Department of Library and Information Science, Sungkyunkwan University, Seoul, Republic of Korea
| | - Yongjun Zhu
- Department of Library and Information Science, Yonsei University, Seoul, Republic of Korea
| |
Collapse
|
31
|
A Novel Deep Neural Network Technique for Drug–Target Interaction. Pharmaceutics 2022; 14:pharmaceutics14030625. [PMID: 35336000 PMCID: PMC8954728 DOI: 10.3390/pharmaceutics14030625] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 03/08/2022] [Accepted: 03/08/2022] [Indexed: 01/20/2023] Open
Abstract
Drug discovery (DD) is a time-consuming and expensive process. Thus, the industry employs strategies such as drug repositioning and drug repurposing, which allows the application of already approved drugs to treat a different disease, as occurred in the first months of 2020, during the COVID-19 pandemic. The prediction of drug–target interactions is an essential part of the DD process because it can accelerate it and reduce the required costs. DTI prediction performed in silico have used approaches based on molecular docking simulations, including similarity-based and network- and graph-based ones. This paper presents MPS2IT-DTI, a DTI prediction model obtained from research conducted in the following steps: the definition of a new method for encoding molecule and protein sequences onto images; the definition of a deep-learning approach based on a convolutional neural network in order to create a new method for DTI prediction. Training results conducted with the Davis and KIBA datasets show that MPS2IT-DTI is viable compared to other state-of-the-art (SOTA) approaches in terms of performance and complexity of the neural network model. With the Davis dataset, we obtained 0.876 for the concordance index and 0.276 for the MSE; with the KIBA dataset, we obtained 0.836 and 0.226 for the concordance index and the MSE, respectively. Moreover, the MPS2IT-DTI model represents molecule and protein sequences as images, instead of treating them as an NLP task, and as such, does not employ an embedding layer, which is present in other models.
Collapse
|
32
|
Yang L, Li LP, Yi HC. DeepWalk based method to predict lncRNA-miRNA associations via lncRNA-miRNA-disease-protein-drug graph. BMC Bioinformatics 2022; 22:621. [PMID: 35216549 PMCID: PMC8875942 DOI: 10.1186/s12859-022-04579-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 01/18/2022] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Long non-coding RNAs (lncRNAs) play a crucial role in diverse biological processes and have been confirmed to be concerned with various diseases. Largely uncharacterized of the physiological role and functions of lncRNA remains. MicroRNAs (miRNAs), which are usually 20-24 nucleotides, have several critical regulatory parts in cells. LncRNA can be regarded as a sponge to adsorb miRNA and indirectly regulate transcription and translation. Thus, the identification of lncRNA-miRNA associations is essential and valuable. RESULTS In our work, we present DWLMI to infer the potential associations between lncRNAs and miRNAs by representing them as vectors via a lncRNA-miRNA-disease-protein-drug graph. Specifically, DeepWalk can be used to learn the behavior representation of vertices. The methods of fingerprint, k-mer and MeSH descriptors were mainly used to learn the attribute representation of vertices. By combining the above two kinds of information, unknown lncRNA-miRNA associations can be predicted by the random forest classifier. Under the five-fold cross-validation, the proposed DWLMI model obtained an average prediction accuracy of 95.22% with a sensitivity of 94.35% at the AUC of 98.56%. CONCLUSIONS The experimental results demonstrated that DWLMI can effectively predict the potential lncRNA-miRNA associated pairs, and the results can provide a new insight for related non-coding RNA researchers in the field of combing biology big data with deep learning.
Collapse
Affiliation(s)
- Long Yang
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Li-Ping Li
- College of Grassland and Environmental Science, Xinjiang Agricultural University, Urumqi, 830052, China.
| | - Hai-Cheng Yi
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| |
Collapse
|
33
|
Fu H, Huang F, Liu X, Qiu Y, Zhang W. MVGCN: data integration through multi-view graph convolutional network for predicting links in biomedical bipartite networks. Bioinformatics 2022; 38:426-434. [PMID: 34499148 DOI: 10.1093/bioinformatics/btab651] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2021] [Revised: 08/07/2021] [Accepted: 09/06/2021] [Indexed: 02/05/2023] Open
Abstract
MOTIVATION There are various interaction/association bipartite networks in biomolecular systems. Identifying unobserved links in biomedical bipartite networks helps to understand the underlying molecular mechanisms of human complex diseases and thus benefits the diagnosis and treatment of diseases. Although a great number of computational methods have been proposed to predict links in biomedical bipartite networks, most of them heavily depend on features and structures involving the bioentities in one specific bipartite network, which limits the generalization capacity of applying the models to other bipartite networks. Meanwhile, bioentities usually have multiple features, and how to leverage them has also been challenging. RESULTS In this study, we propose a novel multi-view graph convolution network (MVGCN) framework for link prediction in biomedical bipartite networks. We first construct a multi-view heterogeneous network (MVHN) by combining the similarity networks with the biomedical bipartite network, and then perform a self-supervised learning strategy on the bipartite network to obtain node attributes as initial embeddings. Further, a neighborhood information aggregation (NIA) layer is designed for iteratively updating the embeddings of nodes by aggregating information from inter- and intra-domain neighbors in every view of the MVHN. Next, we combine embeddings of multiple NIA layers in each view, and integrate multiple views to obtain the final node embeddings, which are then fed into a discriminator to predict the existence of links. Extensive experiments show MVGCN performs better than or on par with baseline methods and has the generalization capacity on six benchmark datasets involving three typical tasks. AVAILABILITY AND IMPLEMENTATION Source code and data can be downloaded from https://github.com/fuhaitao95/MVGCN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Haitao Fu
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Feng Huang
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Xuan Liu
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Yang Qiu
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
34
|
Nguyen T, Nguyen GTT, Nguyen T, Le DH. Graph Convolutional Networks for Drug Response Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:146-154. [PMID: 33606633 DOI: 10.1109/tcbb.2021.3060430] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
BACKGROUND Drug response prediction is an important problem in computational personalized medicine. Many machine-learning-based methods, especially deep learning-based ones, have been proposed for this task. However, these methods often represent the drugs as strings, which are not a natural way to depict molecules. Also, interpretation (e.g., what are the mutation or copy number aberration contributing to the drug response) has not been considered thoroughly. METHODS In this study, we propose a novel method, GraphDRP, based on graph convolutional network for the problem. In GraphDRP, drugs were represented in molecular graphs directly capturing the bonds among atoms, meanwhile cell lines were depicted as binary vectors of genomic aberrations. Representative features of drugs and cell lines were learned by convolution layers, then combined to represent for each drug-cell line pair. Finally, the response value of each drug-cell line pair was predicted by a fully-connected neural network. Four variants of graph convolutional networks were used for learning the features of drugs. RESULTS We found that GraphDRP outperforms tCNNS in all performance measures for all experiments. Also, through saliency maps of the resulting GraphDRP models, we discovered the contribution of the genomic aberrations to the responses. CONCLUSION Representing drugs as graphs can improve the performance of drug response prediction. Availability of data and materials: Data and source code can be downloaded athttps://github.com/hauldhut/GraphDRP.
Collapse
|
35
|
Song T, Wang G, Ding M, Rodriguez-Paton A, Wang X, Wang S. Network-Based Approaches for Drug Repositioning. Mol Inform 2021; 41:e2100200. [PMID: 34970871 DOI: 10.1002/minf.202100200] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 12/05/2021] [Indexed: 12/25/2022]
Abstract
With deep learning creeping up into the ranks of big data, new models based on deep learning and massive data have made great leaps forward rapidly in the field of drug repositioning. However, there is no relevant review to summarize the transformations and development process of models and their data in the field of drug repositioning. Among all the computational methods, network-based methods play an extraordinary role. In view of these circumstances, understanding and comparing existing network-based computational methods applied in drug repositioning will help us recognize the cutting-edge technologies and offer valuable information for relevant researchers. Therefore, in this review, we present an interpretation of the series of important network-based methods applied in drug repositioning, together with their comparisons and development process.
Collapse
Affiliation(s)
- Tao Song
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China.,Department of Artificial Intelligence, Faculty of Computer Science, Polytechnical University of Madrid, Campus de Montegancedo, Boadilla del Monte, 28660, Madrid, Spain
| | - Gan Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China
| | - Mao Ding
- Department of Neurology Medicine, The Second Hospital, Cheeloo College of Medicine, Shandong University, Ji Nan Shi, Jinan, 250033, China
| | - Alfonso Rodriguez-Paton
- Department of Artificial Intelligence, Faculty of Computer Science, Polytechnical University of Madrid, Campus de Montegancedo, Boadilla del Monte, 28660, Madrid, Spain
| | - Xun Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China.,China High Performance Computer Research Center, Institute of Computer Technology, Chinese Academy of Science, Beijing, 100190, Beijing, China
| | - Shudong Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China
| |
Collapse
|
36
|
Xie G, Li J, Gu G, Sun Y, Lin Z, Zhu Y, Wang W. BGMSDDA: a bipartite graph diffusion algorithm with multiple similarity integration for drug-disease association prediction. Mol Omics 2021; 17:997-1011. [PMID: 34610633 DOI: 10.1039/d1mo00237f] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Drug repositioning, a method that relies on the information from the original drug-disease association matrix, aims to identify new indications for existing drugs and is expected to greatly reduce the cost and time of drug development. However, most current drug repositioning methods make use of the original drug-disease association matrix directly without preconditioning. As relatively only a few associations between drugs and diseases have been determined from actual observations, the original drug-disease association matrix used in the prediction is sparse, which affects the performance of the prediction method. A method for mining similar features of drugs and diseases is still lacking. To solve these problems, we developed a bipartite graph diffusion algorithm with multiple similarity integration for drug-disease association prediction (BGMSDDA). First, the weight K nearest known neighbors (WKNKN) algorithm was used to reconstruct the drug-disease association matrix. Secondly, an effective method was designed to extract similar characteristics of drugs and diseases based on integrating linear neighborhood similarity and Gaussian kernel similarity. Finally, bipartite graph diffusion was used to infer undiscovered drug-disease associations. After carrying out 10-fold cross-validation experiments, BGMSDDA showed excellent performance on two datasets, specifically with AUC values of 0.939 (Fdataset) and 0.954 (Cdataset), and AUPR values of 0.466 (Fdataset) and 0.565 (Cdataset). Furthermore, to evaluate the accuracy of the results of BGMSDDA, we conducted case studies on three medically used drugs selected from Fdataset and Cdataset and validated the predictive associated diseases of each drug with some databases. Based on the results obtained, BGMSDDA was demonstrated to be useful for predicting drug-disease associations.
Collapse
Affiliation(s)
- Guobo Xie
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Jianming Li
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Guosheng Gu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Yuping Sun
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Zhiyi Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Yinting Zhu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Weiming Wang
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| |
Collapse
|
37
|
Wu X, Zeng W, Lin F, Zhou X. NeuRank: learning to rank with neural networks for drug-target interaction prediction. BMC Bioinformatics 2021; 22:567. [PMID: 34836495 PMCID: PMC8620576 DOI: 10.1186/s12859-021-04476-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 11/08/2021] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Experimental verification of a drug discovery process is expensive and time-consuming. Therefore, recently, the demand to more efficiently and effectively identify drug-target interactions (DTIs) has intensified. RESULTS We treat the prediction of DTIs as a ranking problem and propose a neural network architecture, NeuRank, to address it. Also, we assume that similar drug compounds are likely to interact with similar target proteins. Thus, in our model, we add drug and target similarities, which are very effective at improving the prediction of DTIs. Then, we develop NeuRank from a point-wise to a pair-wise, and further to list-wise model. CONCLUSION Finally, results from extensive experiments on five public data sets (DrugBank, Enzymes, Ion Channels, G-Protein-Coupled Receptors, and Nuclear Receptors) show that, in identifying DTIs, our models achieve better performance than other state-of-the-art methods.
Collapse
Affiliation(s)
- Xiujin Wu
- School of Informatics, Xiamen University, Xiamen, China
| | - Wenhua Zeng
- School of Informatics, Xiamen University, Xiamen, China
| | - Fan Lin
- School of Informatics, Xiamen University, Xiamen, China
| | - Xiuze Zhou
- Shuye Technology Co., Ltd., Hangzhou, China
| |
Collapse
|
38
|
Zhang Y, Jiang Z, Chen C, Wei Q, Gu H, Yu B. DeepStack-DTIs: Predicting Drug-Target Interactions Using LightGBM Feature Selection and Deep-Stacked Ensemble Classifier. Interdiscip Sci 2021; 14:311-330. [PMID: 34731411 DOI: 10.1007/s12539-021-00488-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2021] [Revised: 10/19/2021] [Accepted: 10/21/2021] [Indexed: 12/12/2022]
Abstract
Accurate prediction of drug-target interactions (DTIs), which is often used in the fields of drug discovery and drug repositioning, is regarded a key challenge in the study of drug science. In this paper, a new method called DeepStack-DTIs is proposed to predict DTIs. First, for the target protein, pseudo-position specific score matrix, pseudo amino acid composition and SPIDER3 are used to extract the different feature information of the target protein. Meanwhile, the path-based fingerprint features of each drug are extracted. Then, the synthetic minority oversampling technique (SMOTE) and light gradient boosting machine (LightGBM) are used for data balancing and feature selection, respectively. Finally, the processed features are input to the deep-stacked ensemble classifier composed of gated recurrent unit (GRU), deep neural network (DNN), support vector machine (SVM), eXtreme gradient boosting (XGBoost) and logistic regression (LR) to predict DTIs. Under the five-fold cross-validation and compared with existing methods, the proposed method achieves higher prediction accuracy on the gold standard dataset. To evaluate the predictive power of DeepStack-DTIs, we validate the method on another dataset and predict the drug-target interaction network. The results indicate that DeepStack-DTIs has excellent predictive ability than the other methods, and provides novel insights for the prediction of DTIs. A novel method DeepStack-DTIs for drug-target interactions prediction. PsePSSM, PseAAC, SPIDER3 and FP2 are fused to convert protein sequence and drug molecule information into digital information, respectively. The SMOTE algorithm is used to balance the dataset and LightGBM feature selection algorithm is employed to remove redundant and irrelevant features to select the optimal feature subset. This optimal feature subset is inputted into the deep-stacked ensemble classifier to predict drug-target interactions. The experimental results show DeepStack-DTIs method can significantly improve the prediction accuracy of drug-target interactions.
Collapse
Affiliation(s)
- Yan Zhang
- College of Mechanical and Electrical Engineering, Qingdao University of Science and Technology, Qingdao, 266061, China.,College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China.,Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Zhiwen Jiang
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China.,Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Cheng Chen
- School of Computer Science and Technology, Shandong University, Qingdao, 266237, China
| | - Qinqin Wei
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China.,Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Haiming Gu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China.,Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Bin Yu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China. .,Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China. .,Key Laboratory of Computational Science and Application of Hainan Province, Haikou, 571158, China.
| |
Collapse
|
39
|
Wang L, You ZH, Li JQ, Huang YA. IMS-CDA: Prediction of CircRNA-Disease Associations From the Integration of Multisource Similarity Information With Deep Stacked Autoencoder Model. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:5522-5531. [PMID: 33027025 DOI: 10.1109/tcyb.2020.3022852] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Emerging evidence indicates that circular RNA (circRNA) has been an indispensable role in the pathogenesis of human complex diseases and many critical biological processes. Using circRNA as a molecular marker or therapeutic target opens up a new avenue for our treatment and detection of human complex diseases. The traditional biological experiments, however, are usually limited to small scale and are time consuming, so the development of an effective and feasible computational-based approach for predicting circRNA-disease associations is increasingly favored. In this study, we propose a new computational-based method, called IMS-CDA, to predict potential circRNA-disease associations based on multisource biological information. More specifically, IMS-CDA combines the information from the disease semantic similarity, the Jaccard and Gaussian interaction profile kernel similarity of disease and circRNA, and extracts the hidden features using the stacked autoencoder (SAE) algorithm of deep learning. After training in the rotation forest (RF) classifier, IMS-CDA achieves 88.08% area under the ROC curve with 88.36% accuracy at the sensitivity of 91.38% on the CIRCR2Disease dataset. Compared with the state-of-the-art support vector machine and K -nearest neighbor models and different descriptor models, IMS-CDA achieves the best overall performance. In the case studies, eight of the top 15 circRNA-disease associations with the highest prediction score were confirmed by recent literature. These results indicated that IMS-CDA has an outstanding ability to predict new circRNA-disease associations and can provide reliable candidates for biological experiments.
Collapse
|
40
|
Xuan P, Hu K, Cui H, Zhang T, Nakaguchi T. Learning multi-scale heterogeneous representations and global topology for drug-target interaction prediction. IEEE J Biomed Health Inform 2021; 26:1891-1902. [PMID: 34673498 DOI: 10.1109/jbhi.2021.3121798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Identification of drug-target interactions (DTIs) plays a critical role in drug discovery and repositioning. Deep integration of inter-connections and intra-similarities between heterogeneous multi-source data related to drugs and targets, however, is a challenging issue. We propose a DTI prediction model by learning from drug and protein related multi-scale attributes and global topology formed by heterogeneous connections. A drug-protein-disease heterogeneous network (RPD-Net) is firstly constructed to associate diverse similarities, interactions and associations across nodes. Secondly, we propose a multi-scale pairwise deep representation learning module consisting of a new embedding strategy to integrate diverse inter-relations and intra-relations, and dilation convolutions for multi-scale deep representation extraction. A global topology learning module is proposed which is composed of strategy based on non-negative matrix factorization (NMF) to extract topology from RPD-Net, and a new relational-level attention mechanism for discriminative topology embedding. Experimental results using public dataset demonstrate improved performance over state-of-the-art methods and contributions of our major innovations. Evaluation results by top k recall rates and case studies on five drugs further show the effectiveness in retrieving potential target candidates for drugs.
Collapse
|
41
|
Autoencoder-based detection of the residues involved in G protein-coupled receptor signaling. Sci Rep 2021; 11:19867. [PMID: 34615896 PMCID: PMC8494915 DOI: 10.1038/s41598-021-99019-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 09/17/2021] [Indexed: 11/08/2022] Open
Abstract
Regulator binding and mutations alter protein dynamics. The transmission of the signal of these alterations to distant sites through protein motion results in changes in protein expression and cell function. The detection of residues involved in signal transmission contributes to an elucidation of the mechanisms underlying processes as vast as cellular function and disease pathogenesis. We developed an autoencoder (AE) based method that detects residues essential for signaling by comparing the fluctuation data, particularly the time fluctuation of the side-chain distances between residues, during molecular dynamics simulations between the ligand-bound and -unbound forms or wild-type and mutant forms of proteins. Here, the AE-based method was applied to the G protein-coupled receptor (GPCR) system, particularly a class A-type GPCR, CXCR4, to detect the essential residues involved in signaling. Among the residues involved in the signaling of the homolog CXCR2, which were extracted from the literature based on the complex structures of the ligand and G protein, our method could detect more than half of the essential residues involved in G protein signaling, including those spanning the fifth and sixth transmembrane helices in the intracellular region, despite the lack of information regarding the interaction with G protein in our CXCR4 models.
Collapse
|
42
|
Prediction of Drug-Target Interactions by Combining Dual-Tree Complex Wavelet Transform with Ensemble Learning Method. Molecules 2021; 26:molecules26175359. [PMID: 34500792 PMCID: PMC8433937 DOI: 10.3390/molecules26175359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 08/27/2021] [Accepted: 08/30/2021] [Indexed: 11/17/2022] Open
Abstract
Identification of drug–target interactions (DTIs) is vital for drug discovery. However, traditional biological approaches have some unavoidable shortcomings, such as being time consuming and expensive. Therefore, there is an urgent need to develop novel and effective computational methods to predict DTIs in order to shorten the development cycles of new drugs. In this study, we present a novel computational approach to identify DTIs, which uses protein sequence information and the dual-tree complex wavelet transform (DTCWT). More specifically, a position-specific scoring matrix (PSSM) was performed on the target protein sequence to obtain its evolutionary information. Then, DTCWT was used to extract representative features from the PSSM, which were then combined with the drug fingerprint features to form the feature descriptors. Finally, these descriptors were sent to the Rotation Forest (RoF) model for classification. A 5-fold cross validation (CV) was adopted on four datasets (Enzyme, Ion Channel, GPCRs (G-protein-coupled receptors), and NRs (Nuclear Receptors)) to validate the proposed model; our method yielded high average accuracies of 89.21%, 85.49%, 81.02%, and 74.44%, respectively. To further verify the performance of our model, we compared the RoF classifier with two state-of-the-art algorithms: the support vector machine (SVM) and the k-nearest neighbor (KNN) classifier. We also compared it with some other published methods. Moreover, the prediction results for the independent dataset further indicated that our method is effective for predicting potential DTIs. Thus, we believe that our method is suitable for facilitating drug discovery and development.
Collapse
|
43
|
Ru X, Ye X, Sakurai T, Zou Q, Xu L, Lin C. Current status and future prospects of drug-target interaction prediction. Brief Funct Genomics 2021; 20:312-322. [PMID: 34189559 DOI: 10.1093/bfgp/elab031] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 06/01/2021] [Accepted: 06/04/2021] [Indexed: 01/09/2023] Open
Abstract
Drug-target interaction prediction is important for drug development and drug repurposing. Many computational methods have been proposed for drug-target interaction prediction due to their potential to the time and cost reduction. In this review, we introduce the molecular docking and machine learning-based methods, which have been widely applied to drug-target interaction prediction. Particularly, machine learning-based methods are divided into different types according to the data processing form and task type. For each type of method, we provide a specific description and propose some solutions to improve its capability. The knowledge of heterogeneous network and learning to rank are also summarized in this review. As far as we know, this is the first comprehensive review that summarizes the knowledge of heterogeneous network and learning to rank in the drug-target interaction prediction. Moreover, we propose three aspects that can be explored in depth for future research.
Collapse
Affiliation(s)
| | - Xiucai Ye
- Department of Computer Science, and Center for Artificial Intelligence Research (C-AIR), University of Tsukuba
| | - Tetsuya Sakurai
- Department of Computer Science and is the director of the C-AIR, University of Tsukuba
| | - Quan Zou
- University of Electronic Science and Technology of China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic
| | | |
Collapse
|
44
|
Xu L, Ru X, Song R. Application of Machine Learning for Drug-Target Interaction Prediction. Front Genet 2021; 12:680117. [PMID: 34234813 PMCID: PMC8255962 DOI: 10.3389/fgene.2021.680117] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Accepted: 05/28/2021] [Indexed: 11/13/2022] Open
Abstract
Exploring drug–target interactions by biomedical experiments requires a lot of human, financial, and material resources. To save time and cost to meet the needs of the present generation, machine learning methods have been introduced into the prediction of drug–target interactions. The large amount of available drug and target data in existing databases, the evolving and innovative computer technologies, and the inherent characteristics of various types of machine learning have made machine learning techniques the mainstream method for drug–target interaction prediction research. In this review, details of the specific applications of machine learning in drug–target interaction prediction are summarized, the characteristics of each algorithm are analyzed, and the issues that need to be further addressed and explored for future research are discussed. The aim of this review is to provide a sound basis for the construction of high-performance models.
Collapse
Affiliation(s)
- Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
| | - Xiaoqing Ru
- Department of Computer Science, University of Tsukuba, Tsukuba, Japan
| | - Rong Song
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
| |
Collapse
|
45
|
Predicting Drug-Target Interactions Based on the Ensemble Models of Multiple Feature Pairs. Int J Mol Sci 2021; 22:ijms22126598. [PMID: 34202954 PMCID: PMC8234024 DOI: 10.3390/ijms22126598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Revised: 06/09/2021] [Accepted: 06/16/2021] [Indexed: 11/30/2022] Open
Abstract
Backgroud: The prediction of drug–target interactions (DTIs) is of great significance in drug development. It is time-consuming and expensive in traditional experimental methods. Machine learning can reduce the cost of prediction and is limited by the characteristics of imbalanced datasets and problems of essential feature selection. Methods: The prediction method based on the Ensemble model of Multiple Feature Pairs (Ensemble-MFP) is introduced. Firstly, three negative sets are generated according to the Euclidean distance of three feature pairs. Then, the negative samples of the validation set/test set are randomly selected from the union set of the three negative sets in the validation set/test set. At the same time, the ensemble model with weight is optimized and applied to the test set. Results: The area under the receiver operating characteristic curve (area under ROC, AUC) in three out of four sub-datasets in gold standard datasets was more than 94.0% in the prediction of new drugs. The effectiveness of the proposed method is also shown with the comparison of state-of-the-art methods and demonstration of predicted drug–target pairs. Conclusion: The Ensemble-MFP can weigh the existing feature pairs and has a good prediction effect for general prediction on new drugs.
Collapse
|
46
|
Zeng Y, Chen X, Luo Y, Li X, Peng D. Deep drug-target binding affinity prediction with multiple attention blocks. Brief Bioinform 2021; 22:6231754. [PMID: 33866349 PMCID: PMC8083346 DOI: 10.1093/bib/bbab117] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 02/12/2021] [Accepted: 03/13/2021] [Indexed: 11/23/2022] Open
Abstract
Drug-target interaction (DTI) prediction has drawn increasing interest due to its substantial position in the drug discovery process. Many studies have introduced computational models to treat DTI prediction as a regression task, which directly predict the binding affinity of drug-target pairs. However, existing studies (i) ignore the essential correlations between atoms when encoding drug compounds and (ii) model the interaction of drug-target pairs simply by concatenation. Based on those observations, in this study, we propose an end-to-end model with multiple attention blocks to predict the binding affinity scores of drug-target pairs. Our proposed model offers the abilities to (i) encode the correlations between atoms by a relation-aware self-attention block and (ii) model the interaction of drug representations and target representations by the multi-head attention block. Experimental results of DTI prediction on two benchmark datasets show our approach outperforms existing methods, which are benefit from the correlation information encoded by the relation-aware self-attention block and the interaction information extracted by the multi-head attention block. Moreover, we conduct the experiments on the effects of max relative position length and find out the best max relative position length value \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$k \in \{3, 5\}$\end{document}. Furthermore, we apply our model to predict the binding affinity of Corona Virus Disease 2019 (COVID-19)-related genome sequences and \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$3137$\end{document} FDA-approved drugs.
Collapse
Affiliation(s)
- Yuni Zeng
- College of Computer Science, Sichuan University, Chengdu, Sichuan,610065, China
| | - Xiangru Chen
- College of Computer Science, Sichuan University, Chengdu, Sichuan,610065, China
| | - Yujie Luo
- Shenzhen Peng Cheng Laboratory, Shenzhen, 518052, China
| | - Xuedong Li
- Chengdu Sobey Digital Technology Co., Ltd, Chengdu, 610041,China
| | - Dezhong Peng
- College of Computer Science, Sichuan University, Chengdu, Sichuan,610065, China
| |
Collapse
|
47
|
Yang S, Zhu F, Ling X, Liu Q, Zhao P. Intelligent Health Care: Applications of Deep Learning in Computational Medicine. Front Genet 2021; 12:607471. [PMID: 33912213 PMCID: PMC8075004 DOI: 10.3389/fgene.2021.607471] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 03/05/2021] [Indexed: 12/24/2022] Open
Abstract
With the progress of medical technology, biomedical field ushered in the era of big data, based on which and driven by artificial intelligence technology, computational medicine has emerged. People need to extract the effective information contained in these big biomedical data to promote the development of precision medicine. Traditionally, the machine learning methods are used to dig out biomedical data to find the features from data, which generally rely on feature engineering and domain knowledge of experts, requiring tremendous time and human resources. Different from traditional approaches, deep learning, as a cutting-edge machine learning branch, can automatically learn complex and robust feature from raw data without the need for feature engineering. The applications of deep learning in medical image, electronic health record, genomics, and drug development are studied, where the suggestion is that deep learning has obvious advantage in making full use of biomedical data and improving medical health level. Deep learning plays an increasingly important role in the field of medical health and has a broad prospect of application. However, the problems and challenges of deep learning in computational medical health still exist, including insufficient data, interpretability, data privacy, and heterogeneity. Analysis and discussion on these problems provide a reference to improve the application of deep learning in medical health.
Collapse
Affiliation(s)
- Sijie Yang
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Fei Zhu
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Xinghong Ling
- School of Computer Science and Technology, Soochow University, Suzhou, China
- WenZheng College of Soochow University, Suzhou, China
| | - Quan Liu
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Peiyao Zhao
- School of Computer Science and Technology, Soochow University, Suzhou, China
| |
Collapse
|
48
|
Mahmud SMH, Chen W, Liu Y, Awal MA, Ahmed K, Rahman MH, Moni MA. PreDTIs: prediction of drug-target interactions based on multiple feature information using gradient boosting framework with data balancing and feature selection techniques. Brief Bioinform 2021; 22:6168499. [PMID: 33709119 PMCID: PMC7989622 DOI: 10.1093/bib/bbab046] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 01/25/2021] [Accepted: 01/29/2021] [Indexed: 12/13/2022] Open
Abstract
Discovering drug–target (protein) interactions (DTIs) is of great significance for researching and developing novel drugs, having a tremendous advantage to pharmaceutical industries and patients. However, the prediction of DTIs using wet-lab experimental methods is generally expensive and time-consuming. Therefore, different machine learning-based methods have been developed for this purpose, but there are still substantial unknown interactions needed to discover. Furthermore, data imbalance and feature dimensionality problems are a critical challenge in drug-target datasets, which can decrease the classifier performances that have not been significantly addressed yet. This paper proposed a novel drug–target interaction prediction method called PreDTIs. First, the feature vectors of the protein sequence are extracted by the pseudo-position-specific scoring matrix (PsePSSM), dipeptide composition (DC) and pseudo amino acid composition (PseAAC); and the drug is encoded with MACCS substructure fingerings. Besides, we propose a FastUS algorithm to handle the class imbalance problem and also develop a MoIFS algorithm to remove the irrelevant and redundant features for getting the best optimal features. Finally, balanced and optimal features are provided to the LightGBM Classifier to identify DTIs, and the 5-fold CV validation test method was applied to evaluate the prediction ability of the proposed method. Prediction results indicate that the proposed model PreDTIs is significantly superior to other existing methods in predicting DTIs, and our model could be used to discover new drugs for unknown disorders or infections, such as for the coronavirus disease 2019 using existing drugs compounds and severe acute respiratory syndrome coronavirus 2 protein sequences.
Collapse
Affiliation(s)
- S M Hasan Mahmud
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Wenyu Chen
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Yongsheng Liu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Md Abdul Awal
- Electronics and Communication Engineering Discipline, Khulna University, Khulna 9208, Bangladesh
| | - Kawsar Ahmed
- Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Santosh, Tangail-1902, Bangladesh
| | - Md Habibur Rahman
- Department of Computer Science and Engineering, Islamic University, Kushtia-7003, Bangladesh
| | - Mohammad Ali Moni
- UNSW Digital Health, WHO Center for eHealth, School of Public Health and Community Medicine, Faculty of Medicine, The University of New South Wales, Sydney, Australia
| |
Collapse
|
49
|
Shim J, Hong ZY, Sohn I, Hwang C. Prediction of drug-target binding affinity using similarity-based convolutional neural network. Sci Rep 2021; 11:4416. [PMID: 33627791 PMCID: PMC7904939 DOI: 10.1038/s41598-021-83679-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 01/18/2021] [Indexed: 12/02/2022] Open
Abstract
Identifying novel drug–target interactions (DTIs) plays an important role in drug discovery. Most of the computational methods developed for predicting DTIs use binary classification, whose goal is to determine whether or not a drug–target (DT) pair interacts. However, it is more meaningful but also more challenging to predict the binding affinity that describes the strength of the interaction between a DT pair. If the binding affinity is not sufficiently large, such drug may not be useful. Therefore, the methods for predicting DT binding affinities are very valuable. The increase in novel public affinity data available in the DT-related databases enables advanced deep learning techniques to be used to predict binding affinities. In this paper, we propose a similarity-based model that applies 2-dimensional (2D) convolutional neural network (CNN) to the outer products between column vectors of two similarity matrices for the drugs and targets to predict DT binding affinities. To our best knowledge, this is the first application of 2D CNN in similarity-based DT binding affinity prediction. The validation results on multiple public datasets show that the proposed model is an effective approach for DT binding affinity prediction and can be quite helpful in drug development process.
Collapse
Affiliation(s)
- Jooyong Shim
- Department of Statistics, Institute of Statistical Information, Inje University, Gimhae, Gyeongsangnamdo, South Korea
| | | | | | - Changha Hwang
- Department of Applied Statistics, Dankook University, Yongin, Gyeonggido, 16890, South Korea.
| |
Collapse
|
50
|
Peng J, Wang Y, Guan J, Li J, Han R, Hao J, Wei Z, Shang X. An end-to-end heterogeneous graph representation learning-based framework for drug-target interaction prediction. Brief Bioinform 2021; 22:6124914. [PMID: 33517357 DOI: 10.1093/bib/bbaa430] [Citation(s) in RCA: 64] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 12/01/2020] [Accepted: 12/23/2020] [Indexed: 12/28/2022] Open
Abstract
Accurately identifying potential drug-target interactions (DTIs) is a key step in drug discovery. Although many related experimental studies have been carried out for identifying DTIs in the past few decades, the biological experiment-based DTI identification is still timeconsuming and expensive. Therefore, it is of great significance to develop effective computational methods for identifying DTIs. In this paper, we develop a novel 'end-to-end' learning-based framework based on heterogeneous 'graph' convolutional networks for 'DTI' prediction called end-to-end graph (EEG)-DTI. Given a heterogeneous network containing multiple types of biological entities (i.e. drug, protein, disease, side-effect), EEG-DTI learns the low-dimensional feature representation of drugs and targets using a graph convolutional networks-based model and predicts DTIs based on the learned features. During the training process, EEG-DTI learns the feature representation of nodes in an end-to-end mode. The evaluation test shows that EEG-DTI performs better than existing state-of-art methods. The data and source code are available at: https://github.com/MedicineBiology-AI/EEG-DTI.
Collapse
Affiliation(s)
- Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China
| | - Yuxian Wang
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China.,Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an 710072, China
| | - Jiaojiao Guan
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China.,Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an 710072, China
| | - Jingyi Li
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China.,Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an 710072, China
| | - Ruijiang Han
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China
| | - Jianye Hao
- College of Intelligence and Computing, Tianjin University, Tianjin 300072, China
| | - Zhongyu Wei
- School of Data Science, Fudan University, Shanghai 200433, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China.,Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an 710072, China
| |
Collapse
|