1
|
Yang S, Bai M, Liu W, Li W, Zhong Z, Kwok LY, Dong G, Sun Z. Predicting Lactobacillus delbrueckii subsp. bulgaricus-Streptococcus thermophilus interactions based on a highly accurate semi-supervised learning method. SCIENCE CHINA. LIFE SCIENCES 2025; 68:558-574. [PMID: 39417929 DOI: 10.1007/s11427-023-2569-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 03/15/2024] [Indexed: 10/19/2024]
Abstract
Lactobacillus delbrueckii subsp. bulgaricus (L. bulgaricus) and Streptococcus thermophilus (S. thermophilus) are commonly used starters in milk fermentation. Fermentation experiments revealed that L. bulgaricus-S. thermophilus interactions (LbStI) substantially impact dairy product quality and production. Traditional biological humidity experiments are time-consuming and labor-intensive in screening interaction combinations, an artificial intelligence-based method for screening interactive starter combinations is necessary. However, in the current research on artificial intelligence based interaction prediction in the field of bioinformatics, most successful models adopt supervised learning methods, and there is a lack of research on interaction prediction with only a small number of labeled samples. Hence, this study aimed to develop a semi-supervised learning framework for predicting LbStI using genomic data from 362 isolates (181 per species). The framework consisted of a two-part model: a co-clustering prediction model (based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) dataset) and a Laplacian regularized least squares prediction model (based on K-mer analysis and gene composition of all isolates datasets). To enhance accuracy, we integrated the separate outcomes produced by each component of the two-part model to generate the ultimate LbStI prediction results, which were verified through milk fermentation experiments. Validation through milk fermentation experiments confirmed a high precision rate of 85% (17/20; validated with 20 randomly selected combinations of expected interacting isolates). Our data suggest that the biosynthetic pathways of cysteine, riboflavin, teichoic acid, and exopolysaccharides, as well as the ATP-binding cassette transport systems, contribute to the mutualistic relationship between these starter bacteria during milk fermentation. However, this finding requires further experimental verification. The presented model and data are valuable resources for academics and industry professionals interested in screening dairy starter cultures and understanding their interactions.
Collapse
Affiliation(s)
- Shujuan Yang
- Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
| | - Mei Bai
- Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
| | - Weichi Liu
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal Husbandry, Hohhot, 010018, China
| | - Weicheng Li
- Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
| | - Zhi Zhong
- Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
| | - Lai-Yu Kwok
- Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
| | - Gaifang Dong
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China.
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal Husbandry, Hohhot, 010018, China.
| | - Zhihong Sun
- Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China.
- Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, 010018, China.
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China.
- Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China.
| |
Collapse
|
2
|
Li J, Zhang X, Li B, Li Z, Chen Z. MDFGNN-SMMA: prediction of potential small molecule-miRNA associations based on multi-source data fusion and graph neural networks. BMC Bioinformatics 2025; 26:13. [PMID: 39806287 PMCID: PMC11730471 DOI: 10.1186/s12859-025-06040-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Accepted: 01/06/2025] [Indexed: 01/16/2025] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) are pivotal in the initiation and progression of complex human diseases and have been identified as targets for small molecule (SM) drugs. However, the expensive and time-intensive characteristics of conventional experimental techniques for identifying SM-miRNA associations highlight the necessity for efficient computational methodologies in this field. RESULTS In this study, we proposed a deep learning method called Multi-source Data Fusion and Graph Neural Networks for Small Molecule-MiRNA Association (MDFGNN-SMMA) to predict potential SM-miRNA associations. Firstly, MDFGNN-SMMA extracted features of Atom Pairs fingerprints and Molecular ACCess System fingerprints to derive fusion feature vectors for small molecules (SMs). The K-mer features were employed to generate the initial feature vectors for miRNAs. Secondly, cosine similarity measures were computed to construct the adjacency matrices for SMs and miRNAs, respectively. Thirdly, these feature vectors and adjacency matrices were input into a model comprising GAT and GraphSAGE, which were utilized to generate the final feature vectors for SMs and miRNAs. Finally, the averaged final feature vectors were utilized as input for a multilayer perceptron to predict the associations between SMs and miRNAs. CONCLUSIONS The performance of MDFGNN-SMMA was assessed using 10-fold cross-validation, demonstrating superior compared to the four state-of-the-art models in terms of both AUC and AUPR. Moreover, the experimental results of an independent test set confirmed the model's generalization capability. Additionally, the efficacy of MDFGNN-SMMA was substantiated through three case studies. The findings indicated that among the top 50 predicted miRNAs associated with Cisplatin, 5-Fluorouracil, and Doxorubicin, 42, 36, and 36 miRNAs, respectively, were corroborated by existing literature and the RNAInter database.
Collapse
Affiliation(s)
- Jianwei Li
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China
| | - Xukun Zhang
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China
| | - Bing Li
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China
| | - Ziyu Li
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China
| | - Zhenzhen Chen
- Beijing Institute of Heart Lung and Blood Vessel Diseases, Beijing Anzhen Hospital of Capital Medical University, Beijing, 101100, China.
| |
Collapse
|
3
|
Zhang Q, Wei Y, Liu L. A Domain Adaptive Interpretable Substructure-Aware Graph Attention Network for Drug-Drug Interaction Prediction. Interdiscip Sci 2025:10.1007/s12539-024-00680-5. [PMID: 39775539 DOI: 10.1007/s12539-024-00680-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 11/19/2024] [Accepted: 11/19/2024] [Indexed: 01/11/2025]
Abstract
Accurate prediction of drug-drug interaction (DDI) is essential to improve clinical efficacy, avoid adverse effects of drug combination therapy, and enhance drug safety. Recently researchers have developed several computer-aided methods for DDI prediction. However, these methods lack the substructural features that are critical to drug interactions and are not effective in generalizing across domains and different distribution data. In this work, we present SAGAN, a domain adaptive interpretable substructure-aware graph attention network for DDI prediction. Based on attention mechanism and unsupervised clustering algorithm, we propose a new substructure segmentation method, which segments the drug molecule into multiple substructures, learns the mechanism of drug interaction from the perspective of interaction, and identifies important interaction regions between drugs. To enhance the generalization ability of the model, we improve and apply a conditional domain adversarial network to achieve cross-domain generalization by alternately optimizing the cross-entropy loss on the source domain and the adversarial loss of the domain discriminator. We evaluate and compare SAGAN with the state-of-the-art DDI prediction model on four real-world datasets for both in-domain and cross-domain scenarios, and show that SAGAN achieves the best overall performance. Moreover, the visualization results of the model show that SAGAN has achieved pharmacologically significant substructure extraction, which can help drug developers screen for some undiscovered local interaction sites, and provide important information for further drug structure optimization. The codes and datasets are available online at https://github.com/wyx2012/SAGAN .
Collapse
Affiliation(s)
- Qi Zhang
- College of Science, Dalian Jiaotong University, Dalian, 116028, China
| | - Yuxiao Wei
- College of Software, Dalian Jiaotong University, Dalian, 116028, China
| | - Liwei Liu
- College of Science, Dalian Jiaotong University, Dalian, 116028, China.
| |
Collapse
|
4
|
Toprak A. Predicting human miRNA disease association with minimize matrix nuclear norm. Sci Rep 2024; 14:30815. [PMID: 39730483 DOI: 10.1038/s41598-024-81213-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2024] [Accepted: 11/25/2024] [Indexed: 12/29/2024] Open
Abstract
microRNAs (miRNAs) are non-coding RNA molecules that influence the development and progression of many diseases. Research have documented that miRNAs have a significant role in the prevention, diagnosis, and treatment of complex human diseases. Recently, scientists have devoted extensive resources to attempting to find the connections between miRNAs and diseases. Since the experimental methods used to discover that new miRNA-disease associations are time-consuming and expensive, many computational methods have been developed. In this research, a novel computational method based on matrix decomposition was proposed to predict new associations between miRNAs and diseases. Furthermore, the nuclear norm minimization method was employed to acquire breast cancer-associated miRNAs. We then evaluated the effectiveness of our method by utilizing two different cross-validation techniques and the results were compared to seven different methods. Moreover, a case study on breast cancer further validated our technique, confirming its predictive accuracy. These experimental results demonstrate that our method is a reliable computational model for uncovering potential miRNA-disease relationships.
Collapse
Affiliation(s)
- Ahmet Toprak
- Department of Electricity and Energy, Selcuk University, Konya, Turkey.
| |
Collapse
|
5
|
Liu T, Wang S, Zhang Y, Li Y, Liu Y, Huang S. TIWMFLP: Two-Tier Interactive Weighted Matrix Factorization and Label Propagation Based on Similarity Matrix Fusion for Drug-Disease Association Prediction. J Chem Inf Model 2024; 64:8641-8654. [PMID: 39486090 DOI: 10.1021/acs.jcim.4c01589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2024]
Abstract
Accurately identifying new therapeutic uses for drugs is crucial for advancing pharmaceutical research and development. Matrix factorization is often used in association prediction due to its simplicity and high interpretability. However, existing matrix factorization models do not enable real-time interaction between molecular feature matrices and similarity matrices, nor do they consider the geometric structure of the matrices. Additionally, efficiently integrating multisource data remains a significant challenge. To address these issues, we propose a two-tier interactive weighted matrix factorization and label propagation model based on similarity matrix fusion (TIWMFLP) to assist in personalized treatment. First, we calculate the Gaussian and Laplace kernel similarities for drugs and diseases using known drug-disease associations. We then introduce a new multisource similarity fusion method, called similarity matrix fusion (SMF), to integrate these drug/disease similarities. SMF not only considers the different contributions represented by each neighbor but also incorporates drug-disease association information to enhance the contextual topological relationships and potential features of each drug/disease node in the network. Second, we innovatively developed a two-tier interactive weighted matrix factorization (TIWMF) method to process three biological networks. This method realizes for the first time the real-time interaction between the drug/disease feature matrix and its similarity matrix, allowing for a better capture of the complex relationships between drugs and diseases. Additionally, the weighted matrix of the drug/disease similarity matrix is introduced to preserve the underlying structure of the similarity matrix. Finally, the label propagation algorithm makes predictions based on the three updated biological networks. Experimental outcomes reveal that TIWMFLP consistently surpasses state-of-the-art models on four drug-disease data sets, two small molecule-miRNA data sets, and one miRNA-disease data set.
Collapse
Affiliation(s)
- Tiyao Liu
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Shudong Wang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266525, China
| | - Yunyin Li
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Yingye Liu
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Shiyuan Huang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| |
Collapse
|
6
|
Galeano D, Imrat, Haltom J, Andolino C, Yousey A, Zaksas V, Das S, Baylin SB, Wallace DC, Slack FJ, Enguita FJ, Wurtele ES, Teegarden D, Meller R, Cifuentes D, Beheshti A. sChemNET: a deep learning framework for predicting small molecules targeting microRNA function. Nat Commun 2024; 15:9149. [PMID: 39443444 PMCID: PMC11500171 DOI: 10.1038/s41467-024-49813-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Accepted: 06/14/2024] [Indexed: 10/25/2024] Open
Abstract
MicroRNAs (miRNAs) have been implicated in human disorders, from cancers to infectious diseases. Targeting miRNAs or their target genes with small molecules offers opportunities to modulate dysregulated cellular processes linked to diseases. Yet, predicting small molecules associated with miRNAs remains challenging due to the small size of small molecule-miRNA datasets. Herein, we develop a generalized deep learning framework, sChemNET, for predicting small molecules affecting miRNA bioactivity based on chemical structure and sequence information. sChemNET overcomes the limitation of sparse chemical information by an objective function that allows the neural network to learn chemical space from a large body of chemical structures yet unknown to affect miRNAs. We experimentally validated small molecules predicted to act on miR-451 or its targets and tested their role in erythrocyte maturation during zebrafish embryogenesis. We also tested small molecules targeting the miR-181 network and other miRNAs using in-vitro and in-vivo experiments. We demonstrate that our machine-learning framework can predict bioactive small molecules targeting miRNAs or their targets in humans and other mammalian organisms.
Collapse
Affiliation(s)
- Diego Galeano
- Department of Electronics and Mechatronics Engineering, Facultad de Ingeniería, Universidad Nacional de Asunción - FIUNA, Luque, Paraguay.
- COVID-19 International Research Team, Medford, MA, USA.
| | - Imrat
- Department of Biochemistry and Cell Biology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Jeffrey Haltom
- COVID-19 International Research Team, Medford, MA, USA
- Center for Mitochondrial and Epigenomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Chaylen Andolino
- Department of Nutrition Science, Purdue University, Indiana, USA
- Purdue Institute for Cancer Research, Purdue University, Indiana, USA
| | - Aliza Yousey
- COVID-19 International Research Team, Medford, MA, USA
- Neuroscience Institute, Department of Neurobiology/ Department of Pharmacology and Toxicology, Morehouse School of Medicine, Atlanta, GA, USA
| | - Victoria Zaksas
- COVID-19 International Research Team, Medford, MA, USA
- Center for Translational Data Science, University of Chicago, Chicago, IL, USA
- Clever Research Lab, Springfield, IL, USA
| | - Saswati Das
- COVID-19 International Research Team, Medford, MA, USA
- Atal Bihari Vajpayee Institute of Medical Sciences and Dr Ram Manohar Lohia Hospital, New Delhi, India
| | - Stephen B Baylin
- COVID-19 International Research Team, Medford, MA, USA
- Sidney Kimmel Comprehensive Cancer Center and Department of Oncology, Johns Hopkins School of Medicine, Baltimore, MD, USA
- The Van Andel Institute, Grand Rapids, MI, USA
| | - Douglas C Wallace
- COVID-19 International Research Team, Medford, MA, USA
- Center for Mitochondrial and Epigenomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Frank J Slack
- Harvard Medical School Initiative for RNA Medicine, Department of Pathology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Francisco J Enguita
- COVID-19 International Research Team, Medford, MA, USA
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisboa, Portugal
| | - Eve Syrkin Wurtele
- Bioinformatics and Computational Biology Program, Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, USA
| | - Dorothy Teegarden
- Department of Nutrition Science, Purdue University, Indiana, USA
- Purdue Institute for Cancer Research, Purdue University, Indiana, USA
| | - Robert Meller
- COVID-19 International Research Team, Medford, MA, USA
- Neuroscience Institute, Department of Neurobiology/ Department of Pharmacology and Toxicology, Morehouse School of Medicine, Atlanta, GA, USA
| | - Daniel Cifuentes
- Department of Biochemistry and Cell Biology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Department of Virology, Immunology & Microbiology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Afshin Beheshti
- COVID-19 International Research Team, Medford, MA, USA
- Blue Marble Space Institute of Science, NASA Ames Research Center, Moffett Field, CA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- McGowan Institute for Regenerative Medicine - Center for Space Biomedicine, Department of Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
7
|
Sun XY, Hou ZJ, Zhang WG, Chen Y, Yao HB. HTFSMMA: Higher-Order Topological Guided Small Molecule-MicroRNA Associations Prediction. J Comput Biol 2024; 31:886-906. [PMID: 39109562 DOI: 10.1089/cmb.2024.0587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2024] Open
Abstract
Small molecules (SMs) play a pivotal role in regulating microRNAs (miRNAs). Existing prediction methods for associations between SM-miRNA have overlooked crucial aspects: the incorporation of local topological features between nodes, which represent either SMs or miRNAs, and the effective fusion of node features with topological features. This study introduces a novel approach, termed high-order topological features for SM-miRNA association prediction (HTFSMMA), which specifically addresses these limitations. Initially, an association graph is formed by integrating SM-miRNA association data, SM similarity, and miRNA similarity. Subsequently, we focus on the local information of links and propose target neighborhood graph convolutional network for extracting local topological features. Then, HTFSMMA employs graph attention networks to amalgamate these local features, thereby establishing a platform for the acquisition of high-order features through random walks. Finally, the extracted features are integrated into the multilayer perceptron to derive the association prediction scores. To demonstrate the performance of HTFSMMA, we conducted comprehensive evaluations including five-fold cross-validation, leave-one-out cross-validation (LOOCV), SM-fixed local LOOCV, and miRNA-fixed local LOOCV. The area under receiver operating characteristic curve values were 0.9958 ± 0.0024 (0.8722 ± 0.0021), 0.9986 (0.9504), 0.9974 (0.9111), and 0.9977 (0.9074), respectively. Our findings demonstrate the superior performance of HTFSMMA over existing approaches. In addition, three case studies and the DeLong test have confirmed the effectiveness of the proposed method. These results collectively underscore the significance of HTFSMMA in facilitating the inference of associations between SMs and miRNAs.
Collapse
Affiliation(s)
- Xiao-Yan Sun
- School of Computer Science and Artificial Intelligence & Aliyun Big Data, Changzhou University, Changzhou, China
| | - Zhen-Jie Hou
- School of Computer Science and Artificial Intelligence & Aliyun Big Data, Changzhou University, Changzhou, China
| | - Wen-Guang Zhang
- School of Life Sciences, Inner Mongolia Agricultural University, Hohhot, China
| | - Yan Chen
- School of Computer Science and Artificial Intelligence & Aliyun Big Data, Changzhou University, Changzhou, China
| | - Hai-Bin Yao
- School of Computer Science and Artificial Intelligence & Aliyun Big Data, Changzhou University, Changzhou, China
| |
Collapse
|
8
|
Liu S, Yu J, Ni N, Wang Z, Chen M, Li Y, Xu C, Ding Y, Zhang J, Yao X, Liu H. Versatile Framework for Drug-Target Interaction Prediction by Considering Domain-Specific Features. J Chem Inf Model 2024; 64:5646-5656. [PMID: 38976879 DOI: 10.1021/acs.jcim.4c00403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Predicting drug-target interactions (DTIs) is one of the crucial tasks in drug discovery, but traditional wet-lab experiments are costly and time-consuming. Recently, deep learning has emerged as a promising tool for accelerating DTI prediction due to its powerful performance. However, the models trained on limited known DTI data struggle to generalize effectively to novel drug-target pairs. In this work, we propose a strategy to train an ensemble of models by capturing both domain-generic and domain-specific features (E-DIS) to learn diverse domain features and adapt them to out-of-distribution data. Multiple experts were trained on different domains to capture and align domain-specific information from various distributions without accessing any data from unseen domains. E-DIS provides a comprehensive representation of proteins and ligands by capturing diverse features. Experimental results on four benchmark data sets in both in-domain and cross-domain settings demonstrated that E-DIS significantly improved model performance and domain generalization compared to existing methods. Our approach presents a significant advancement in DTI prediction by combining domain-generic and domain-specific features, enhancing the generalization ability of the DTI prediction model.
Collapse
Affiliation(s)
- Shuo Liu
- School of Pharmacy, Lanzhou University, Gansu 730000, China
- Huawei Technologies Co., Ltd., Hangzhou 310000, China
| | - Jialiang Yu
- Huawei Technologies Co., Ltd., Hangzhou 310000, China
| | - Ningxi Ni
- Huawei Technologies Co., Ltd., Hangzhou 310000, China
| | - Zidong Wang
- Huawei Technologies Co., Ltd., Hangzhou 310000, China
| | - Mengyun Chen
- Huawei Technologies Co., Ltd., Hangzhou 310000, China
| | - Yuquan Li
- College of Chemistry and Chemical Engineering, Lanzhou University, Gansu 730000, China
| | - Chen Xu
- Huawei Technologies Co., Ltd., Hangzhou 310000, China
| | - Yahao Ding
- Huawei Technologies Co., Ltd., Hangzhou 310000, China
| | - Jun Zhang
- Changping Laboratory, Beijing 102200, China
| | - Xiaojun Yao
- Faculty of Applied Sciences, Macao Polytechnic University, Macao SAR 999078, China
| | - Huanxiang Liu
- Faculty of Applied Sciences, Macao Polytechnic University, Macao SAR 999078, China
| |
Collapse
|
9
|
Wang S, Liu T, Ren C, Zhao Y, Qiao S, Zhang Y, Pang S. Heterogeneous graph inference with range constrainted L 2,1-collaborative matrix factorization for small molecule-miRNA association prediction. Comput Biol Chem 2024; 110:108078. [PMID: 38677013 DOI: 10.1016/j.compbiolchem.2024.108078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 04/03/2024] [Accepted: 04/16/2024] [Indexed: 04/29/2024]
Abstract
MicroRNAs (miRNAs) play a vital role in regulating gene expression and various biological processes. As a result, they have been identified as effective targets for small molecule (SM) drugs in disease treatment. Heterogeneous graph inference stands as a classical approach for predicting SM-miRNA associations, showcasing commendable convergence accuracy and speed. However, most existing methods do not adequately address the inherent sparsity in SM-miRNA association networks, and imprecise SM/miRNA similarity metrics reduce the accuracy of predicting SM-miRNA associations. In this research, we proposed a heterogeneous graph inference with range constrained L2,1-collaborative matrix factorization (HGIRCLMF) method to predict potential SM-miRNA associations. First, we computed the multi-source similarities of SM/miRNA and integrated these similarity information into a comprehensive SM/miRNA similarity. This step improved the accuracy of SM and miRNA similarity, ensuring reliability for the subsequent inference of the heterogeneity map. Second, we used a range constrained L2,1-collaborative matrix factorization (RCLMF) model to pre-populate the SM-miRNA association matrix with missing values. In this step, we developed a novel matrix decomposition method that enhances the robustness and formative nature of SM-miRNA edges between SM networks and miRNA networks. Next, we built a well-established SM-miRNA heterogeneous network utilizing the processed biological information. Finally, HGIRCLMF used this network data to infer unknown association pair scores. We implemented four cross-validation experiments on two distinct datasets, and HGIRCLMF acquired the highest areas under the curve, surpassing six state-of-the-art computational approaches. Furthermore, we performed three case studies to validate the predictive power of our method in practical application.
Collapse
Affiliation(s)
- Shudong Wang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Tiyao Liu
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Chuanru Ren
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Yawu Zhao
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Sibo Qiao
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266525, China.
| | - Shanchen Pang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| |
Collapse
|
10
|
Kalemati M, Zamani Emani M, Koohi S. DCGAN-DTA: Predicting drug-target binding affinity with deep convolutional generative adversarial networks. BMC Genomics 2024; 25:411. [PMID: 38724911 PMCID: PMC11080241 DOI: 10.1186/s12864-024-10326-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 04/19/2024] [Indexed: 05/13/2024] Open
Abstract
BACKGROUND In recent years, there has been a growing interest in utilizing computational approaches to predict drug-target binding affinity, aiming to expedite the early drug discovery process. To address the limitations of experimental methods, such as cost and time, several machine learning-based techniques have been developed. However, these methods encounter certain challenges, including the limited availability of training data, reliance on human intervention for feature selection and engineering, and a lack of validation approaches for robust evaluation in real-life applications. RESULTS To mitigate these limitations, in this study, we propose a method for drug-target binding affinity prediction based on deep convolutional generative adversarial networks. Additionally, we conducted a series of validation experiments and implemented adversarial control experiments using straw models. These experiments serve to demonstrate the robustness and efficacy of our predictive models. We conducted a comprehensive evaluation of our method by comparing it to baselines and state-of-the-art methods. Two recently updated datasets, namely the BindingDB and PDBBind, were used for this purpose. Our findings indicate that our method outperforms the alternative methods in terms of three performance measures when using warm-start data splitting settings. Moreover, when considering physiochemical-based cold-start data splitting settings, our method demonstrates superior predictive performance, particularly in terms of the concordance index. CONCLUSION The results of our study affirm the practical value of our method and its superiority over alternative approaches in predicting drug-target binding affinity across multiple validation sets. This highlights the potential of our approach in accelerating drug repurposing efforts, facilitating novel drug discovery, and ultimately enhancing disease treatment. The data and source code for this study were deposited in the GitHub repository, https://github.com/mojtabaze7/DCGAN-DTA . Furthermore, the web server for our method is accessible at https://dcgan.shinyapps.io/bindingaffinity/ .
Collapse
Affiliation(s)
- Mahmood Kalemati
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Mojtaba Zamani Emani
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Somayyeh Koohi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran.
| |
Collapse
|
11
|
Xu P, Li C, Yuan J, Bao Z, Liu W. Predict lncRNA-drug associations based on graph neural network. Front Genet 2024; 15:1388015. [PMID: 38737125 PMCID: PMC11082279 DOI: 10.3389/fgene.2024.1388015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 04/05/2024] [Indexed: 05/14/2024] Open
Abstract
LncRNAs are an essential type of non-coding RNAs, which have been reported to be involved in various human pathological conditions. Increasing evidence suggests that drugs can regulate lncRNAs expression, which makes it possible to develop lncRNAs as therapeutic targets. Thus, developing in-silico methods to predict lncRNA-drug associations (LDAs) is a critical step for developing lncRNA-based therapies. In this study, we predict LDAs by using graph convolutional networks (GCN) and graph attention networks (GAT) based on lncRNA and drug similarity networks. Results show that our proposed method achieves good performance (average AUCs > 0.92) on five datasets. In addition, case studies and KEGG functional enrichment analysis further prove that the model can effectively identify novel LDAs. On the whole, this study provides a deep learning-based framework for predicting novel LDAs, which will accelerate the lncRNA-targeted drug development process.
Collapse
Affiliation(s)
- Peng Xu
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
- School of Computer Science of Information Technology, Qiannan Normal University for Nationalities, Duyun, China
| | - Chuchu Li
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
| | - Jiaqi Yuan
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
| | - Zhenshen Bao
- College of Information Engineering, Taizhou University, Taizhou, Jiangsu, China
| | - Wenbin Liu
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou, Guangdong, China
| |
Collapse
|
12
|
Svensson E, Hoedt PJ, Hochreiter S, Klambauer G. HyperPCM: Robust Task-Conditioned Modeling of Drug-Target Interactions. J Chem Inf Model 2024; 64:2539-2553. [PMID: 38185877 PMCID: PMC11005051 DOI: 10.1021/acs.jcim.3c01417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 11/27/2023] [Accepted: 11/27/2023] [Indexed: 01/09/2024]
Abstract
A central problem in drug discovery is to identify the interactions between drug-like compounds and protein targets. Over the past few decades, various quantitative structure-activity relationship (QSAR) and proteo-chemometric (PCM) approaches have been developed to model and predict these interactions. While QSAR approaches solely utilize representations of the drug compound, PCM methods incorporate both representations of the protein target and the drug compound, enabling them to achieve above-chance predictive accuracy on previously unseen protein targets. Both QSAR and PCM approaches have recently been improved by machine learning and deep neural networks, that allow the development of drug-target interaction prediction models from measurement data. However, deep neural networks typically require large amounts of training data and cannot robustly adapt to new tasks, such as predicting interaction for unseen protein targets at inference time. In this work, we propose to use HyperNetworks to efficiently transfer information between tasks during inference and thus to accurately predict drug-target interactions on unseen protein targets. Our HyperPCM method reaches state-of-the-art performance compared to previous methods on multiple well-known benchmarks, including Davis, DUD-E, and a ChEMBL derived data set, and particularly excels at zero-shot inference involving unseen protein targets. Our method, as well as reproducible data preparation, is available at https://github.com/ml-jku/hyper-dti.
Collapse
Affiliation(s)
- Emma Svensson
- ELLIS
Unit Linz & Institute for Machine Learning, Johannes Kepler University, Linz 4040, Austria
- Molecular
AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, 431 83, Sweden
| | - Pieter-Jan Hoedt
- ELLIS
Unit Linz & Institute for Machine Learning, Johannes Kepler University, Linz 4040, Austria
| | - Sepp Hochreiter
- ELLIS
Unit Linz & Institute for Machine Learning, Johannes Kepler University, Linz 4040, Austria
- Institute
of Advanced Research in Artificial Intelligence (IARAI), Vienna 1030, Austria
| | - Günter Klambauer
- ELLIS
Unit Linz & Institute for Machine Learning, Johannes Kepler University, Linz 4040, Austria
| |
Collapse
|
13
|
Zhou Z, Zhuo L, Fu X, Lv J, Zou Q, Qi R. Joint masking and self-supervised strategies for inferring small molecule-miRNA associations. MOLECULAR THERAPY. NUCLEIC ACIDS 2024; 35:102103. [PMID: 38261851 PMCID: PMC10794920 DOI: 10.1016/j.omtn.2023.102103] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 12/13/2023] [Indexed: 01/25/2024]
Abstract
Inferring small molecule-miRNA associations (MMAs) is crucial for revealing the intricacies of biological processes and disease mechanisms. Deep learning, renowned for its exceptional speed and accuracy, is extensively used for predicting MMAs. However, given their heavy reliance on data, inaccuracies during data collection can make these methods susceptible to noise interference. To address this challenge, we introduce the joint masking and self-supervised (JMSS)-MMA model. This model synergizes graph autoencoders with a probability distribution-based masking strategy, effectively countering the impact of noisy data and enabling precise predictions of unknown MMAs. Operating in a self-supervised manner, it deeply encodes the relationship data of small molecules and miRNA through the graph autoencoder, delving into its latent information. Our masking strategy has successfully reduced data noise, enhancing prediction accuracy. To our knowledge, this is the pioneering integration of a masking strategy with graph autoencoders for MMA prediction. Furthermore, the JMSS-MMA model incorporates a node-degree-based decoder, deepening the understanding of the network's structure. Experiments on two mainstream datasets confirm the model's efficiency and precision, and ablation studies further attest to its robustness. We firmly believe that this model will revolutionize drug development, personalized medicine, and biomedical research.
Collapse
Affiliation(s)
- Zhecheng Zhou
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou 325027, China
| | - Linlin Zhuo
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou 325027, China
| | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410012, China
| | - Juan Lv
- College of Traditional Chinese Medicine, Changsha Medical University, Changsha 410000, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 611730, China
| | - Ren Qi
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
14
|
Dehghan A, Abbasi K, Razzaghi P, Banadkuki H, Gharaghani S. CCL-DTI: contributing the contrastive loss in drug-target interaction prediction. BMC Bioinformatics 2024; 25:48. [PMID: 38291364 PMCID: PMC11264960 DOI: 10.1186/s12859-024-05671-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 01/22/2024] [Indexed: 02/01/2024] Open
Abstract
BACKGROUND The Drug-Target Interaction (DTI) prediction uses a drug molecule and a protein sequence as inputs to predict the binding affinity value. In recent years, deep learning-based models have gotten more attention. These methods have two modules: the feature extraction module and the task prediction module. In most deep learning-based approaches, a simple task prediction loss (i.e., categorical cross entropy for the classification task and mean squared error for the regression task) is used to learn the model. In machine learning, contrastive-based loss functions are developed to learn more discriminative feature space. In a deep learning-based model, extracting more discriminative feature space leads to performance improvement for the task prediction module. RESULTS In this paper, we have used multimodal knowledge as input and proposed an attention-based fusion technique to combine this knowledge. Also, we investigate how utilizing contrastive loss function along the task prediction loss could help the approach to learn a more powerful model. Four contrastive loss functions are considered: (1) max-margin contrastive loss function, (2) triplet loss function, (3) Multi-class N-pair Loss Objective, and (4) NT-Xent loss function. The proposed model is evaluated using four well-known datasets: Wang et al. dataset, Luo's dataset, Davis, and KIBA datasets. CONCLUSIONS Accordingly, after reviewing the state-of-the-art methods, we developed a multimodal feature extraction network by combining protein sequences and drug molecules, along with protein-protein interaction networks and drug-drug interaction networks. The results show it performs significantly better than the comparable state-of-the-art approaches.
Collapse
Affiliation(s)
- Alireza Dehghan
- Department of Bioinformatics, Kish International Campus, University of Tehran, Kish, 1417614411, Iran
| | - Karim Abbasi
- Laboratory of System Biology, Bioinformatics and Artificial Intelligence in Medicine (LBB&AI), Faculty of Mathematics and Computer Science, Kharazmi University, Tehran, 1417614411, Iran
| | - Parvin Razzaghi
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, 4513766731, Iran.
| | - Hossein Banadkuki
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, 1417614411, Iran
| | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, 1417614411, Iran.
| |
Collapse
|
15
|
Zhong Y, Shen C, Xi X, Luo Y, Ding P, Luo L. Multitask joint learning with graph autoencoders for predicting potential MiRNA-drug associations. Artif Intell Med 2023; 145:102665. [PMID: 37925217 DOI: 10.1016/j.artmed.2023.102665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 06/14/2023] [Accepted: 09/14/2023] [Indexed: 11/06/2023]
Abstract
The occurrence of many diseases is associated with miRNA abnormalities. Predicting potential drug-miRNA associations is of great importance for both disease treatment and new drug discovery. Most computation-based approaches learn one task at a time, ignoring the information contained in other tasks in the same domain. Multitask learning can effectively enhance the prediction performance of a single task by extending the valid information of related tasks. In this paper, we presented a multitask joint learning framework (MTJL) with a graph autoencoder for predicting the associations between drugs and miRNAs. First, we combined multiple pieces of information to construct a high-quality similarity network of both drugs and miRNAs and then used a graph autoencoder (GAE) to learn their embedding representations separately. Second, to further improve the embedding quality of drugs, we added an auxiliary task to classify drugs using the learned representations. Finally, the embedding representations of drugs and miRNAs were linearly transformed to obtain the predictive association scores between them. A comparison with other state-of-the-art models shows that MTJL has the best prediction performance, and ablation experiments show that the auxiliary task can enhance the embedding quality and improve the robustness of the model. In addition, we show that MTJL has high utility in predicting potential associations between drugs and miRNAs by conducting two case studies.
Collapse
Affiliation(s)
- Yichen Zhong
- School of Computer Science, University of South China, Hengyang 421001, China
| | - Cong Shen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410083, China
| | - Xiaoting Xi
- School of Computer Science, University of South China, Hengyang 421001, China
| | - Yuxun Luo
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411105, China
| | - Pingjian Ding
- School of Computer Science, University of South China, Hengyang 421001, China
| | - Lingyun Luo
- School of Computer Science, University of South China, Hengyang 421001, China.
| |
Collapse
|
16
|
Alghushairy O, Ali F, Alghamdi W, Khalid M, Alsini R, Asiry O. Machine learning-based model for accurate identification of druggable proteins using light extreme gradient boosting. J Biomol Struct Dyn 2023; 42:12330-12341. [PMID: 37850427 DOI: 10.1080/07391102.2023.2269280] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 10/04/2023] [Indexed: 10/19/2023]
Abstract
The identification of druggable proteins (DPs) is significant for the development of new drugs, personalized medicine, understanding of disease mechanisms, drug repurposing, and economic benefits. By identifying new druggable targets, researchers can develop new therapies for a range of diseases, leading to better patient outcomes. Identification of DPs by machine learning strategies is more efficient and cost-effective than conventional methods. In this study, a computational predictor, namely Drug-LXGB, is introduced to enhance the identification of DPs. Features are discovered by composition, transition, and distribution (CTD), composition of K-spaced amino acid pair (CKSAAP), pseudo-position-specific scoring matrix (PsePSSM), and a novel descriptor, called multi-block pseudo amino acid composition (MB-PseAAC). The dimensions of CTD, CKSAAP, PsePSSM, and MB-PseAAC are integrated and utilized the sequential forward selection as feature selection algorithm. The best characteristics are provided by random forest, extreme gradient boosting, and light eXtreme gradient boosting (LXGB). The predictive analysis of these learning methods is measured via 10-fold cross-validation. The LXGB-based model secures the highest results than other existing predictors. Our novel protocol will perform an active role in designing novel drugs and would be fruitful to explore the potential target. This study will help better to capture a more universal view of a potential target.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Omar Alghushairy
- Department of Information Systems and Technology, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia
| | - Farman Ali
- Department of Software Engineering, Sarhad University of Science and Information Technology Peshawar Mardan Campus, Peshawar, Pakistan
| | - Wajdi Alghamdi
- Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Majdi Khalid
- Department of Computer Science, College of Computers and Information Systems, Umm Al-Qura University, Makkah, Saudi Arabia
| | - Raed Alsini
- Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Othman Asiry
- Department of Information Technology, College of Computing and Information Technology at Khulais, University of Jeddah, Jeddah, Saudi Arabia
| |
Collapse
|
17
|
Gan Y, Liu W, Xu G, Yan C, Zou G. DMFDDI: deep multimodal fusion for drug-drug interaction prediction. Brief Bioinform 2023; 24:bbad397. [PMID: 37930025 DOI: 10.1093/bib/bbad397] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 09/28/2023] [Accepted: 10/13/2023] [Indexed: 11/07/2023] Open
Abstract
Drug combination therapy has gradually become a promising treatment strategy for complex or co-existing diseases. As drug-drug interactions (DDIs) may cause unexpected adverse drug reactions, DDI prediction is an important task in pharmacology and clinical applications. Recently, researchers have proposed several deep learning methods to predict DDIs. However, these methods mainly exploit the chemical or biological features of drugs, which is insufficient and limits the performances of DDI prediction. Here, we propose a new deep multimodal feature fusion framework for DDI prediction, DMFDDI, which fuses drug molecular graph, DDI network and the biochemical similarity features of drugs to predict DDIs. To fully extract drug molecular structure, we introduce an attention-gated graph neural network for capturing the global features of the molecular graph and the local features of each atom. A sparse graph convolution network is introduced to learn the topological structure information of the DDI network. In the multimodal feature fusion module, an attention mechanism is used to efficiently fuse different features. To validate the performance of DMFDDI, we compare it with 10 state-of-the-art methods. The comparison results demonstrate that DMFDDI achieves better performance in DDI prediction. Our method DMFDDI is implemented in Python using the Pytorch machine-learning library, and it is freely available at https://github.com/DHUDEBLab/DMFDDI.git.
Collapse
Affiliation(s)
- Yanglan Gan
- School of Computer Science and Technology, Donghua University, 2999 North Renmin Road, 201600, Shanghai, China
| | - Wenxiao Liu
- School of Computer Science and Technology, Donghua University, 2999 North Renmin Road, 201600, Shanghai, China
| | - Guangwei Xu
- School of Computer Science and Technology, Donghua University, 2999 North Renmin Road, 201600, Shanghai, China
| | - Cairong Yan
- School of Computer Science and Technology, Donghua University, 2999 North Renmin Road, 201600, Shanghai, China
| | - Guobing Zou
- School of Computer Engineering and Science, Shanghai University, 99 Shangda Road, 200444, Shanghai, China
| |
Collapse
|
18
|
Wang S, Li Y, Zhang Y, Pang S, Qiao S, Zhang Y, Wang F. Generative Adversarial Matrix Completion Network based on Multi-Source Data Fusion for miRNA-Disease Associations Prediction. Brief Bioinform 2023; 24:bbad270. [PMID: 37482409 DOI: 10.1093/bib/bbad270] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 06/16/2023] [Accepted: 07/04/2023] [Indexed: 07/25/2023] Open
Abstract
Numerous biological studies have shown that considering disease-associated micro RNAs (miRNAs) as potential biomarkers or therapeutic targets offers new avenues for the diagnosis of complex diseases. Computational methods have gradually been introduced to reveal disease-related miRNAs. Considering that previous models have not fused sufficiently diverse similarities, that their inappropriate fusion methods may lead to poor quality of the comprehensive similarity network and that their results are often limited by insufficiently known associations, we propose a computational model called Generative Adversarial Matrix Completion Network based on Multi-source Data Fusion (GAMCNMDF) for miRNA-disease association prediction. We create a diverse network connecting miRNAs and diseases, which is then represented using a matrix. The main task of GAMCNMDF is to complete the matrix and obtain the predicted results. The main innovations of GAMCNMDF are reflected in two aspects: GAMCNMDF integrates diverse data sources and employs a nonlinear fusion approach to update the similarity networks of miRNAs and diseases. Also, some additional information is provided to GAMCNMDF in the form of a 'hint' so that GAMCNMDF can work successfully even when complete data are not available. Compared with other methods, the outcomes of 10-fold cross-validation on two distinct databases validate the superior performance of GAMCNMDF with statistically significant results. It is worth mentioning that we apply GAMCNMDF in the identification of underlying small molecule-related miRNAs, yielding outstanding performance results in this specific domain. In addition, two case studies about two important neoplasms show that GAMCNMDF is a promising prediction method.
Collapse
Affiliation(s)
- ShuDong Wang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum (East China), 66 Changjiang Xi Lu, 266580, Shandong, China
| | - YunYin Li
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum (East China), 66 Changjiang Xi Lu, 266580, Shandong, China
| | - YuanYuan Zhang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum (East China), 66 Changjiang Xi Lu, 266580, Shandong, China
| | - ShanChen Pang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum (East China), 66 Changjiang Xi Lu, 266580, Shandong, China
| | - SiBo Qiao
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum (East China), 66 Changjiang Xi Lu, 266580, Shandong, China
| | - Yu Zhang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum (East China), 66 Changjiang Xi Lu, 266580, Shandong, China
| | - FuYu Wang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum (East China), 66 Changjiang Xi Lu, 266580, Shandong, China
| |
Collapse
|
19
|
Wang S, Ren C, Zhang Y, Li Y, Pang S, Song T. Identifying potential small molecule-miRNA associations via Robust PCA based on γ-norm regularization. Brief Bioinform 2023; 24:bbad312. [PMID: 37670501 DOI: 10.1093/bib/bbad312] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 07/18/2023] [Accepted: 08/10/2023] [Indexed: 09/07/2023] Open
Abstract
Dysregulation of microRNAs (miRNAs) is closely associated with refractory human diseases, and the identification of potential associations between small molecule (SM) drugs and miRNAs can provide valuable insights for clinical treatment. Existing computational techniques for inferring potential associations suffer from limitations in terms of accuracy and efficiency. To address these challenges, we devise a novel predictive model called RPCA$\Gamma $NR, in which we propose a new Robust principal component analysis (PCA) framework based on $\gamma $-norm and $l_{2,1}$-norm regularization and design an Augmented Lagrange Multiplier method to optimize it, thereby deriving the association scores. The Gaussian Interaction Profile Kernel Similarity is calculated to capture the similarity information of SMs and miRNAs in known associations. Through extensive evaluation, including Cross Validation Experiments, Independent Validation Experiment, Efficiency Analysis, Ablation Experiment, Matrix Sparsity Analysis, and Case Studies, RPCA$\Gamma $NR outperforms state-of-the-art models concerning accuracy, efficiency and robustness. In conclusion, RPCA$\Gamma $NR can significantly streamline the process of determining SM-miRNA associations, thus contributing to advancements in drug development and disease treatment.
Collapse
Affiliation(s)
- Shudong Wang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum (East China), 66 Changjiang Xi Lu, 266580 Shandong, China
| | - Chuanru Ren
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum (East China), 66 Changjiang Xi Lu, 266580 Shandong, China
| | - Yulin Zhang
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Xin An Street, 266590 Shandong, China
| | - Yunyin Li
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum (East China), 66 Changjiang Xi Lu, 266580 Shandong, China
| | - Shanchen Pang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum (East China), 66 Changjiang Xi Lu, 266580 Shandong, China
| | - Tao Song
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum (East China), 66 Changjiang Xi Lu, 266580 Shandong, China
| |
Collapse
|
20
|
Sun J, Xu M, Ru J, James-Bott A, Xiong D, Wang X, Cribbs AP. Small molecule-mediated targeting of microRNAs for drug discovery: Experiments, computational techniques, and disease implications. Eur J Med Chem 2023; 257:115500. [PMID: 37262996 PMCID: PMC11554572 DOI: 10.1016/j.ejmech.2023.115500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/05/2023] [Accepted: 05/15/2023] [Indexed: 06/03/2023]
Abstract
Small molecules have been providing medical breakthroughs for human diseases for more than a century. Recently, identifying small molecule inhibitors that target microRNAs (miRNAs) has gained importance, despite the challenges posed by labour-intensive screening experiments and the significant efforts required for medicinal chemistry optimization. Numerous experimentally-verified cases have demonstrated the potential of miRNA-targeted small molecule inhibitors for disease treatment. This new approach is grounded in their posttranscriptional regulation of the expression of disease-associated genes. Reversing dysregulated gene expression using this mechanism may help control dysfunctional pathways. Furthermore, the ongoing improvement of algorithms has allowed for the integration of computational strategies built on top of laboratory-based data, facilitating a more precise and rational design and discovery of lead compounds. To complement the use of extensive pharmacogenomics data in prioritising potential drugs, our previous work introduced a computational approach based on only molecular sequences. Moreover, various computational tools for predicting molecular interactions in biological networks using similarity-based inference techniques have been accumulated in established studies. However, there are a limited number of comprehensive reviews covering both computational and experimental drug discovery processes. In this review, we outline a cohesive overview of both biological and computational applications in miRNA-targeted drug discovery, along with their disease implications and clinical significance. Finally, utilizing drug-target interaction (DTIs) data from DrugBank, we showcase the effectiveness of deep learning for obtaining the physicochemical characterization of DTIs.
Collapse
Affiliation(s)
- Jianfeng Sun
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK.
| | - Miaoer Xu
- Department of Biology, Emory University, Atlanta, GA, 30322, USA
| | - Jinlong Ru
- Chair of Prevention of Microbial Diseases, School of Life Sciences Weihenstephan, Technical University of Munich, Freising, 85354, Germany
| | - Anna James-Bott
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK
| | - Dapeng Xiong
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Xia Wang
- College of Animal Science and Technology, Northwest A&F University, Yangling, 712100, China.
| | - Adam P Cribbs
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK.
| |
Collapse
|
21
|
Qu J, Song Z, Cheng X, Jiang Z, Zhou J. Neighborhood-based inference and restricted Boltzmann machine for small molecule-miRNA associations prediction. PeerJ 2023; 11:e15889. [PMID: 37641598 PMCID: PMC10460564 DOI: 10.7717/peerj.15889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 07/21/2023] [Indexed: 08/31/2023] Open
Abstract
Background A growing number of experiments have shown that microRNAs (miRNAs) can be used as target of small molecules (SMs) to regulate gene expression for treating diseases. Therefore, identifying SM-related miRNAs is helpful for the treatment of diseases in the domain of medical investigation. Methods This article presents a new computational model, called NIRBMSMMA (neighborhood-based inference (NI) and restricted Boltzmann machine (RBM)), which we developed to identify potential small molecule-miRNA associations (NIRBMSMMA). First, grounded on known SM-miRNAs associations, SM similarity and miRNA similarity, NI was used to predict score of an unknown SM-miRNA pair by reckoning the sum of known associations between neighbors of the SM (miRNA) and the miRNA (SM). Second, utilizing a two-layered generative stochastic artificial neural network, RBM was used to predict SM-miRNA association by learning potential probability distribution from known SM-miRNA associations. At last, an ensemble learning model was conducted to combine NI and RBM for identifying potential SM-miRNA associations. Results Furthermore, we conducted global leave one out cross validation (LOOCV), miRNA-fixed LOOCV, SM-fixed LOOCV and five-fold cross validation to assess performance of NIRBMSMMA based on three datasets. Results showed that NIRBMSMMA obtained areas under the curve (AUC) of 0.9912, 0.9875, 0.8376 and 0.9898 ± 0.0009 under global LOOCV, miRNA-fixed LOOCV, SM-fixed LOOCV and five-fold cross validation based on dataset 1, respectively. For dataset 2, the AUCs are 0.8645, 0.8720, 0.7066 and 0.8547 ± 0.0046 in turn. For dataset 3, the AUCs are 0.9884, 0.9802, 0.8239 and 0.9870 ± 0.0015 in turn. Also, we conducted case studies to further assess the predictive performance of NIRBMSMMA. These results illustrated the proposed model is a useful tool in predicting potential SM-miRNA associations.
Collapse
Affiliation(s)
- Jia Qu
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China
| | - Zihao Song
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China
| | - Xiaolong Cheng
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China
| | - Zhibin Jiang
- Department of Computer Science and Engineering, Shaoxing University, Shaoxing, Zhejiang, China
| | - Jie Zhou
- Department of Computer Science and Engineering, Shaoxing University, Shaoxing, Zhejiang, China
| |
Collapse
|
22
|
Qu J, Song Z, Cheng X, Jiang Z, Zhou J. A new integrated framework for the identification of potential virus-drug associations. Front Microbiol 2023; 14:1179414. [PMID: 37675432 PMCID: PMC10478006 DOI: 10.3389/fmicb.2023.1179414] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Accepted: 07/31/2023] [Indexed: 09/08/2023] Open
Abstract
Introduction With the increasingly serious problem of antiviral drug resistance, drug repurposing offers a time-efficient and cost-effective way to find potential therapeutic agents for disease. Computational models have the ability to quickly predict potential reusable drug candidates to treat diseases. Methods In this study, two matrix decomposition-based methods, i.e., Matrix Decomposition with Heterogeneous Graph Inference (MDHGI) and Bounded Nuclear Norm Regularization (BNNR), were integrated to predict anti-viral drugs. Moreover, global leave-one-out cross-validation (LOOCV), local LOOCV, and 5-fold cross-validation were implemented to evaluate the performance of the proposed model based on datasets of DrugVirus that consist of 933 known associations between 175 drugs and 95 viruses. Results The results showed that the area under the receiver operating characteristics curve (AUC) of global LOOCV and local LOOCV are 0.9035 and 0.8786, respectively. The average AUC and the standard deviation of the 5-fold cross-validation for DrugVirus datasets are 0.8856 ± 0.0032. We further implemented cross-validation based on MDAD and aBiofilm, respectively, to evaluate the performance of the model. In particle, MDAD (aBiofilm) dataset contains 2,470 (2,884) known associations between 1,373 (1,470) drugs and 173 (140) microbes. In addition, two types of case studies were carried out further to verify the effectiveness of the model based on the DrugVirus and MDAD datasets. The results of the case studies supported the effectiveness of MHBVDA in identifying potential virus-drug associations as well as predicting potential drugs for new microbes.
Collapse
Affiliation(s)
- Jia Qu
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China
| | - Zihao Song
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China
| | - Xiaolong Cheng
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China
| | - Zhibin Jiang
- School of Computer Science and Engineering, Shaoxing University, Shaoxing, Zhejiang, China
| | - Jie Zhou
- School of Computer Science and Engineering, Shaoxing University, Shaoxing, Zhejiang, China
| |
Collapse
|
23
|
Wang S, Liu T, Ren C, Wu W, Zhao Z, Pang S, Zhang Y. Predicting potential small molecule-miRNA associations utilizing truncated schatten p-norm. Brief Bioinform 2023; 24:bbad234. [PMID: 37366591 DOI: 10.1093/bib/bbad234] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 06/05/2023] [Accepted: 06/06/2023] [Indexed: 06/28/2023] Open
Abstract
MicroRNAs (miRNAs) have significant implications in diverse human diseases and have proven to be effectively targeted by small molecules (SMs) for therapeutic interventions. However, current SM-miRNA association prediction models do not adequately capture SM/miRNA similarity. Matrix completion is an effective method for association prediction, but existing models use nuclear norm instead of rank function, which has some drawbacks. Therefore, we proposed a new approach for predicting SM-miRNA associations by utilizing the truncated schatten p-norm (TSPN). First, the SM/miRNA similarity was preprocessed by incorporating the Gaussian interaction profile kernel similarity method. This identified more SM/miRNA similarities and significantly improved the SM-miRNA prediction accuracy. Next, we constructed a heterogeneous SM-miRNA network by combining biological information from three matrices and represented the network with its adjacency matrix. Finally, we constructed the prediction model by minimizing the truncated schatten p-norm of this adjacency matrix and we developed an efficient iterative algorithmic framework to solve the model. In this framework, we also used a weighted singular value shrinkage algorithm to avoid the problem of excessive singular value shrinkage. The truncated schatten p-norm approximates the rank function more closely than the nuclear norm, so the predictions are more accurate. We performed four different cross-validation experiments on two separate datasets, and TSPN outperformed various most advanced methods. In addition, public literature confirms a large number of predictive associations of TSPN in four case studies. Therefore, TSPN is a reliable model for SM-miRNA association prediction.
Collapse
Affiliation(s)
- Shudong Wang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Tiyao Liu
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Chuanru Ren
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Wenhao Wu
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Zhiyuan Zhao
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Shanchen Pang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Yuanyuan Zhang
- College of Information and Control Engineering, Qingdao University of Technology, Qingdao 266580, China
| |
Collapse
|
24
|
Binatlı OC, Gönen M. MOKPE: drug-target interaction prediction via manifold optimization based kernel preserving embedding. BMC Bioinformatics 2023; 24:276. [PMID: 37407927 DOI: 10.1186/s12859-023-05401-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 06/25/2023] [Indexed: 07/07/2023] Open
Abstract
BACKGROUND In many applications of bioinformatics, data stem from distinct heterogeneous sources. One of the well-known examples is the identification of drug-target interactions (DTIs), which is of significant importance in drug discovery. In this paper, we propose a novel framework, manifold optimization based kernel preserving embedding (MOKPE), to efficiently solve the problem of modeling heterogeneous data. Our model projects heterogeneous drug and target data into a unified embedding space by preserving drug-target interactions and drug-drug, target-target similarities simultaneously. RESULTS We performed ten replications of ten-fold cross validation on four different drug-target interaction network data sets for predicting DTIs for previously unseen drugs. The classification evaluation metrics showed better or comparable performance compared to previous similarity-based state-of-the-art methods. We also evaluated MOKPE on predicting unknown DTIs of a given network. Our implementation of the proposed algorithm in R together with the scripts that replicate the reported experiments is publicly available at https://github.com/ocbinatli/mokpe .
Collapse
Affiliation(s)
- Oğuz C Binatlı
- Graduate School of Sciences and Engineering, Koç University, 34450, Istanbul, Turkey
| | - Mehmet Gönen
- Department of Industrial Engineering, College of Engineering, Koç University, 34450, Istanbul, Turkey.
- School of Medicine, Koç University, 34450, Istanbul, Turkey.
| |
Collapse
|
25
|
Xiang H, Guo R, Liu L, Guo T, Huang Q. MSIF-LNP: microbial and human health association prediction based on matrix factorization noise reduction for similarity fusion and bidirectional linear neighborhood label propagation. Front Microbiol 2023; 14:1216811. [PMID: 37389340 PMCID: PMC10303805 DOI: 10.3389/fmicb.2023.1216811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 05/25/2023] [Indexed: 07/01/2023] Open
Abstract
Studies have shown that microbes are closely related to human health. Clarifying the relationship between microbes and diseases that cause health problems can provide new solutions for the treatment, diagnosis, and prevention of diseases, and provide strong protection for human health. Currently, more and more similarity fusion methods are available to predict potential microbe-disease associations. However, existing methods have noise problems in the process of similarity fusion. To address this issue, we propose a method called MSIF-LNP that can efficiently and accurately identify potential connections between microbes and diseases, and thus clarify the relationship between microbes and human health. This method is based on matrix factorization denoising similarity fusion (MSIF) and bidirectional linear neighborhood propagation (LNP) techniques. First, we use non-linear iterative fusion to obtain a similarity network for microbes and diseases by fusing the initial microbe and disease similarities, and then reduce noise by using matrix factorization. Next, we use the initial microbe-disease association pairs as label information to perform linear neighborhood label propagation on the denoised similarity network of microbes and diseases. This enables us to obtain a score matrix for predicting microbe-disease relationships. We evaluate the predictive performance of MSIF-LNP and seven other advanced methods through 10-fold cross-validation, and the experimental results show that MSIF-LNP outperformed the other seven methods in terms of AUC. In addition, the analysis of Cystic fibrosis and Obesity cases further demonstrate the predictive ability of this method in practical applications.
Collapse
Affiliation(s)
- Hui Xiang
- College of Physical Education, Southwest Forestry University, Kunming, Yunnan, China
| | - Rong Guo
- College of Physical Education, Southwest Forestry University, Kunming, Yunnan, China
| | - Li Liu
- College of Physical Education, Suzhou University, Suzhou, Anhui, China
| | - Tengjie Guo
- College of Physical Education, Yunnan Normal University, Kunming, Yunnan, China
| | - Quan Huang
- College of Physical Education, Southwest Forestry University, Kunming, Yunnan, China
| |
Collapse
|
26
|
Niu Z, Gao X, Xia Z, Zhao S, Sun H, Wang H, Liu M, Kong X, Ma C, Zhu H, Gao H, Liu Q, Yang F, Song X, Lu J, Zhou X. Prediction of small molecule drug-miRNA associations based on GNNs and CNNs. Front Genet 2023; 14:1201934. [PMID: 37323664 PMCID: PMC10268031 DOI: 10.3389/fgene.2023.1201934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 05/17/2023] [Indexed: 06/17/2023] Open
Abstract
MicroRNAs (miRNAs) play a crucial role in various biological processes and human diseases, and are considered as therapeutic targets for small molecules (SMs). Due to the time-consuming and expensive biological experiments required to validate SM-miRNA associations, there is an urgent need to develop new computational models to predict novel SM-miRNA associations. The rapid development of end-to-end deep learning models and the introduction of ensemble learning ideas provide us with new solutions. Based on the idea of ensemble learning, we integrate graph neural networks (GNNs) and convolutional neural networks (CNNs) to propose a miRNA and small molecule association prediction model (GCNNMMA). Firstly, we use GNNs to effectively learn the molecular structure graph data of small molecule drugs, while using CNNs to learn the sequence data of miRNAs. Secondly, since the black-box effect of deep learning models makes them difficult to analyze and interpret, we introduce attention mechanisms to address this issue. Finally, the neural attention mechanism allows the CNNs model to learn the sequence data of miRNAs to determine the weight of sub-sequences in miRNAs, and then predict the association between miRNAs and small molecule drugs. To evaluate the effectiveness of GCNNMMA, we implement two different cross-validation (CV) methods based on two different datasets. Experimental results show that the cross-validation results of GCNNMMA on both datasets are better than those of other comparison models. In a case study, Fluorouracil was found to be associated with five different miRNAs in the top 10 predicted associations, and published experimental literature confirmed that Fluorouracil is a metabolic inhibitor used to treat liver cancer, breast cancer, and other tumors. Therefore, GCNNMMA is an effective tool for mining the relationship between small molecule drugs and miRNAs relevant to diseases.
Collapse
|
27
|
Quan Y, Xiong ZK, Zhang KX, Zhang QY, Zhang W, Zhang HY. Evolution-strengthened knowledge graph enables predicting the targetability and druggability of genes. PNAS NEXUS 2023; 2:pgad147. [PMID: 37188275 PMCID: PMC10178923 DOI: 10.1093/pnasnexus/pgad147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 04/21/2023] [Indexed: 05/17/2023]
Abstract
Identifying promising targets is a critical step in modern drug discovery, with causative genes of diseases that are an important source of successful targets. Previous studies have found that the pathogeneses of various diseases are closely related to the evolutionary events of organisms. Accordingly, evolutionary knowledge can facilitate the prediction of causative genes and further accelerate target identification. With the development of modern biotechnology, massive biomedical data have been accumulated, and knowledge graphs (KGs) have emerged as a powerful approach for integrating and utilizing vast amounts of data. In this study, we constructed an evolution-strengthened knowledge graph (ESKG) and validated applications of ESKG in the identification of causative genes. More importantly, we developed an ESKG-based machine learning model named GraphEvo, which can effectively predict the targetability and the druggability of genes. We further investigated the explainability of the ESKG in druggability prediction by dissecting the evolutionary hallmarks of successful targets. Our study highlights the importance of evolutionary knowledge in biomedical research and demonstrates the potential power of ESKG in promising target identification. The data set of ESKG and the code of GraphEvo can be downloaded from https://github.com/Zhankun-Xiong/GraphEvo.
Collapse
Affiliation(s)
| | | | - Ke-Xin Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, Hubei 430070, P. R. China
| | - Qing-Ye Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, Hubei 430070, P. R. China
| | - Wen Zhang
- To whom correspondence should be addressed: ;
| | | |
Collapse
|
28
|
Chen P, Zheng H. Drug-target interaction prediction based on spatial consistency constraint and graph convolutional autoencoder. BMC Bioinformatics 2023; 24:151. [PMID: 37069493 PMCID: PMC10109239 DOI: 10.1186/s12859-023-05275-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 04/05/2023] [Indexed: 04/19/2023] Open
Abstract
BACKGROUND Drug-target interaction (DTI) prediction plays an important role in drug discovery and repositioning. However, most of the computational methods used for identifying relevant DTIs do not consider the invariance of the nearest neighbour relationships between drugs or targets. In other words, they do not take into account the invariance of the topological relationships between nodes during representation learning. It may limit the performance of the DTI prediction methods. RESULTS Here, we propose a novel graph convolutional autoencoder-based model, named SDGAE, to predict DTIs. As the graph convolutional network cannot handle isolated nodes in a network, a pre-processing step was applied to reduce the number of isolated nodes in the heterogeneous network and facilitate effective exploitation of the graph convolutional network. By maintaining the graph structure during representation learning, the nearest neighbour relationships between nodes in the embedding space remained as close as possible to the original space. CONCLUSIONS Overall, we demonstrated that SDGAE can automatically learn more informative and robust feature vectors of drugs and targets, thus exhibiting significantly improved predictive accuracy for DTIs.
Collapse
Affiliation(s)
- Peng Chen
- School of Computer Science and Technology, University of Science and Technology of China, Jinzhai Road 96, Hefei, 230027, People's Republic of China
- Anhui Key Laboratory of Software Engineering in Computing and Communication, University of Science and Technology of China, Jinzhai Road 96, Hefei, 230027, People's Republic of China
| | - Haoran Zheng
- School of Computer Science and Technology, University of Science and Technology of China, Jinzhai Road 96, Hefei, 230027, People's Republic of China.
- Anhui Key Laboratory of Software Engineering in Computing and Communication, University of Science and Technology of China, Jinzhai Road 96, Hefei, 230027, People's Republic of China.
| |
Collapse
|
29
|
Wang S, Ren C, Zhang Y, Pang S, Qiao S, Wu W, Lin B. AMCSMMA: Predicting Small Molecule-miRNA Potential Associations Based on Accurate Matrix Completion. Cells 2023; 12:cells12081123. [PMID: 37190032 DOI: 10.3390/cells12081123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 04/02/2023] [Accepted: 04/04/2023] [Indexed: 05/17/2023] Open
Abstract
Exploring potential associations between small molecule drugs (SMs) and microRNAs (miRNAs) is significant for drug development and disease treatment. Since biological experiments are expensive and time-consuming, we propose a computational model based on accurate matrix completion for predicting potential SM-miRNA associations (AMCSMMA). Initially, a heterogeneous SM-miRNA network is constructed, and its adjacency matrix is taken as the target matrix. An optimization framework is then proposed to recover the target matrix with the missing values by minimizing its truncated nuclear norm, an accurate, robust, and efficient approximation to the rank function. Finally, we design an effective two-step iterative algorithm to solve the optimization problem and obtain the prediction scores. After determining the optimal parameters, we conduct four kinds of cross-validation experiments based on two datasets, and the results demonstrate that AMCSMMA is superior to the state-of-the-art methods. In addition, we implement another validation experiment, in which more evaluation metrics in addition to the AUC are introduced and finally achieve great results. In two types of case studies, a large number of SM-miRNA pairs with high predictive scores are confirmed by the published experimental literature. In summary, AMCSMMA has superior performance in predicting potential SM-miRNA associations, which can provide guidance for biological experiments and accelerate the discovery of new SM-miRNA associations.
Collapse
Affiliation(s)
- Shudong Wang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Chuanru Ren
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Yulin Zhang
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao 266580, China
| | - Shanchen Pang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Sibo Qiao
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Wenhao Wu
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Boyang Lin
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| |
Collapse
|
30
|
Castiglione F, Nardini C, Onofri E, Pedicini M, Tieri P. Explainable Drug Repurposing Approach From Biased Random Walks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1009-1019. [PMID: 35839194 DOI: 10.1109/tcbb.2022.3191392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Drug repurposing is a highly active research area, aiming at finding novel uses for drugs that have been previously developed for other therapeutic purposes. Despite the flourishing of methodologies, success is still partial, and different approaches offer, each, peculiar advantages. In this composite landscape, we present a novel methodology focusing on an efficient mathematical procedure based on gene similarity scores and biased random walks which rely on robust drug-gene-disease association data sets. The recommendation mechanism is further unveiled by means of the Markov chain underlying the random walk process, hence providing explainability about how findings are suggested. Performances evaluation and the analysis of a case study on rheumatoid arthritis show that our approach is accurate in providing useful recommendations and is computationally efficient, compared to the state of the art of drug repurposing approaches.
Collapse
|
31
|
Yang X, Yang G, Chu J. The Computational Drug Repositioning Without Negative Sampling. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1506-1517. [PMID: 36197871 DOI: 10.1109/tcbb.2022.3212051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Computational drug repositioning technology is an effective tool to accelerate drug development. Although this technique has been widely used and successful in recent decades, many existing models still suffer from multiple drawbacks such as the massive number of unvalidated drug-disease associations and the inner product. The limitations of these works are mainly due to the following two reasons: firstly, previous works used negative sampling techniques to treat unvalidated drug-disease associations as negative samples, which is invalid in real-world settings; secondly, the inner product cannot fully take into account the feature information contained in the latent factor of drug and disease. In this paper, we propose a novel PUON framework for addressing the above deficiencies, which models the risk estimator of computational drug repositioning only using validated (Positive) and unvalidated (Unlabelled) drug-disease associations without employing negative sampling techniques. The PUON also proposed an Outer Neighborhood-based classifier for modeling the cross-feature information of the latent facotor. For a comprehensive comparison, we considered 6 popular baselines. Extensive experiments in four real-world datasets showed that PUON model achieved the best performance based on 6 evaluation metrics.
Collapse
|
32
|
Sun J, Ru J, Ramos-Mucci L, Qi F, Chen Z, Chen S, Cribbs AP, Deng L, Wang X. DeepsmirUD: Prediction of Regulatory Effects on microRNA Expression Mediated by Small Molecules Using Deep Learning. Int J Mol Sci 2023; 24:1878. [PMID: 36768205 PMCID: PMC9915273 DOI: 10.3390/ijms24031878] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 12/26/2022] [Accepted: 01/12/2023] [Indexed: 01/21/2023] Open
Abstract
Aberrant miRNA expression has been associated with a large number of human diseases. Therefore, targeting miRNAs to regulate their expression levels has become an important therapy against diseases that stem from the dysfunction of pathways regulated by miRNAs. In recent years, small molecules have demonstrated enormous potential as drugs to regulate miRNA expression (i.e., SM-miR). A clear understanding of the mechanism of action of small molecules on the upregulation and downregulation of miRNA expression allows precise diagnosis and treatment of oncogenic pathways. However, outside of a slow and costly process of experimental determination, computational strategies to assist this on an ad hoc basis have yet to be formulated. In this work, we developed, to the best of our knowledge, the first cross-platform prediction tool, DeepsmirUD, to infer small-molecule-mediated regulatory effects on miRNA expression (i.e., upregulation or downregulation). This method is powered by 12 cutting-edge deep-learning frameworks and achieved AUC values of 0.843/0.984 and AUCPR values of 0.866/0.992 on two independent test datasets. With a complementarily constructed network inference approach based on similarity, we report a significantly improved accuracy of 0.813 in determining the regulatory effects of nearly 650 associated SM-miR relations, each formed with either novel small molecule or novel miRNA. By further integrating miRNA-cancer relationships, we established a database of potential pharmaceutical drugs from 1343 small molecules for 107 cancer diseases to understand the drug mechanisms of action and offer novel insight into drug repositioning. Furthermore, we have employed DeepsmirUD to predict the regulatory effects of a large number of high-confidence associated SM-miR relations. Taken together, our method shows promise to accelerate the development of potential miRNA targets and small molecule drugs.
Collapse
Affiliation(s)
- Jianfeng Sun
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Jinlong Ru
- Institute of Virology, Helmholtz Centre Munich—German Research Center for Environmental Health, 85764 Neuherberg, Germany
- Chair of Prevention of Microbial Diseases, School of Life Sciences Weihenstephan, Technical University of Munich, 85354 Freising, Germany
| | - Lorenzo Ramos-Mucci
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Fei Qi
- Institute of Genomics, School of Medicine, Huaqiao University, Xiamen 362021, China
| | - Zihao Chen
- Department of Computational Biology for Drug Discovery, Biolife Biotechnology Ltd., Zhumadian 463200, China
| | - Suyuan Chen
- Leibniz-Institut für Analytische Wissenschaften–ISAS–e.V., Otto-Hahn-Str asse 6b, 44227 Dortmund, Germany
| | - Adam P. Cribbs
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Li Deng
- Institute of Virology, Helmholtz Centre Munich—German Research Center for Environmental Health, 85764 Neuherberg, Germany
- Chair of Prevention of Microbial Diseases, School of Life Sciences Weihenstephan, Technical University of Munich, 85354 Freising, Germany
| | - Xia Wang
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ 85721, USA
| |
Collapse
|
33
|
Luo Y, Peng L, Shan W, Sun M, Luo L, Liang W. Machine learning in the development of targeting microRNAs in human disease. Front Genet 2023; 13:1088189. [PMID: 36685965 PMCID: PMC9845262 DOI: 10.3389/fgene.2022.1088189] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 12/12/2022] [Indexed: 01/05/2023] Open
Abstract
A microRNA is a small, single-stranded, non-coding ribonucleic acid that plays a crucial role in RNA silencing and can regulate gene expression. With the in-depth study of miRNA in development and disease, miRNA has become an attractive target for novel therapeutic strategies. Exploring miRNA targeting therapy only through experiments is expensive and laborious, so it is essential to develop novel and efficient computational methods to narrow down the search. Recent advances in machine learning applied in biomedical informatics provide opportunities to explore miRNA-targeting drugs, thus promoting miRNA therapeutics. This review provides an overview of recent advancements in miRNA targeting therapeutic using machine learning. First, we mainly describe the basics of predicting miRNA targeting drugs, including pharmacogenomic data resources and data preprocessing. Then we present primary machine learning algorithms and elaborate their application in discovering relationships among miRNAs, drugs, and diseases. Along with the progress of miRNA targeting therapeutics, we finally analyze and discuss the current challenges and opportunities that machine learning confronts.
Collapse
Affiliation(s)
- Yuxun Luo
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China,Hunan Key Laboratory for Service computing and Novel Software Technology, Xiangtan, China
| | - Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China,Hunan Key Laboratory for Service computing and Novel Software Technology, Xiangtan, China
| | - Wenyu Shan
- School of Computer Science, University of South China, Hengyang, China
| | - Mengyue Sun
- School of Polymer Science and Polymer Engineering, The University of Akron, Akron, OH, United States
| | - Lingyun Luo
- School of Computer Science, University of South China, Hengyang, China
| | - Wei Liang
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China,Hunan Key Laboratory for Service computing and Novel Software Technology, Xiangtan, China,*Correspondence: Wei Liang,
| |
Collapse
|
34
|
Li J, Lin H, Wang Y, Li Z, Wu B. Prediction of potential small molecule-miRNA associations based on heterogeneous network representation learning. Front Genet 2022; 13:1079053. [PMID: 36531225 PMCID: PMC9755196 DOI: 10.3389/fgene.2022.1079053] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 11/21/2022] [Indexed: 11/25/2023] Open
Abstract
MicroRNAs (miRNAs) are closely associated with the occurrences and developments of many complex human diseases. Increasing studies have shown that miRNAs emerge as new therapeutic targets of small molecule (SM) drugs. Since traditional experiment methods are expensive and time consuming, it is particularly crucial to find efficient computational approaches to predict potential small molecule-miRNA (SM-miRNA) associations. Considering that integrating multi-source heterogeneous information related with SM-miRNA association prediction would provide a comprehensive insight into the features of both SMs and miRNAs, we proposed a novel model of Small Molecule-MiRNA Association prediction based on Heterogeneous Network Representation Learning (SMMA-HNRL) for more precisely predicting the potential SM-miRNA associations. In SMMA-HNRL, a novel heterogeneous information network was constructed with SM nodes, miRNA nodes and disease nodes. To access and utilize of the topological information of the heterogeneous information network, feature vectors of SM and miRNA nodes were obtained by two different heterogeneous network representation learning algorithms (HeGAN and HIN2Vec) respectively and merged with connect operation. Finally, LightGBM was chosen as the classifier of SMMA-HNRL for predicting potential SM-miRNA associations. The 10-fold cross validations were conducted to evaluate the prediction performance of SMMA-HNRL, it achieved an area under of ROC curve of 0.9875, which was superior to other three state-of-the-art models. With two independent validation datasets, the test experiment results revealed the robustness of our model. Moreover, three case studies were performed. As a result, 35, 37, and 22 miRNAs among the top 50 predicting miRNAs associated with 5-FU, cisplatin, and imatinib were validated by experimental literature works respectively, which confirmed the effectiveness of SMMA-HNRL. The source code and experimental data of SMMA-HNRL are available at https://github.com/SMMA-HNRL/SMMA-HNRL.
Collapse
Affiliation(s)
- Jianwei Li
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
- Hebei Province Key Laboratory of Big Data Calculation, Hebei University of Technology, Tianjin, China
| | - Hongxin Lin
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| | - Yinfei Wang
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| | - Zhiguang Li
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| | - Baoqin Wu
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| |
Collapse
|
35
|
Li P, Tiwari P, Xu J, Qian Y, Ai C, Ding Y, Guo F. Sparse regularized joint projection model for identifying associations of non-coding RNAs and human diseases. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.110044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
36
|
Ni J, Cheng X, Ni T, Liang J. Identifying SM-miRNA associations based on layer attention graph convolutional network and matrix decomposition. Front Mol Biosci 2022; 9:1009099. [PMID: 36504714 PMCID: PMC9732030 DOI: 10.3389/fmolb.2022.1009099] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 11/03/2022] [Indexed: 11/27/2022] Open
Abstract
The accurate prediction of potential associations between microRNAs (miRNAs) and small molecule (SM) drugs can enhance our knowledge of how SM cures endogenous miRNA-related diseases. Given that traditional methods for predicting SM-miRNA associations are time-consuming and arduous, a number of computational models have been proposed to anticipate the potential SM-miRNA associations. However, several of these strategies failed to eliminate noise from the known SM-miRNA association information or failed to prioritize the most significant known SM-miRNA associations. Therefore, we proposed a model of Graph Convolutional Network with Layer Attention mechanism for SM-MiRNA Association prediction (GCNLASMMA). Firstly, we obtained the new SM-miRNA associations by matrix decomposition. The new SM-miRNA associations, as well as the integrated SM similarity and miRNA similarity were subsequently incorporated into a heterogeneous network. Finally, a graph convolutional network with an attention mechanism was used to compute the reconstructed SM-miRNA association matrix. Furthermore, four types of cross validations and two types of case studies were performed to assess the performance of GCNLASMMA. In cross validation, global Leave-One-Out Cross Validation (LOOCV), miRNA-fixed LOOCV, SM-fixed LOOCV and 5-fold cross-validation achieved excellent performance. Numerous hypothesized associations in case studies were confirmed by experimental literatures. All of these results confirmed that GCNLASMMA is a trustworthy association inference method.
Collapse
|
37
|
Peng L, Tu Y, Huang L, Li Y, Fu X, Chen X. DAESTB: inferring associations of small molecule-miRNA via a scalable tree boosting model based on deep autoencoder. Brief Bioinform 2022; 23:6827720. [PMID: 36377749 DOI: 10.1093/bib/bbac478] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 09/28/2022] [Accepted: 10/08/2022] [Indexed: 11/16/2022] Open
Abstract
MicroRNAs (miRNAs) are closely related to a variety of human diseases, not only regulating gene expression, but also having an important role in human life activities and being viable targets of small molecule drugs for disease treatment. Current computational techniques to predict the potential associations between small molecule and miRNA are not that accurate. Here, we proposed a new computational method based on a deep autoencoder and a scalable tree boosting model (DAESTB), to predict associations between small molecule and miRNA. First, we constructed a high-dimensional feature matrix by integrating small molecule-small molecule similarity, miRNA-miRNA similarity and known small molecule-miRNA associations. Second, we reduced feature dimensionality on the integrated matrix using a deep autoencoder to obtain the potential feature representation of each small molecule-miRNA pair. Finally, a scalable tree boosting model is used to predict small molecule and miRNA potential associations. The experiments on two datasets demonstrated the superiority of DAESTB over various state-of-the-art methods. DAESTB achieved the best AUC value. Furthermore, in three case studies, a large number of predicted associations by DAESTB are confirmed with the public accessed literature. We envision that DAESTB could serve as a useful biological model for predicting potential small molecule-miRNA associations.
Collapse
Affiliation(s)
- Li Peng
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China.,Hunan Key Laboratory for Service computing and Novel Software Technology
| | - Yuan Tu
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Yang Li
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Xiangzheng Fu
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Xiang Chen
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| |
Collapse
|
38
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: taxonomy, trends and challenges of computational models. Brief Bioinform 2022; 23:6686738. [PMID: 36056743 DOI: 10.1093/bib/bbac358] [Citation(s) in RCA: 63] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/24/2022] [Accepted: 07/30/2022] [Indexed: 12/12/2022] Open
Abstract
Since the problem proposed in late 2000s, microRNA-disease association (MDA) predictions have been implemented based on the data fusion paradigm. Integrating diverse data sources gains a more comprehensive research perspective, and brings a challenge to algorithm design for generating accurate, concise and consistent representations of the fused data. After more than a decade of research progress, a relatively simple algorithm like the score function or a single computation layer may no longer be sufficient for further improving predictive performance. Advanced model design has become more frequent in recent years, particularly in the form of reasonably combing multiple algorithms, a process known as model fusion. In the current review, we present 29 state-of-the-art models and introduce the taxonomy of computational models for MDA prediction based on model fusion and non-fusion. The new taxonomy exhibits notable changes in the algorithmic architecture of models, compared with that of earlier ones in the 2017 review by Chen et al. Moreover, we discuss the progresses that have been made towards overcoming the obstacles to effective MDA prediction since 2017 and elaborated on how future models can be designed according to a set of new schemas. Lastly, we analysed the strengths and weaknesses of each model category in the proposed taxonomy and proposed future research directions from diverse perspectives for enhancing model performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
39
|
Kang LP, Lin KB, Lu P, Yang F, Chen JP. Multitype drug interaction prediction based on the deep fusion of drug features and topological relationships. PLoS One 2022; 17:e0273764. [PMID: 36037188 PMCID: PMC9423685 DOI: 10.1371/journal.pone.0273764] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 08/14/2022] [Indexed: 11/21/2022] Open
Abstract
Drug–drug interaction (DDI) prediction has received considerable attention from industry and academia. Most existing methods predict DDIs from drug attributes or relationships with neighbors, which does not guarantee that informative drug embeddings for prediction will be obtained. To address this limitation, we propose a multitype drug interaction prediction method based on the deep fusion of drug features and topological relationships, abbreviated DM-DDI. The proposed method adopts a deep fusion strategy to combine drug features and topologies to learn representative drug embeddings for DDI prediction. Specifically, a deep neural network model is first used on the drug feature matrix to extract feature information, while a graph convolutional network model is employed to capture structural information from the adjacency matrix. Then, we adopt delivery operations that allow the two models to exchange information between layers, as well as an attention mechanism for a weighted fusion of the two learned embeddings before the output layer. Finally, the unified drug embeddings for the downstream task are obtained. We conducted extensive experiments on real-world datasets, the experimental results demonstrated that DM-DDI achieved more accurate prediction results than state-of-the-art baselines. Furthermore, in two tasks that are more similar to real-world scenarios, DM-DDI outperformed other prediction methods for unknown drugs.
Collapse
Affiliation(s)
- Li-Ping Kang
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen, China
| | - Kai-Biao Lin
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen, China
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Provincial University, Putian, China
- * E-mail:
| | - Ping Lu
- School of Economics and Management, Xiamen University of Technology, Xiamen, China
| | - Fan Yang
- Department of Automation, Xiamen University, Xiamen, China
| | - Jin-Po Chen
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen, China
| |
Collapse
|
40
|
Liang L, Liu Y, Kang B, Wang R, Sun MY, Wu Q, Meng XF, Lin JP. Large-scale comparison of machine learning algorithms for target prediction of natural products. Brief Bioinform 2022; 23:6675751. [PMID: 36007240 DOI: 10.1093/bib/bbac359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 07/26/2022] [Accepted: 07/31/2022] [Indexed: 11/13/2022] Open
Abstract
Natural products (NPs) and their derivatives are important resources for drug discovery. There are many in silico target prediction methods that have been reported, however, very few of them distinguish NPs from synthetic molecules. Considering the fact that NPs and synthetic molecules are very different in many characteristics, it is necessary to build specific target prediction models of NPs. Therefore, we collected the activity data of NPs and their derivatives from the public databases and constructed four datasets, including the NP dataset, the NPs and its first-class derivatives dataset, the NPs and all its derivatives and the ChEMBL26 compounds dataset. Conditions, including activity thresholds and input features, were explored to access the performance of eight machine learning methods of target prediction of NPs, including support vector machines (SVM), extreme gradient boosting, random forests, K-nearest neighbor, naive Bayes, feedforward neural networks (FNN), convolutional neural networks and recurrent neural networks. As a result, the NPs and all their derivatives datasets were selected to build the best NP-specific models. Furthermore, the consensus models, as well as the voting models, were additionally applied to improve the prediction performance. More evaluations were made on the external validation set and the results demonstrated that (1) the NP-specific model performed better on the target prediction of NPs than the traditional models training on the whole compounds of ChEMBL26. (2) The consensus model of FNN + SVM possessed the best overall performance, and the voting model can significantly improve recall and specificity.
Collapse
Affiliation(s)
- Lu Liang
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Ye Liu
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Bo Kang
- National Supercomputer Center in Tianjin, 10 Xinhuanxi Road, Tianjin Binhai New Area, Tianjin 300457, China
| | - Ru Wang
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Meng-Yu Sun
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Qi Wu
- National Supercomputer Center in Tianjin, 10 Xinhuanxi Road, Tianjin Binhai New Area, Tianjin 300457, China
| | - Xiang-Fei Meng
- National Supercomputer Center in Tianjin, 10 Xinhuanxi Road, Tianjin Binhai New Area, Tianjin 300457, China
| | - Jian-Ping Lin
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China.,Biodesign Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 32 West 7th Avenue, Tianjin Airport Economic Area, Tianjin 300308, China.,Platform of Pharmaceutical Intelligence, Tianjin International Joint Academy of Biomedicine, Tianjin 300457, China
| |
Collapse
|
41
|
A Systematic Review of Clinical Validated and Potential miRNA Markers Related to the Efficacy of Fluoropyrimidine Drugs. DISEASE MARKERS 2022; 2022:1360954. [PMID: 36051356 PMCID: PMC9427288 DOI: 10.1155/2022/1360954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 07/15/2022] [Accepted: 07/29/2022] [Indexed: 12/24/2022]
Abstract
Colorectal cancer (CRC) is becoming increasingly prevalent worldwide. Fluoropyrimidine drugs are the primary chemotherapy regimens in routine clinical practice of CRC. However, the survival rate of patients on fluoropyrimidine-based chemotherapy varies significantly among individuals. Biomarkers of fluoropyrimidine drugs'' efficacy are needed to implement personalized medicine. This review summarized fluoropyrimidine drug-related microRNA (miRNA) by affecting metabolic enzymes or showing the relevance of drug efficacy. We first outlined 42 miRNAs that may affect the metabolism of fluoropyrimidine drugs. Subsequently, we filtered another 41 miRNAs related to the efficacy of fluoropyrimidine drugs based on clinical trials. Bioinformatics analysis showed that most well-established miRNA biomarkers were significantly enriched in the cancer pathways instead of the fluoropyrimidine drug metabolism pathways. The result also suggests that the miRNAs screened from metastasis patients have a more critical role in cancer development than those from non-metastasis patients. There are five miRNAs shared between these two lists. The miR-21, miR-215, and miR-218 can suppress fluoropyrimidine drugs'' catabolism. The miR-326 and miR-328 can reduce the efflux of fluoropyrimidine drugs. These five miRNAs could jointly act by increasing intracellular levels of fluoropyrimidine drugs'' cytotoxic metabolites, leading to better chemotherapy responses. In conclusion, we demonstrated that the dynamic changes in the transcriptional regulation via miRNAs might play significant roles in the efficacy and toxicity of the fluoropyrimidine drug. The reported miRNA biomarkers would help evaluate the efficacy of fluoropyrimidine drug-based chemotherapy and improve the prognosis of colorectal cancer patients.
Collapse
|
42
|
Zhou H, Zhang N. miR-212-5p inhibits nasopharyngeal carcinoma progression by targeting METTL3. Open Med (Wars) 2022; 17:1241-1251. [PMID: 35892080 PMCID: PMC9281587 DOI: 10.1515/med-2022-0515] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 05/30/2022] [Accepted: 06/03/2022] [Indexed: 11/15/2022] Open
Abstract
This study was conducted to investigate the effect of microRNA-212-5p (miR-212-5p) on the proliferation and apoptosis of nasopharyngeal carcinoma (NPC) cells. Microarray datasets (EXP00394 and EXP00660) were downloaded from the dbDEMC database, and the differentially expressed microRNAs between high-grade and low-grade NPC were analyzed. miR-212-5p and methyltransferase like 3 (METTL3) expression levels in NPC tissues and cells were determined by the quantitative real-time polymerase chain reaction and Western blot. Besides, the relationship between miR-212-5p expression and clinicopathological characteristics of patients was analyzed by the Chi-square test. Cell counting kit-8 assay, 5-ethynyl-2-deoxyuridine (EdU) assay, and flow cytometry were adopted to detect the effect of miR-212-5p on the cell proliferation and apoptosis. Kyoto Encyclopedia of Genes and Genomes and Gene Ontology analysis were performed to explore the potential biological functions and the signal pathways related to the target genes of miR-212-5p. Bioinformatics prediction and dual luciferase reporter gene assay were used to verify the relationship between miR-212-5p and METTL3 3' untranslated region. Besides, western blot was adopted to detect the expression of METTL3. Gene set enrichment analysis was performed to analyze the downstream pathways in which METTL3 was enriched. It was found that miR-212-5p was downregulated in NPC tissues, and the low miR-212-5p expression was associated with lymph node metastasis and poor differentiation. miR-212-5p overexpression inhibited the growth and promoted apoptosis of NPC cells; miR-212-5p inhibition functioned oppositely. Mechanistically, miR-212-5p inhibited the proliferation and promoted apoptosis of NPC cells via suppressing METTL3 expression. miR-212-5p/METTL3 was associated with processes of RNA transport and cell cycle. In conclusion, miR-212-5p inhibits the progression of NPC by targeting METTL3.
Collapse
Affiliation(s)
- Hongyu Zhou
- Department of Otorhinolaryngology Head and Neck Surgery, Wuhan Fourth Hospital, Wuhan 430033, Hubei, China
| | - Nana Zhang
- Department of Otorhinolaryngology Head and Neck Surgery, Wuhan Fourth Hospital, Wuhan 430033, Hubei, China
| |
Collapse
|
43
|
DRDB: A Machine Learning Platform to Predict Chemical-Protein Interactions towards Diabetic Retinopathy. OXIDATIVE MEDICINE AND CELLULAR LONGEVITY 2022; 2022:1718353. [PMID: 35910835 PMCID: PMC9329024 DOI: 10.1155/2022/1718353] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 06/17/2022] [Accepted: 06/22/2022] [Indexed: 11/17/2022]
Abstract
Diabetic retinopathy (DR), a diabetic microangiopathy caused by diabetes, affects approximately 93 million people, worldwide. However, the drugs used to treat DR have limited efficacy and the variety of side effects. This is possibly because the complicated pathogenesis of DR is associated with multiple proteins. In this work, we attempted to identify potential drugs against DR-associated proteins and predict potential targets for drugs using in silico prediction of chemical-protein interactions (CPI) based on multitarget quantitative structure-activity relationship (mt-QSAR) method. Therefore, we developed 128 binary classifiers to predict the CPI for 15 DR targets using random forest (RF), k-nearest neighbours (KNN), support vector machine (SVM), and neural network (NN) algorithms with MACCS, extended connectivity fingerprints (ECFP6) fingerprints, and protein descriptors. In order to facilitate discovery of the novel drugs and target identification using the 128 binary classifiers, a free web server (DRDB) was developed. Compound Danshen Dripping Pills (CDDP), composed of Salvia miltiorrhiza, Panax notoginseng, and borneol, is commonly used in the treatment of cardiovascular diseases. To explore the applicability of DRDB, the potential CPIs of CDDP in treatment of DR were investigated based on DRDB. In vitro experimental validation demonstrated that cryptotanshinone and protocatechuic acid, two key components of CDDP, are capable of targeting ICAM-1 which is one of the key target of DR. We hope that this work can facilitate development of more effective clinical strategies for the treatment of DR.
Collapse
|
44
|
Aldahdooh J, Vähä-Koskela M, Tang J, Tanoli Z. Using BERT to identify drug-target interactions from whole PubMed. BMC Bioinformatics 2022; 23:245. [PMID: 35729494 PMCID: PMC9214985 DOI: 10.1186/s12859-022-04768-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 06/03/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Drug-target interactions (DTIs) are critical for drug repurposing and elucidation of drug mechanisms, and are manually curated by large databases, such as ChEMBL, BindingDB, DrugBank and DrugTargetCommons. However, the number of curated articles likely constitutes only a fraction of all the articles that contain experimentally determined DTIs. Finding such articles and extracting the experimental information is a challenging task, and there is a pressing need for systematic approaches to assist the curation of DTIs. To this end, we applied Bidirectional Encoder Representations from Transformers (BERT) to identify such articles. Because DTI data intimately depends on the type of assays used to generate it, we also aimed to incorporate functions to predict the assay format. RESULTS Our novel method identified 0.6 million articles (along with drug and protein information) which are not previously included in public DTI databases. Using 10-fold cross-validation, we obtained ~ 99% accuracy for identifying articles containing quantitative drug-target profiles. The F1 micro for the prediction of assay format is 88%, which leaves room for improvement in future studies. CONCLUSION The BERT model in this study is robust and the proposed pipeline can be used to identify previously overlooked articles containing quantitative DTIs. Overall, our method provides a significant advancement in machine-assisted DTI extraction and curation. We expect it to be a useful addition to drug mechanism discovery and repurposing.
Collapse
Affiliation(s)
- Jehad Aldahdooh
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.,Doctoral Programme in Computer Science, University of Helsinki, Helsinki, Finland
| | - Markus Vähä-Koskela
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | - Jing Tang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
| | - Ziaurrehman Tanoli
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland. .,BioICAWtech, Helsinki, Finland.
| |
Collapse
|
45
|
Peng L, Yang C, Huang L, Chen X, Fu X, Liu W. RNMFLP: Predicting circRNA-disease associations based on robust nonnegative matrix factorization and label propagation. Brief Bioinform 2022; 23:6582881. [PMID: 35534179 DOI: 10.1093/bib/bbac155] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 03/09/2022] [Accepted: 04/06/2022] [Indexed: 12/22/2022] Open
Abstract
Circular RNAs (circRNAs) are a class of structurally stable endogenous noncoding RNA molecules. Increasing studies indicate that circRNAs play vital roles in human diseases. However, validating disease-related circRNAs in vivo is costly and time-consuming. A reliable and effective computational method to identify circRNA-disease associations deserves further studies. In this study, we propose a computational method called RNMFLP that combines robust nonnegative matrix factorization (RNMF) and label propagation algorithm (LP) to predict circRNA-disease associations. First, to reduce the impact of false negative data, the original circRNA-disease adjacency matrix is updated by matrix multiplication using the integrated circRNA similarity and the disease similarity information. Subsequently, the RNMF algorithm is used to obtain the restricted latent space to capture potential circRNA-disease pairs from the association matrix. Finally, the LP algorithm is utilized to predict more accurate circRNA-disease associations from the integrated circRNA similarity network and integrated disease similarity network, respectively. Fivefold cross-validation of four datasets shows that RNMFLP is superior to the state-of-the-art methods. In addition, case studies on lung cancer, hepatocellular carcinoma and colorectal cancer further demonstrate the reliability of our method to discover disease-related circRNAs.
Collapse
Affiliation(s)
- Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China.,Hunan Key Laboratory for Service computing and Novel Software Technology
| | - Cheng Yang
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, 10084, Beijing, China.,The Future Laboratory, Tsinghua University, 10084, Beijing, China
| | - Xiang Chen
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| | - Xiangzheng Fu
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Wei Liu
- College of Information Engineering, Xiangtan University, Xiangtan, 411105, Hunan, China
| |
Collapse
|
46
|
Deng L, Liu Z, Qian Y, Zhang J. Predicting circRNA-drug sensitivity associations via graph attention auto-encoder. BMC Bioinformatics 2022; 23:160. [PMID: 35508967 PMCID: PMC9066932 DOI: 10.1186/s12859-022-04694-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 04/20/2022] [Indexed: 11/18/2022] Open
Abstract
Background Circular RNAs (circRNAs) play essential roles in cancer development and therapy resistance. Many studies have shown that circRNA is closely related to human health. The expression of circRNAs also affects the sensitivity of cells to drugs, thereby significantly affecting the efficacy of drugs. However, traditional biological experiments are time-consuming and expensive to validate drug-related circRNAs. Therefore, it is an important and urgent task to develop an effective computational method for predicting unknown circRNA-drug associations. Results In this work, we propose a computational framework (GATECDA) based on graph attention auto-encoder to predict circRNA-drug sensitivity associations. In GATECDA, we leverage multiple databases, containing the sequences of host genes of circRNAs, the structure of drugs, and circRNA-drug sensitivity associations. Based on the data, GATECDA employs Graph attention auto-encoder (GATE) to extract the low-dimensional representation of circRNA/drug, effectively retaining critical information in sparse high-dimensional features and realizing the effective fusion of nodes’ neighborhood information. Experimental results indicate that GATECDA achieves an average AUC of 89.18% under 10-fold cross-validation. Case studies further show the excellent performance of GATECDA. Conclusions Many experimental results and case studies show that our proposed GATECDA method can effectively predict the circRNA-drug sensitivity associations.
Collapse
Affiliation(s)
- Lei Deng
- School of Software, Xinjiang University, Urumqi, China.,School of Computer Science and Engineering, Central South University, Changsha, China
| | - Zixuan Liu
- School of Software, Xinjiang University, Urumqi, China
| | - Yurong Qian
- School of Software, Xinjiang University, Urumqi, China
| | - Jingpu Zhang
- School of Computer and Data Science, Henan University of Urban Construction, Pingdingshan, China.
| |
Collapse
|
47
|
DTIP-TC2A: An analytical framework for drug-target interactions prediction methods. Comput Biol Chem 2022; 99:107707. [DOI: 10.1016/j.compbiolchem.2022.107707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 05/01/2022] [Accepted: 05/26/2022] [Indexed: 11/18/2022]
|
48
|
Sikander R, Ghulam A, Ali F. XGB-DrugPred: computational prediction of druggable proteins using eXtreme gradient boosting and optimized features set. Sci Rep 2022; 12:5505. [PMID: 35365726 PMCID: PMC8976041 DOI: 10.1038/s41598-022-09484-3] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 03/07/2022] [Indexed: 11/19/2022] Open
Abstract
Accurate identification of drug-targets in human body has great significance for designing novel drugs. Compared with traditional experimental methods, prediction of drug-targets via machine learning algorithms has enhanced the attention of many researchers due to fast and accurate prediction. In this study, we propose a machine learning-based method, namely XGB-DrugPred for accurate prediction of druggable proteins. The features from primary protein sequences are extracted by group dipeptide composition, reduced amino acid alphabet, and novel encoder pseudo amino acid composition segmentation. To select the best feature set, eXtreme Gradient Boosting-recursive feature elimination is implemented. The best feature set is provided to eXtreme Gradient Boosting (XGB), Random Forest, and Extremely Randomized Tree classifiers for model training and prediction. The performance of these classifiers is evaluated by tenfold cross-validation. The empirical results show that XGB-based predictor achieves the best results compared with other classifiers and existing methods in the literature.
Collapse
Affiliation(s)
- Rahu Sikander
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, China.
| | - Ali Ghulam
- Computerization and Network Section, Sindh Agriculture University, Tandojam, Pakistan
| | - Farman Ali
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
| |
Collapse
|
49
|
Du BX, Qin Y, Jiang YF, Xu Y, Yiu SM, Yu H, Shi JY. Compound–protein interaction prediction by deep learning: Databases, descriptors and models. Drug Discov Today 2022; 27:1350-1366. [DOI: 10.1016/j.drudis.2022.02.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Revised: 11/19/2021] [Accepted: 02/28/2022] [Indexed: 11/24/2022]
|
50
|
Zhang ZW, Gao Z, Zheng CH, Li L, Qi SM, Wang YT. WVMDA: Predicting miRNA-Disease Association Based on Weighted Voting. Front Genet 2021; 12:742992. [PMID: 34659363 PMCID: PMC8511643 DOI: 10.3389/fgene.2021.742992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Accepted: 09/09/2021] [Indexed: 11/15/2022] Open
Abstract
An increasing number of experiments had verified that miRNA expression is related to human diseases. The miRNA expression profile may be an indicator of clinical diagnosis and provides a new direction for the prevention and treatment of complex diseases. In this work, we present a weighted voting-based model for predicting miRNA–disease association (WVMDA). To reasonably build a network of similarity, we established credibility similarity based on the reliability of known associations and used it to improve the original incomplete similarity. To eliminate noise interference as much as possible while maintaining more reliable similarity information, we developed a filter. More importantly, to ensure the fairness and efficiency of weighted voting, we focus on the design of weighting. Finally, cross-validation experiments and case studies are undertaken to verify the efficacy of the proposed model. The results showed that WVMDA could efficiently identify miRNAs associated with the disease.
Collapse
Affiliation(s)
- Zhen-Wei Zhang
- School of Cyberspace Security, Qufu Normal University, Qufu, China
| | - Zhen Gao
- School of Computer Science and Technology, Anhui University, Hefei, China
| | - Chun-Hou Zheng
- School of Cyberspace Security, Qufu Normal University, Qufu, China.,School of Computer Science and Technology, Anhui University, Hefei, China
| | - Lei Li
- School of Cyberspace Security, Qufu Normal University, Qufu, China
| | - Su-Min Qi
- School of Cyberspace Security, Qufu Normal University, Qufu, China
| | - Yu-Tian Wang
- School of Cyberspace Security, Qufu Normal University, Qufu, China
| |
Collapse
|