Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang L, Wang CC, Chen X. Predicting drug-target binding affinity through molecule representation block based on multi-head attention and skip connection. Brief Bioinform 2022;23:6782838. [PMID: 36411674 DOI: 10.1093/bib/bbac468] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 09/13/2022] [Accepted: 09/29/2022] [Indexed: 11/22/2022] Open

For:	Zhang L, Wang CC, Chen X. Predicting drug-target binding affinity through molecule representation block based on multi-head attention and skip connection. Brief Bioinform 2022;23:6782838. [PMID: 36411674 DOI: 10.1093/bib/bbac468] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 09/13/2022] [Accepted: 09/29/2022] [Indexed: 11/22/2022] Open

Number

Cited by Other Article(s)

Chen J, Zhu Y, Yuan Q. Predicting potential microbe-disease associations based on dual branch graph convolutional network. J Cell Mol Med 2024;28:e18571. [PMID: 39086148 PMCID: PMC11291560 DOI: 10.1111/jcmm.18571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 06/15/2024] [Accepted: 06/27/2024] [Indexed: 08/02/2024] Open

Abstract

Studying the association between microbes and diseases not only aids in the prevention and diagnosis of diseases, but also provides crucial theoretical support for new drug development and personalized treatment. Due to the time-consuming and costly nature of laboratory-based biological tests to confirm the relationship between microbes and diseases, there is an urgent need for innovative computational frameworks to anticipate new associations between microbes and diseases. Here, we propose a novel computational approach based on a dual branch graph convolutional network (GCN) module, abbreviated as DBGCNMDA, for identifying microbe-disease associations. First, DBGCNMDA calculates the similarity matrix of diseases and microbes by integrating functional similarity and Gaussian association spectrum kernel (GAPK) similarity. Then, semantic information from different biological networks is extracted by two GCN modules from different perspectives. Finally, the scores of microbe-disease associations are predicted based on the extracted features. The main innovation of this method lies in the use of two types of information for microbe/disease similarity assessment. Additionally, we extend the disease nodes to address the issue of insufficient features due to low data dimensionality. We optimize the connectivity between the homogeneous entities using random walk with restart (RWR), and then use the optimized similarity matrix as the initial feature matrix. In terms of network understanding, we design a dual branch GCN module, namely GlobalGCN and LocalGCN, to fine-tune node representations by introducing side information, including homologous neighbour nodes. We evaluate the accuracy of the DBGCNMDA model using five-fold cross-validation (5-fold-CV) technique. The results show that the area under the receiver operating characteristic curve (AUC) and area under the precision versus recall curve (AUPR) of the DBGCNMDA model in the 5-fold-CV are 0.9559 and 0.9630, respectively. The results from the case studies using published experimental data confirm a significant number of predicted associations, indicating that DBGCNMDA is an effective tool for predicting potential microbe-disease associations.

Collapse

Zhao K, Zhao P, Wang S, Xia Y, Zhang G. FoldPAthreader: predicting protein folding pathway using a novel folding force field model derived from known protein universe. Genome Biol 2024;25:152. [PMID: 38862984 PMCID: PMC11167914 DOI: 10.1186/s13059-024-03291-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 05/29/2024] [Indexed: 06/13/2024] Open

Kalemati M, Zamani Emani M, Koohi S. DCGAN-DTA: Predicting drug-target binding affinity with deep convolutional generative adversarial networks. BMC Genomics 2024;25:411. [PMID: 38724911 PMCID: PMC11080241 DOI: 10.1186/s12864-024-10326-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 04/19/2024] [Indexed: 05/13/2024] Open

Abstract

BACKGROUND

In recent years, there has been a growing interest in utilizing computational approaches to predict drug-target binding affinity, aiming to expedite the early drug discovery process. To address the limitations of experimental methods, such as cost and time, several machine learning-based techniques have been developed. However, these methods encounter certain challenges, including the limited availability of training data, reliance on human intervention for feature selection and engineering, and a lack of validation approaches for robust evaluation in real-life applications.

RESULTS

To mitigate these limitations, in this study, we propose a method for drug-target binding affinity prediction based on deep convolutional generative adversarial networks. Additionally, we conducted a series of validation experiments and implemented adversarial control experiments using straw models. These experiments serve to demonstrate the robustness and efficacy of our predictive models. We conducted a comprehensive evaluation of our method by comparing it to baselines and state-of-the-art methods. Two recently updated datasets, namely the BindingDB and PDBBind, were used for this purpose. Our findings indicate that our method outperforms the alternative methods in terms of three performance measures when using warm-start data splitting settings. Moreover, when considering physiochemical-based cold-start data splitting settings, our method demonstrates superior predictive performance, particularly in terms of the concordance index.

CONCLUSION

The results of our study affirm the practical value of our method and its superiority over alternative approaches in predicting drug-target binding affinity across multiple validation sets. This highlights the potential of our approach in accelerating drug repurposing efforts, facilitating novel drug discovery, and ultimately enhancing disease treatment. The data and source code for this study were deposited in the GitHub repository, https://github.com/mojtabaze7/DCGAN-DTA . Furthermore, the web server for our method is accessible at https://dcgan.shinyapps.io/bindingaffinity/ .

Collapse

Svensson E, Hoedt PJ, Hochreiter S, Klambauer G. HyperPCM: Robust Task-Conditioned Modeling of Drug-Target Interactions. J Chem Inf Model 2024;64:2539-2553. [PMID: 38185877 PMCID: PMC11005051 DOI: 10.1021/acs.jcim.3c01417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 11/27/2023] [Accepted: 11/27/2023] [Indexed: 01/09/2024]

Zeng X, Li SJ, Lv SQ, Wen ML, Li Y. A comprehensive review of the recent advances on predicting drug-target affinity based on deep learning. Front Pharmacol 2024;15:1375522. [PMID: 38628639 PMCID: PMC11019008 DOI: 10.3389/fphar.2024.1375522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 03/21/2024] [Indexed: 04/19/2024] Open

Peng L, Yang Y, Yang C, Li Z, Cheong N. HRGCNLDA: Forecasting of lncRNA-disease association based on hierarchical refinement graph convolutional neural network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024;21:4814-4834. [PMID: 38872515 DOI: 10.3934/mbe.2024212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]

Qi H, Yu T, Yu W, Liu C. Drug-target affinity prediction with extended graph learning-convolutional networks. BMC Bioinformatics 2024;25:75. [PMID: 38365583 PMCID: PMC10874073 DOI: 10.1186/s12859-024-05698-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 02/12/2024] [Indexed: 02/18/2024] Open

Abstract

BACKGROUND

High-performance computing plays a pivotal role in computer-aided drug design, a field that holds significant promise in pharmaceutical research. The prediction of drug-target affinity (DTA) is a crucial stage in this process, potentially accelerating drug development through rapid and extensive preliminary compound screening, while also minimizing resource utilization and costs. Recently, the incorporation of deep learning into DTA prediction and the enhancement of its accuracy have emerged as key areas of interest in the research community. Drugs and targets can be characterized through various methods, including structure-based, sequence-based, and graph-based representations. Despite the progress in structure and sequence-based techniques, they tend to provide limited feature information. Conversely, graph-based approaches have risen to prominence, attracting considerable attention for their comprehensive data representation capabilities. Recent studies have focused on constructing protein and drug molecular graphs using sequences and SMILES, subsequently deriving representations through graph neural networks. However, these graph-based approaches are limited by the use of a fixed adjacent matrix of protein and drug molecular graphs for graph convolution. This limitation restricts the learning of comprehensive feature representations from intricate compound and protein structures, consequently impeding the full potential of graph-based feature representation in DTA prediction. This, in turn, significantly impacts the models' generalization capabilities in the complex realm of drug discovery.

RESULTS

To tackle these challenges, we introduce GLCN-DTA, a model specifically designed for proficiency in DTA tasks. GLCN-DTA innovatively integrates a graph learning module into the existing graph architecture. This module is designed to learn a soft adjacent matrix, which effectively and efficiently refines the contextual structure of protein and drug molecular graphs. This advancement allows for learning richer structural information from protein and drug molecular graphs via graph convolution, specifically tailored for DTA tasks, compared to the conventional fixed adjacent matrix approach. A series of experiments have been conducted to validate the efficacy of the proposed GLCN-DTA method across diverse scenarios. The results demonstrate that GLCN-DTA possesses advantages in terms of robustness and high accuracy.

CONCLUSIONS

The proposed GLCN-DTA model enhances DTA prediction performance by introducing a novel framework that synergizes graph learning operations with graph convolution operations, thereby achieving richer representations. GLCN-DTA does not distinguish between different protein classifications, including structurally ordered and intrinsically disordered proteins, focusing instead on improving feature representation. Therefore, its applicability scope may be more effective in scenarios involving structurally ordered proteins, while potentially being limited in contexts with intrinsically disordered proteins.

Collapse

Dehghan A, Abbasi K, Razzaghi P, Banadkuki H, Gharaghani S. CCL-DTI: contributing the contrastive loss in drug-target interaction prediction. BMC Bioinformatics 2024;25:48. [PMID: 38291364 PMCID: PMC11264960 DOI: 10.1186/s12859-024-05671-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 01/22/2024] [Indexed: 02/01/2024] Open

Xu F, Hu H, Lin H, Lu J, Cheng F, Zhang J, Li X, Shuai J. scGIR: deciphering cellular heterogeneity via gene ranking in single-cell weighted gene correlation networks. Brief Bioinform 2024;25:bbae091. [PMID: 38487851 PMCID: PMC10940817 DOI: 10.1093/bib/bbae091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 02/08/2024] [Accepted: 02/15/2024] [Indexed: 03/18/2024] Open

Peng CX, Liang F, Xia YH, Zhao KL, Hou MH, Zhang GJ. Recent Advances and Challenges in Protein Structure Prediction. J Chem Inf Model 2024;64:76-95. [PMID: 38109487 DOI: 10.1021/acs.jcim.3c01324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]

Zeng X, Zhong KY, Jiang B, Li Y. Fusing Sequence and Structural Knowledge by Heterogeneous Models to Accurately and Interpretively Predict Drug-Target Affinity. Molecules 2023;28:8005. [PMID: 38138496 PMCID: PMC10745601 DOI: 10.3390/molecules28248005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Revised: 12/06/2023] [Accepted: 12/06/2023] [Indexed: 12/24/2023] Open

Li H, Wang S, Zheng W, Yu L. Multi-dimensional search for drug-target interaction prediction by preserving the consistency of attention distribution. Comput Biol Chem 2023;107:107968. [PMID: 37844375 DOI: 10.1016/j.compbiolchem.2023.107968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 09/27/2023] [Accepted: 10/05/2023] [Indexed: 10/18/2023]

Zhang Y, Liu C, Liu M, Liu T, Lin H, Huang CB, Ning L. Attention is all you need: utilizing attention in AI-enabled drug discovery. Brief Bioinform 2023;25:bbad467. [PMID: 38189543 PMCID: PMC10772984 DOI: 10.1093/bib/bbad467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 11/03/2023] [Accepted: 11/25/2023] [Indexed: 01/09/2024] Open

Zhang L, Wang CC, Zhang Y, Chen X. GPCNDTA: Prediction of drug-target binding affinity through cross-attention networks augmented with graph features and pharmacophores. Comput Biol Med 2023;166:107512. [PMID: 37788507 DOI: 10.1016/j.compbiomed.2023.107512] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 08/28/2023] [Accepted: 09/19/2023] [Indexed: 10/05/2023]

Abstract

Drug-target affinity prediction is a challenging task in drug discovery. The latest computational models have limitations in mining edge information in molecule graphs, accessing to knowledge in pharmacophores, integrating multimodal data of the same biomolecule and realizing effective interactions between two different biomolecules. To solve these problems, we proposed a method called Graph features and Pharmacophores augmented Cross-attention Networks based Drug-Target binding Affinity prediction (GPCNDTA). First, we utilized the GNN module, the linear projection unit and self-attention layer to correspondingly extract features of drugs and proteins. Second, we devised intramolecular and intermolecular cross-attention to respectively fuse and interact features of drugs and proteins. Finally, the linear projection unit was applied to gain final features of drugs and proteins, and the Multi-Layer Perceptron was employed to predict drug-target binding affinity. Three major innovations of GPCNDTA are as follows: (i) developing the residual CensNet and the residual EW-GCN to correspondingly extract features of drug and protein graphs, (ii) regarding pharmacophores as a new type of priors to heighten drug-target affinity prediction performance, and (iii) devising intramolecular and intermolecular cross-attention, in which the intramolecular cross-attention realizes the effective fusion of different modal data related to the same biomolecule, and the intermolecular cross-attention fulfills the information interaction between two different biomolecules in attention space. The test results on five benchmark datasets imply that GPCNDTA achieves the best performance compared with state-of-the-art computational models. Besides, relying on ablation experiments, we proved effectiveness of GNN modules, pharmacophores and two cross-attention strategies in improving the prediction accuracy, stability and reliability of GPCNDA. In case studies, we applied GPCNDTA to predict binding affinities between 3C-like proteinase and 185 drugs, and observed that most binding affinities predicted by GPCNDTA are close to corresponding experimental measurements.

Collapse

Zhu HT, Xia YH, Zhang GJ. E2EDA: Protein Domain Assembly Based on End-to-End Deep Learning. J Chem Inf Model 2023;63:6451-6461. [PMID: 37788318 DOI: 10.1021/acs.jcim.3c01387] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]

Alghushairy O, Ali F, Alghamdi W, Khalid M, Alsini R, Asiry O. Machine learning-based model for accurate identification of druggable proteins using light extreme gradient boosting. J Biomol Struct Dyn 2023:1-12. [PMID: 37850427 DOI: 10.1080/07391102.2023.2269280] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 10/04/2023] [Indexed: 10/19/2023]

Pan J, You Z, You W, Zhao T, Feng C, Zhang X, Ren F, Ma S, Wu F, Wang S, Sun Y. PTBGRP: predicting phage-bacteria interactions with graph representation learning on microbial heterogeneous information network. Brief Bioinform 2023;24:bbad328. [PMID: 37742053 DOI: 10.1093/bib/bbad328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 08/14/2023] [Accepted: 08/30/2023] [Indexed: 09/25/2023] Open

Affiliation(s)

Jie Pan Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
Zhuhong You School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China
Wencai You Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
Tian Zhao Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
Chenlu Feng Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
Xuexia Zhang North China Pharmaceutical Group, Shijiazhuang 050015, Hebei, China National Microbial Medicine Engineering & Research Center, Shijiazhuang 050015, Hebei, China
Fengzhi Ren North China Pharmaceutical Group, Shijiazhuang 050015, Hebei, China National Microbial Medicine Engineering & Research Center, Shijiazhuang 050015, Hebei, China
Sanxing Ma Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
Fan Wu Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
Shiwei Wang Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
Yanmei Sun Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China

Collapse

Peng L, Tan J, Xiong W, Zhang L, Wang Z, Yuan R, Li Z, Chen X. Deciphering ligand-receptor-mediated intercellular communication based on ensemble deep learning and the joint scoring strategy from single-cell transcriptomic data. Comput Biol Med 2023;163:107137. [PMID: 37364528 DOI: 10.1016/j.compbiomed.2023.107137] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 05/18/2023] [Accepted: 06/04/2023] [Indexed: 06/28/2023]

Abstract

BACKGROUND

Cell-cell communication in a tumor microenvironment is vital to tumorigenesis, tumor progression and therapy. Intercellular communication inference helps understand molecular mechanisms of tumor growth, progression and metastasis.

METHODS

Focusing on ligand-receptor co-expressions, in this study, we developed an ensemble deep learning framework, CellComNet, to decipher ligand-receptor-mediated cell-cell communication from single-cell transcriptomic data. First, credible LRIs are captured by integrating data arrangement, feature extraction, dimension reduction, and LRI classification based on an ensemble of heterogeneous Newton boosting machine and deep neural network. Next, known and identified LRIs are screened based on single-cell RNA sequencing (scRNA-seq) data in certain tissues. Finally, cell-cell communication is inferred by incorporating scRNA-seq data, the screened LRIs, a joint scoring strategy that combines expression thresholding and expression product of ligands and receptors.

RESULTS

The proposed CellComNet framework was compared with four competing protein-protein interaction prediction models (PIPR, XGBoost, DNNXGB, and OR-RCNN) and obtained the best AUCs and AUPRs on four LRI datasets, elucidating the optimal LRI classification ability. CellComNet was further applied to analyze intercellular communication in human melanoma and head and neck squamous cell carcinoma (HNSCC) tissues. The results demonstrate that cancer-associated fibroblasts highly communicate with melanoma cells and endothelial cells strong communicate with HNSCC cells.

CONCLUSIONS

The proposed CellComNet framework efficiently identified credible LRIs and significantly improved cell-cell communication inference performance. We anticipate that CellComNet can contribute to anticancer drug design and tumor-targeted therapy.

Collapse

Lv J, Liu G, Ju Y, Huang H, Sun Y. AADB: A Manually Collected Database for Combinations of Antibiotics With Adjuvants. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:2827-2836. [PMID: 37279138 DOI: 10.1109/tcbb.2023.3283221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Binatlı OC, Gönen M. MOKPE: drug-target interaction prediction via manifold optimization based kernel preserving embedding. BMC Bioinformatics 2023;24:276. [PMID: 37407927 DOI: 10.1186/s12859-023-05401-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 06/25/2023] [Indexed: 07/07/2023] Open

Wang F, Yang H, Wu Y, Peng L, Li X. SAELGMDA: Identifying human microbe-disease associations based on sparse autoencoder and LightGBM. Front Microbiol 2023;14:1207209. [PMID: 37415823 PMCID: PMC10320730 DOI: 10.3389/fmicb.2023.1207209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 05/18/2023] [Indexed: 07/08/2023] Open

Abstract

Introduction

Identification of complex associations between diseases and microbes is important to understand the pathogenesis of diseases and design therapeutic strategies. Biomedical experiment-based Microbe-Disease Association (MDA) detection methods are expensive, time-consuming, and laborious.

Methods

Here, we developed a computational method called SAELGMDA for potential MDA prediction. First, microbe similarity and disease similarity are computed by integrating their functional similarity and Gaussian interaction profile kernel similarity. Second, one microbe-disease pair is presented as a feature vector by combining the microbe and disease similarity matrices. Next, the obtained feature vectors are mapped to a low-dimensional space based on a Sparse AutoEncoder. Finally, unknown microbe-disease pairs are classified based on Light Gradient boosting machine.

Results

The proposed SAELGMDA method was compared with four state-of-the-art MDA methods (MNNMDA, GATMDA, NTSHMDA, and LRLSHMDA) under five-fold cross validations on diseases, microbes, and microbe-disease pairs on the HMDAD and Disbiome databases. The results show that SAELGMDA computed the best accuracy, Matthews correlation coefficient, AUC, and AUPR under the majority of conditions, outperforming the other four MDA prediction models. In particular, SAELGMDA obtained the best AUCs of 0.8358 and 0.9301 under cross validation on diseases, 0.9838 and 0.9293 under cross validation on microbes, and 0.9857 and 0.9358 under cross validation on microbe-disease pairs on the HMDAD and Disbiome databases. Colorectal cancer, inflammatory bowel disease, and lung cancer are diseases that severely threat human health. We used the proposed SAELGMDA method to find possible microbes for the three diseases. The results demonstrate that there are potential associations between Clostridium coccoides and colorectal cancer and one between Sphingomonadaceae and inflammatory bowel disease. In addition, Veillonella may associate with autism. The inferred MDAs need further validation.

Conclusion

We anticipate that the proposed SAELGMDA method contributes to the identification of new MDAs.

Collapse