1
|
Lu S, Liang Y, Li L, Miao R, Liao S, Zou Y, Yang C, Ouyang D. Predicting potential microbe-disease associations based on auto-encoder and graph convolution network. BMC Bioinformatics 2023; 24:476. [PMID: 38097930 PMCID: PMC10722760 DOI: 10.1186/s12859-023-05611-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 12/11/2023] [Indexed: 12/17/2023] Open
Abstract
The increasing body of research has consistently demonstrated the intricate correlation between the human microbiome and human well-being. Microbes can impact the efficacy and toxicity of drugs through various pathways, as well as influence the occurrence and metastasis of tumors. In clinical practice, it is crucial to elucidate the association between microbes and diseases. Although traditional biological experiments accurately identify this association, they are time-consuming, expensive, and susceptible to experimental conditions. Consequently, conducting extensive biological experiments to screen potential microbe-disease associations becomes challenging. The computational methods can solve the above problems well, but the previous computational methods still have the problems of low utilization of node features and the prediction accuracy needs to be improved. To address this issue, we propose the DAEGCNDF model predicting potential associations between microbes and diseases. Our model calculates four similar features for each microbe and disease. These features are fused to obtain a comprehensive feature matrix representing microbes and diseases. Our model first uses the graph convolutional network module to extract low-rank features with graph information of microbes and diseases, and then uses a deep sparse Auto-Encoder to extract high-rank features of microbe-disease pairs, after which the low-rank and high-rank features are spliced to improve the utilization of node features. Finally, Deep Forest was used for microbe-disease potential relationship prediction. The experimental results show that combining low-rank and high-rank features helps to improve the model performance and Deep Forest has better classification performance than the baseline model.
Collapse
Affiliation(s)
- Shanghui Lu
- Faculty of Innovation Enginee, Macau University of Science and Technology, Avenida Wai Long, Taipa, 999078, Macao, Macao Special Administrative Region of China, China
- School of Mathematics and Physics, Hechi University, No. 42, Longjiang, Hechi, 546300, Guangxi, China
| | - Yong Liang
- Faculty of Innovation Enginee, Macau University of Science and Technology, Avenida Wai Long, Taipa, 999078, Macao, Macao Special Administrative Region of China, China.
- Peng Cheng Laboratory, Shenzhen, 518055, Guangdong, China.
| | - Le Li
- Faculty of Innovation Enginee, Macau University of Science and Technology, Avenida Wai Long, Taipa, 999078, Macao, Macao Special Administrative Region of China, China
| | - Rui Miao
- Basic Teaching Department, Zhuhai Campus of Zunyi Medical University, Zhuhai, 519041, Guangdong, China
| | - Shuilin Liao
- Faculty of Innovation Enginee, Macau University of Science and Technology, Avenida Wai Long, Taipa, 999078, Macao, Macao Special Administrative Region of China, China
| | - Yongfu Zou
- School of Mathematics and Physics, Hechi University, No. 42, Longjiang, Hechi, 546300, Guangxi, China
| | - Chengjun Yang
- School of Artificial Intelligence and Manufacturing, Hechi University, No. 42, Longjiang, Hechi, 546300, Guangxi, China
| | - Dong Ouyang
- School of Biomedical Engineering, Guangdong Medical University, No. 1, Xincheng, Zhanjiang, 523808, Guangdong, China
| |
Collapse
|
2
|
Shokri Garjan H, Omidi Y, Poursheikhali Asghari M, Ferdousi R. In-silico computational approaches to study microbiota impacts on diseases and pharmacotherapy. Gut Pathog 2023; 15:10. [PMID: 36882861 PMCID: PMC9990230 DOI: 10.1186/s13099-023-00535-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 02/21/2023] [Indexed: 03/09/2023] Open
Abstract
Microorganisms have been linked to a variety of critical human disease, thanks to advances in sequencing technology and microbiology. The growing recognition of human microbe-disease relationships provides crucial insights into the underlying disease process from the perspective of pathogens, which is extremely useful for pathogenesis research, early diagnosis, and precision medicine and therapy. Microbe-based analysis in terms of diseases and related drug discovery can predict new connections/mechanisms and provide new concepts. These phenomena have been studied via various in-silico computational approaches. This review aims to elaborate on the computational works conducted on the microbe-disease and microbe-drug topics, discuss the computational model approaches used for predicting associations and provide comprehensive information on the related databases. Finally, we discussed potential prospects and obstacles in this field of study, while also outlining some recommendations for further enhancing predictive capabilities.
Collapse
Affiliation(s)
- Hassan Shokri Garjan
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Yadollah Omidi
- Department of Pharmaceutical Sciences, Nova Southeastern University, College of Pharmacy, Fort Lauderdale, FL, USA
| | | | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
3
|
Guan J, Zhang ZG, Liu Y, Wang M. A novel bi-directional heterogeneous network selection method for disease and microbial association prediction. BMC Bioinformatics 2022; 23:483. [PMID: 36376802 PMCID: PMC9664813 DOI: 10.1186/s12859-022-04961-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Accepted: 09/21/2022] [Indexed: 11/16/2022] Open
Abstract
Microorganisms in the human body have a great impact on human health. Therefore, mastering the potential relationship between microorganisms and diseases is helpful to understand the pathogenesis of diseases and is of great significance to the prevention, diagnosis, and treatment of diseases. In order to predict the potential microbial disease relationship, we propose a new computational model. Firstly, a bi-directional heterogeneous microbial disease network is constructed by integrating multiple similarities, including Gaussian kernel similarity, microbial function similarity, disease semantic similarity, and disease symptom similarity. Secondly, the neighbor information of the network is learned by random walk; Finally, the selection model is used for information aggregation, and the microbial disease node pair is analyzed. Our method is superior to the existing methods in leave-one-out cross-validation and five-fold cross-validation. Moreover, in case studies of different diseases, our method was proven to be effective.
Collapse
|
4
|
Munir T, Akbar MS, Ahmed S, Sarfraz A, Sarfraz Z, Sarfraz M, Felix M, Cherrez-Ojeda I. A Systematic Review of Internet of Things in Clinical Laboratories: Opportunities, Advantages, and Challenges. SENSORS (BASEL, SWITZERLAND) 2022; 22:8051. [PMID: 36298402 PMCID: PMC9611742 DOI: 10.3390/s22208051] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 10/10/2022] [Accepted: 10/11/2022] [Indexed: 06/16/2023]
Abstract
The Internet of Things (IoT) is the network of physical objects embedded with sensors, software, electronics, and online connectivity systems. This study explores the role of IoT in clinical laboratory processes; this systematic review was conducted adhering to the PRISMA Statement 2020 guidelines. We included IoT models and applications across preanalytical, analytical, and postanalytical laboratory processes. PubMed, Cochrane Central, CINAHL Plus, Scopus, IEEE, and A.C.M. Digital library were searched between August 2015 to August 2022; the data were tabulated. Cohen's coefficient of agreement was calculated to quantify inter-reviewer agreements; a total of 18 studies were included with Cohen's coefficient computed to be 0.91. The included studies were divided into three classifications based on availability, including preanalytical, analytical, and postanalytical. The majority (77.8%) of the studies were real-tested. Communication-based approaches were the most common (83.3%), followed by application-based approaches (44.4%) and sensor-based approaches (33.3%) among the included studies. Open issues and challenges across the included studies included scalability, costs and energy consumption, interoperability, privacy and security, and performance issues. In this study, we identified, classified, and evaluated IoT applicability in clinical laboratory systems. This study presents pertinent findings for IoT development across clinical laboratory systems, for which it is essential that more rigorous and efficient testing and studies be conducted in the future.
Collapse
Affiliation(s)
- Tahir Munir
- Department of Research, Nishtar Medical University, Multan 66000, Pakistan
| | | | - Sadia Ahmed
- Department of Research, Punjab Medical College, Faisalabad 38000, Pakistan
| | - Azza Sarfraz
- Department of Pediatrics and Child Health, The Aga Khan University, Karachi 74800, Pakistan
| | - Zouina Sarfraz
- Department of Research and Publications, Fatima Jinnah Medical University, Lahore 54000, Pakistan
| | - Muzna Sarfraz
- Department of Research, King Edward Medical University, Lahore 54000, Pakistan
| | - Miguel Felix
- Department of Pulmonology, Universidad Espíritu Santo, Samborondón 092301, Ecuador
| | - Ivan Cherrez-Ojeda
- Department of Pulmonology, Universidad Espíritu Santo, Samborondón 092301, Ecuador
| |
Collapse
|
5
|
Hua M, Yu S, Liu T, Yang X, Wang H. MVGCNMDA: Multi-view Graph Augmentation Convolutional Network for Uncovering Disease-Related Microbes. Interdiscip Sci 2022; 14:669-682. [PMID: 35428964 DOI: 10.1007/s12539-022-00514-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 03/06/2022] [Accepted: 03/13/2022] [Indexed: 06/14/2023]
Abstract
MOTIVATION Exploring the interrelationships between microbes and disease can help microbiologists make decisions and plan treatments. Predicting new microbe-disease associations currently relies on biological experiments and domain knowledge, which is time-consuming and inefficient. Automated algorithms are used to uncover the intrinsic link between microbes and disease. However, due to data noise and inadequate understanding of relevant biology, the efficient prediction of microbe-disease associations is still crucial. This study develops a multi-view graph augmentation convolutional network (MVGCNMDA) to predict potential disease-associated microbes. METHODS First, we use two data augmentation methods, edge perturbation and node dropping, to remove the data noise in the preprocessing stage. Second, we calculate Gaussian interaction profile kernel similarity and cosine similarity. Therefore, the Graph Convolutional Network(GCN) can fully use multi-view features. Then, the multi-view features are fed into the multi-attention block to learn the weights of different features adaptively. Finally, the embedding results are obtained using a Convolutional Neural Network (CNN) combiner, and the matrix completion is used to predict the relationship between potential microbes and diseases. RESULTS We test our model on the Human microbe-disease Association Database (HMDAD), Disbiome, and the Combined Dataset (Peryton and MicroPhenoDB). The area under PR curve (AUPR), area under ROC curve (AUC), F1 score, and RECALL value are calculated to evaluate the performance of the developed MVGCNMDA. The AUPR is 0.9440, AUC is 0.9428, F1 score is 0.9383, and RECALL value is 0.8858. The experiments show that our model can accurately predict potential microbe-disease associations compared with the state-of-the-art works on the global Leave-One-Out-Cross-Validation (LOOCV) and the fivefold Cross-Validation (fivefold CV). To further verify the effectiveness of the proposed graph data augmentation, we designed five different settings in the ablation study. Furthermore, we present two case studies that validate the prediction of the potential association between microbes and diseases by MVGCNMDA.
Collapse
Affiliation(s)
- Meifang Hua
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Shengpeng Yu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Tianyu Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Xue Yang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Hong Wang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China.
| |
Collapse
|
6
|
Wang Y, Lei X, Lu C, Pan Y. Predicting Microbe-Disease Association Based on Multiple Similarities and LINE Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2399-2408. [PMID: 34014827 DOI: 10.1109/tcbb.2021.3082183] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Numerous microbes have been found to have vital impacts on human health through affecting biological processes. Therefore, exploring potential associations between microbes and diseases will promote the understanding and diagnosis of diseases. In this study, we present a novel computational model, named MSLINE, to infer potential microbe-disease associations by integrating Multiple Similarities and Large-scale Information Network Embedding (LINE) based on known associations. Specifically, on the basis of known microbe-disease associations from the Human Microbe-Disease Association Database, we first increase the known associations by collecting proven associations from existing literatures. We then construct a microbe-disease heterogeneous network (MDHN) by integrating known associations and multiple similarities (including Gaussian interaction profile kernel similarity, microbe function similarity, disease semantic similarity and disease-symptom similarity). After that, we implement random walk and LINE algorithm on MDHN to learn its structure information. Finally, we score the microbe-disease associations according to the structure information for every nodes. In the Leave-one-out cross validation and 5-fold cross validation, MSLINE performs better compared to other existing methods. Moreover, case studies of different diseases proved that MSLINE could predict the potential microbe-disease associations efficiently.
Collapse
|
7
|
Pardo-Diaz J, Poole PS, Beguerisse-Díaz M, Deane CM, Reinert G. Generating weighted and thresholded gene coexpression networks using signed distance correlation. NETWORK SCIENCE (CAMBRIDGE UNIVERSITY PRESS) 2022; 10:131-145. [PMID: 36217370 PMCID: PMC7613200 DOI: 10.1017/nws.2022.13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Even within well-studied organisms, many genes lack useful functional annotations. One way to generate such functional information is to infer biological relationships between genes or proteins, using a network of gene coexpression data that includes functional annotations. Signed distance correlation has proved useful for the construction of unweighted gene coexpression networks. However, transforming correlation values into unweighted networks may lead to a loss of important biological information related to the intensity of the correlation. Here we introduce a principled method to construct weighted gene coexpression networks using signed distance correlation. These networks contain weighted edges only between those pairs of genes whose correlation value is higher than a given threshold. We analyse data from different organisms and find that networks generated with our method based on signed distance correlation are more stable and capture more biological information compared to networks obtained from Pearson correlation. Moreover, we show that signed distance correlation networks capture more biological information than unweighted networks based on the same metric. While we use biological data sets to illustrate the method, the approach is general and can be used to construct networks in other domains. Code and data are available on https://github.com/javier-pardodiaz/sdcorGCN.
Collapse
Affiliation(s)
| | - Philip S Poole
- Department of Plant Sciences, University of Oxford, Oxford OX1 3RB, UK
| | | | | | - Gesine Reinert
- Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
| |
Collapse
|
8
|
Wang L, Tan Y, Yang X, Kuang L, Ping P. Review on predicting pairwise relationships between human microbes, drugs and diseases: from biological data to computational models. Brief Bioinform 2022; 23:6553604. [PMID: 35325024 DOI: 10.1093/bib/bbac080] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 12/11/2022] Open
Abstract
In recent years, with the rapid development of techniques in bioinformatics and life science, a considerable quantity of biomedical data has been accumulated, based on which researchers have developed various computational approaches to discover potential associations between human microbes, drugs and diseases. This paper provides a comprehensive overview of recent advances in prediction of potential correlations between microbes, drugs and diseases from biological data to computational models. Firstly, we introduced the widely used datasets relevant to the identification of potential relationships between microbes, drugs and diseases in detail. And then, we divided a series of a lot of representative computing models into five major categories including network, matrix factorization, matrix completion, regularization and artificial neural network for in-depth discussion and comparison. Finally, we analysed possible challenges and opportunities in this research area, and at the same time we outlined some suggestions for further improvement of predictive performances as well.
Collapse
Affiliation(s)
- Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Yaqin Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Xiaoyu Yang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Pengyao Ping
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China
| |
Collapse
|
9
|
Wang L, Li H, Wang Y, Tan Y, Chen Z, Pei T, Zou Q. MDADP: A webserver integrating database and prediction tools for microbe-disease associations. IEEE J Biomed Health Inform 2022; 26:3427-3434. [PMID: 35254998 DOI: 10.1109/jbhi.2022.3156166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
More and more evidence has demonstrated that microbiota play important roles in the life processes of the human body. In recent years, various computational methods have been proposed for identifying potentially disease-associated microbes to save costs in traditional biological experiments. However, prediction performances of these methods are generally limited by outdated and incomplete datasets. And moreover, until now, there are limited studies that can provide visual predictive tools for inferring possible microbe-disease associations (MDAs) as well. Hence, in this manuscript, a novel webserver called MDADP will be proposed to identify latent MDAs, in which, a new MDA database together with interactive prediction tools for MDAs studies will be designed simultaneously. Especially, in the newly constructed MDA database, 2019 known MDAs between 58 diseases and 703 microbes have been manually collected first. And then, through adopting the average ranking method and the co-confidence method respectively, eight representative computational models have been integrated together to identify potential disease-related microbes. As a result, MDADP can provide not only interactive features for users to access and capture MDAs entities, but also effective tools for users to identify candidate microbes for different diseases. To our knowledge, MDADP is the first online platform that incorporates a new MDA database with comprehensive MDA prediction tools. Therefore, we believe that it will be a valuable source of information for researches in microbiology and disease-related fields. MDADP can be accessed at http://mdadp.leelab2997.cn.
Collapse
|
10
|
Song HS, Lindemann SR, Lee DY. Editorial: Predictive Modeling of Human Microbiota and Their Role in Health and Disease. Front Microbiol 2021; 12:782871. [PMID: 34917060 PMCID: PMC8668940 DOI: 10.3389/fmicb.2021.782871] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 11/11/2021] [Indexed: 12/23/2022] Open
Affiliation(s)
- Hyun-Seob Song
- Department of Biological Systems Engineering, University of Nebraska-Lincoln, Lincoln, NE, United States.,Department of Food Science and Technology, Nebraska Food for Health Center, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Stephen R Lindemann
- Department of Food Science, Whistler Center for Carbohydrate Research, Purdue University, West Lafayette, IN, United States.,Department of Nutrition Science, Purdue University, West Lafayette, IN, United States
| | - Dong-Yup Lee
- School of Chemical Engineering, Sungkyunkwan University, Suwon, South Korea
| |
Collapse
|
11
|
Li H, Wang Y, Zhang Z, Tan Y, Chen Z, Wang X, Pei T, Wang L. Identifying Microbe-Disease Association Based on a Novel Back-Propagation Neural Network Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2502-2513. [PMID: 32305935 DOI: 10.1109/tcbb.2020.2986459] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Over the years, numerous evidences have demonstrated that microbes living in the human body are closely related to human life activities and human diseases. However, traditional biological experiments are time-consuming and expensive, so it has become a research topic in bioinformatics to predict potential microbe-disease associations by adopting computational methods. In this study, a novel calculative method called BPNNHMDA is proposed to identify potential microbe-disease associations. In BPNNHMDA, a novel neural network model is first designed to infer potential microbe-disease associations, its input signal is a matrix of known microbe-disease associations, and its output signal is matrix of potential microbe-disease associations probabilities. And moreover, in the novel neural network model, a new activation function is designed to activate the hidden layer and the output layer based on the hyperbolic tangent function, and its initial connection weights are optimized by adopting Gaussian Interaction Profile kernel (GIP) similarity for microbes, which can improve the training speed of BPNNHMDA efficiently. Finally, in order to verify the performance of our prediction model, different frameworks such as the Leave-One-Out Cross Validation (LOOCV) and k-Fold Cross Validation ( k-Fold CV) are implemented on BPNNHMDA respectively. Simulation results illustrate that BPNNHMDA can achieve reliable AUCs of 0.9242, 0.9127 ± 0.0009 and 0.8955 ± 0.0018 in LOOCV, 5-Fold CV and 2-Fold CV separately, which are superior to previous state-of-the-art methods. Furthermore, case studies of inflammatory bowel disease (IBD), asthma and obesity demonstrate that BPNNHMDA has excellent prediction ability in practical applications as well.
Collapse
|
12
|
Pardo-Diaz J, Bozhilova LV, Beguerisse-Díaz M, Poole PS, Deane CM, Reinert G. Robust gene coexpression networks using signed distance correlation. Bioinformatics 2021; 37:btab041. [PMID: 33523234 PMCID: PMC8557847 DOI: 10.1093/bioinformatics/btab041] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 11/30/2020] [Accepted: 01/21/2021] [Indexed: 12/30/2022] Open
Abstract
MOTIVATION Even within well studied organisms, many genes lack useful functional annotations. One way to generate such functional information is to infer biological relationships between genes/proteins, using a network of gene coexpression data that includes functional annotations. However, the lack of trustworthy functional annotations can impede the validation of such networks. Hence, there is a need for a principled method to construct gene coexpression networks that capture biological information and are structurally stable even in the absence of functional information. RESULTS We introduce the concept of signed distance correlation as a measure of dependency between two variables, and apply it to generate gene coexpression networks. Distance correlation offers a more intuitive approach to network construction than commonly used methods such as Pearson correlation and mutual information. We propose a framework to generate self-consistent networks using signed distance correlation purely from gene expression data, with no additional information. We analyse data from three different organisms to illustrate how networks generated with our method are more stable and capture more biological information compared to networks obtained from Pearson correlation or mutual information. SUPPLEMENTARY INFORMATION Supplementary Information and code are available at Bioinformatics and https://github.com/javier-pardodiaz/sdcorGCN online.
Collapse
Affiliation(s)
- Javier Pardo-Diaz
- Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
- Department of Plant Sciences, University of Oxford, Oxford OX1 3RB, UK
| | | | | | - Philip S Poole
- Department of Plant Sciences, University of Oxford, Oxford OX1 3RB, UK
| | | | - Gesine Reinert
- Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
| |
Collapse
|
13
|
Zhao Y, Wang CC, Chen X. Microbes and complex diseases: from experimental results to computational models. Brief Bioinform 2020; 22:5882184. [PMID: 32766753 DOI: 10.1093/bib/bbaa158] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 06/19/2020] [Accepted: 06/22/2020] [Indexed: 12/13/2022] Open
Abstract
Studies have shown that the number of microbes in humans is almost 10 times that of cells. These microbes have been proven to play an important role in a variety of physiological processes, such as enhancing immunity, improving the digestion of gastrointestinal tract and strengthening metabolic function. In addition, in recent years, more and more research results have indicated that there are close relationships between the emergence of the human noncommunicable diseases and microbes, which provides a novel insight for us to further understand the pathogenesis of the diseases. An in-depth study about the relationships between diseases and microbes will not only contribute to exploring new strategies for the diagnosis and treatment of diseases but also significantly heighten the efficiency of new drugs development. However, applying the methods of biological experimentation to reveal the microbe-disease associations is costly and inefficient. In recent years, more and more researchers have constructed multiple computational models to predict microbes that are potentially associated with diseases. Here, we start with a brief introduction of microbes and databases as well as web servers related to them. Then, we mainly introduce four kinds of computational models, including score function-based models, network algorithm-based models, machine learning-based models and experimental analysis-based models. Finally, we summarize the advantages as well as disadvantages of them and set the direction for the future work of revealing microbe-disease associations based on computational models. We firmly believe that computational models are expected to be important tools in large-scale predictions of disease-related microbes.
Collapse
Affiliation(s)
- Yan Zhao
- School of Information and Control Engineering, China University of Mining
| | - Chun-Chun Wang
- School of Information and Control Engineering, China University of Mining
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining
| |
Collapse
|
14
|
Wen Z, Yan C, Duan G, Li S, Wu FX, Wang J. A survey on predicting microbe-disease associations: biological data and computational methods. Brief Bioinform 2020; 22:5881365. [PMID: 34020541 DOI: 10.1093/bib/bbaa157] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 06/18/2020] [Accepted: 06/22/2020] [Indexed: 02/06/2023] Open
Abstract
Various microbes have proved to be closely related to the pathogenesis of human diseases. While many computational methods for predicting human microbe-disease associations (MDAs) have been developed, few systematic reviews on these methods have been reported. In this study, we provide a comprehensive overview of the existing methods. Firstly, we introduce the data used in existing MDA prediction methods. Secondly, we classify those methods into different categories by their nature and describe their algorithms and strategies in detail. Next, experimental evaluations are conducted on representative methods using different similarity data and calculation methods to compare their prediction performances. Based on the principles of computational methods and experimental results, we discuss the advantages and disadvantages of those methods and propose suggestions for the improvement of prediction performances. Considering the problems of the MDA prediction at present stage, we discuss future work from three perspectives including data, methods and formulations at the end.
Collapse
Affiliation(s)
- Zhongqi Wen
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| | - Cheng Yan
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University
| | - Suning Li
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Sciences, University of Saskatchewan, Saskatoon, Canada
| | - Jianxin Wang
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| |
Collapse
|
15
|
Lei X, Wang Y. Predicting Microbe-Disease Association by Learning Graph Representations and Rule-Based Inference on the Heterogeneous Network. Front Microbiol 2020; 11:579. [PMID: 32351464 PMCID: PMC7174569 DOI: 10.3389/fmicb.2020.00579] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 03/17/2020] [Indexed: 12/18/2022] Open
Abstract
More and more clinical observations have implied that microbes have great effects on human diseases. Understanding the relations between microbes and diseases are of profound significance for disease prevention and therapy. In this paper, we propose a predictive model based on the known microbe-disease associations to discover potential microbe-disease associations through integrating Learning Graph Representations and a modified Scoring mechanism on the Heterogeneous network (called LGRSH). Firstly, the similarity networks for microbe and disease are obtained based on the similarity of Gaussian interaction profile kernel. Then, we construct a heterogeneous network including these two similarity networks and microbe-disease associations' network. After that, the embedding algorithm Node2vec is implemented to learn representations of nodes in the heterogeneous network. Finally, according to these low-dimensional vector representations, we calculate the relevance between each microbe and disease by utilizing a modified rule-based inference method. By comparison with three other methods including LRLSHMDA, KATZHMDA and BiRWHMDA, LGRSH performs better than others. Moreover, in case studies of asthma, Chronic Obstructive Pulmonary Disease and Inflammatory Bowel Disease, there are 8, 8, and 10 out of the top-10 discovered disease-related microbes were validated respectively, demonstrating that LGRSH performs well in predicting potential microbe-disease associations.
Collapse
Affiliation(s)
- Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Yueyue Wang
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| |
Collapse
|
16
|
Jiang D, Armour CR, Hu C, Mei M, Tian C, Sharpton TJ, Jiang Y. Microbiome Multi-Omics Network Analysis: Statistical Considerations, Limitations, and Opportunities. Front Genet 2019; 10:995. [PMID: 31781153 PMCID: PMC6857202 DOI: 10.3389/fgene.2019.00995] [Citation(s) in RCA: 83] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Accepted: 09/18/2019] [Indexed: 12/21/2022] Open
Abstract
The advent of large-scale microbiome studies affords newfound analytical opportunities to understand how these communities of microbes operate and relate to their environment. However, the analytical methodology needed to model microbiome data and integrate them with other data constructs remains nascent. This emergent analytical toolset frequently ports over techniques developed in other multi-omics investigations, especially the growing array of statistical and computational techniques for integrating and representing data through networks. While network analysis has emerged as a powerful approach to modeling microbiome data, oftentimes by integrating these data with other types of omics data to discern their functional linkages, it is not always evident if the statistical details of the approach being applied are consistent with the assumptions of microbiome data or how they impact data interpretation. In this review, we overview some of the most important network methods for integrative analysis, with an emphasis on methods that have been applied or have great potential to be applied to the analysis of multi-omics integration of microbiome data. We compare advantages and disadvantages of various statistical tools, assess their applicability to microbiome data, and discuss their biological interpretability. We also highlight on-going statistical challenges and opportunities for integrative network analysis of microbiome data.
Collapse
Affiliation(s)
- Duo Jiang
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| | - Courtney R Armour
- Department of Microbiology, Oregon State University, Corvallis, OR, United States
| | - Chenxiao Hu
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| | - Meng Mei
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| | - Chuan Tian
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| | - Thomas J Sharpton
- Department of Statistics, Oregon State University, Corvallis, OR, United States
- Department of Microbiology, Oregon State University, Corvallis, OR, United States
| | - Yuan Jiang
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| |
Collapse
|