Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Oudah M, Henschel A. Taxonomy-aware feature engineering for microbiome classification. BMC Bioinformatics 2018;19:227. [PMID: 29907097 PMCID: PMC6003080 DOI: 10.1186/s12859-018-2205-3] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2017] [Accepted: 05/15/2018] [Indexed: 12/17/2022] Open

For:	Oudah M, Henschel A. Taxonomy-aware feature engineering for microbiome classification. BMC Bioinformatics 2018;19:227. [PMID: 29907097 PMCID: PMC6003080 DOI: 10.1186/s12859-018-2205-3] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2017] [Accepted: 05/15/2018] [Indexed: 12/17/2022] Open

Number

Cited by Other Article(s)

Sun W, Wang S, Bi J, Ning Z, Wang J, Hou H. Study on the response mechanisms and evolution prediction of groundwater microbial-toxicological indicators. WATER ENVIRONMENT RESEARCH : A RESEARCH PUBLICATION OF THE WATER ENVIRONMENT FEDERATION 2024;96:e11131. [PMID: 39327691 DOI: 10.1002/wer.11131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Revised: 08/10/2024] [Accepted: 08/28/2024] [Indexed: 09/28/2024]

Abstract

This study aims to investigate the response mechanisms of groundwater microbial-toxicological indicators, specifically total bacteria count (TBC) and total coliform count (TCC), to water quality indicators and environmental conditions. Using data from a water source in the western plateau of China, a predictive model focusing on TBC and TCC was developed. An orthogonal experimental design was employed to manipulate environmental conditions such as temperature, pH, and porosity, facilitating laboratory experiments. These experiments measured pH, chemical oxygen demand (COD), oxidation-reduction potential (ORP), TBC, and TCC at varying depths and environmental conditions. Principal component analysis elucidated the mechanisms by which water quality indicators and environmental conditions affect groundwater microbial-toxicological indicators. A prediction model for these indicators in plateau regions was established based on a backpropagation neural network (BP-NN), using TBC and TCC as target variables and the newly extracted principal components as influencing factors. The results demonstrate that environmental conditions and water quality indicators primarily influence the evolution of groundwater microbial-toxicological indicators by altering the ionic charge quantities, redox conditions, and temperature of the groundwater. The predictive model for groundwater microbial-toxicological indicators shows trends consistent with experimental outcomes, with an average relative error of less than 15%, meeting engineering requirements. PRACTITIONER POINTS: The values of total bacteria count (TBC) and total coliform count (TCC) under different conditions were obtained by column experiments. The influence mechanism of environmental conditions and groundwater indicators on TBC and TCC was elaborated by principal component analysis. TBC and TCC prediction models were established through the investigation of water sources in a plateau area and laboratory experiments.

Collapse

Hosseiniyan Khatibi SM, Dimaano NG, Veliz E, Sundaresan V, Ali J. Exploring and exploiting the rice phytobiome to tackle climate change challenges. PLANT COMMUNICATIONS 2024:101078. [PMID: 39233440 DOI: 10.1016/j.xplc.2024.101078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Revised: 08/07/2024] [Accepted: 09/02/2024] [Indexed: 09/06/2024]

Abstract

The future of agriculture is uncertain under the current climate change scenario. Climate change directly and indirectly affects the biotic and abiotic elements that control agroecosystems, jeopardizing the safety of the world's food supply. A new area that focuses on characterizing the phytobiome is emerging. The phytobiome comprises plants and their immediate surroundings, involving numerous interdependent microscopic and macroscopic organisms that affect the health and productivity of plants. Phytobiome studies primarily focus on the microbial communities associated with plants, which are referred to as the plant microbiome. The development of high-throughput sequencing technologies over the past 10 years has dramatically advanced our understanding of the structure, functionality, and dynamics of the phytobiome; however, comprehensive methods for using this knowledge are lacking, particularly for major crops such as rice. Considering the impact of rice production on world food security, gaining fresh perspectives on the interdependent and interrelated components of the rice phytobiome could enhance rice production and crop health, sustain rice ecosystem function, and combat the effects of climate change. Our review re-conceptualizes the complex dynamics of the microscopic and macroscopic components in the rice phytobiome as influenced by human interventions and changing environmental conditions driven by climate change. We also discuss interdisciplinary and systematic approaches to decipher and reprogram the sophisticated interactions in the rice phytobiome using novel strategies and cutting-edge technology. Merging the gigantic datasets and complex information on the rice phytobiome and their application in the context of regenerative agriculture could lead to sustainable rice farming practices that are resilient to the impacts of climate change.

Collapse

Oliver A, Kay M, Lemay DG. TaxaHFE: a machine learning approach to collapse microbiome datasets using taxonomic structure. BIOINFORMATICS ADVANCES 2023;3:vbad165. [PMID: 38046097 PMCID: PMC10689668 DOI: 10.1093/bioadv/vbad165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 09/28/2023] [Accepted: 11/27/2023] [Indexed: 12/05/2023]

Arjmandi M, Fattahi M, Motevassel M, Rezaveisi H. Evaluating algorithms of decision tree, support vector machine and regression for anode side catalyst data in proton exchange membrane water electrolysis. Sci Rep 2023;13:20309. [PMID: 37985795 PMCID: PMC10662483 DOI: 10.1038/s41598-023-47174-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Accepted: 11/09/2023] [Indexed: 11/22/2023] Open

Al-Aamri A, Kamarul Azman S, Daw Elbait G, Alsafar H, Henschel A. Critical assessment of on-premise approaches to scalable genome analysis. BMC Bioinformatics 2023;24:354. [PMID: 37735350 PMCID: PMC10512525 DOI: 10.1186/s12859-023-05470-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 09/08/2023] [Indexed: 09/23/2023] Open

Xu W, Wang T, Wang N, Zhang H, Zha Y, Ji L, Chu Y, Ning K. Artificial intelligence-enabled microbiome-based diagnosis models for a broad spectrum of cancer types. Brief Bioinform 2023;24:7152257. [PMID: 37141141 DOI: 10.1093/bib/bbad178] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Revised: 04/07/2023] [Accepted: 04/18/2023] [Indexed: 05/05/2023] Open

Affiliation(s)

Wei Xu Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
Teng Wang Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
Nan Wang Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
Haohong Zhang Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
Yuguo Zha Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
Lei Ji Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China Geneis Beijing Co., Ltd., Beijing, 100102, China Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao 266000, China
Yuwen Chu Geneis Beijing Co., Ltd., Beijing, 100102, China Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao 266000, China School of Electrical & Information Engineering, Anhui University of Technology, Anhui, 243002, China
Kang Ning Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China

Collapse

Tapio M, Fischer D, Mäntysaari P, Tapio I. Rumen Microbiota Predicts Feed Efficiency of Primiparous Nordic Red Dairy Cows. Microorganisms 2023;11:1116. [PMID: 37317090 DOI: 10.3390/microorganisms11051116] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 04/17/2023] [Accepted: 04/23/2023] [Indexed: 06/16/2023] Open

Kumar R, Yadav G, Kuddus M, Ashraf GM, Singh R. Unlocking the microbial studies through computational approaches: how far have we reached? ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023;30:48929-48947. [PMID: 36920617 PMCID: PMC10016191 DOI: 10.1007/s11356-023-26220-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 02/24/2023] [Indexed: 04/16/2023]

Shen Y, Zhu J, Deng Z, Lu W, Wang H. EnsDeepDP: An Ensemble Deep Learning Approach for Disease Prediction Through Metagenomics. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:986-998. [PMID: 36001521 DOI: 10.1109/tcbb.2022.3201295] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Shtossel O, Isakov H, Turjeman S, Koren O, Louzoun Y. Ordering taxa in image convolution networks improves microbiome-based machine learning accuracy. Gut Microbes 2023;15:2224474. [PMID: 37345233 PMCID: PMC10288916 DOI: 10.1080/19490976.2023.2224474] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 06/08/2023] [Indexed: 06/23/2023] Open

Gavin PG, Kim KW, Craig ME, Hill MM, Hamilton-Williams EE. Multi-omic interactions in the gut of children at the onset of islet autoimmunity. MICROBIOME 2022;10:230. [PMID: 36527134 PMCID: PMC9756488 DOI: 10.1186/s40168-022-01425-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 11/11/2022] [Indexed: 06/17/2023]

Shen WX, Liang SR, Jiang YY, Chen YZ. Enhanced metagenomic deep learning for disease prediction and consistent signature recognition by restructured microbiome 2D representations. PATTERNS (NEW YORK, N.Y.) 2022;4:100658. [PMID: 36699735 PMCID: PMC9868677 DOI: 10.1016/j.patter.2022.100658] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 07/15/2022] [Accepted: 11/15/2022] [Indexed: 12/23/2022]

Interpreting tree ensemble machine learning models with endoR. PLoS Comput Biol 2022;18:e1010714. [PMID: 36516158 PMCID: PMC9797088 DOI: 10.1371/journal.pcbi.1010714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 12/28/2022] [Accepted: 11/07/2022] [Indexed: 12/15/2022] Open

Abstract

Tree ensemble machine learning models are increasingly used in microbiome science as they are compatible with the compositional, high-dimensional, and sparse structure of sequence-based microbiome data. While such models are often good at predicting phenotypes based on microbiome data, they only yield limited insights into how microbial taxa may be associated. We developed endoR, a method to interpret tree ensemble models. First, endoR simplifies the fitted model into a decision ensemble. Then, it extracts information on the importance of individual features and their pairwise interactions, displaying them as an interpretable network. Both the endoR network and importance scores provide insights into how features, and interactions between them, contribute to the predictive performance of the fitted model. Adjustable regularization and bootstrapping help reduce the complexity and ensure that only essential parts of the model are retained. We assessed endoR on both simulated and real metagenomic data. We found endoR to have comparable accuracy to other common approaches while easing and enhancing model interpretation. Using endoR, we also confirmed published results on gut microbiome differences between cirrhotic and healthy individuals. Finally, we utilized endoR to explore associations between human gut methanogens and microbiome components. Indeed, these hydrogen consumers are expected to interact with fermenting bacteria in a complex syntrophic network. Specifically, we analyzed a global metagenome dataset of 2203 individuals and confirmed the previously reported association between Methanobacteriaceae and Christensenellales. Additionally, we observed that Methanobacteriaceae are associated with a network of hydrogen-producing bacteria. Our method accurately captures how tree ensembles use features and interactions between them to predict a response. As demonstrated by our applications, the resultant visualizations and summary outputs facilitate model interpretation and enable the generation of novel hypotheses about complex systems.

Collapse

Loganathan T, Priya Doss C G. The influence of machine learning technologies in gut microbiome research and cancer studies - A review. Life Sci 2022;311:121118. [DOI: 10.1016/j.lfs.2022.121118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 10/19/2022] [Accepted: 10/19/2022] [Indexed: 11/18/2022]

Zeng W, Gautam A, Huson DH. DeepToA: An Ensemble Deep-Learning Approach to Predicting the Theater of Activity of a Microbiome. Bioinformatics 2022;38:4670-4676. [PMID: 36029249 DOI: 10.1093/bioinformatics/btac584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 07/19/2022] [Accepted: 08/26/2022] [Indexed: 11/14/2022] Open

Zhou YH, Sun G. Improve the Colorectal Cancer Diagnosis Using Gut Microbiome Data. Front Mol Biosci 2022;9:921945. [PMID: 36032686 PMCID: PMC9415616 DOI: 10.3389/fmolb.2022.921945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 06/16/2022] [Indexed: 11/17/2022] Open

McElhinney JMWR, Catacutan MK, Mawart A, Hasan A, Dias J. Interfacing Machine Learning and Microbial Omics: A Promising Means to Address Environmental Challenges. Front Microbiol 2022;13:851450. [PMID: 35547145 PMCID: PMC9083327 DOI: 10.3389/fmicb.2022.851450] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Accepted: 03/14/2022] [Indexed: 11/13/2022] Open

Chen X, Zhu Z, Zhang W, Wang Y, Wang F, Yang J, Wong KC. Human disease prediction from microbiome data by multiple feature fusion and deep learning. iScience 2022;25:104081. [PMID: 35372808 PMCID: PMC8971930 DOI: 10.1016/j.isci.2022.104081] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2021] [Revised: 09/16/2021] [Accepted: 03/13/2022] [Indexed: 10/29/2022] Open

Host phenotype classification from human microbiome data is mainly driven by the presence of microbial taxa. PLoS Comput Biol 2022;18:e1010066. [PMID: 35446845 PMCID: PMC9064115 DOI: 10.1371/journal.pcbi.1010066] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 05/03/2022] [Accepted: 03/29/2022] [Indexed: 12/14/2022] Open

Abstract

Machine learning-based classification approaches are widely used to predict host phenotypes from microbiome data. Classifiers are typically employed by considering operational taxonomic units or relative abundance profiles as input features. Such types of data are intrinsically sparse, which opens the opportunity to make predictions from the presence/absence rather than the relative abundance of microbial taxa. This also poses the question whether it is the presence rather than the abundance of particular taxa to be relevant for discrimination purposes, an aspect that has been so far overlooked in the literature. In this paper, we aim at filling this gap by performing a meta-analysis on 4,128 publicly available metagenomes associated with multiple case-control studies. At species-level taxonomic resolution, we show that it is the presence rather than the relative abundance of specific microbial taxa to be important when building classification models. Such findings are robust to the choice of the classifier and confirmed by statistical tests applied to identifying differentially abundant/present taxa. Results are further confirmed at coarser taxonomic resolutions and validated on 4,026 additional 16S rRNA samples coming from 30 public case-control studies.

The composition of the human microbiome has been linked to a large number of different diseases. In this context, classification methodologies based on machine learning approaches have represented a promising tool for diagnostic purposes from metagenomics data. The link between microbial population composition and host phenotypes has been usually performed by considering taxonomic profiles represented by relative abundances of microbial species. In this study, we show that it is more the presence rather than the relative abundance of microbial taxa to be relevant to maximize classification accuracy. This is accomplished by conducting a meta-analysis on more than 4,000 shotgun metagenomes coming from 25 case-control studies and in which original relative abundance data are degraded to presence/absence profiles. Findings are also extended to 16S rRNA data and advance the research field in building prediction models directly from human microbiome data.

Collapse

Lin YC, Salleb-Aouissi A, Hooven TA. Interpretable prediction of necrotizing enterocolitis from machine learning analysis of premature infant stool microbiota. BMC Bioinformatics 2022;23:104. [PMID: 35337258 PMCID: PMC8953333 DOI: 10.1186/s12859-022-04618-w] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 02/23/2022] [Indexed: 12/18/2022] Open

Abstract

Background

Necrotizing enterocolitis (NEC) is a common, potentially catastrophic intestinal disease among very low birthweight premature infants. Affecting up to 15% of neonates born weighing less than 1500 g, NEC causes sudden-onset, progressive intestinal inflammation and necrosis, which can lead to significant bowel loss, multi-organ injury, or death. No unifying cause of NEC has been identified, nor is there any reliable biomarker that indicates an individual patient’s risk of the disease. Without a way to predict NEC in advance, the current medical strategy involves close clinical monitoring in an effort to treat babies with NEC as quickly as possible before irrecoverable intestinal damage occurs. In this report, we describe a novel machine learning application for generating dynamic, individualized NEC risk scores based on intestinal microbiota data, which can be determined from sequencing bacterial DNA from otherwise discarded infant stool. A central insight that differentiates our work from past efforts was the recognition that disease prediction from stool microbiota represents a specific subtype of machine learning problem known as multiple instance learning (MIL).

Results

We used a neural network-based MIL architecture, which we tested on independent datasets from two cohorts encompassing 3595 stool samples from 261 at-risk infants. Our report also introduces a new concept called the “growing bag” analysis, which applies MIL over time, allowing incorporation of past data into each new risk calculation. This approach allowed early, accurate NEC prediction, with a mean sensitivity of 86% and specificity of 90%. True-positive NEC predictions occurred an average of 8 days before disease onset. We also demonstrate that an attention-gated mechanism incorporated into our MIL algorithm permits interpretation of NEC risk, identifying several bacterial taxa that past work has associated with NEC, and potentially pointing the way toward new hypotheses about NEC pathogenesis. Our system is flexible, accepting microbiota data generated from targeted 16S or “shotgun” whole-genome DNA sequencing. It performs well in the setting of common, potentially confounding preterm neonatal clinical events such as perinatal cardiopulmonary depression, antibiotic administration, feeding disruptions, or transitions between breast feeding and formula.

Conclusions

We have developed and validated a robust MIL-based system for NEC prediction from harmlessly collected premature infant stool. While this system was developed for NEC prediction, our MIL approach may also be applicable to other diseases characterized by changes in the human microbiota.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12859-022-04618-w.

Collapse

Joishy TK, Jha A, Oudah M, Das S, Adak A, Deb D, Khan MR. Human Gut Microbes Associated with Systolic Blood Pressure. Int J Hypertens 2022;2022:2923941. [PMID: 35154822 PMCID: PMC8831042 DOI: 10.1155/2022/2923941] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 12/31/2021] [Indexed: 11/17/2022] Open

Xiang L, Jin X, Liu Y, Ma Y, Jian Z, Wei Z, Li H, Li Y, Wang K. Prediction of the occurrence of calcium oxalate kidney stones based on clinical and gut microbiota characteristics. World J Urol 2021;40:221-227. [PMID: 34427737 PMCID: PMC8813786 DOI: 10.1007/s00345-021-03801-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 08/02/2021] [Indexed: 02/08/2023] Open

Yang F, Zou Q. mAML: an automated machine learning pipeline with a microbiome repository for human disease classification. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021;2020:5862399. [PMID: 32588040 PMCID: PMC7316531 DOI: 10.1093/database/baaa050] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 05/27/2020] [Accepted: 06/03/2020] [Indexed: 12/20/2022]

Li H, Ni J, Qing H. Gut Microbiota: Critical Controller and Intervention Target in Brain Aging and Cognitive Impairment. Front Aging Neurosci 2021;13:671142. [PMID: 34248602 PMCID: PMC8267942 DOI: 10.3389/fnagi.2021.671142] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Accepted: 05/07/2021] [Indexed: 12/12/2022] Open

Chen X, Liu L, Zhang W, Yang J, Wong KC. Human host status inference from temporal microbiome changes via recurrent neural networks. Brief Bioinform 2021;22:6307015. [PMID: 34151933 DOI: 10.1093/bib/bbab223] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 04/21/2021] [Accepted: 04/21/2021] [Indexed: 01/04/2023] Open

Jasner Y, Belogolovski A, Ben-Itzhak M, Koren O, Louzoun Y. Microbiome Preprocessing Machine Learning Pipeline. Front Immunol 2021;12:677870. [PMID: 34220823 PMCID: PMC8250139 DOI: 10.3389/fimmu.2021.677870] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 05/07/2021] [Indexed: 11/13/2022] Open

Anyaso-Samuel S, Sachdeva A, Guha S, Datta S. Metagenomic Geolocation Prediction Using an Adaptive Ensemble Classifier. Front Genet 2021;12:642282. [PMID: 33959149 PMCID: PMC8093763 DOI: 10.3389/fgene.2021.642282] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 03/18/2021] [Indexed: 11/13/2022] Open

Marcos-Zambrano LJ, Karaduzovic-Hadziabdic K, Loncar Turukalo T, Przymus P, Trajkovik V, Aasmets O, Berland M, Gruca A, Hasic J, Hron K, Klammsteiner T, Kolev M, Lahti L, Lopes MB, Moreno V, Naskinova I, Org E, Paciência I, Papoutsoglou G, Shigdel R, Stres B, Vilne B, Yousef M, Zdravevski E, Tsamardinos I, Carrillo de Santa Pau E, Claesson MJ, Moreno-Indias I, Truu J. Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment. Front Microbiol 2021;12:634511. [PMID: 33737920 PMCID: PMC7962872 DOI: 10.3389/fmicb.2021.634511] [Citation(s) in RCA: 126] [Impact Index Per Article: 42.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 02/01/2021] [Indexed: 12/19/2022] Open

Abstract

The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach.

Collapse

Affiliation(s)

Laura Judith Marcos-Zambrano Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
Kanita Karaduzovic-Hadziabdic Faculty of Engineering and Natural Sciences, International University of Sarajevo, Sarajevo, Bosnia and Herzegovina
Tatjana Loncar Turukalo Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia
Piotr Przymus Faculty of Mathematics and Computer Science, Nicolaus Copernicus University, Toruń, Poland
Vladimir Trajkovik Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Skopje, North Macedonia
Oliver Aasmets Institute of Genomics, Estonian Genome Centre, University of Tartu, Tartu, Estonia Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
Magali Berland Université Paris-Saclay, INRAE, MGP, Jouy-en-Josas, France
Aleksandra Gruca Department of Computer Networks and Systems, Silesian University of Technology, Gliwice, Poland
Jasminka Hasic University Sarajevo School of Science and Technology, Sarajevo, Bosnia and Herzegovina
Karel Hron Department of Mathematical Analysis and Applications of Mathematics, Palacký University, Olomouc, Czechia
Thomas Klammsteiner Department of Microbiology, University of Innsbruck, Innsbruck, Austria
Mikhail Kolev South West University “Neofit Rilski”, Blagoevgrad, Bulgaria
Leo Lahti Department of Computing, University of Turku, Turku, Finland
Marta B. Lopes NOVA Laboratory for Computer Science and Informatics (NOVA LINCS), FCT, UNL, Caparica, Portugal Centro de Matemática e Aplicações (CMA), FCT, UNL, Caparica, Portugal
Victor Moreno Oncology Data Analytics Program, Catalan Institute of Oncology (ICO)Barcelona, Spain Colorectal Cancer Group, Institut de Recerca Biomedica de Bellvitge (IDIBELL), Barcelona, Spain Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), Barcelona, Spain Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona, Spain
Irina Naskinova South West University “Neofit Rilski”, Blagoevgrad, Bulgaria
Elin Org Institute of Genomics, Estonian Genome Centre, University of Tartu, Tartu, Estonia
Inês Paciência EPIUnit – Instituto de Saúde Pública da Universidade do Porto, Porto, Portugal
Georgios Papoutsoglou Department of Computer Science, University of Crete, Heraklion, Greece
Rajesh Shigdel Department of Clinical Science, University of Bergen, Bergen, Norway
Blaz Stres Group for Microbiology and Microbial Biotechnology, Department of Animal Science, University of Ljubljana, Ljubljana, Slovenia
Baiba Vilne Bioinformatics Research Unit, Riga Stradins University, Riga, Latvia
Malik Yousef Department of Information Systems, Zefat Academic College, Zefat, Israel Galilee Digital Health Research Center (GDH), Zefat Academic College, Zefat, Israel
Eftim Zdravevski Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Skopje, North Macedonia
Ioannis Tsamardinos Department of Computer Science, University of Crete, Heraklion, Greece
Enrique Carrillo de Santa Pau Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
Marcus J. Claesson School of Microbiology & APC Microbiome Ireland, University College Cork, Cork, Ireland
Isabel Moreno-Indias Unidad de Gestión Clínica de Endocrinología y Nutrición, Instituto de Investigación Biomédica de Málaga (IBIMA), Hospital Clínico Universitario Virgen de la Victoria, Universidad de Málaga, Málaga, Spain Centro de Investigación Biomédica en Red de Fisiopatología de la Obesidad y la Nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain
Jaak Truu Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia

Collapse

Ghannam RB, Techtmann SM. Machine learning applications in microbial ecology, human microbiome studies, and environmental monitoring. Comput Struct Biotechnol J 2021;19:1092-1107. [PMID: 33680353 PMCID: PMC7892807 DOI: 10.1016/j.csbj.2021.01.028] [Citation(s) in RCA: 89] [Impact Index Per Article: 29.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 01/16/2021] [Accepted: 01/18/2021] [Indexed: 01/04/2023] Open

Reiman D, Farhat AM, Dai Y. Predicting Host Phenotype Based on Gut Microbiome Using a Convolutional Neural Network Approach. Methods Mol Biol 2021;2190:249-266. [PMID: 32804370 DOI: 10.1007/978-1-0716-0826-5_12] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Gut microbiota and artificial intelligence approaches: A scoping review. HEALTH AND TECHNOLOGY 2020. [DOI: 10.1007/s12553-020-00486-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Pérez-Cobas AE, Gomez-Valero L, Buchrieser C. Metagenomic approaches in microbial ecology: an update on whole-genome and marker gene sequencing analyses. Microb Genom 2020;6:mgen000409. [PMID: 32706331 PMCID: PMC7641418 DOI: 10.1099/mgen.0.000409] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Accepted: 06/30/2020] [Indexed: 12/23/2022] Open

Xia Y. Correlation and association analyses in microbiome study integrating multiomics in health and disease. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020;171:309-491. [PMID: 32475527 DOI: 10.1016/bs.pmbts.2020.04.003] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Abstract

Correlation and association analyses are one of the most widely used statistical methods in research fields, including microbiome and integrative multiomics studies. Correlation and association have two implications: dependence and co-occurrence. Microbiome data are structured as phylogenetic tree and have several unique characteristics, including high dimensionality, compositionality, sparsity with excess zeros, and heterogeneity. These unique characteristics cause several statistical issues when analyzing microbiome data and integrating multiomics data, such as large p and small n, dependency, overdispersion, and zero-inflation. In microbiome research, on the one hand, classic correlation and association methods are still applied in real studies and used for the development of new methods; on the other hand, new methods have been developed to target statistical issues arising from unique characteristics of microbiome data. Here, we first provide a comprehensive view of classic and newly developed univariate correlation and association-based methods. We discuss the appropriateness and limitations of using classic methods and demonstrate how the newly developed methods mitigate the issues of microbiome data. Second, we emphasize that concepts of correlation and association analyses have been shifted by introducing network analysis, microbe-metabolite interactions, functional analysis, etc. Third, we introduce multivariate correlation and association-based methods, which are organized by the categories of exploratory, interpretive, and discriminatory analyses and classification methods. Fourth, we focus on the hypothesis testing of univariate and multivariate regression-based association methods, including alpha and beta diversities-based, count-based, and relative abundance (or compositional)-based association analyses. We demonstrate the characteristics and limitations of each approaches. Fifth, we introduce two specific microbiome-based methods: phylogenetic tree-based association analysis and testing for survival outcomes. Sixth, we provide an overall view of longitudinal methods in analysis of microbiome and omics data, which cover standard, static, regression-based time series methods, principal trend analysis, and newly developed univariate overdispersed and zero-inflated as well as multivariate distance/kernel-based longitudinal models. Finally, we comment on current association analysis and future direction of association analysis in microbiome and multiomics studies.

Collapse

Reiman D, Metwally AA, Sun J, Dai Y. PopPhy-CNN: A Phylogenetic Tree Embedded Architecture for Convolutional Neural Networks to Predict Host Phenotype From Metagenomic Data. IEEE J Biomed Health Inform 2020;24:2993-3001. [PMID: 32396115 DOI: 10.1109/jbhi.2020.2993761] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Hooven TA, Lin AYC, Salleb-Aouissi A. Multiple Instance Learning for Predicting Necrotizing Enterocolitis in Premature Infants Using Microbiome Data. PROCEEDINGS OF THE ACM CONFERENCE ON HEALTH, INFERENCE, AND LEARNING 2020;2020:99-109. [PMID: 34318306 PMCID: PMC8313028 DOI: 10.1145/3368555.3384466] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science. UNSUPERVISED AND SEMI-SUPERVISED LEARNING 2020. [DOI: 10.1007/978-3-030-22475-2_1] [Citation(s) in RCA: 88] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Taxonomy dimension reduction for colorectal cancer prediction. Comput Biol Chem 2019;83:107160. [DOI: 10.1016/j.compbiolchem.2019.107160] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Revised: 11/02/2019] [Accepted: 11/04/2019] [Indexed: 02/01/2023]

LaPierre N, Ju CJT, Zhou G, Wang W. MetaPheno: A critical evaluation of deep learning and machine learning in metagenome-based disease prediction. Methods 2019;166:74-82. [PMID: 30885720 PMCID: PMC6708502 DOI: 10.1016/j.ymeth.2019.03.003] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2018] [Revised: 02/14/2019] [Accepted: 03/04/2019] [Indexed: 01/21/2023] Open

Cruz AF, Barka GD, Blum LEB, Tanaka T, Ono N, Kanaya S, Reineke A. Evaluation of microbial communities in peels of Brazilian tropical fruits by amplicon sequence analysis. Braz J Microbiol 2019;50:739-748. [PMID: 31073985 PMCID: PMC6863208 DOI: 10.1007/s42770-019-00088-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Accepted: 03/20/2019] [Indexed: 10/26/2022] Open

Zhou YH, Gallins P. A Review and Tutorial of Machine Learning Methods for Microbiome Host Trait Prediction. Front Genet 2019;10:579. [PMID: 31293616 PMCID: PMC6603228 DOI: 10.3389/fgene.2019.00579] [Citation(s) in RCA: 95] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2019] [Accepted: 06/04/2019] [Indexed: 12/19/2022] Open

Qu K, Guo F, Liu X, Lin Y, Zou Q. Application of Machine Learning in Microbiology. Front Microbiol 2019;10:827. [PMID: 31057526 PMCID: PMC6482238 DOI: 10.3389/fmicb.2019.00827] [Citation(s) in RCA: 95] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 04/01/2019] [Indexed: 02/01/2023] Open