1
|
Rojas-Velazquez D, Kidwai S, Kraneveld AD, Tonda A, Oberski D, Garssen J, Lopez-Rincon A. Methodology for biomarker discovery with reproducibility in microbiome data using machine learning. BMC Bioinformatics 2024; 25:26. [PMID: 38225565 PMCID: PMC10789030 DOI: 10.1186/s12859-024-05639-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Accepted: 01/04/2024] [Indexed: 01/17/2024] Open
Abstract
BACKGROUND In recent years, human microbiome studies have received increasing attention as this field is considered a potential source for clinical applications. With the advancements in omics technologies and AI, research focused on the discovery for potential biomarkers in the human microbiome using machine learning tools has produced positive outcomes. Despite the promising results, several issues can still be found in these studies such as datasets with small number of samples, inconsistent results, lack of uniform processing and methodologies, and other additional factors lead to lack of reproducibility in biomedical research. In this work, we propose a methodology that combines the DADA2 pipeline for 16s rRNA sequences processing and the Recursive Ensemble Feature Selection (REFS) in multiple datasets to increase reproducibility and obtain robust and reliable results in biomedical research. RESULTS Three experiments were performed analyzing microbiome data from patients/cases in Inflammatory Bowel Disease (IBD), Autism Spectrum Disorder (ASD), and Type 2 Diabetes (T2D). In each experiment, we found a biomarker signature in one dataset and applied to 2 other as further validation. The effectiveness of the proposed methodology was compared with other feature selection methods such as K-Best with F-score and random selection as a base line. The Area Under the Curve (AUC) was employed as a measure of diagnostic accuracy and used as a metric for comparing the results of the proposed methodology with other feature selection methods. Additionally, we use the Matthews Correlation Coefficient (MCC) as a metric to evaluate the performance of the methodology as well as for comparison with other feature selection methods. CONCLUSIONS We developed a methodology for reproducible biomarker discovery for 16s rRNA microbiome sequence analysis, addressing the issues related with data dimensionality, inconsistent results and validation across independent datasets. The findings from the three experiments, across 9 different datasets, show that the proposed methodology achieved higher accuracy compared to other feature selection methods. This methodology is a first approach to increase reproducibility, to provide robust and reliable results.
Collapse
Affiliation(s)
- David Rojas-Velazquez
- Division of Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Faculty of Science, University of Utrecht, Utrecht, The Netherlands.
- Department of Data Science, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands.
| | - Sarah Kidwai
- Division of Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Faculty of Science, University of Utrecht, Utrecht, The Netherlands
| | - Aletta D Kraneveld
- Division of Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Faculty of Science, University of Utrecht, Utrecht, The Netherlands
- Department of Neuroscience, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Alberto Tonda
- UMR 518 MIA - PS, INRAE, Institut des Systèmes Complexes de Paris, Île - de - France (ISC-PIF) - UAR 3611 CNRS, Université Paris-Saclay, Paris, France
| | - Daniel Oberski
- Department of Data Science, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Johan Garssen
- Division of Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Faculty of Science, University of Utrecht, Utrecht, The Netherlands
- Global Centre of Excellence Immunology, Danone Nutricia Research, Utrecht, The Netherlands
| | - Alejandro Lopez-Rincon
- Division of Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Faculty of Science, University of Utrecht, Utrecht, The Netherlands
- Department of Data Science, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
| |
Collapse
|
2
|
Peralta-Marzal LN, Rojas-Velazquez D, Rigters D, Prince N, Garssen J, Kraneveld AD, Perez-Pardo P, Lopez-Rincon A. A robust microbiome signature for autism spectrum disorder across different studies using machine learning. Sci Rep 2024; 14:814. [PMID: 38191575 PMCID: PMC10774349 DOI: 10.1038/s41598-023-50601-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 12/21/2023] [Indexed: 01/10/2024] Open
Abstract
Autism spectrum disorder (ASD) is a highly complex neurodevelopmental disorder characterized by deficits in sociability and repetitive behaviour, however there is a great heterogeneity within other comorbidities that accompany ASD. Recently, gut microbiome has been pointed out as a plausible contributing factor for ASD development as individuals diagnosed with ASD often suffer from intestinal problems and show a differentiated intestinal microbial composition. Nevertheless, gut microbiome studies in ASD rarely agree on the specific bacterial taxa involved in this disorder. Regarding the potential role of gut microbiome in ASD pathophysiology, our aim is to investigate whether there is a set of bacterial taxa relevant for ASD classification by using a sibling-controlled dataset. Additionally, we aim to validate these results across two independent cohorts as several confounding factors, such as lifestyle, influence both ASD and gut microbiome studies. A machine learning approach, recursive ensemble feature selection (REFS), was applied to 16S rRNA gene sequencing data from 117 subjects (60 ASD cases and 57 siblings) identifying 26 bacterial taxa that discriminate ASD cases from controls. The average area under the curve (AUC) of this specific set of bacteria in the sibling-controlled dataset was 81.6%. Moreover, we applied the selected bacterial taxa in a tenfold cross-validation scheme using two independent cohorts (a total of 223 samples-125 ASD cases and 98 controls). We obtained average AUCs of 74.8% and 74%, respectively. Analysis of the gut microbiome using REFS identified a set of bacterial taxa that can be used to predict the ASD status of children in three distinct cohorts with AUC over 80% for the best-performing classifiers. Our results indicate that the gut microbiome has a strong association with ASD and should not be disregarded as a potential target for therapeutic interventions. Furthermore, our work can contribute to use the proposed approach for identifying microbiome signatures across other 16S rRNA gene sequencing datasets.
Collapse
Affiliation(s)
- Lucia N Peralta-Marzal
- Division of Pharmacology, Faculty of Science, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, The Netherlands
| | - David Rojas-Velazquez
- Division of Pharmacology, Faculty of Science, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, The Netherlands
- Department of Data Science, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Douwe Rigters
- Division of Pharmacology, Faculty of Science, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, The Netherlands
| | - Naika Prince
- Division of Pharmacology, Faculty of Science, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, The Netherlands
| | - Johan Garssen
- Division of Pharmacology, Faculty of Science, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, The Netherlands
- Global Centre of Excellence Immunology, Danone Nutricia Research, Utrecht, The Netherlands
| | - Aletta D Kraneveld
- Division of Pharmacology, Faculty of Science, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, The Netherlands
- Department of Neuroscience, Faculty of Science, VU University, Amsterdam, The Netherlands
| | - Paula Perez-Pardo
- Division of Pharmacology, Faculty of Science, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, The Netherlands.
| | - Alejandro Lopez-Rincon
- Division of Pharmacology, Faculty of Science, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, The Netherlands
- Department of Data Science, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
| |
Collapse
|
3
|
Sardon-Prado O, Diaz-Garcia C, Corcuera-Elosegui P, Korta-Murua J, Valverde-Molina J, Sanchez-Solis M. Severe Asthma and Biological Therapies: Now and the Future. J Clin Med 2023; 12:5846. [PMID: 37762787 PMCID: PMC10532431 DOI: 10.3390/jcm12185846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 08/18/2023] [Accepted: 08/25/2023] [Indexed: 09/29/2023] Open
Abstract
Recognition of phenotypic variability in pediatric asthma allows for a more personalized therapeutic approach. Knowledge of the underlying pathophysiological and molecular mechanisms (endotypes) of corresponding biomarkers and new treatments enables this strategy to progress. Biologic therapies for children with severe asthma are becoming more relevant in this sense. The T2 phenotype is the most prevalent in childhood and adolescence, and non-T2 phenotypes are usually rare. This document aims to review the mechanism of action, efficacy, and potential predictive and monitoring biomarkers of biological drugs, focusing on the pediatric population. The drugs currently available are omalizumab, mepolizumab, benralizumab, dupilumab, and 1ezepelumab, with some differences in administrative approval prescription criteria between the U.S. Food and Drug Administration (FDA) and the European Medicines Agency (EMA). Previously, we described the characteristics of severe asthma in children and its diagnostic and therapeutic management.
Collapse
Affiliation(s)
- Olaia Sardon-Prado
- Division of Paediatric Respiratory Medicine, Donostia University Hospital, 20014 San Sebastián, Spain; (O.S.-P.); (P.C.-E.); (J.K.-M.)
- Department of Pediatrics, University of the Basque Country (UPV/EHU), 20014 Leioa, Spain
| | - Carolina Diaz-Garcia
- Paediatric Pulmonology and Allergy Unit, Santa Lucia General University Hospital, 30202 Cartagena, Spain;
| | - Paula Corcuera-Elosegui
- Division of Paediatric Respiratory Medicine, Donostia University Hospital, 20014 San Sebastián, Spain; (O.S.-P.); (P.C.-E.); (J.K.-M.)
| | - Javier Korta-Murua
- Division of Paediatric Respiratory Medicine, Donostia University Hospital, 20014 San Sebastián, Spain; (O.S.-P.); (P.C.-E.); (J.K.-M.)
| | - Jose Valverde-Molina
- Department of Paediatrics, Santa Lucía General University Hospital, 30202 Cartagena, Spain
- IMIB Biomedical Research Institute, 20120 Murcia, Spain;
| | - Manuel Sanchez-Solis
- IMIB Biomedical Research Institute, 20120 Murcia, Spain;
- Department of Pediatrics, University of Murcia, 20120 Murcia, Spain
- Paediatric Allergy and Pulmonology Units, Virgen de la Arrixaca University Children’s Hospital, 20120 Murcia, Spain
| |
Collapse
|