1
|
Solfisburg QS, Baldini F, Baldwin-Hunter B, Austin GI, Lee HH, Park H, Freedberg DE, Lightdale CJ, Korem T, Abrams JA. The Salivary Microbiome and Predicted Metabolite Production Are Associated with Barrett's Esophagus and High-Grade Dysplasia or Adenocarcinoma. Cancer Epidemiol Biomarkers Prev 2024; 33:371-380. [PMID: 38117184 PMCID: PMC10955687 DOI: 10.1158/1055-9965.epi-23-0652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 09/05/2023] [Accepted: 12/18/2023] [Indexed: 12/21/2023] Open
Abstract
BACKGROUND Esophageal adenocarcinoma (EAC) is rising in incidence, and established risk factors do not explain this trend. Esophageal microbiome alterations have been associated with Barrett's esophagus (BE) and dysplasia and EAC. The oral microbiome is tightly linked to the esophageal microbiome; this study aimed to identify salivary microbiome-related factors associated with BE, dysplasia, and EAC. METHODS Clinical data and oral health history were collected from patients with and without BE. The salivary microbiome was characterized, assessing differential relative abundance of taxa by 16S rRNA gene sequencing and associations between microbiome composition and clinical features. Microbiome metabolic modeling was used to predict metabolite production. RESULTS A total of 244 patients (125 non-BE and 119 BE) were analyzed. Patients with high-grade dysplasia (HGD)/EAC had a significantly higher prevalence of tooth loss (P = 0.001). There were significant shifts with increased dysbiosis associated with HGD/EAC, independent of tooth loss, with the largest shifts within the genus Streptococcus. Modeling predicted significant shifts in the microbiome metabolic capacities, including increases in L-lactic acid and decreases in butyric acid and L-tryptophan production in HGD/EAC. CONCLUSIONS Marked dysbiosis in the salivary microbiome is associated with HGD and EAC, with notable increases within the genus Streptococcus and accompanying changes in predicted metabolite production. Further work is warranted to identify the biological significance of these alterations and to validate metabolic shifts. IMPACT There is an association between oral dysbiosis and HGD/EAC. Further work is needed to establish the diagnostic, predictive, and causal potential of this relationship.
Collapse
Affiliation(s)
- Quinn S Solfisburg
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Federico Baldini
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | | | - George I Austin
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Harry H Lee
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Heekuk Park
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA
- Microbiome and Pathogen Genomics Collaborative Center, Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA
| | - Daniel E Freedberg
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA
- Digestive and Liver Disease Research Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Charles J Lightdale
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA
| | - Tal Korem
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, New York, NY, USA
- CIFAR Azrieli Global Scholars Program, CIFAR, Toronto, Canada
| | - Julian A Abrams
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA
- Digestive and Liver Disease Research Center, Columbia University Irving Medical Center, New York, NY, USA
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY USA
| |
Collapse
|
2
|
Austin GI, Kav AB, Park H, Biermann J, Uhlemann AC, Korem T. Processing-bias correction with DEBIAS-M improves cross-study generalization of microbiome-based prediction models. bioRxiv 2024:2024.02.09.579716. [PMID: 38405914 PMCID: PMC10888995 DOI: 10.1101/2024.02.09.579716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Every step in common microbiome profiling protocols has variable efficiency for each microbe. For example, different DNA extraction kits may have different efficiency for Gram-positive and -negative bacteria. These variable efficiencies, combined with technical variation, create strong processing biases, which impede the identification of signals that are reproducible across studies and the development of generalizable and biologically interpretable prediction models. "Batch-correction" methods have been used to alleviate these issues computationally with some success. However, many make strong parametric assumptions which do not necessarily apply to microbiome data or processing biases, or require the use of an outcome variable, which risks overfitting. Lastly and importantly, existing transformations used to correct microbiome data are largely non-interpretable, and could, for example, introduce values to features that were initially mostly zeros. Altogether, processing bias currently compromises our ability to glean robust and generalizable biological insights from microbiome data. Here, we present DEBIAS-M (Domain adaptation with phenotype Estimation and Batch Integration Across Studies of the Microbiome), an interpretable framework for inference and correction of processing bias, which facilitates domain adaptation in microbiome studies. DEBIAS-M learns bias-correction factors for each microbe in each batch that simultaneously minimize batch effects and maximize cross-study associations with phenotypes. Using benchmarks of HIV and colorectal cancer classification from gut microbiome data, and cervical neoplasia prediction from cervical microbiome data, we demonstrate that DEBIAS-M outperforms batch-correction methods commonly used in the field. Notably, we show that the inferred bias-correction factors are stable, interpretable, and strongly associated with specific experimental protocols. Overall, we show that DEBIAS-M allows for better modeling of microbiome data and identification of interpretable signals that are reproducible across studies.
Collapse
Affiliation(s)
- George I. Austin
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Aya Brown Kav
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Heekuk Park
- Division of Infectious Diseases, Columbia University Irving Medical Center, New York, NY, USA
| | - Jana Biermann
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Department of Medicine, Division of Hematology/Oncology, Columbia University Irving Medical Center, New York, NY, USA
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Anne-Catrin Uhlemann
- Division of Infectious Diseases, Columbia University Irving Medical Center, New York, NY, USA
| | - Tal Korem
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, New York, NY, USA
| |
Collapse
|
3
|
Austin GI, Park H, Meydan Y, Seeram D, Sezin T, Lou YC, Firek BA, Morowitz MJ, Banfield JF, Christiano AM, Pe'er I, Uhlemann AC, Shenhav L, Korem T. Contamination source modeling with SCRuB improves cancer phenotype prediction from microbiome data. Nat Biotechnol 2023; 41:1820-1828. [PMID: 36928429 PMCID: PMC10504420 DOI: 10.1038/s41587-023-01696-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 01/23/2023] [Indexed: 03/18/2023]
Abstract
Sequencing-based approaches for the analysis of microbial communities are susceptible to contamination, which could mask biological signals or generate artifactual ones. Methods for in silico decontamination using controls are routinely used, but do not make optimal use of information shared across samples and cannot handle taxa that only partially originate in contamination or leakage of biological material into controls. Here we present Source tracking for Contamination Removal in microBiomes (SCRuB), a probabilistic in silico decontamination method that incorporates shared information across multiple samples and controls to precisely identify and remove contamination. We validate the accuracy of SCRuB in multiple data-driven simulations and experiments, including induced contamination, and demonstrate that it outperforms state-of-the-art methods by an average of 15-20 times. We showcase the robustness of SCRuB across multiple ecosystems, data types and sequencing depths. Demonstrating its applicability to microbiome research, SCRuB facilitates improved predictions of host phenotypes, most notably the prediction of treatment response in melanoma patients using decontaminated tumor microbiome data.
Collapse
Affiliation(s)
- George I Austin
- Department of Computer Science, Columbia University, New York, NY, USA
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Heekuk Park
- Division of Infectious Diseases, Columbia University Irving Medical Center, New York, NY, USA
| | - Yoli Meydan
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Dwayne Seeram
- Division of Infectious Diseases, Columbia University Irving Medical Center, New York, NY, USA
| | - Tanya Sezin
- Department of Dermatology, Columbia University Irving Medical Center, New York, NY, USA
| | - Yue Clare Lou
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Brian A Firek
- Department of Surgery, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Michael J Morowitz
- Department of Surgery, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Jillian F Banfield
- Department of Earth and Planetary Science, University of California, Berkeley, CA, USA
- Department of Environmental Science, Policy, and Management, University of California, Berkeley, CA, USA
- Innovative Genomics Institute, University of California, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Angela M Christiano
- Department of Dermatology, Columbia University Irving Medical Center, New York, NY, USA
- Department of Genetics and Development, Columbia University Irving Medical Center, New York, NY, USA
| | - Itsik Pe'er
- Department of Computer Science, Columbia University, New York, NY, USA
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Data Science Institute, Columbia University, New York, NY, USA
| | - Anne-Catrin Uhlemann
- Division of Infectious Diseases, Columbia University Irving Medical Center, New York, NY, USA
| | - Liat Shenhav
- Center for Studies in Physics and Biology, Rockefeller University, New York, NY, USA.
| | - Tal Korem
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA.
- Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, New York, NY, USA.
- CIFAR Azrieli Global Scholars program, CIFAR, Toronto, Canada.
| |
Collapse
|