1
|
Little A, Zhao N, Mikhaylova A, Zhang A, Ling W, Thibord F, Johnson AD, Raffield LM, Curran JE, Blangero J, O'Connell JR, Xu H, Rotter JI, Rich SS, Rice KM, Chen MH, Reiner A, Kooperberg C, Vu T, Hou L, Fornage M, Loos RJF, Kenny E, Mathias R, Becker L, Smith AV, Boerwinkle E, Yu B, Thornton T, Wu MC. General Kernel Machine Methods for Multi-Omics Integration and Genome-Wide Association Testing With Related Individuals. Genet Epidemiol 2025; 49:e22610. [PMID: 39812506 DOI: 10.1002/gepi.22610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 09/18/2024] [Accepted: 12/17/2024] [Indexed: 01/16/2025]
Abstract
Integrating multi-omics data may help researchers understand the genetic underpinnings of complex traits and diseases. However, the best ways to integrate multi-omics data and use them to address pressing scientific questions remain a challenge. One important and topical problem is how to assess the aggregate effect of multiple genomic data types (e.g. genotypes and gene expression levels) on a phenotype, particularly while accommodating routine issues, such as having related subjects' data in analyses. In this paper, we extend an existing composite kernel machine regression model to integrate two multi-omics data types, while accommodating for general correlation structures amongst outcomes. Due to the kernel machine regression framework, our methods allow for the integration of high-dimensional omics data with small, nonlinear, and interactive effects, and accommodation of general study designs. Here, we focus on scientific questions that aim to assess the association between a functional grouping (such as a gene or a pathway) and a quantitative trait of interest. We use a kernel machine regression to integrate the two multi-omics data types, as they may relate to the trait, and perform a global test of association. We demonstrate the advantage of this approach over single data type association tests via simulation. Finally, we apply this method to a large, multi-ethnic data set to investigate how predicted gene expression and rare genetic variation may be related to two platelet traits.
Collapse
Grants
- U.S. Department of Health and Human Services, National Institute on Minority Health and Health Disparities, National Institutes of Health, National Human Genome Research Institute, National Center for Research Resources, COPD Foundation, National Heart, Lung, and Blood Institute, National Science Foundation, National Institute on Aging, and National Institute of Neurological Disorders and Stroke.
Collapse
Affiliation(s)
- Amarise Little
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - Ni Zhao
- Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland, USA
| | - Anna Mikhaylova
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Angela Zhang
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Wodan Ling
- Department of Population Health Sciences, Division of Biostatistics, Weill Cornell Medicine, New York, New York, USA
| | - Florian Thibord
- National Heart, Lung, and Blood Institute, Boston University's Framingham Heart Study, Framingham, Massachusetts, USA
- National Heart, Lung and Blood Institute, Population Sciences Branch, Framingham, Massachusetts, USA
| | - Andrew D Johnson
- National Heart, Lung, and Blood Institute, Boston University's Framingham Heart Study, Framingham, Massachusetts, USA
- National Heart, Lung and Blood Institute, Population Sciences Branch, Framingham, Massachusetts, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Joanne E Curran
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, Texas, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, Texas, USA
| | - John Blangero
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, Texas, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, Texas, USA
| | | | - Huichun Xu
- Department of Medicine, University of Maryland, Baltimore, Maryland, USA
| | - Jerome I Rotter
- Department of Pediatrics, The Institute for Translational Genomics and Population Sciences, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, California, USA
| | - Stephen S Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia, USA
| | - Kenneth M Rice
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Ming-Huei Chen
- National Heart, Lung, and Blood Institute, Boston University's Framingham Heart Study, Framingham, Massachusetts, USA
- National Heart, Lung and Blood Institute, Population Sciences Branch, Framingham, Massachusetts, USA
| | - Alexander Reiner
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - Charles Kooperberg
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - Thao Vu
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Lifang Hou
- Department of Preventive Medicine, Northwestern Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Myriam Fornage
- Brown Foundation Institute for Molecular Medicine, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Eimear Kenny
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- The Center for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Rasika Mathias
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Lewis Becker
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Albert V Smith
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, USA
| | - Eric Boerwinkle
- Department of Epidemiology, School of Public Health, University of Texas Health Science Center at Houston, Houston, Texas, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, USA
| | - Bing Yu
- Department of Epidemiology, School of Public Health, University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Timothy Thornton
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Michael C Wu
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| |
Collapse
|
2
|
He M, Zhao N. A Mixed Effect Similarity Matrix Regression Model (SMRmix) for Integrating Multiple Microbiome Datasets at Community Level. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.10.584315. [PMID: 38559012 PMCID: PMC10979838 DOI: 10.1101/2024.03.10.584315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
BACKGROUND Recent studies have highlighted the importance of human microbiota in our health and diseases. However, in many areas of research, individual microbiome studies often offer inconsistent results due to the limited sample sizes and the heterogeneity in study populations and experimental procedures. This inconsistency underscores the necessity for integrative analysis of multiple microbiome datasets. Despite the critical need, statistical methods that incorporate multiple microbiome datasets and account for the study heterogeneity are not available in the literature. METHODS In this paper, we develop a mixed effect similarity matrix regression (SMRmix) approach for identifying community level microbiome shifts between outcomes. SMRmix has a close connection with the microbiome kernel association test, one of the most popular approaches for such a task but is only applicable when we have a single study. SMRmix enables researchers to consolidate findings from diverse microbiome studies. RESULTS Via extensive simulations, we show that SMRmix has well-controlled type I error and higher power than some potential competitors. We applied the SMRmix to two real-world datasets. The first, from the HIV-reanalysis consortium, integrated data from 17 studies on gut dysbiosis in HIV. Our analysis confirmed consistent associations between the gut microbiome and HIV infection as well as MSM (men who have sex with men) status, demonstrating greater power than competing methods. The second dataset involved 11 studies on the gut microbiome in colorectal cancer; analysis with SMRmix confirmed significant dysbiosis in affected individuals compared to healthy controls. CONCLUSION The development of SMRmix enables the integration of multiple studies and effectively managing study heterogeneity, and provides a powerful tool for uncovering consistent associations between diseases and community-level microbiome data.
Collapse
|
3
|
Tsay JCJ, Darawshy F, Wang C, Kwok B, Wong KK, Wu BG, Sulaiman I, Zhou H, Isaacs B, Kugler MC, Sanchez E, Bain A, Li Y, Schluger R, Lukovnikova A, Collazo D, Kyeremateng Y, Pillai R, Chang M, Li Q, Vanguri RS, Becker AS, Moore WH, Thurston G, Gordon T, Moreira AL, Goparaju CM, Sterman DH, Tsirigos A, Li H, Segal LN, Pass HI. Lung Microbial and Host Genomic Signatures as Predictors of Prognosis in Early-Stage Adenocarcinoma. Cancer Epidemiol Biomarkers Prev 2024; 33:1433-1444. [PMID: 39225784 PMCID: PMC11530314 DOI: 10.1158/1055-9965.epi-24-0661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 07/15/2024] [Accepted: 08/28/2024] [Indexed: 09/04/2024] Open
Abstract
BACKGROUND Risk of early-stage lung adenocarcinoma recurrence after surgical resection is significant, and the postrecurrence median survival is approximately 2 years. Currently, there are no commercially available biomarkers that predict recurrence. In this study, we investigated whether microbial and host genomic signatures in the lung can predict recurrence. METHODS In 91 patients with early-stage (stage IA/IB) lung adenocarcinoma with extensive follow-up, we used 16s rRNA gene sequencing and host RNA sequencing to map the microbial and host transcriptomic landscape in tumor and adjacent unaffected lung samples. RESULTS Of 91 subjects, 23 had tumor recurrence over 5-year period. In tumor samples, lung adenocarcinoma recurrence was associated with enrichment in Dialister and Prevotella, whereas in unaffected lung samples, recurrence was associated with enrichment in Sphingomonas and Alloiococcus. The strengths of the associations between microbial and host genomic signatures with lung adenocarcinoma recurrence were greater in adjacent unaffected lung samples than in the primary tumor. Among microbial-host features in the unaffected lung samples associated with recurrence, enrichment in Stenotrophomonas geniculata and Chryseobacterium was positively correlated with upregulation of IL2, IL3, IL17, EGFR, and HIF1 signaling pathways among the host transcriptome. In tumor samples, enrichment in Veillonellaceae (Dialister), Ruminococcaceae, Haemophilus influenzae, and Neisseria was positively correlated with upregulation of IL1, IL6, IL17, IFN, and tryptophan metabolism pathways. CONCLUSIONS Overall, modeling suggested that a combined microbial/transcriptome approach using unaffected lung samples had the best biomarker performance (AUC = 0.83). IMPACT This study suggests that lung adenocarcinoma recurrence is associated with distinct pathophysiologic mechanisms of microbial-host interactions in the unaffected lung rather than those present in the resected tumor.
Collapse
Affiliation(s)
- Jun-Chieh J. Tsay
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Division of Pulmonary and Critical Care Medicine, VA New York Harbor Healthcare System, New York, NY
| | - Fares Darawshy
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- The Institute of Pulmonary Medicine, Hadassah Medical Center, Jerusalem, Israel
- Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Chan Wang
- Department of Population Health, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Benjamin Kwok
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Kendrew K. Wong
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Benjamin G. Wu
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Division of Pulmonary and Critical Care Medicine, VA New York Harbor Healthcare System, New York, NY
| | - Imran Sulaiman
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Respiratory Medicine, Royal College of Surgeons in Ireland, Dublin, Ireland
- Department of Respiratory Medicine, Beaumont Hospital, Dublin, Ireland
| | - Hua Zhou
- Applied Bioinformatics Laboratories, NYU Grossman School of Medicine New York, USA
| | - Bradley Isaacs
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Matthias C. Kugler
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Elizabeth Sanchez
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Alexander Bain
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Yonghua Li
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Rosemary Schluger
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Alena Lukovnikova
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Destiny Collazo
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Yaa Kyeremateng
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Ray Pillai
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Miao Chang
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Qingsheng Li
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Rami S. Vanguri
- Division of Precision Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, USA
| | - Anton S. Becker
- Department of Radiology, NYU Grossman School of Medicine New York, USA
| | - William H. Moore
- Department of Radiology, NYU Grossman School of Medicine New York, USA
| | - George Thurston
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Population Health, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Division of Environmental Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, USA
| | - Terry Gordon
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Division of Environmental Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, USA
| | - Andre L. Moreira
- Department of Pathology, NYU Grossman School of Medicine New York, USA
| | - Chandra M. Goparaju
- Department of Cardiothoracic Surgery, NYU Grossman School of Medicine, New York, USA
| | - Daniel H. Sterman
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Laura and Isaac Perlmutter Cancer Center, NYU Grossman School of Medicine, New York, NY
| | - Aristotelis Tsirigos
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Applied Bioinformatics Laboratories, NYU Grossman School of Medicine New York, USA
- Division of Precision Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, USA
| | - Huilin Li
- Department of Population Health, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
| | - Leopoldo N. Segal
- Division of Pulmonary, Critical Care and Sleep Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Department of Medicine, NYU Grossman School of Medicine, NYU Langone Health, New York, NY
- Laura and Isaac Perlmutter Cancer Center, NYU Grossman School of Medicine, New York, NY
| | - Harvey I. Pass
- Department of Cardiothoracic Surgery, NYU Grossman School of Medicine, New York, USA
- Laura and Isaac Perlmutter Cancer Center, NYU Grossman School of Medicine, New York, NY
| |
Collapse
|
4
|
Li Y, Lee T, Marin K, Hua X, Srinivasan S, Fredricks DN, Lee JR, Ling W. SurvBal: compositional microbiome balances for survival outcomes. Bioinformatics 2024; 40:btae612. [PMID: 39404767 PMCID: PMC11639162 DOI: 10.1093/bioinformatics/btae612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 09/16/2024] [Accepted: 10/13/2024] [Indexed: 10/30/2024] Open
Abstract
SUMMARY Identification of balances of bacterial taxa in relation to continuous and dichotomous outcomes is an increasingly frequent analytic objective in microbiome profiling experiments. SurvBal enables the selection of balances in relation to censored survival or time-to-event outcomes which are of considerable interest in many biomedical studies. The most commonly used survival models-the Cox proportional hazards and parametric survival models are included in the package, which are used in combination with step-wise selection procedures to identify the optimal associated balance of microbiome, i.e. the ratio of the geometric means of two groups of taxa's relative abundances. AVAILABILITY AND IMPLEMENTATION The SurvBal R package and Shiny app can be accessed at https://github.com/yinglia/SurvBal and https://yinglistats.shinyapps.io/shinyapp-survbal/.
Collapse
Affiliation(s)
- Ying Li
- Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, United States
| | - Teresa Lee
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, United States
| | - Kai Marin
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, United States
| | - Xing Hua
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, United States
| | - Sujatha Srinivasan
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, United States
| | - David N Fredricks
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, United States
- Department of Medicine, University of Washington, Seattle, WA 98195, United States
| | - John R Lee
- Division of Nephrology and Hypertension, Department of Medicine, Weill Cornell Medicine, New York, NY 10065, United States
- Department of Transplantation Medicine, New York Presbyterian Hospital–Weill Cornell Medical Center, New York, NY 10065, United States
| | - Wodan Ling
- Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, United States
| |
Collapse
|
5
|
Jang H, Koh H. A unified web cloud computing platform MiMedSurv for microbiome causal mediation analysis with survival responses. Sci Rep 2024; 14:20650. [PMID: 39232070 PMCID: PMC11374894 DOI: 10.1038/s41598-024-71852-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 08/31/2024] [Indexed: 09/06/2024] Open
Abstract
In human microbiome studies, mediation analysis has recently been spotlighted as a practical and powerful analytic tool to survey the causal roles of the microbiome as a mediator to explain the observed relationships between a medical treatment/environmental exposure and a human disease. We also note that, in a clinical research, investigators often trace disease progression sequentially in time; as such, time-to-event (e.g., time-to-disease, time-to-cure) responses, known as survival responses, are prevalent as a surrogate variable for human health or disease. In this paper, we introduce a web cloud computing platform, named as microbiome mediation analysis with survival responses (MiMedSurv), for comprehensive microbiome mediation analysis with survival responses on user-friendly web environments. MiMedSurv is an extension of our prior web cloud computing platform, named as microbiome mediation analysis (MiMed), for survival responses. The two main features that are well-distinguished are as follows. First, MiMedSurv conducts some baseline exploratory non-mediational survival analysis, not involving microbiome, to survey the disparity in survival response between medical treatments/environmental exposures. Then, MiMedSurv identifies the mediating roles of the microbiome in various aspects: (i) as a microbial ecosystem using ecological indices (e.g., alpha and beta diversity indices) and (ii) as individual microbial taxa in various hierarchies (e.g., phyla, classes, orders, families, genera, species). To illustrate its use, we survey the mediating roles of the gut microbiome between antibiotic treatment and time-to-type 1 diabetes. MiMedSurv is freely available on our web server ( http://mimedsurv.micloud.kr ).
Collapse
Affiliation(s)
- Hyojung Jang
- Department of Applied Mathematics and Statistics, The State University of New York, Korea, Incheon, South Korea
| | - Hyunwook Koh
- Department of Applied Mathematics and Statistics, The State University of New York, Korea, Incheon, South Korea.
| |
Collapse
|
6
|
Koh H. A general kernel machine regression framework using principal component analysis for jointly testing main and interaction effects: Applications to human microbiome studies. NAR Genom Bioinform 2024; 6:lqae148. [PMID: 39534501 PMCID: PMC11555437 DOI: 10.1093/nargab/lqae148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 09/27/2024] [Accepted: 10/18/2024] [Indexed: 11/16/2024] Open
Abstract
The effect of a treatment on a health or disease response can be modified by genetic or microbial variants. It is the matter of interaction effects between genetic or microbial variants and a treatment. To powerfully discover genetic or microbial biomarkers, it is crucial to incorporate such interaction effects in addition to the main effects. However, in the context of kernel machine regression analysis of its kind, existing methods cannot be utilized in a situation, where a kernel is available but its underlying real variants are unknown. To address such limitations, I introduce a general kernel machine regression framework using principal component analysis for jointly testing main and interaction effects. It begins with extracting principal components from an input kernel through the singular value decomposition. Then, it employs the principal components as surrogate variants to construct three endogenous kernels for the main effects, interaction effects, and both of them, respectively. Hence, it works with a kernel as an input without knowing its underlying real variants, and also detects either the main effects, interaction effects, or both of them robustly. I also introduce its omnibus testing extension to multiple input kernels, named OmniK. I demonstrate its use for human microbiome studies.
Collapse
Affiliation(s)
- Hyunwook Koh
- Department of Applied Mathematics and Statistics, The State University of New York, Korea, Incheon 21985, South Korea
| |
Collapse
|
7
|
Deek RA, Ma S, Lewis J, Li H. Statistical and computational methods for integrating microbiome, host genomics, and metabolomics data. eLife 2024; 13:e88956. [PMID: 38832759 PMCID: PMC11149933 DOI: 10.7554/elife.88956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 05/10/2024] [Indexed: 06/05/2024] Open
Abstract
Large-scale microbiome studies are progressively utilizing multiomics designs, which include the collection of microbiome samples together with host genomics and metabolomics data. Despite the increasing number of data sources, there remains a bottleneck in understanding the relationships between different data modalities due to the limited number of statistical and computational methods for analyzing such data. Furthermore, little is known about the portability of general methods to the metagenomic setting and few specialized techniques have been developed. In this review, we summarize and implement some of the commonly used methods. We apply these methods to real data sets where shotgun metagenomic sequencing and metabolomics data are available for microbiome multiomics data integration analysis. We compare results across methods, highlight strengths and limitations of each, and discuss areas where statistical and computational innovation is needed.
Collapse
Affiliation(s)
- Rebecca A Deek
- Department of Biostatistics, University of PittsburghPittsburghUnited States
| | - Siyuan Ma
- Department of Biostatistics, Vanderbilt School of MedicineNashvilleUnited States
| | - James Lewis
- Division of Gastroenterology and Hepatology, Perelman School of Medicine, University of PennsylvaniaPhiladelphiaUnited States
| | - Hongzhe Li
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of PennsylvaniaPhiladelphiaUnited States
| |
Collapse
|
8
|
Liu S, Bradley P, Sun W. Neural network models for sequence-based TCR and HLA association prediction. PLoS Comput Biol 2023; 19:e1011664. [PMID: 37983288 PMCID: PMC10695368 DOI: 10.1371/journal.pcbi.1011664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 12/04/2023] [Accepted: 11/06/2023] [Indexed: 11/22/2023] Open
Abstract
T cells rely on their T cell receptors (TCRs) to discern foreign antigens presented by human leukocyte antigen (HLA) proteins. The TCRs of an individual contain a record of this individual's past immune activities, such as immune response to infections or vaccines. Mining the TCR data may recover useful information or biomarkers for immune related diseases or conditions. Some TCRs are observed only in the individuals with certain HLA alleles, and thus characterizing TCRs requires a thorough understanding of TCR-HLA associations. The extensive diversity of HLA alleles and the rareness of some HLA alleles present a formidable challenge for this task. Existing methods either treat HLA as a categorical variable or represent an HLA by its alphanumeric name, and have limited ability to generalize to the HLAs that are not seen in the training process. To address this challenge, we propose a neural network-based method named Deep learning Prediction of TCR-HLA association (DePTH) to predict TCR-HLA associations based on their amino acid sequences. We demonstrate that DePTH is capable of making reasonable predictions for TCR-HLA associations, even when neither the HLA nor the TCR have been included in the training dataset. Furthermore, we establish that DePTH can be used to quantify the functional similarities among HLA alleles, and that these HLA similarities are associated with the survival outcomes of cancer patients who received immune checkpoint blockade treatments.
Collapse
Affiliation(s)
- Si Liu
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, United States of America
| | - Philip Bradley
- Herbold Computational Biology Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, United States of America
- Institute for Protein Design. University of Washington, Seattle, Washington, United States of America
| | - Wei Sun
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, United States of America
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
9
|
Little P, Hsu L, Sun W. Associating somatic mutation with clinical outcomes through kernel regression and optimal transport. Biometrics 2023; 79:2705-2718. [PMID: 36217816 PMCID: PMC10455040 DOI: 10.1111/biom.13769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 09/16/2022] [Indexed: 11/30/2022]
Abstract
Somatic mutations in cancer patients are inherently sparse and potentially high dimensional. Cancer patients may share the same set of deregulated biological processes perturbed by different sets of somatically mutated genes. Therefore, when assessing the associations between somatic mutations and clinical outcomes, gene-by-gene analysis is often under-powered because it does not capture the complex disease mechanisms shared across cancer patients. Rather than testing genes one by one, an intuitive approach is to aggregate somatic mutation data of multiple genes to assess their joint association with clinical outcomes. The challenge is how to aggregate such information. Building on the optimal transport method, we propose a principled approach to estimate the similarity of somatic mutation profiles of multiple genes between tumor samples, while accounting for gene-gene similarities defined by gene annotations or empirical mutational patterns. Using such similarities, we can assess the associations between somatic mutations and clinical outcomes by kernel regression. We have applied our method to analyze somatic mutation data of 17 cancer types and identified at least five cancer types, where somatic mutations are associated with overall survival, progression-free interval, or cytolytic activity.
Collapse
Affiliation(s)
- Paul Little
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, U.S.A
| | - Li Hsu
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, U.S.A
- Department of Biostatistics, University of Washington, Seattle, Washington, U.S.A
| | - Wei Sun
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, U.S.A
- Department of Biostatistics, University of Washington, Seattle, Washington, U.S.A
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| |
Collapse
|
10
|
Qun T, Zhou T, Hao J, Wang C, Zhang K, Xu J, Wang X, Zhou W. Antibacterial activities of anthraquinones: structure-activity relationships and action mechanisms. RSC Med Chem 2023; 14:1446-1471. [PMID: 37593578 PMCID: PMC10429894 DOI: 10.1039/d3md00116d] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Accepted: 05/24/2023] [Indexed: 08/19/2023] Open
Abstract
With the increasing prevalence of untreatable infections caused by antibiotic-resistant bacteria, the discovery of new drugs from natural products has become a hot research topic. The antibacterial activity of anthraquinones widely distributed in traditional Chinese medicine has attracted much attention. Herein, the structure and activity relationships (SARs) of anthraquinones as bacteriostatic agents are reviewed and elucidated. The substituents of anthraquinone and its derivatives are closely related to their antibacterial activities. The stronger the polarity of anthraquinone substituents is, the more potent the antibacterial effects appear. The presence of hydroxyl groups is not necessary for the antibacterial activity of hydroxyanthraquinone derivatives. Substitution of di-isopentenyl groups can improve the antibacterial activity of anthraquinone derivatives. The rigid plane structure of anthraquinone lowers its water solubility and results in the reduced activity. Meanwhile, the antibacterial mechanisms of anthraquinone and its analogs are explored, mainly including biofilm formation inhibition, destruction of the cell wall, endotoxin inhibition, inhibition of nucleic acid and protein synthesis, and blockage of energy metabolism and other substances.
Collapse
Affiliation(s)
- Tang Qun
- Shanghai Veterinary Research Institute, Chinese Academy of Agricultural Sciences 200241 Shanghai China
| | - Tiantian Zhou
- School of Chinese Materia Medica, Guangdong Pharmaceutical University 440113 Guangzhou China
| | - Jiongkai Hao
- Shanghai Veterinary Research Institute, Chinese Academy of Agricultural Sciences 200241 Shanghai China
| | - Chunmei Wang
- Shanghai Veterinary Research Institute, Chinese Academy of Agricultural Sciences 200241 Shanghai China
- Key laboratory of Veterinary Chemical Drugs and Pharmaceutics, Ministry of Agriculture and Rural Affairs, Shanghai Research Institute, Chinese Academy of Agricultural Sciences Shanghai 200241 China
| | - Keyu Zhang
- Shanghai Veterinary Research Institute, Chinese Academy of Agricultural Sciences 200241 Shanghai China
- Key laboratory of Veterinary Chemical Drugs and Pharmaceutics, Ministry of Agriculture and Rural Affairs, Shanghai Research Institute, Chinese Academy of Agricultural Sciences Shanghai 200241 China
| | - Jing Xu
- Huanghua Agricultural and Rural Development Bureau Bohai New Area 061100 Hebei China
| | - Xiaoyang Wang
- Shanghai Veterinary Research Institute, Chinese Academy of Agricultural Sciences 200241 Shanghai China
- Key laboratory of Veterinary Chemical Drugs and Pharmaceutics, Ministry of Agriculture and Rural Affairs, Shanghai Research Institute, Chinese Academy of Agricultural Sciences Shanghai 200241 China
| | - Wen Zhou
- Shanghai Veterinary Research Institute, Chinese Academy of Agricultural Sciences 200241 Shanghai China
- Key laboratory of Veterinary Chemical Drugs and Pharmaceutics, Ministry of Agriculture and Rural Affairs, Shanghai Research Institute, Chinese Academy of Agricultural Sciences Shanghai 200241 China
| |
Collapse
|
11
|
Gu W, Koh H, Jang H, Lee B, Kang B. MiSurv: an Integrative Web Cloud Platform for User-Friendly Microbiome Data Analysis with Survival Responses. Microbiol Spectr 2023; 11:e0505922. [PMID: 37039671 PMCID: PMC10269532 DOI: 10.1128/spectrum.05059-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 03/12/2023] [Indexed: 04/12/2023] Open
Abstract
Investigators have studied the treatment effects on human health or disease, the treatment effects on human microbiome, and the roles of the microbiome on human health or disease. Especially, in a clinical trial, investigators commonly trace disease status over a lengthy period to survey the sequential disease progression for different treatment groups (e.g., treatment versus placebo, new treatment versus old treatment). Hence, disease responses are often available in the form of survival (i.e., time-to-event) responses stratified by treatment groups. While the recent web cloud platforms have enabled user-friendly microbiome data processing and analytics, there is currently no web cloud platform to analyze microbiome data with survival responses. Therefore, we introduce here an integrative web cloud platform, called MiSurv, for comprehensive microbiome data analysis with survival responses. IMPORTANCE MiSurv consists of a data processing module and its following four data analytic modules: (i) Module 1: Comparative survival analysis between treatment groups, (ii) Module 2: Comparative analysis in microbial composition between treatment groups, (iii) Module 3: Association testing between microbial composition and survival responses, (iv) Module 4: Prediction modeling using microbial taxa on survival responses. We demonstrate its use through an example trial on the effects of antibiotic use on the survival rate against type 1 diabetes (T1D) onset and gut microbiome composition, respectively, and the effects of the gut microbiome on the survival rate against T1D onset. MiSurv is freely available on our web server (http://misurv.micloud.kr) or can alternatively run on the user's local computer (https://github.com/wg99526/MiSurvGit).
Collapse
Affiliation(s)
- Won Gu
- Department of Applied Mathematics and Statistics, The State University of New York, Korea, Incheon, South Korea
| | - Hyunwook Koh
- Department of Applied Mathematics and Statistics, The State University of New York, Korea, Incheon, South Korea
| | - Hyojung Jang
- Department of Applied Mathematics and Statistics, The State University of New York, Korea, Incheon, South Korea
| | - Byungho Lee
- Department of Applied Mathematics and Statistics, The State University of New York, Korea, Incheon, South Korea
| | - Byungkon Kang
- Department of Computer Science, The State University of New York, Korea, Incheon, South Korea
| |
Collapse
|
12
|
Khomich M, Lin H, Malinovschi A, Brix S, Cestelli L, Peddada S, Johannessen A, Eriksen C, Real FG, Svanes C, Bertelsen RJ. Association between lipid-A-producing oral bacteria of different potency and fractional exhaled nitric oxide in a Norwegian population-based adult cohort. J Transl Med 2023; 21:354. [PMID: 37246224 DOI: 10.1186/s12967-023-04199-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 05/14/2023] [Indexed: 05/30/2023] Open
Abstract
BACKGROUND Lipid A is the primary immunostimulatory part of the lipopolysaccharide (LPS) molecule. The inflammatory response of LPS varies and depends upon the number of acyl chains and phosphate groups in lipid A which is specific for a bacterial species or strain. Traditional LPS quantification assays cannot distinguish between the acylation degree of lipid A molecules, and therefore little is known about how bacteria with different inflammation-inducing potencies affect fractional exhaled nitric oxide (FeNO). We aimed to explore the association between pro-inflammatory hexa- and less inflammatory penta-acylated LPS-producing oral bacteria and FeNO as a marker of airway inflammation. METHODS We used data from a population-based adult cohort from Norway (n = 477), a study center of the RHINESSA multi-center generation study. We applied statistical methods on the bacterial community- (prediction with MiRKAT) and genus-level (differential abundance analysis with ANCOM-BC) to investigate the association between the oral microbiota composition and FeNO. RESULTS We found the overall composition to be significantly associated with increasing FeNO levels independent of covariate adjustment, and abundances of 27 bacterial genera to differ in individuals with high FeNO vs. low FeNO levels. Hexa- and penta-acylated LPS producers made up 2.4% and 40.8% of the oral bacterial genera, respectively. The Bray-Curtis dissimilarity within hexa- and penta-acylated LPS-producing oral bacteria was associated with increasing FeNO levels independent of covariate adjustment. A few single penta-acylated LPS producers were more abundant in individuals with low FeNO vs. high FeNO, while hexa-acylated LPS producers were found not to be enriched. CONCLUSIONS In a population-based adult cohort, FeNO was observed to be associated with the overall oral bacterial community composition. The effect of hexa- and penta-acylated LPS-producing oral bacteria was overall significant when focusing on Bray-Curtis dissimilarity within each of the two communities and FeNO levels, but only penta-acylated LPS producers appeared to be reduced or absent in individuals with high FeNO. It is likely that the pro-inflammatory effect of hexa-acylated LPS producers is counteracted by the dominance of the more abundant penta-acylated LPS producers in this population-based adult cohort involving mainly healthy individuals.
Collapse
Affiliation(s)
- Maryia Khomich
- Department of Clinical Science, University of Bergen, Bergen, Norway.
| | - Huang Lin
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences (NIEHS), NIH, Research Triangle Park, Durham, NC, USA
| | - Andrei Malinovschi
- Department of Medical Sciences, Clinical Physiology, Uppsala University, Uppsala, Sweden
| | - Susanne Brix
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Lucia Cestelli
- Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Shyamal Peddada
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences (NIEHS), NIH, Research Triangle Park, Durham, NC, USA
| | - Ane Johannessen
- Department of Global Public Health and Primary Care, Center for International Health, University of Bergen, Bergen, Norway
| | - Carsten Eriksen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
- Center for Molecular Prediction of Inflammatory Bowel Disease (PREDICT), Department of Clinical Medicine, Aalborg University, Copenhagen, Denmark
| | - Francisco Gomez Real
- Department of Clinical Science, University of Bergen, Bergen, Norway
- Department of Obstetrics and Gynecology, Haukeland University Hospital, Bergen, Norway
| | - Cecilie Svanes
- Department of Global Public Health and Primary Care, Center for International Health, University of Bergen, Bergen, Norway
- Department of Occupational Medicine, Haukeland University Hospital, Bergen, Norway
| | - Randi Jacobsen Bertelsen
- Department of Clinical Science, University of Bergen, Bergen, Norway.
- Oral Health Center of Expertise in Western Norway, Bergen, Norway.
| |
Collapse
|
13
|
Liu H, Ling W, Hua X, Moon JY, Williams-Nguyen JS, Zhan X, Plantinga AM, Zhao N, Zhang A, Knight R, Qi Q, Burk RD, Kaplan RC, Wu MC. Kernel-based genetic association analysis for microbiome phenotypes identifies host genetic drivers of beta-diversity. MICROBIOME 2023; 11:80. [PMID: 37081571 PMCID: PMC10116795 DOI: 10.1186/s40168-023-01530-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 03/21/2023] [Indexed: 05/03/2023]
Abstract
BACKGROUND Understanding human genetic influences on the gut microbiota helps elucidate the mechanisms by which genetics may influence health outcomes. Typical microbiome genome-wide association studies (GWAS) marginally assess the association between individual genetic variants and individual microbial taxa. We propose a novel approach, the covariate-adjusted kernel RV (KRV) framework, to map genetic variants associated with microbiome beta-diversity, which focuses on overall shifts in the microbiota. The KRV framework evaluates the association between genetics and microbes by comparing similarity in genetic profiles, based on groups of variants at the gene level, to similarity in microbiome profiles, based on the overall microbiome composition, across all pairs of individuals. By reducing the multiple-testing burden and capturing intrinsic structure within the genetic and microbiome data, the KRV framework has the potential of improving statistical power in microbiome GWAS. RESULTS We apply the covariate-adjusted KRV to the Hispanic Community Health Study/Study of Latinos (HCHS/SOL) in a two-stage (first gene-level, then variant-level) genome-wide association analysis for gut microbiome beta-diversity. We have identified an immunity-related gene, IL23R, reported in a previous microbiome genetic association study and discovered 3 other novel genes, 2 of which are involved in immune functions or autoimmune disorders. In addition, simulation studies show that the covariate-adjusted KRV has a greater power than other microbiome GWAS methods that rely on univariate microbiome phenotypes across a range of scenarios. CONCLUSIONS Our findings highlight the value of the covariate-adjusted KRV as a powerful microbiome GWAS approach and support an important role of immunity-related genes in shaping the gut microbiome composition. Video Abstract.
Collapse
Affiliation(s)
- Hongjiao Liu
- Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
| | - Wodan Ling
- Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, 10065, USA
| | - Xing Hua
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
| | - Jee-Young Moon
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Jessica S Williams-Nguyen
- Institute for Research and Education to Advance Community Health, Washington State University, Seattle, WA, 98101, USA
| | - Xiang Zhan
- Department of Biostatistics and Beijing International Center for Mathematical Research, Peking University, Beijing, 100191, China
| | - Anna M Plantinga
- Department of Mathematics and Statistics, Williams College, Williamstown, MA, 01267, USA
| | - Ni Zhao
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, 21205, USA
| | - Angela Zhang
- Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
| | - Rob Knight
- Departments of Pediatrics, Computer Science & Engineering, and Bioengineering; Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Qibin Qi
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Robert D Burk
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
- Departments of Pediatrics; Microbiology & Immunology; and, Obstetrics, Gynecology & Women's Health, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Robert C Kaplan
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Michael C Wu
- Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA.
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA.
| |
Collapse
|
14
|
Sun H, Wang Y, Xiao Z, Huang X, Wang H, He T, Jiang X. multiMiAT: an optimal microbiome-based association test for multicategory phenotypes. Brief Bioinform 2023; 24:7005163. [PMID: 36702753 DOI: 10.1093/bib/bbad012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 12/31/2022] [Accepted: 01/03/2023] [Indexed: 01/28/2023] Open
Abstract
Microbes can affect the metabolism and immunity of human body incessantly, and the dysbiosis of human microbiome drives not only the occurrence but also the progression of disease (i.e. multiple statuses of disease). Recently, microbiome-based association tests have been widely developed to detect the association between the microbiome and host phenotype. However, the existing methods have not achieved satisfactory performance in testing the association between the microbiome and ordinal/nominal multicategory phenotypes (e.g. disease severity and tumor subtype). In this paper, we propose an optimal microbiome-based association test for multicategory phenotypes, namely, multiMiAT. Specifically, under the multinomial logit model framework, we first introduce a microbiome regression-based kernel association test for multicategory phenotypes (multiMiRKAT). As a data-driven optimal test, multiMiAT then integrates multiMiRKAT, score test and MiRKAT-MC to maintain excellent performance in diverse association patterns. Massive simulation experiments prove the success of our method. Furthermore, multiMiAT is also applied to real microbiome data experiments to detect the association between the gut microbiome and clinical statuses of colorectal cancer as well as for diverse statuses of Clostridium difficile infections.
Collapse
Affiliation(s)
- Han Sun
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China
- School of Computer Science, Central China Normal University, Wuhan 430079, China
- School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China
| | - Yue Wang
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China
- School of Computer Science, Central China Normal University, Wuhan 430079, China
| | - Zhen Xiao
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China
- School of Computer Science, Central China Normal University, Wuhan 430079, China
- School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China
| | - Xiaoyun Huang
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China
- School of Computer Science, Central China Normal University, Wuhan 430079, China
- Collaborative & Innovative Center for Educational Technology, Central China Normal University, Wuhan 430079, China
| | - Haodong Wang
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China
- School of Computer Science, Central China Normal University, Wuhan 430079, China
| | - Tingting He
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China
- School of Computer Science, Central China Normal University, Wuhan 430079, China
- National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, Wuhan 430079, China
| | - Xingpeng Jiang
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China
- School of Computer Science, Central China Normal University, Wuhan 430079, China
- National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, Wuhan 430079, China
| |
Collapse
|
15
|
Abstract
Since advances in next-generation sequencing (NGS) technique enabled to investigate uncultured microbiota and their genomes in unbiased manner, many microbiome researches have been reporting strong evidences for close links of microbiome to human health and disease. Bioinformatic and statistical analysis of NGS-based microbiome data are essential components in those microbiome researches to explore the complex composition of microbial community and understand the functions of community members in relation to host and environment. This chapter introduces bioinformatic analysis methods that generate taxonomy and functional feature count table along with phylogenetic tree from raw NGS microbiome data and then introduce statistical methods and machine learning approaches for analyzing the outputs of the bioinformatic analysis to infer the biodiversity of a microbial community and unravel host-microbiome association. Understanding the advantages and limitations of the analysis methods will help readers use the methods correctly in microbiome data analysis and may give a new opportunity to develop new analytic techniques for microbiome research.
Collapse
Affiliation(s)
- Youngchul Kim
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA.
| |
Collapse
|
16
|
Ham H, Park T. Combining p-values from various statistical methods for microbiome data. Front Microbiol 2022; 13:990870. [PMID: 36439799 PMCID: PMC9686280 DOI: 10.3389/fmicb.2022.990870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Accepted: 10/11/2022] [Indexed: 08/30/2023] Open
Abstract
MOTIVATION In the field of microbiome analysis, there exist various statistical methods that have been developed for identifying differentially expressed features, that account for the overdispersion and the high sparsity of microbiome data. However, due to the differences in statistical models or test formulations, it is quite often to have inconsistent significance results across statistical methods, that makes it difficult to determine the importance of microbiome taxa. Thus, it is practically important to have the integration of the result from all statistical methods to determine the importance of microbiome taxa. A standard meta-analysis is a powerful tool for integrative analysis and it provides a summary measure by combining p-values from various statistical methods. While there are many meta-analyses available, it is not easy to choose the best meta-analysis that is the most suitable for microbiome data. RESULTS In this study, we investigated which meta-analysis method most adequately represents the importance of microbiome taxa. We considered Fisher's method, minimum value of p method, Simes method, Stouffer's method, Kost method, and Cauchy combination test. Through simulation studies, we showed that Cauchy combination test provides the best combined value of p in the sense that it performed the best among the examined methods while controlling the type 1 error rates. Furthermore, it produced high rank similarity with the true ranks. Through the real data application of colorectal cancer microbiome data, we demonstrated that the most highly ranked microbiome taxa by Cauchy combination test have been reported to be associated with colorectal cancer.
Collapse
Affiliation(s)
- Hyeonjung Ham
- Interdisciplinary Program of Bioinformatics, Seoul National University, Seoul, South Korea
| | - Taesung Park
- Interdisciplinary Program of Bioinformatics, Seoul National University, Seoul, South Korea
- Departement of Statistics, Seoul National University, Seoul, South Korea
| |
Collapse
|
17
|
Peters BA, Pass HI, Burk RD, Xue X, Goparaju C, Sollecito CC, Grassi E, Segal LN, Tsay JCJ, Hayes RB, Ahn J. The lung microbiome, peripheral gene expression, and recurrence-free survival after resection of stage II non-small cell lung cancer. Genome Med 2022; 14:121. [PMID: 36303210 PMCID: PMC9609265 DOI: 10.1186/s13073-022-01126-7] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 10/14/2022] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Cancer recurrence after tumor resection in early-stage non-small cell lung cancer (NSCLC) is common, yet difficult to predict. The lung microbiota and systemic immunity may be important modulators of risk for lung cancer recurrence, yet biomarkers from the lung microbiome and peripheral immune environment are understudied. Such markers may hold promise for prediction as well as improved etiologic understanding of lung cancer recurrence. METHODS In tumor and distant normal lung samples from 46 stage II NSCLC patients with curative resection (39 tumor samples, 41 normal lung samples), we conducted 16S rRNA gene sequencing. We also measured peripheral blood immune gene expression with nanoString®. We examined associations of lung microbiota and peripheral gene expression with recurrence-free survival (RFS) and disease-free survival (DFS) using 500 × 10-fold cross-validated elastic-net penalized Cox regression, and examined predictive accuracy using time-dependent receiver operating characteristic (ROC) curves. RESULTS Over a median of 4.8 years of follow-up (range 0.2-12.2 years), 43% of patients experienced a recurrence, and 50% died. In normal lung tissue, a higher abundance of classes Bacteroidia and Clostridia, and orders Bacteroidales and Clostridiales, were associated with worse RFS, while a higher abundance of classes Alphaproteobacteria and Betaproteobacteria, and orders Burkholderiales and Neisseriales, were associated with better RFS. In tumor tissue, a higher abundance of orders Actinomycetales and Pseudomonadales were associated with worse DFS. Among these taxa, normal lung Clostridiales and Bacteroidales were also related to worse survival in a previous small pilot study and an additional independent validation cohort. In peripheral blood, higher expression of genes TAP1, TAPBP, CSF2RB, and IFITM2 were associated with better DFS. Analysis of ROC curves revealed that lung microbiome and peripheral gene expression biomarkers provided significant additional recurrence risk discrimination over standard demographic and clinical covariates, with microbiome biomarkers contributing more to short-term (1-year) prediction and gene biomarkers contributing to longer-term (2-5-year) prediction. CONCLUSIONS We identified compelling biomarkers in under-explored data types, the lung microbiome, and peripheral blood gene expression, which may improve risk prediction of recurrence in early-stage NSCLC patients. These findings will require validation in a larger cohort.
Collapse
Affiliation(s)
- Brandilyn A Peters
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, 1300 Morris Park Avenue, #1315AB, The Bronx, New York, NY, 10461, USA.
| | - Harvey I Pass
- Department of Cardiothoracic Surgery, NYU Langone Health, New York, NY, USA
- NYU Perlmutter Cancer Center, New York, NY, USA
| | - Robert D Burk
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, 1300 Morris Park Avenue, #1315AB, The Bronx, New York, NY, 10461, USA
- Department of Pediatrics, Albert Einstein College of Medicine, The Bronx, New York, NY, USA
- Department of Microbiology & Immunology, and Obstetrics & Gynecology & Women's Health, Albert Einstein College of Medicine, The Bronx, New York, NY, USA
| | - Xiaonan Xue
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, 1300 Morris Park Avenue, #1315AB, The Bronx, New York, NY, 10461, USA
| | - Chandra Goparaju
- Department of Cardiothoracic Surgery, NYU Langone Health, New York, NY, USA
| | | | - Evan Grassi
- Department of Pediatrics, Albert Einstein College of Medicine, The Bronx, New York, NY, USA
| | | | | | - Richard B Hayes
- NYU Perlmutter Cancer Center, New York, NY, USA
- Department of Population Health, NYU Langone Health, New York, NY, USA
| | - Jiyoung Ahn
- NYU Perlmutter Cancer Center, New York, NY, USA
- Department of Population Health, NYU Langone Health, New York, NY, USA
| |
Collapse
|
18
|
Wojciechowski S, Majchrzak-Górecka M, Biernat P, Odrzywołek K, Pruss Ł, Zych K, Jan Majta, Milanowska-Zabel K. Machine learning on the road to unlocking microbiota's potential for boosting immune checkpoint therapy. Int J Med Microbiol 2022; 312:151560. [PMID: 36113358 DOI: 10.1016/j.ijmm.2022.151560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 07/15/2022] [Accepted: 08/31/2022] [Indexed: 10/14/2022] Open
Abstract
The intestinal microbiota is a complex and diverse ecological community that fulfills multiple functions and substantially impacts human health. Despite its plasticity, unfavorable conditions can cause perturbations leading to so-called dysbiosis, which have been connected to multiple diseases. Unfortunately, understanding the mechanisms underlying the crosstalk between those microorganisms and their host is proving to be difficult. Traditionally used bioinformatic tools have difficulties to fully exploit big data generated for this purpose by modern high throughput screens. Machine Learning (ML) may be a potential means of solving such problems, but it requires diligent application to allow for drawing valid conclusions. This is especially crucial as gaining insight into the mechanistic basis of microbial impact on human health is highly anticipated in numerous fields of study. This includes oncology, where growing amounts of studies implicate the gut ecosystems in both cancerogenesis and antineoplastic treatment outcomes. Based on these reports and first signs of clinical benefits related to microbiota modulation in human trials, hopes are rising for the development of microbiome-derived diagnostics and therapeutics. In this mini-review, we're inspecting analytical approaches used to uncover the role of gut microbiome in immune checkpoint therapy (ICT) with the use of shotgun metagenomic sequencing (SMS) data.
Collapse
Affiliation(s)
| | | | | | - Krzysztof Odrzywołek
- Ardigen, Podole 76, 30-394 Kraków, Poland; Institute of Computer Science, Faculty of Computer Science, Electronics and Telecommunications, AGH University of Science and Technology, Mickiewicza 30, 30-059 Kraków, Poland
| | - Łukasz Pruss
- Ardigen, Podole 76, 30-394 Kraków, Poland; Department of Biochemistry, Molecular Biology and Biotechnology, Faculty of Chemistry, Wroclaw University of Science and Technology, 50-373 Wroclaw, Poland
| | | | - Jan Majta
- Ardigen, Podole 76, 30-394 Kraków, Poland; Department of Computational Biophysics and Bioinformatics, Faculty of Biochemistry, Biophysics and Biotechnology, Jagiellonian University, Krakow, Poland
| | | |
Collapse
|
19
|
Hu Y, Li Y, Satten GA, Hu YJ. Testing microbiome associations with survival times at both the community and individual taxon levels. PLoS Comput Biol 2022; 18:e1010509. [PMID: 36103548 PMCID: PMC9512219 DOI: 10.1371/journal.pcbi.1010509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 09/26/2022] [Accepted: 08/23/2022] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Finding microbiome associations with possibly censored survival times is an important problem, especially as specific taxa could serve as biomarkers for disease prognosis or as targets for therapeutic interventions. The two existing methods for survival outcomes, MiRKAT-S and OMiSA, are restricted to testing associations at the community level and do not provide results at the individual taxon level. An ad hoc approach testing each taxon with a survival outcome using the Cox proportional hazard model may not perform well in the microbiome setting with sparse count data and small sample sizes. METHODS We have previously developed the linear decomposition model (LDM) for testing continuous or discrete outcomes that unifies community-level and taxon-level tests into one framework. Here we extend the LDM to test survival outcomes. We propose to use the Martingale residuals or the deviance residuals obtained from the Cox model as continuous covariates in the LDM. We further construct tests that combine the results of analyzing each set of residuals separately. Finally, we extend PERMANOVA, the most commonly used distance-based method for testing community-level hypotheses, to handle survival outcomes in a similar manner. RESULTS Using simulated data, we showed that the LDM-based tests preserved the false discovery rate for testing individual taxa and had good sensitivity. The LDM-based community-level tests and PERMANOVA-based tests had comparable or better power than MiRKAT-S and OMiSA. An analysis of data on the association of the gut microbiome and the time to acute graft-versus-host disease revealed several dozen associated taxa that would not have been achievable by any community-level test, as well as improved community-level tests by the LDM and PERMANOVA over those obtained using MiRKAT-S and OMiSA. CONCLUSIONS Unlike existing methods, our new methods are capable of discovering individual taxa that are associated with survival times, which could be of important use in clinical settings.
Collapse
Affiliation(s)
- Yingtian Hu
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, Georgia, United States of America
| | - Yunxiao Li
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, Georgia, United States of America
| | - Glen A. Satten
- Department of Gynecology and Obstetrics, Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - Yi-Juan Hu
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, Georgia, United States of America
- * E-mail:
| |
Collapse
|
20
|
Jiang Z, He M, Chen J, Zhao N, Zhan X. MiRKAT-MC: A Distance-Based Microbiome Kernel Association Test With Multi-Categorical Outcomes. Front Genet 2022; 13:841764. [PMID: 35432465 PMCID: PMC9010828 DOI: 10.3389/fgene.2022.841764] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 03/10/2022] [Indexed: 12/14/2022] Open
Abstract
Increasing evidence has elucidated that the microbiome plays a critical role in many human diseases. Apart from continuous and binary traits that measure the extent or presence of a disease, multi-categorical outcomes including variations/subtypes of a disease or ordinal levels of disease severity are commonly seen in clinical studies. On top of that, studies with clustered design (i.e., family-based and longitudinal studies) are popular alternatives to population-based ones as they are able to identify characteristics on both individual and population levels and to investigate the trajectory of traits of interest over time. However, existing methods for microbiome association analysis are inadequate to handle multi-categorical outcomes, neither independent nor clustered data. We propose a microbiome kernel association test with multi-categorical outcomes (MiRKAT-MC). Our method is versatile to deal with both nominal and ordinal outcomes for independent and clustered data. In addition, it incorporates multiple ecological distances to allow for different association patterns between outcomes and microbiome compositions to be incorporated. A computationally efficient pseudo-permutation strategy is used to evaluate the statistical significance. Comprehensive simulations show that MiRKAT-MC preserves the nominal type I error and increases statistical powers under various scenarios and data types. We also apply MiRKAT-MC to real data sets with nominal and ordinal outcomes to gain biological insights. MiRKAT-MC is easy to implement, and freely available via an R package at https://github.com/Zhiwen-Owen-Jiang/MiRKATMC with a Graphical User Interface through R Shinny also available.
Collapse
Affiliation(s)
- Zhiwen Jiang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, United States
| | - Mengyu He
- Department Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, United States
| | - Jun Chen
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, United States
| | - Ni Zhao
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, United States
- *Correspondence: Ni Zhao, ; Xiang Zhan,
| | - Xiang Zhan
- Department of Biostatistics, School of Public Health and Beijing International Center for Mathematical Research, Peking University, Beijing, China
- *Correspondence: Ni Zhao, ; Xiang Zhan,
| |
Collapse
|
21
|
Du Y, Feng R, Chang ET, Debelius JW, Yin L, Xu M, Huang T, Zhou X, Xiao X, Li Y, Liao J, Zheng Y, Huang G, Adami HO, Zhang Z, Cai Y, Ye W. Influence of Pre-treatment Saliva Microbial Diversity and Composition on Nasopharyngeal Carcinoma Prognosis. Front Cell Infect Microbiol 2022; 12:831409. [PMID: 35392614 PMCID: PMC8981580 DOI: 10.3389/fcimb.2022.831409] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 02/21/2022] [Indexed: 11/13/2022] Open
Abstract
Background The human microbiome has been reported to mediate the response to anticancer therapies. However, research about the influence of the oral microbiome on nasopharyngeal carcinoma (NPC) survival is lacking. We aimed to explore the effect of oral microbiota on NPC prognosis. Methods Four hundred eighty-two population-based NPC cases in southern China between 2010 and 2013 were followed for survival, and their saliva samples were profiled using 16s rRNA sequencing. We analyzed associations of the oral microbiome diversity with mortality from all causes and NPC. Results Within- and between-community diversities of saliva were associated with mortality with an average of 5.29 years follow-up. Lower Faith’s phylogenetic diversity was related to higher all-cause mortality [adjusted hazard ratio (aHR), 1.52 (95% confidence interval (CI), 1.06–2.17)] and NPC-specific mortality [aHR, 1.57 (95% CI, 1.07–2.29)], compared with medium diversity, but higher phylogenetic diversity was not protective. The third principal coordinate (PC3) identified from principal coordinates analysis (PCoA) on Bray–Curtis distance was marginally associated with reduced all-cause mortality [aHR, 0.85 (95% CI, 0.73–1.00)], as was the first principal coordinate (PC1) from PCoA on weighted UniFrac [aHR, 0.86 (95% CI, 0.74–1.00)], but neither was associated with NPC-specific mortality. PC3 from robust principal components analysis was associated with lower all-cause and NPC-specific mortalities, with HRs of 0.72 (95% CI, 0.61–0.85) and 0.71 (95% CI, 0.60–0.85), respectively. Conclusions Oral microbiome may be an explanatory factor for NPC prognosis. Lower within-community diversity was associated with higher mortality, and certain measures of between-community diversity were related to mortality. Specifically, candidate bacteria were not related to mortality, suggesting that observed associations may be due to global patterns rather than particular pathogens.
Collapse
Affiliation(s)
- Yun Du
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Ruimei Feng
- Department of Epidemiology and Health Statistics and Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fuzhou, China
| | - Ellen T. Chang
- Exponent, Inc., Center for Health Sciences, Menlo Park, CA, United States
| | - Justine W. Debelius
- Centre for Translational Microbiome Research, Department of Microbiology, Tumor, and Cell Biology, Karolinska Institutet, Solna, Sweden
- Karolinska Institutet, Science for Life Laboratory, Solna, Sweden
| | - Li Yin
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Miao Xu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Tingting Huang
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Department of Radiation Oncology, The First Affiliated Hospital of Guangxi Medical University, Nanning, China
- Radiation Oncology Clinical Medical Research of Guangxi Medical University, Nanning, China
| | - Xiaoying Zhou
- Life Science Institute, Guangxi Medical University, Nanning, China
- Key Laboratory of High-Incidence-Tumor Prevention & Treatment (Guangxi Medical University), Ministry of Education, Nanning, China
| | - Xue Xiao
- Department of Otolaryngology-Head & Neck Surgery, First Affiliated Hospital of Guangxi Medical University, Nanning, China
| | - Yancheng Li
- Guangxi Health Commission Key Laboratory of Molecular Epidemiology of Nasopharyngeal Carcinoma, Wuzhou Red Cross Hospital, Wuzhou, China
| | - Jian Liao
- Cangwu Institute for Nasopharyngeal Carcinoma Control and Prevention, Wuzhou, China
| | - Yuming Zheng
- Guangxi Health Commission Key Laboratory of Molecular Epidemiology of Nasopharyngeal Carcinoma, Wuzhou Red Cross Hospital, Wuzhou, China
| | - Guangwu Huang
- Department of Otolaryngology-Head & Neck Surgery, First Affiliated Hospital of Guangxi Medical University, Nanning, China
| | - Hans-Olov Adami
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Clinical Effectiveness Research Group, Institute of Health, University of Oslo, Oslo, Norway
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, United States
| | - Zhe Zhang
- Department of Otolaryngology-Head & Neck Surgery, First Affiliated Hospital of Guangxi Medical University, Nanning, China
| | - Yonglin Cai
- Guangxi Health Commission Key Laboratory of Molecular Epidemiology of Nasopharyngeal Carcinoma, Wuzhou Red Cross Hospital, Wuzhou, China
| | - Weimin Ye
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- *Correspondence: Weimin Ye,
| |
Collapse
|
22
|
Immunologic Gene Signature Analysis Correlates Myeloid Cells and M2 Macrophages with Time to Trabectedin Failure in Sarcoma Patients. Cancers (Basel) 2022; 14:cancers14051290. [PMID: 35267598 PMCID: PMC8909887 DOI: 10.3390/cancers14051290] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 02/21/2022] [Accepted: 02/26/2022] [Indexed: 01/29/2023] Open
Abstract
Patients with metastatic soft tissue sarcoma (STS) have a poor prognosis and few available systemic treatment options. Trabectedin is currently being investigated as a potential adjunct to immunotherapy as it has been previously shown to kill tumor-associated macrophages. In this retrospective study, we sought to identify biomarkers that would be relevant to trials combining trabectedin with immunotherapy. We performed a single-center retrospective study of sarcoma patients treated with trabectedin with long-term follow-up. Multiplex gene expression analysis using the NanoString platform was assessed, and an exploratory analysis using the lasso-penalized Cox regression and kernel association test for survival (MiRKAT-S) methods investigated tumor-associated immune cells and correlated their gene signatures to patient survival. In total, 147 sarcoma patients treated with trabectedin were analyzed, with a mean follow-up time of 5 years. Patients with fewer prior chemotherapy regimens were more likely to stay on trabectedin longer (pairwise correlation = -0.17, p = 0.04). At 5 years, increased PD-L1 expression corresponded to worse outcomes (HR = 1.87, p = 0.04, q = 0.199). Additionally, six immunologic gene signatures were associated with up to 7-year survival by MiRKAT-S, notably myeloid-derived suppressor cells (p = 0.023, q = 0.058) and M2 macrophages (p = 0.03, q = 0.058). We found that the number of chemotherapy regimens prior to trabectedin negatively correlated with the number of trabectedin cycles received, suggesting that patients may benefit from receiving trabectedin earlier in their therapy course. The correlation of trabectedin outcomes with immune cell infiltrates supports the hypothesis that trabectedin may function as an immune modulator and supports ongoing efforts to study trabectedin in combination with immunotherapy. Furthermore, tumors with an immunosuppressive microenvironment characterized by macrophage infiltration and high PD-L1 expression were less likely to benefit from trabectedin, which could guide clinicians in future treatment decisions.
Collapse
|
23
|
Banerjee K, Chen J, Zhan X. Adaptive and powerful microbiome multivariate association analysis via feature selection. NAR Genom Bioinform 2022; 4:lqab120. [PMID: 35047812 PMCID: PMC8759573 DOI: 10.1093/nargab/lqab120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Revised: 11/13/2021] [Accepted: 12/24/2021] [Indexed: 02/06/2023] Open
Abstract
The important role of human microbiome is being increasingly recognized in health and disease conditions. Since microbiome data is typically high dimensional, one popular mode of statistical association analysis for microbiome data is to pool individual microbial features into a group, and then conduct group-based multivariate association analysis. A corresponding challenge within this approach is to achieve adequate power to detect an association signal between a group of microbial features and the outcome of interest across a wide range of scenarios. Recognizing some existing methods' susceptibility to the adverse effects of noise accumulation, we introduce the Adaptive Microbiome Association Test (AMAT), a novel and powerful tool for multivariate microbiome association analysis, which unifies both blessings of feature selection in high-dimensional inference and robustness of adaptive statistical association testing. AMAT first alleviates the burden of noise accumulation via distance correlation learning, and then conducts a data-adaptive association test under the flexible generalized linear model framework. Extensive simulation studies and real data applications demonstrate that AMAT is highly robust and often more powerful than several existing methods, while preserving the correct type I error rate. A free implementation of AMAT in R computing environment is available at https://github.com/kzb193/AMAT.
Collapse
Affiliation(s)
| | | | - Xiang Zhan
- To whom correspondence should be addressed. Tel: +86 10 62744132; Fax: +86 10 62744134;
| |
Collapse
|
24
|
Novel application of survival models for predicting microbial community transitions with variable selection for eDNA. Appl Environ Microbiol 2022; 88:e0214621. [PMID: 35138931 DOI: 10.1128/aem.02146-21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Survival analysis is a prolific statistical tool in medicine for inferring risk and time to disease-related events. However, it is under-utilized in microbiome research to predict microbial community mediated events, partly due to the sparsity and high dimensional nature of the data. We advance the application of Cox proportional hazards (Cox PH) survival models to environmental DNA (eDNA) data with feature selection suitable for filtering irrelevant and redundant taxonomic variables. Selection methods are compared in terms of false positives, sensitivity, and survival estimation accuracy in simulation and in a real data setting to forecast harmful cyanobacterial blooms. A novel extension of a method for selecting microbial biomarkers with survival data (SuRFCox) reliably outperforms other methods. We determine Cox PH models with SuRFCox selected predictors are more robust to varied signal, noise, and data correlation structure. SuRFCox also yields the most accurate and consistent prediction of blooms according to cross-validated testing by year over eight different bloom seasons. Identification of common biomarkers among validated survival forecasts over changing conditions has clear biological significance. Survival models with such biomarkers inform risk assessment and provide insight into the causes of critical community transitions. Importance In this paper, we report on a novel approach of selecting microorganisms for model-based prediction of the time to critical microbially-modulated events (e.g., harmful algal blooms, clinical outcomes, community shifts, etc.). Our novel method for identifying biomarkers from large, dynamic communities of microbes has broad utility to environmental and ecological impact risk assessment and public health. Results will also promote theoretical and practical advancements relevant to the biology of specific organisms. To address the unique challenge posed by diverse environmental conditions and sparse microbes, we developed a novel method of selecting predictors for modelling time-to-event data. Competing methods for selecting predictors are rigorously compared to determine which is the most accurate and generalizable. Model forecasts are applied to show suitable predictors can precisely quantify the risk over time of biological events like harmful cyanobacterial blooms.
Collapse
|
25
|
Mohamed N, Litlekalsøy J, Ahmed IA, Martinsen EMH, Furriol J, Javier-Lopez R, Elsheikh M, Gaafar NM, Morgado L, Mundra S, Johannessen AC, Osman TAH, Nginamau ES, Suleiman A, Costea DE. Analysis of Salivary Mycobiome in a Cohort of Oral Squamous Cell Carcinoma Patients From Sudan Identifies Higher Salivary Carriage of Malassezia as an Independent and Favorable Predictor of Overall Survival. Front Cell Infect Microbiol 2021; 11:673465. [PMID: 34712619 PMCID: PMC8547610 DOI: 10.3389/fcimb.2021.673465] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 08/27/2021] [Indexed: 12/20/2022] Open
Abstract
Background Microbial dysbiosis and microbiome-induced inflammation have emerged as important factors in oral squamous cell carcinoma (OSCC) tumorigenesis during the last two decades. However, the “rare biosphere” of the oral microbiome, including fungi, has been sparsely investigated. This study aimed to characterize the salivary mycobiome in a prospective Sudanese cohort of OSCC patients and to explore patterns of diversities associated with overall survival (OS). Materials and Methods Unstimulated saliva samples (n = 72) were collected from patients diagnosed with OSCC (n = 59) and from non-OSCC control volunteers (n = 13). DNA was extracted using a combined enzymatic–mechanical extraction protocol. The salivary mycobiome was assessed using a next-generation sequencing (NGS)-based methodology by amplifying the ITS2 region. The impact of the abundance of different fungal genera on the survival of OSCC patients was analyzed using Kaplan–Meier and Cox regression survival analyses (SPPS). Results Sixteen genera were identified exclusively in the saliva of OSCC patients. Candida, Malassezia, Saccharomyces, Aspergillus, and Cyberlindnera were the most relatively abundant fungal genera in both groups and showed higher abundance in OSCC patients. Kaplan–Meier survival analysis showed higher salivary carriage of the Candida genus significantly associated with poor OS of OSCC patients (Breslow test: p = 0.043). In contrast, the higher salivary carriage of Malassezia showed a significant association with favorable OS in OSCC patients (Breslow test: p = 0.039). The Cox proportional hazards multiple regression model was applied to adjust the salivary carriage of both Candida and Malassezia according to age (p = 0.029) and identified the genus Malassezia as an independent predictor of OS (hazard ratio = 0.383, 95% CI = 0.16–0.93, p = 0.03). Conclusion The fungal compositional patterns in saliva from OSCC patients were different from those of individuals without OSCC. The fungal genus Malassezia was identified as a putative prognostic biomarker and therapeutic target for OSCC.
Collapse
Affiliation(s)
- Nazar Mohamed
- Gade Laboratory for Pathology, Department of Clinical Medicine, and Center for Cancer Biomarkers CCBIO, University of Bergen, Bergen, Norway.,Department of Oral and Maxillofacial Surgery/Department of Basic Sciences, University of Khartoum, Khartoum, Sudan
| | - Jorunn Litlekalsøy
- Gade Laboratory for Pathology, Department of Clinical Medicine, and Center for Cancer Biomarkers CCBIO, University of Bergen, Bergen, Norway
| | - Israa Abdulrahman Ahmed
- Gade Laboratory for Pathology, Department of Clinical Medicine, and Center for Cancer Biomarkers CCBIO, University of Bergen, Bergen, Norway.,Department of Operative Dentistry, University of Science & Technology, Omdurman, Sudan
| | | | - Jessica Furriol
- Department of Nephrology, Haukeland University Hospital, Bergen, Norway
| | - Ruben Javier-Lopez
- Department of Biological Sciences, The Faculty of Mathematics and Natural Sciences, University of Bergen, Bergen, Norway
| | - Mariam Elsheikh
- Department of Oral and Maxillofacial Surgery/Department of Basic Sciences, University of Khartoum, Khartoum, Sudan.,Department of Oral & Maxillofacial Surgery, Khartoum Dental Teaching Hospital, Khartoum, Sudan
| | - Nuha Mohamed Gaafar
- Gade Laboratory for Pathology, Department of Clinical Medicine, and Center for Cancer Biomarkers CCBIO, University of Bergen, Bergen, Norway.,Department of Oral and Maxillofacial Surgery/Department of Basic Sciences, University of Khartoum, Khartoum, Sudan
| | - Luis Morgado
- Section for Genetics and Evolutionary Biology (EvoGene), Department of Biosciences, The Faculty of Mathematics and Natural Sciences, University of Oslo, Oslo, Norway
| | - Sunil Mundra
- Section for Genetics and Evolutionary Biology (EvoGene), Department of Biosciences, The Faculty of Mathematics and Natural Sciences, University of Oslo, Oslo, Norway.,Department of Biology, College of Science, United Arab Emirates University, Al Ain, Abu Dhabi, United Arab Emirates
| | - Anne Christine Johannessen
- Gade Laboratory for Pathology, Department of Clinical Medicine, and Center for Cancer Biomarkers CCBIO, University of Bergen, Bergen, Norway.,Department of Pathology, Laboratory Clinic, Haukeland University Hospital, Bergen, Norway
| | - Tarig Al-Hadi Osman
- Gade Laboratory for Pathology, Department of Clinical Medicine, and Center for Cancer Biomarkers CCBIO, University of Bergen, Bergen, Norway
| | - Elisabeth Sivy Nginamau
- Gade Laboratory for Pathology, Department of Clinical Medicine, and Center for Cancer Biomarkers CCBIO, University of Bergen, Bergen, Norway.,Department of Pathology, Laboratory Clinic, Haukeland University Hospital, Bergen, Norway
| | - Ahmed Suleiman
- Department of Oral and Maxillofacial Surgery/Department of Basic Sciences, University of Khartoum, Khartoum, Sudan.,Department of Oral & Maxillofacial Surgery, Khartoum Dental Teaching Hospital, Khartoum, Sudan
| | - Daniela Elena Costea
- Gade Laboratory for Pathology, Department of Clinical Medicine, and Center for Cancer Biomarkers CCBIO, University of Bergen, Bergen, Norway.,Department of Pathology, Laboratory Clinic, Haukeland University Hospital, Bergen, Norway
| |
Collapse
|
26
|
Sun H, Huang X, Fu L, Huo B, He T, Jiang X. A powerful adaptive microbiome-based association test for microbial association signals with diverse sparsity levels. J Genet Genomics 2021; 48:851-859. [PMID: 34411712 DOI: 10.1016/j.jgg.2021.08.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 08/06/2021] [Accepted: 08/06/2021] [Indexed: 01/12/2023]
Abstract
The dysbiosis of microbiome may have negative effects on a host phenotype. The microbes related to the host phenotype are regarded as microbial association signals. Recently, statistical methods based on microbiome-phenotype association tests have been extensively developed to detect these association signals. However, the currently available methods do not perform well to detect microbial association signals when dealing with diverse sparsity levels (i.e., sparse, low sparse, non-sparse). Actually, the real association patterns related to different host phenotypes are not unique. Here, we propose a powerful and adaptive microbiome-based association test to detect microbial association signals with diverse sparsity levels, designated as MiATDS. In particular, we define probability degree to measure the associations between microbes and the host phenotype and introduce the adaptive weighted sum of powered score tests by considering both probability degree and phylogenetic information. We design numerous simulation experiments for the task of detecting association signals with diverse sparsity levels to prove the performance of the method. We find that type I error rates can be well-controlled and MiATDS shows superior efficiency on the power. By applying to real data analysis, MiATDS displays reliable practicability too. The R package is available at https://github.com/XiaoyunHuang33/MiATDS.
Collapse
Affiliation(s)
- Han Sun
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China; School of Computer, Central China Normal University, Wuhan 430079, China; School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China
| | - Xiaoyun Huang
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China; School of Computer, Central China Normal University, Wuhan 430079, China; Collaborative & Innovative Center for Educational Technology, Central China Normal University, Wuhan 430079, China
| | - Lingling Fu
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China; School of Computer, Central China Normal University, Wuhan 430079, China; School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China
| | - Ban Huo
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China; School of Computer, Central China Normal University, Wuhan 430079, China
| | - Tingting He
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China; School of Computer, Central China Normal University, Wuhan 430079, China; National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, Wuhan 430079, China
| | - Xingpeng Jiang
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China; School of Computer, Central China Normal University, Wuhan 430079, China; National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, Wuhan 430079, China.
| |
Collapse
|
27
|
Tsay JCJ, Wu BG, Sulaiman I, Gershner K, Schluger R, Li Y, Yie TA, Meyn P, Olsen E, Perez L, Franca B, Carpenito J, Iizumi T, El-Ashmawy M, Badri M, Morton JT, Shen N, He L, Michaud G, Rafeq S, Bessich JL, Smith RL, Sauthoff H, Felner K, Pillai R, Zavitsanou AM, Koralov SB, Mezzano V, Loomis CA, Moreira AL, Moore W, Tsirigos A, Heguy A, Rom WN, Sterman DH, Pass HI, Clemente JC, Li H, Bonneau R, Wong KK, Papagiannakopoulos T, Segal LN. Lower Airway Dysbiosis Affects Lung Cancer Progression. Cancer Discov 2021; 11:293-307. [PMID: 33177060 PMCID: PMC7858243 DOI: 10.1158/2159-8290.cd-20-0263] [Citation(s) in RCA: 160] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Revised: 09/15/2020] [Accepted: 10/27/2020] [Indexed: 11/16/2022]
Abstract
In lung cancer, enrichment of the lower airway microbiota with oral commensals commonly occurs, and ex vivo models support that some of these bacteria can trigger host transcriptomic signatures associated with carcinogenesis. Here, we show that this lower airway dysbiotic signature was more prevalent in the stage IIIB-IV tumor-node-metastasis lung cancer group and is associated with poor prognosis, as shown by decreased survival among subjects with early-stage disease (I-IIIA) and worse tumor progression as measured by RECIST scores among subjects with stage IIIB-IV disease. In addition, this lower airway microbiota signature was associated with upregulation of the IL17, PI3K, MAPK, and ERK pathways in airway transcriptome, and we identified Veillonella parvula as the most abundant taxon driving this association. In a KP lung cancer model, lower airway dysbiosis with V. parvula led to decreased survival, increased tumor burden, IL17 inflammatory phenotype, and activation of checkpoint inhibitor markers. SIGNIFICANCE: Multiple lines of investigation have shown that the gut microbiota affects host immune response to immunotherapy in cancer. Here, we support that the local airway microbiota modulates the host immune tone in lung cancer, affecting tumor progression and prognosis.See related commentary by Zitvogel and Kroemer, p. 224.This article is highlighted in the In This Issue feature, p. 211.
Collapse
Affiliation(s)
- Jun-Chieh J Tsay
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
- Division of Pulmonary and Critical Care Medicine, VA New York Harbor Healthcare System, New York, New York
| | - Benjamin G Wu
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
- Division of Pulmonary and Critical Care Medicine, VA New York Harbor Healthcare System, New York, New York
| | - Imran Sulaiman
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Katherine Gershner
- Section of Pulmonary, Critical Care, Allergy and Immunology, Wake Forest School of Medicine, Winston-Salem, North Carolina
| | - Rosemary Schluger
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Yonghua Li
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Ting-An Yie
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Peter Meyn
- NYU Langone Genomic Technology Center, New York University School of Medicine, New York, New York
| | - Evan Olsen
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Luisannay Perez
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Brendan Franca
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Joseph Carpenito
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Tadasu Iizumi
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Mariam El-Ashmawy
- Department of Medicine, New York University School of Medicine, New York, New York
| | - Michelle Badri
- Department of Biology, New York University, New York, New York
| | - James T Morton
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, New York
| | - Nan Shen
- Department of Genetics and Genomic Sciences and Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Linchen He
- Department of Population Health, New York University School of Medicine, New York, New York
| | - Gaetane Michaud
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Samaan Rafeq
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Jamie L Bessich
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Robert L Smith
- Division of Pulmonary and Critical Care Medicine, VA New York Harbor Healthcare System, New York, New York
| | - Harald Sauthoff
- Division of Pulmonary and Critical Care Medicine, VA New York Harbor Healthcare System, New York, New York
| | - Kevin Felner
- Division of Pulmonary and Critical Care Medicine, VA New York Harbor Healthcare System, New York, New York
| | - Ray Pillai
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | | | - Sergei B Koralov
- Department of Pathology, New York University School of Medicine, New York, New York
| | - Valeria Mezzano
- Department of Pathology, New York University School of Medicine, New York, New York
| | - Cynthia A Loomis
- Department of Pathology, New York University School of Medicine, New York, New York
| | - Andre L Moreira
- Department of Pathology, New York University School of Medicine, New York, New York
| | - William Moore
- Department of Radiology, New York University School of Medicine, New York, New York
| | - Aristotelis Tsirigos
- Department of Pathology, New York University School of Medicine, New York, New York
| | - Adriana Heguy
- NYU Langone Genomic Technology Center, New York University School of Medicine, New York, New York
- Department of Pathology, New York University School of Medicine, New York, New York
| | - William N Rom
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Daniel H Sterman
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York
| | - Harvey I Pass
- Department of Cardiothoracic Surgery, New York University School of Medicine, New York, New York
| | - Jose C Clemente
- Department of Genetics and Genomic Sciences and Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Huilin Li
- Department of Population Health, New York University School of Medicine, New York, New York
| | - Richard Bonneau
- Department of Biology, New York University, New York, New York
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, New York
- Center for Data Science, New York University School of Medicine, New York, New York
| | - Kwok-Kin Wong
- Division of Hematology and Oncology, New York University School of Medicine, New York, New York
| | | | - Leopoldo N Segal
- Division of Pulmonary and Critical Care Medicine, New York University School of Medicine, New York, New York.
| |
Collapse
|
28
|
Luna PN, Mansbach JM, Shaw CA. A joint modeling approach for longitudinal microbiome data improves ability to detect microbiome associations with disease. PLoS Comput Biol 2020; 16:e1008473. [PMID: 33315858 PMCID: PMC7769610 DOI: 10.1371/journal.pcbi.1008473] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 12/28/2020] [Accepted: 10/27/2020] [Indexed: 02/02/2023] Open
Abstract
Changes in the composition of the microbiome over time are associated with myriad human illnesses. Unfortunately, the lack of analytic techniques has hindered researchers' ability to quantify the association between longitudinal microbial composition and time-to-event outcomes. Prior methodological work developed the joint model for longitudinal and time-to-event data to incorporate time-dependent biomarker covariates into the hazard regression approach to disease outcomes. The original implementation of this joint modeling approach employed a linear mixed effects model to represent the time-dependent covariates. However, when the distribution of the time-dependent covariate is non-Gaussian, as is the case with microbial abundances, researchers require different statistical methodology. We present a joint modeling framework that uses a negative binomial mixed effects model to determine longitudinal taxon abundances. We incorporate these modeled microbial abundances into a hazard function with a parameterization that not only accounts for the proportional nature of microbiome data, but also generates biologically interpretable results. Herein we demonstrate the performance improvements of our approach over existing alternatives via simulation as well as a previously published longitudinal dataset studying the microbiome during pregnancy. The results demonstrate that our joint modeling framework for longitudinal microbiome count data provides a powerful methodology to uncover associations between changes in microbial abundances over time and the onset of disease. This method offers the potential to equip researchers with a deeper understanding of the associations between longitudinal microbial composition changes and disease outcomes. This new approach could potentially lead to new diagnostic biomarkers or inform clinical interventions to help prevent or treat disease.
Collapse
Affiliation(s)
- Pamela N. Luna
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Statistics, Rice University, Houston, Texas, United States of America
| | - Jonathan M. Mansbach
- Department of Pediatrics, Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Chad A. Shaw
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Statistics, Rice University, Houston, Texas, United States of America
| |
Collapse
|
29
|
Wilson N, Zhao N, Zhan X, Koh H, Fu W, Chen J, Li H, Wu MC, Plantinga AM. MiRKAT: kernel machine regression-based global association tests for the microbiome. Bioinformatics 2020; 37:1595-1597. [PMID: 33225342 PMCID: PMC8495888 DOI: 10.1093/bioinformatics/btaa951] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Revised: 10/13/2020] [Accepted: 10/28/2020] [Indexed: 11/14/2022] Open
Abstract
SUMMARY Distance-based tests of microbiome beta diversity are an integral part of many microbiome analyses. MiRKAT enables distance-based association testing with a wide variety of outcome types, including continuous, binary, censored time-to-event, multivariate, correlated and high-dimensional outcomes. Omnibus tests allow simultaneous consideration of multiple distance and dissimilarity measures, providing higher power across a range of simulation scenarios. Two measures of effect size, a modified R-squared coefficient and a kernel RV coefficient, are incorporated to allow comparison of effect sizes across multiple kernels. AVAILABILITY AND IMPLEMENTATION MiRKAT is available on CRAN as an R package. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nehemiah Wilson
- Department of Mathematics and Statistics, Williams
College, Williamstown, MA 01267, USA
| | - Ni Zhao
- Department of Biostatistics, Johns Hopkins Bloomberg
School of Public Health, Baltimore, MD 21205, USA
| | - Xiang Zhan
- Department of Public Health Sciences, Penn State
College of Medicine, Hershey, PA 17033, USA
| | - Hyunwook Koh
- Department of Applied Mathematics and Statistics,
The State University of New York, Korea (SUNY Korea), Incheon
21985, South Korea
| | - Weijia Fu
- Institute for Health Metrics and Evaluation,
University of Washington, Seattle, WA 98121, USA
| | - Jun Chen
- Division of Biomedical Statistics and Informatics,
Department of Health Sciences Research, Mayo Clinic, Rochester, MN
55905, USA
| | - Hongzhe Li
- Department of Biostatistics, Epidemiology and
Informatics, Perelman School of Medicine, University of
Pennsylvania, Philadelphia, PA 19104, USA
| | - Michael C Wu
- Public Health Sciences Division, Biostatistics and
Biomathematics Program, Fred Hutchinson Cancer Research Center,
Seattle, WA 98109, USA
| | - Anna M Plantinga
- Department of Mathematics and Statistics, Williams
College, Williamstown, MA 01267, USA,To whom correspondence should be addressed.
| |
Collapse
|
30
|
Xia Y. Correlation and association analyses in microbiome study integrating multiomics in health and disease. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 171:309-491. [PMID: 32475527 DOI: 10.1016/bs.pmbts.2020.04.003] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Correlation and association analyses are one of the most widely used statistical methods in research fields, including microbiome and integrative multiomics studies. Correlation and association have two implications: dependence and co-occurrence. Microbiome data are structured as phylogenetic tree and have several unique characteristics, including high dimensionality, compositionality, sparsity with excess zeros, and heterogeneity. These unique characteristics cause several statistical issues when analyzing microbiome data and integrating multiomics data, such as large p and small n, dependency, overdispersion, and zero-inflation. In microbiome research, on the one hand, classic correlation and association methods are still applied in real studies and used for the development of new methods; on the other hand, new methods have been developed to target statistical issues arising from unique characteristics of microbiome data. Here, we first provide a comprehensive view of classic and newly developed univariate correlation and association-based methods. We discuss the appropriateness and limitations of using classic methods and demonstrate how the newly developed methods mitigate the issues of microbiome data. Second, we emphasize that concepts of correlation and association analyses have been shifted by introducing network analysis, microbe-metabolite interactions, functional analysis, etc. Third, we introduce multivariate correlation and association-based methods, which are organized by the categories of exploratory, interpretive, and discriminatory analyses and classification methods. Fourth, we focus on the hypothesis testing of univariate and multivariate regression-based association methods, including alpha and beta diversities-based, count-based, and relative abundance (or compositional)-based association analyses. We demonstrate the characteristics and limitations of each approaches. Fifth, we introduce two specific microbiome-based methods: phylogenetic tree-based association analysis and testing for survival outcomes. Sixth, we provide an overall view of longitudinal methods in analysis of microbiome and omics data, which cover standard, static, regression-based time series methods, principal trend analysis, and newly developed univariate overdispersed and zero-inflated as well as multivariate distance/kernel-based longitudinal models. Finally, we comment on current association analysis and future direction of association analysis in microbiome and multiomics studies.
Collapse
Affiliation(s)
- Yinglin Xia
- Department of Medicine, University of Illinois at Chicago, Chicago, IL, United States.
| |
Collapse
|
31
|
Koh H, Zhao N. A powerful microbial group association test based on the higher criticism analysis for sparse microbial association signals. MICROBIOME 2020; 8:63. [PMID: 32393397 PMCID: PMC7216722 DOI: 10.1186/s40168-020-00834-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Accepted: 03/23/2020] [Indexed: 05/05/2023]
Abstract
BACKGROUND In human microbiome studies, it is crucial to evaluate the association between microbial group (e.g., community or clade) composition and a host phenotype of interest. In response, a number of microbial group association tests have been proposed, which account for the unique features of the microbiome data (e.g., high-dimensionality, compositionality, phylogenetic relationship). These tests generally fall in the class of aggregation tests which amplify the overall group association by combining all the underlying microbial association signals, and, therefore, they are powerful when many microbial species are associated with a given host phenotype (i.e., low sparsity). However, in practice, the microbial association signals can be highly sparse, and this is especially the situation where we have a difficulty to discover the microbial group association. METHODS Here, we introduce a powerful microbial group association test for sparse microbial association signals, namely, microbiome higher criticism analysis (MiHC). MiHC is a data-driven omnibus test taken in a search space spanned by tailoring the higher criticism test to incorporate phylogenetic information and/or modulate sparsity levels and including the Simes test for excessively high sparsity levels. Therefore, MiHC robustly adapts to diverse phylogenetic relevance and sparsity levels. RESULTS Our simulations show that MiHC maintains a high power at different phylogenetic relevance and sparsity levels with correct type I error controls. We also apply MiHC to four real microbiome datasets to test the association between respiratory tract microbiome and smoking status, the association between the infant's gut microbiome and delivery mode, the association between the gut microbiome and type 1 diabetes status, and the association between the gut microbiome and human immunodeficiency virus status. CONCLUSIONS In practice, the true underlying association pattern on the extent of phylogenetic relevance and sparsity is usually unknown. Therefore, MiHC can be a useful analytic tool because of its high adaptivity to diverse phylogenetic relevance and sparsity levels. MiHC can be implemented in the R computing environment using our software package freely available at https://github.com/hk1785/MiHC.
Collapse
Affiliation(s)
- Hyunwook Koh
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, 615 North Wolfe Street, Office E3622, Baltimore, MD, 21205, USA
| | - Ni Zhao
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, 615 North Wolfe Street, Office E3622, Baltimore, MD, 21205, USA.
| |
Collapse
|
32
|
Maziarz M, Pfeiffer RM, Wan Y, Gail MH. Using standard microbiome reference groups to simplify beta-diversity analyses and facilitate independent validation. Bioinformatics 2019; 34:3249-3257. [PMID: 29668831 DOI: 10.1093/bioinformatics/bty297] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Accepted: 04/11/2018] [Indexed: 11/13/2022] Open
Abstract
Motivation Comparisons of microbiome communities across populations are often based on pairwise distance measures (beta-diversity). Standard analyses (principal coordinate plots, permutation tests, kernel methods) require access to primary data if another investigator wants to add or compare independent data. We propose using standard reference measurements to simplify microbiome beta-diversity analyses, to make them more transparent, and to facilitate independent validation and comparisons across studies. Results Using stool and nasal reference sets from the Human Microbiome Project (HMP), we computed mean distances (actually Bray-Curtis or Pearson correlation dissimilarities) to each reference set for each new sample. Thus, each new sample has two mean distances that can be plotted and analyzed with classical statistical methods. To test the approach, we studied independent (not reference) HMP subjects. Simple Hotelling tests demonstrated statistically significant differences in mean distances to reference sets between all pairs of body sites (stool, skin, nasal, saliva and vagina) at the phylum, class, order, family and genus levels. Using the distance to a single reference set was usually sufficient, but using both reference sets always worked well. The use of reference sets simplifies standard analyses of beta-diversity and facilitates the independent validation and combining of such data because others can compute distances to the same reference sets. Moreover, standard statistical methods for survival analysis, logistic regression and other procedures can be applied to vectors of mean distances to reference sets, thereby greatly expanding the potential uses of beta-diversity information. More work is needed to identify the best reference sets for particular applications. Availability and implementation https://github.com/NCI-biostats/microbiome-fixed-reference. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marlena Maziarz
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
| | - Ruth M Pfeiffer
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
| | - Yunhu Wan
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
| | - Mitchell H Gail
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
| |
Collapse
|
33
|
Peters BA, Wilson M, Moran U, Pavlick A, Izsak A, Wechter T, Weber JS, Osman I, Ahn J. Relating the gut metagenome and metatranscriptome to immunotherapy responses in melanoma patients. Genome Med 2019; 11:61. [PMID: 31597568 PMCID: PMC6785875 DOI: 10.1186/s13073-019-0672-4] [Citation(s) in RCA: 148] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Accepted: 09/12/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Recent evidence suggests that immunotherapy efficacy in melanoma is modulated by gut microbiota. Few studies have examined this phenomenon in humans, and none have incorporated metatranscriptomics, important for determining expression of metagenomic functions in the microbial community. METHODS In melanoma patients undergoing immunotherapy, gut microbiome was characterized in pre-treatment stool using 16S rRNA gene and shotgun metagenome sequencing (n = 27). Transcriptional expression of metagenomic pathways was confirmed with metatranscriptome sequencing in a subset of 17. We examined associations of taxa and metagenomic pathways with progression-free survival (PFS) using 500 × 10-fold cross-validated elastic-net penalized Cox regression. RESULTS Higher microbial community richness was associated with longer PFS in 16S and shotgun data (p < 0.05). Clustering based on overall microbiome composition divided patients into three groups with differing PFS; the low-risk group had 99% lower risk of progression than the high-risk group at any time during follow-up (p = 0.002). Among the species selected in regression, abundance of Bacteroides ovatus, Bacteroides dorei, Bacteroides massiliensis, Ruminococcus gnavus, and Blautia producta were related to shorter PFS, and Faecalibacterium prausnitzii, Coprococcus eutactus, Prevotella stercorea, Streptococcus sanguinis, Streptococcus anginosus, and Lachnospiraceae bacterium 3 1 46FAA to longer PFS. Metagenomic functions related to PFS that had correlated metatranscriptomic expression included risk-associated pathways of L-rhamnose degradation, guanosine nucleotide biosynthesis, and B vitamin biosynthesis. CONCLUSIONS This work adds to the growing evidence that gut microbiota are related to immunotherapy outcomes, and identifies, for the first time, transcriptionally expressed metagenomic pathways related to PFS. Further research is warranted on microbial therapeutic targets to improve immunotherapy outcomes.
Collapse
Affiliation(s)
- Brandilyn A Peters
- Department of Population Health, NYU School of Medicine, New York, NY, 10016, USA
| | - Melissa Wilson
- Department of Medicine, NYU School of Medicine, New York, NY, USA
- NYU Perlmutter Cancer Center, New York, NY, USA
- Present Address: Sidney Kimmel Cancer Center, Thomas Jefferson University, Philadelphia, PA, USA
| | - Una Moran
- NYU Perlmutter Cancer Center, New York, NY, USA
- The Ronald O. Perelman Department of Dermatology, NYU School of Medicine, New York, NY, USA
| | - Anna Pavlick
- Department of Medicine, NYU School of Medicine, New York, NY, USA
- NYU Perlmutter Cancer Center, New York, NY, USA
| | - Allison Izsak
- The Ronald O. Perelman Department of Dermatology, NYU School of Medicine, New York, NY, USA
| | - Todd Wechter
- The Ronald O. Perelman Department of Dermatology, NYU School of Medicine, New York, NY, USA
| | - Jeffrey S Weber
- Department of Medicine, NYU School of Medicine, New York, NY, USA
- NYU Perlmutter Cancer Center, New York, NY, USA
| | - Iman Osman
- Department of Medicine, NYU School of Medicine, New York, NY, USA
- NYU Perlmutter Cancer Center, New York, NY, USA
- The Ronald O. Perelman Department of Dermatology, NYU School of Medicine, New York, NY, USA
| | - Jiyoung Ahn
- Department of Population Health, NYU School of Medicine, New York, NY, 10016, USA.
- NYU Perlmutter Cancer Center, New York, NY, USA.
| |
Collapse
|
34
|
Plantinga AM, Chen J, Jenq RR, Wu MC. pldist: ecological dissimilarities for paired and longitudinal microbiome association analysis. Bioinformatics 2019; 35:3567-3575. [PMID: 30863868 PMCID: PMC6761933 DOI: 10.1093/bioinformatics/btz120] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Revised: 01/27/2019] [Accepted: 02/13/2019] [Indexed: 01/12/2023] Open
Abstract
MOTIVATION The human microbiome is notoriously variable across individuals, with a wide range of 'healthy' microbiomes. Paired and longitudinal studies of the microbiome have become increasingly popular as a way to reduce unmeasured confounding and to increase statistical power by reducing large inter-subject variability. Statistical methods for analyzing such datasets are scarce. RESULTS We introduce a paired UniFrac dissimilarity that summarizes within-individual (or within-pair) shifts in microbiome composition and then compares these compositional shifts across individuals (or pairs). This dissimilarity depends on a novel transformation of relative abundances, which we then extend to more than two time points and incorporate into several phylogenetic and non-phylogenetic dissimilarities. The data transformation and resulting dissimilarities may be used in a wide variety of downstream analyses, including ordination analysis and distance-based hypothesis testing. Simulations demonstrate that tests based on these dissimilarities retain appropriate type 1 error and high power. We apply the method in two real datasets. AVAILABILITY AND IMPLEMENTATION The R package pldist is available on GitHub at https://github.com/aplantin/pldist. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Anna M Plantinga
- Department of Mathematics and Statistics, Williams College, Williamstown, MA, USA,To whom correspondence should be addressed. E-mail: or
| | - Jun Chen
- Department of Health Sciences Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA,Microbiome Program, Center for Individualized Medicine, Mayo Clinic, Rochester, MN, USA
| | - Robert R Jenq
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA,Department of Stem Cell Transplantation, Division of Cancer Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Michael C Wu
- Department of Biostatistics and Biomathematics Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA,Department of Biostatistics, University of Washington, Seattle, WA, USA,To whom correspondence should be addressed. E-mail: or
| |
Collapse
|
35
|
Koh H, Li Y, Zhan X, Chen J, Zhao N. A Distance-Based Kernel Association Test Based on the Generalized Linear Mixed Model for Correlated Microbiome Studies. Front Genet 2019; 10:458. [PMID: 31156711 PMCID: PMC6532659 DOI: 10.3389/fgene.2019.00458] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Accepted: 04/30/2019] [Indexed: 12/12/2022] Open
Abstract
Researchers have increasingly employed family-based or longitudinal study designs to survey the roles of the human microbiota on diverse host traits of interest (e. g., health/disease status, medical intervention, behavioral/environmental factor). Such study designs are useful to properly control for potential confounders or the sensitive changes in microbial composition and host traits. However, downstream data analysis is challenging because the measurements within clusters (e.g., families, subjects including repeated measures) tend to be correlated so that statistical methods based on the independence assumption cannot be used. For the correlated microbiome studies, a distance-based kernel association test based on the linear mixed model, namely, correlated sequence kernel association test (cSKAT), has recently been introduced. cSKAT models the microbial community using an ecological distance (e.g., Jaccard/Bray-Curtis dissimilarity, unique fraction distance), and then tests its association with a host trait. Similar to prior distance-based kernel association tests (e.g., microbiome regression-based kernel association test), the use of ecological distances gives a high power to cSKAT. However, cSKAT is limited to handling Gaussian traits [e.g., body mass index (BMI)] and a single chosen distance measure at a time. The power of cSKAT differs a lot by which distance measure is used. However, choosing an optimal distance measure is challenging because of the unknown nature of the true association. Here, we introduce a distance-based kernel association test based on the generalized linear mixed model (GLMM), namely, GLMM-MiRKAT, to handle diverse types of traits, such as Gaussian (e.g., BMI), Binomial (e.g., disease status, treatment/placebo) or Poisson (e.g., number of tumors/treatments) traits. We further propose a data-driven adaptive test of GLMM-MiRKAT, namely, aGLMM-MiRKAT, so as to avoid the need to choose the optimal distance measure. Our extensive simulations demonstrate that aGLMM-MiRKAT is robustly powerful while correctly controlling type I error rates. We apply aGLMM-MiRKAT to real familial and longitudinal microbiome data, where we discover significant disparity in microbial community composition by BMI status and the frequency of antibiotic use. In summary, aGLMM-MiRKAT is a useful analytical tool with its broad applicability to diverse types of traits, robust power and valid statistical inference.
Collapse
Affiliation(s)
- Hyunwook Koh
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| | - Yutong Li
- School of Physics, Peking University, Beijing, China
| | - Xiang Zhan
- Department of Public Health Sciences, Pennsylvania State University, Hershey, PA, United States
| | - Jun Chen
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Ni Zhao
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| |
Collapse
|
36
|
Banerjee K, Zhao N, Srinivasan A, Xue L, Hicks SD, Middleton FA, Wu R, Zhan X. An Adaptive Multivariate Two-Sample Test With Application to Microbiome Differential Abundance Analysis. Front Genet 2019; 10:350. [PMID: 31068967 PMCID: PMC6491633 DOI: 10.3389/fgene.2019.00350] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Accepted: 04/01/2019] [Indexed: 01/21/2023] Open
Abstract
Differential abundance analysis is a crucial task in many microbiome studies, where the central goal is to identify microbiome taxa associated with certain biological or clinical conditions. There are two different modes of microbiome differential abundance analysis: the individual-based univariate differential abundance analysis and the group-based multivariate differential abundance analysis. The univariate analysis identifies differentially abundant microbiome taxa subject to multiple correction under certain statistical error measurements such as false discovery rate, which is typically complicated by the high-dimensionality of taxa and complex correlation structure among taxa. The multivariate analysis evaluates the overall shift in the abundance of microbiome composition between two conditions, which provides useful preliminary differential information for the necessity of follow-up validation studies. In this paper, we present a novel Adaptive multivariate two-sample test for Microbiome Differential Analysis (AMDA) to examine whether the composition of a taxa-set are different between two conditions. Our simulation studies and real data applications demonstrated that the AMDA test was often more powerful than several competing methods while preserving the correct type I error rate. A free implementation of our AMDA method in R software is available at https://github.com/xyz5074/AMDA.
Collapse
Affiliation(s)
- Kalins Banerjee
- Department of Public Health Sciences, Pennsylvania State University, Hershey, PA, United States
| | - Ni Zhao
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, United States
| | - Arun Srinivasan
- Department of Statistics, Pennsylvania State University, University Park, PA, United States
| | - Lingzhou Xue
- Department of Statistics, Pennsylvania State University, University Park, PA, United States
| | - Steven D. Hicks
- Department of Pediatrics, Pennsylvania State University, Hershey, PA, United States
| | - Frank A. Middleton
- Department of Neuroscience, State University of New York Upstate Medical University, Syracuse, NY, United States
| | - Rongling Wu
- Department of Public Health Sciences, Pennsylvania State University, Hershey, PA, United States
| | - Xiang Zhan
- Department of Public Health Sciences, Pennsylvania State University, Hershey, PA, United States,*Correspondence: Xiang Zhan
| |
Collapse
|
37
|
Peters BA, Hayes RB, Goparaju C, Reid C, Pass HI, Ahn J. The Microbiome in Lung Cancer Tissue and Recurrence-Free Survival. Cancer Epidemiol Biomarkers Prev 2019; 28:731-740. [PMID: 30733306 DOI: 10.1158/1055-9965.epi-18-0966] [Citation(s) in RCA: 115] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Revised: 11/05/2018] [Accepted: 01/28/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Human microbiota have many functions that could contribute to cancer initiation and/or progression at local sites, yet the relation of the lung microbiota to lung cancer prognosis has not been studied. METHODS In a pilot study, 16S rRNA gene sequencing was performed on paired lung tumor and remote normal samples from the same lobe/segment in 19 patients with non-small cell lung cancer (NSCLC). We explored associations of tumor or normal tissue microbiome diversity and composition with recurrence-free (RFS) and disease-free survival (DFS), and compared microbiome diversity and composition between paired tumor and normal samples. RESULTS Higher richness and diversity in normal tissue were associated with reduced RFS (richness P = 0.08, Shannon index P = 0.03) and DFS (richness P = 0.03, Shannon index P = 0.02), as was normal tissue overall microbiome composition (Bray-Curtis P = 0.09 for RFS and P = 0.02 for DFS). In normal tissue, greater abundance of family Koribacteraceae was associated with increased RFS and DFS, whereas greater abundance of families Bacteroidaceae, Lachnospiraceae, and Ruminococcaceae were associated with reduced RFS or DFS (P < 0.05). Tumor tissue diversity and overall composition were not associated with RFS or DFS. Tumor tissue had lower richness and diversity (P ≤ 0.0001) than paired normal tissue, though overall microbiome composition did not differ between the paired samples. CONCLUSIONS We demonstrate, for the first time, a potential relationship between the normal lung microbiota and lung cancer prognosis, which requires confirmation in a larger study. IMPACT Definition of bacterial biomarkers of prognosis may lead to improved survival outcomes for patients with lung cancer.
Collapse
Affiliation(s)
- Brandilyn A Peters
- Department of Population Health, NYU School of Medicine, New York, New York
| | - Richard B Hayes
- Department of Population Health, NYU School of Medicine, New York, New York
- NYU Perlmutter Cancer Center, New York, New York
| | - Chandra Goparaju
- Department of Cardiothoracic Surgery, NYU School of Medicine, New York, New York
| | - Christopher Reid
- Department of Cardiothoracic Surgery, NYU School of Medicine, New York, New York
| | - Harvey I Pass
- NYU Perlmutter Cancer Center, New York, New York
- Department of Cardiothoracic Surgery, NYU School of Medicine, New York, New York
| | - Jiyoung Ahn
- Department of Population Health, NYU School of Medicine, New York, New York.
- NYU Perlmutter Cancer Center, New York, New York
| |
Collapse
|
38
|
Relationship Between MiRKAT and Coefficient of Determination in Similarity Matrix Regression. Processes (Basel) 2019. [DOI: 10.3390/pr7020079] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
The Microbiome Regression-based Kernel Association Test (MiRKAT) is widely used in testing for the association between microbiome compositions and an outcome of interest. The MiRKAT statistic is derived as a variance-component score test in a kernel machine regression-based generalized linear mixed model. In this brief report, we show that the MiRKAT statistic is proportional to the R 2 (coefficient of determination) statistic in a similarity matrix regression, which characterizes the fraction of variability in outcome similarity, explained by microbiome similarity (up to a constant).
Collapse
|
39
|
Gendo Y, Matsumoto T, Kamiyama N, Saechue B, Fukuda C, Dewayani A, Hidano S, Noguchi K, Sonoda A, Ozaki T, Sachi N, Hirose H, Ozaka S, Eshita Y, Mizukami K, Okimoto T, Kodama M, Yoshimatsu T, Nishida H, Daa T, Yamaoka Y, Murakami K, Kobayashi T. Dysbiosis of the Gut Microbiota on the Inflammatory Background due to Lack of Suppressor of Cytokine Signalling-1 in Mice. Inflamm Intest Dis 2019; 3:145-154. [PMID: 30820436 DOI: 10.1159/000495462] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 11/13/2018] [Indexed: 11/19/2022] Open
Abstract
Background Both environmental and genetic factors have been implicated in the induction of autoimmune disease. Therefore, it is important to understand the pathophysiological significance of the gut microbiota and host genetic background that contribute to an autoimmune disease such as inflammatory bowel disease (IBD). We have previously reported that mice deficient for suppressor of cytokine signaling-1 (SOCS1), in which SOCS1 expression was restored in T and B cells on an SOCS1-/- background (SOCS1-/-Tg mice), developed systemic autoimmune diseases accompanied by spontaneous colitis. Methods To investigate whether the proinflammatory genetic background affects the gut microbiota, we used SOCS1-/-Tg mice as a model of spontaneous chronic colitis. Fecal samples were collected from SOCS1-/-Tg mice and SOCS1+/+Tg (control) mice at 1 and 6 months of age, and the fecal bacterial 16S ribosomal RNA genes were sequenced using the Illumina MiSeq platform. Results Gut microbial diversity was significantly reduced and the intestinal bacterial community composition changed in SOCS1-/-Tg mice in comparison with the control mice. Interestingly, the population of Prevotella species, which is known to be elevated in ulcerative colitis and colorectal cancer patients, was significantly increased in SOCS1-/-Tg mice regardless of age. Conclusion Taken together, these results suggest that the proinflammatory genetic background owing to SOCS1 deficiency causes dysbiosis of the gut microbiota, which in turn generates a procolitogenic environment.
Collapse
Affiliation(s)
- Yoshiko Gendo
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan.,Department of Gastroenterology, Faculty of Medicine, Oita University, Yufu, Japan
| | - Takashi Matsumoto
- Department of Environmental and Preventive Medicine, Faculty of Medicine, Oita University, Yufu, Japan
| | - Naganori Kamiyama
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan
| | - Benjawan Saechue
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan
| | - Ciaki Fukuda
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan
| | - Astri Dewayani
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan
| | - Shinya Hidano
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan
| | - Kaori Noguchi
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan
| | - Akira Sonoda
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan.,Department of Gastroenterology, Faculty of Medicine, Oita University, Yufu, Japan
| | - Takashi Ozaki
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan
| | - Nozomi Sachi
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan
| | - Haruna Hirose
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan
| | - Sotaro Ozaka
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan
| | - Yuki Eshita
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan
| | - Kazuhiro Mizukami
- Department of Gastroenterology, Faculty of Medicine, Oita University, Yufu, Japan
| | - Tadayoshi Okimoto
- Department of Gastroenterology, Faculty of Medicine, Oita University, Yufu, Japan
| | - Masaaki Kodama
- Department of Gastroenterology, Faculty of Medicine, Oita University, Yufu, Japan
| | - Tomoko Yoshimatsu
- Department of Diagnostic Pathology, Faculty of Medicine, Oita University, Yufu, Japan
| | - Haruto Nishida
- Department of Diagnostic Pathology, Faculty of Medicine, Oita University, Yufu, Japan
| | - Tsutomu Daa
- Department of Diagnostic Pathology, Faculty of Medicine, Oita University, Yufu, Japan
| | - Yoshio Yamaoka
- Department of Environmental and Preventive Medicine, Faculty of Medicine, Oita University, Yufu, Japan
| | - Kazunari Murakami
- Department of Gastroenterology, Faculty of Medicine, Oita University, Yufu, Japan
| | - Takashi Kobayashi
- Department of Infectious Disease Control, Faculty of Medicine, Oita University, Yufu, Japan
| |
Collapse
|
40
|
Larson NB, Chen J, Schaid DJ. A review of kernel methods for genetic association studies. Genet Epidemiol 2019; 43:122-136. [PMID: 30604442 DOI: 10.1002/gepi.22180] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Revised: 11/09/2018] [Accepted: 11/26/2018] [Indexed: 12/17/2022]
Abstract
Evaluating the association of multiple genetic variants with a trait of interest by use of kernel-based methods has made a significant impact on how genetic association analyses are conducted. An advantage of kernel methods is that they tend to be robust when the genetic variants have effects that are a mixture of positive and negative effects, as well as when there is a small fraction of causal variants. Another advantage is that kernel methods fit within the framework of mixed models, providing flexible ways to adjust for additional covariates that influence traits. Herein, we review the basic ideas behind the use of kernel methods for genetic association analysis as well as recent methodological advancements for different types of traits, multivariate traits, pedigree data, and longitudinal data. Finally, we discuss opportunities for future research.
Collapse
Affiliation(s)
- Nicholas B Larson
- Department of Health Sciences Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota
| | - Jun Chen
- Department of Health Sciences Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota
| | - Daniel J Schaid
- Department of Health Sciences Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota
| |
Collapse
|
41
|
An adaptive microbiome α-diversity-based association analysis method. Sci Rep 2018; 8:18026. [PMID: 30575793 PMCID: PMC6303306 DOI: 10.1038/s41598-018-36355-7] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2018] [Accepted: 11/19/2018] [Indexed: 12/12/2022] Open
Abstract
To relate microbial diversity with various host traits of interest (e.g., phenotypes, clinical interventions, environmental factors) is a critical step for generic assessments about the disparity in human microbiota among different populations. The performance of the current item-by-item α-diversity-based association tests is sensitive to the choice of α-diversity metric and unpredictable due to the unknown nature of the true association. The approach of cherry-picking a test for the smallest p-value or the largest effect size among multiple item-by-item analyses is not even statistically valid due to the inherent multiplicity issue. Investigators have recently introduced microbial community-level association tests while blustering statistical power increase of their proposed methods. However, they are purely a test for significance which does not provide any estimation facilities on the effect direction and size of a microbial community; hence, they are not in practical use. Here, I introduce a novel microbial diversity association test, namely, adaptive microbiome α-diversity-based association analysis (aMiAD). aMiAD simultaneously tests the significance and estimates the effect score of the microbial diversity on a host trait, while robustly maintaining high statistical power and accurate estimation with no issues in validity.
Collapse
|
42
|
Zhan X, Xue L, Zheng H, Plantinga A, Wu MC, Schaid DJ, Zhao N, Chen J. A small‐sample kernel association test for correlated data with application to microbiome association studies. Genet Epidemiol 2018; 42:772-782. [DOI: 10.1002/gepi.22160] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Revised: 06/27/2018] [Accepted: 07/15/2018] [Indexed: 01/11/2023]
Affiliation(s)
- Xiang Zhan
- Department of Public Health SciencesPennsylvania State UniversityHershey Pennsylvania
| | - Lingzhou Xue
- Department of StatisticsPennsylvania State UniversityUniversity Park Pennsylvania
| | - Haotian Zheng
- Department of Mathematical SciencesTsinghua UniversityBeijing China
| | - Anna Plantinga
- Department of BiostatisticsUniversity of WashingtonSeattle Washington
| | - Michael C. Wu
- Department of BiostatisticsUniversity of WashingtonSeattle Washington
- Division of Public Health SciencesFred Hutchinson Cancer Research CenterSeattle Washington
| | - Daniel J. Schaid
- Division of Biomedical Statistics and InformaticsMayo ClinicRochester Minnesota
| | - Ni Zhao
- Department of BiostatisticsJohns Hopkins UniversityBaltimore Maryland
| | - Jun Chen
- Division of Biomedical Statistics and InformaticsMayo ClinicRochester Minnesota
- Center for Individualized MedicineMayo ClinicRochester Minnesota
| |
Collapse
|
43
|
Koh H, Livanos AE, Blaser MJ, Li H. A highly adaptive microbiome-based association test for survival traits. BMC Genomics 2018; 19:210. [PMID: 29558893 PMCID: PMC5859547 DOI: 10.1186/s12864-018-4599-8] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Accepted: 03/13/2018] [Indexed: 01/15/2023] Open
Abstract
BACKGROUND There has been increasing interest in discovering microbial taxa that are associated with human health or disease, gathering momentum through the advances in next-generation sequencing technologies. Investigators have also increasingly employed prospective study designs to survey survival (i.e., time-to-event) outcomes, but current item-by-item statistical methods have limitations due to the unknown true association pattern. Here, we propose a new adaptive microbiome-based association test for survival outcomes, namely, optimal microbiome-based survival analysis (OMiSA). OMiSA approximates to the most powerful association test in two domains: 1) microbiome-based survival analysis using linear and non-linear bases of OTUs (MiSALN) which weighs rare, mid-abundant, and abundant OTUs, respectively, and 2) microbiome regression-based kernel association test for survival traits (MiRKAT-S) which incorporates different distance metrics (e.g., unique fraction (UniFrac) distance and Bray-Curtis dissimilarity), respectively. RESULTS We illustrate that OMiSA powerfully discovers microbial taxa whether their underlying associated lineages are rare or abundant and phylogenetically related or not. OMiSA is a semi-parametric method based on a variance-component score test and a re-sampling method; hence, it is free from any distributional assumption on the effect of microbial composition and advantageous to robustly control type I error rates. Our extensive simulations demonstrate the highly robust performance of OMiSA. We also present the use of OMiSA with real data applications. CONCLUSIONS OMiSA is attractive in practice as the true association pattern is unpredictable in advance and, for survival outcomes, no adaptive microbiome-based association test is currently available.
Collapse
Affiliation(s)
- Hyunwook Koh
- Department of Population Health, New York University School of Medicine, 650 First Avenue, Room 547, New York, NY 10016 USA
| | - Alexandra E. Livanos
- Department of Medicine, Columbia University Medical Center, New York, NY 10032 USA
| | - Martin J. Blaser
- Departments of Medicine and Microbiology, New York University School of Medicine, New York, NY 10016 USA
- Medical Service, New York Harbor Department of Veterans Affairs Medical Center, New York, NY 10010 USA
| | - Huilin Li
- Department of Population Health, New York University School of Medicine, 650 First Avenue, Room 547, New York, NY 10016 USA
| |
Collapse
|
44
|
Zhan X, Plantinga A, Zhao N, Wu MC. A fast small-sample kernel independence test for microbiome community-level association analysis. Biometrics 2017; 73:1453-1463. [PMID: 28295177 DOI: 10.1111/biom.12684] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Revised: 02/01/2017] [Accepted: 02/01/2017] [Indexed: 12/13/2022]
Abstract
To fully understand the role of microbiome in human health and diseases, researchers are increasingly interested in assessing the relationship between microbiome composition and host genomic data. The dimensionality of the data as well as complex relationships between microbiota and host genomics pose considerable challenges for analysis. In this article, we apply a kernel RV coefficient (KRV) test to evaluate the overall association between host gene expression and microbiome composition. The KRV statistic can capture nonlinear correlations and complex relationships among the individual data types and between gene expression and microbiome composition through measuring general dependency. Testing proceeds via a similar route as existing tests of the generalized RV coefficients and allows for rapid p-value calculation. Strategies to allow adjustment for confounding effects, which is crucial for avoiding misleading results, and to alleviate the problem of selecting the most favorable kernel are considered. Simulation studies show that KRV is useful in testing statistical independence with finite samples given the kernels are appropriately chosen, and can powerfully identify existing associations between microbiome composition and host genomic data while protecting type I error. We apply the KRV to a microbiome study examining the relationship between host transcriptome and microbiome composition within the context of inflammatory bowel disease and are able to derive new biological insights and provide formal inference on prior qualitative observations.
Collapse
Affiliation(s)
- Xiang Zhan
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A
| | - Anna Plantinga
- Department of Biostatistics, University of Washington, Seattle, Washington 98195, U.S.A
| | - Ni Zhao
- Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland 21205, U.S.A
| | - Michael C Wu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A
| |
Collapse
|