1
|
Zhao Y. Mediation Analysis with Multiple Exposures and Multiple Mediators. Stat Med 2024; 43:4887-4898. [PMID: 39250913 DOI: 10.1002/sim.10215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 04/25/2024] [Accepted: 08/23/2024] [Indexed: 09/11/2024]
Abstract
A mediation analysis approach is proposed for multiple exposures, multiple mediators, and a continuous scalar outcome under the linear structural equation modeling framework. It assumes that there exist orthogonal components that demonstrate parallel mediation mechanisms on the outcome, and thus is named principal component mediation analysis (PCMA). Likelihood-based estimators are introduced for simultaneous estimation of the component projections and effect parameters. The asymptotic distribution of the estimators is derived for low-dimensional data. A bootstrap procedure is introduced for inference. Simulation studies illustrate the superior performance of the proposed approach. Applied to a proteomics-imaging dataset from the Alzheimer's disease neuroimaging initiative (ADNI), the proposed framework identifies protein deposition - brain atrophy - memory deficit mechanisms consistent with existing knowledge and suggests potential AD pathology by integrating data collected from different modalities.
Collapse
Affiliation(s)
- Yi Zhao
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indianapolis, Indiana
| |
Collapse
|
2
|
Xu Z, Li C, Chi S, Yang T, Wei P. Speeding up interval estimation for R 2 -based mediation effect of high-dimensional mediators via cross-fitting. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.06.527391. [PMID: 36798366 PMCID: PMC9934518 DOI: 10.1101/2023.02.06.527391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
Mediation analysis is a useful tool in investigating how molecular phenotypes such as gene expression mediate the effect of exposure on health outcomes. However, commonly used mean-based total mediation effect measures may suffer from cancellation of component-wise mediation effects in opposite directions in the presence of high-dimensional omics mediators. To overcome this limitation, we recently proposed a variance-based R-squared total mediation effect measure that relies on the computationally intensive nonparametric bootstrap for confidence interval estimation. In the work described herein, we formulated a more efficient two-stage, cross-fitted estimation procedure for theR 2 measure. To avoid potential bias, we performed iterative Sure Independence Screening (iSIS) in two subsamples to exclude the non-mediators, followed by ordinary least squares regressions for the variance estimation. We then constructed confidence intervals based on the newly derived closed-form asymptotic distribution of theR 2 measure. Extensive simulation studies demonstrated that this proposed procedure is much more computationally efficient than the resampling-based method, with comparable coverage probability. Furthermore, when applied to the Framingham Heart Study, the proposed method replicated the established finding of gene expression mediating age-related variation in systolic blood pressure and identified the role of gene expression profiles in the relationship between sex and high-density lipoprotein cholesterol level. The proposed estimation procedure is implemented in R package CFR2M.
Collapse
Affiliation(s)
- Zhichao Xu
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, U.S.A
| | - Chunlin Li
- Department of Statistics, Iowa State University, Ames, Iowa, 50011, U.S.A
| | - Sunyi Chi
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, U.S.A
| | - Tianzhong Yang
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota 55455, U.S.A
| | - Peng Wei
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, U.S.A
| |
Collapse
|
3
|
Goodrich JA, Wang H, Jia Q, Stratakis N, Zhao Y, Maitre L, Bustamante M, Vafeiadi M, Aung M, Andrušaitytė S, Basagana X, Farzan SF, Heude B, Keun H, McConnell R, Yang TC, Siskos AP, Urquiza J, Valvi D, Varo N, Småstuen Haug L, Oftedal BM, Gražulevičienė R, Philippat C, Wright J, Vrijheid M, Chatzi L, Conti DV. Integrating Multi-Omics with environmental data for precision health: A novel analytic framework and case study on prenatal mercury induced childhood fatty liver disease. ENVIRONMENT INTERNATIONAL 2024; 190:108930. [PMID: 39128376 PMCID: PMC11620538 DOI: 10.1016/j.envint.2024.108930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 06/24/2024] [Accepted: 07/31/2024] [Indexed: 08/13/2024]
Abstract
BACKGROUND Precision Health aims to revolutionize disease prevention by leveraging information across multiple omic datasets (multi-omics). However, existing methods generally do not consider personalized environmental risk factors (e.g., environmental pollutants). OBJECTIVE To develop and apply a precision health framework which combines multiomic integration (including early, intermediate, and late integration, representing sequential stages at which omics layers are combined for modeling) with mediation approaches (including high-dimensional mediation to identify biomarkers, mediation with latent factors to identify pathways, and integrated/quasi-mediation to identify high-risk subpopulations) to identify novel biomarkers of prenatal mercury induced metabolic dysfunction-associated fatty liver disease (MAFLD), elucidate molecular pathways linking prenatal mercury with MAFLD in children, and identify high-risk children based on integrated exposure and multiomics data. METHODS This prospective cohort study used data from 420 mother-child pairs from the Human Early Life Exposome (HELIX) project. Mercury concentrations were determined in maternal or cord blood from pregnancy. Cytokeratin 18 (CK-18; a MAFLD biomarker) and five omics layers (DNA Methylation, gene transcription, microRNA, proteins, and metabolites) were measured in blood in childhood (age 6-10 years). RESULTS Each standard deviation increase in prenatal mercury was associated with a 0.11 [95% confidence interval: 0.02-0.21] standard deviation increase in CK-18. High dimensional mediation analysis identified 10 biomarkers linking prenatal mercury and CK-18, including six CpG sites and four transcripts. Mediation with latent factors identified molecular pathways linking mercury and MAFLD, including altered cytokine signaling and hepatic stellate cell activation. Integrated/quasi-mediation identified high risk subgroups of children based on unique combinations of exposure levels, omics profiles (driven by epigenetic markers), and MAFLD. CONCLUSIONS Prenatal mercury exposure is associated with elevated liver enzymes in childhood, likely through alterations in DNA methylation and gene expression. Our analytic framework can be applied across many different fields and serve as a resource to help guide future precision health investigations.
Collapse
Affiliation(s)
- Jesse A Goodrich
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, United States.
| | - Hongxu Wang
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, United States
| | - Qiran Jia
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, United States
| | - Nikos Stratakis
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Spain
| | - Yinqi Zhao
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, United States
| | - Léa Maitre
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Spain
| | - Mariona Bustamante
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Spain
| | - Marina Vafeiadi
- Department of Social Medicine Faculty of Medicine, University of Crete, Heraklion, Greece
| | - Max Aung
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, United States
| | - Sandra Andrušaitytė
- Department of Environmental Sciences, Vytauto Didžiojo Universitetas, Kaunas, Lithuania
| | - Xavier Basagana
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Spain
| | - Shohreh F Farzan
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, United States
| | - Barbara Heude
- Université de Paris Cité, Institut National de la Santé et de la Recherche Médicale (INSERM), National Research Institute for Agriculture, Food and Environment, Centre of Research in Epidemiology and Statistics, Paris, France
| | - Hector Keun
- Department of Surgery & Cancer and Department of Metabolism Digestion & Reproduction Imperial College London, London, United Kingdom
| | - Rob McConnell
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, United States
| | - Tiffany C Yang
- Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford, United Kingdom
| | - Alexandros P Siskos
- Department of Surgery & Cancer and Department of Metabolism Digestion & Reproduction Imperial College London, London, United Kingdom
| | - Jose Urquiza
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Spain
| | - Damaskini Valvi
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Nerea Varo
- Laboratory of Biochemistry, University Clinic of Navarra, Pamplona, Spain
| | | | | | - Regina Gražulevičienė
- Department of Environmental Sciences, Vytauto Didžiojo Universitetas, Kaunas, Lithuania
| | - Claire Philippat
- University Grenoble Alpes, Institut National de la Santé et de la Recherche Médicale (INSERM) U 1209, CNRS UMR 5309, Team of Environmental Epidemiology Applied to Development and Respiratory Health, Institute for Advanced Biosciences, 38000 Grenoble, France
| | - John Wright
- Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford, United Kingdom
| | - Martine Vrijheid
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Spain
| | - Leda Chatzi
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, United States
| | - David V Conti
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, United States
| |
Collapse
|
4
|
Wang S, Huang Y. DP2LM: leveraging deep learning approach for estimation and hypothesis testing on mediation effects with high-dimensional mediators and complex confounders. Biostatistics 2024; 25:818-832. [PMID: 38330064 DOI: 10.1093/biostatistics/kxad037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 12/18/2023] [Accepted: 12/21/2023] [Indexed: 02/10/2024] Open
Abstract
Traditional linear mediation analysis has inherent limitations when it comes to handling high-dimensional mediators. Particularly, accurately estimating and rigorously inferring mediation effects is challenging, primarily due to the intertwined nature of the mediator selection issue. Despite recent developments, the existing methods are inadequate for addressing the complex relationships introduced by confounders. To tackle these challenges, we propose a novel approach called DP2LM (Deep neural network-based Penalized Partially Linear Mediation). This approach incorporates deep neural network techniques to account for nonlinear effects in confounders and utilizes the penalized partially linear model to accommodate high dimensionality. Unlike most existing works that concentrate on mediator selection, our method prioritizes estimation and inference on mediation effects. Specifically, we develop test procedures for testing the direct and indirect mediation effects. Theoretical analysis shows that the tests maintain the Type-I error rate. In simulation studies, DP2LM demonstrates its superior performance as a modeling tool for complex data, outperforming existing approaches in a wide range of settings and providing reliable estimation and inference in scenarios involving a considerable number of mediators. Further, we apply DP2LM to investigate the mediation effect of DNA methylation on cortisol stress reactivity in individuals who experienced childhood trauma, uncovering new insights through a comprehensive analysis.
Collapse
Affiliation(s)
- Shuoyang Wang
- Department of Biostatistics, Yale University, New Haven, CT 06520, USA
| | - Yuan Huang
- Department of Biostatistics, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
5
|
Hung-Ching C, Yusi F, Gorczyca MT, Kayhan B, Tseng GC. High-dimensional causal mediation analysis by partial sum statistic and sample splitting strategy in imaging genetics application. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.06.23.24309362. [PMID: 38978660 PMCID: PMC11230309 DOI: 10.1101/2024.06.23.24309362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Causal mediation analysis provides a systematic approach to explore the causal role of one or more mediators in the association between exposure and outcome. In omics or imaging data analysis, mediators are often high-dimensional, which brings new statistical challenges. Existing methods either violate causal assumptions or fail in interpretable variable selection. Additionally, mediators are often highly correlated, presenting difficulties in selecting and prioritizing top mediators. To address these issues, we develop a framework using Partial Sum Statistic and Sample Splitting Strategy, namely PS5, for high-dimensional causal mediation analysis. The method provides a powerful global mediation test satisfying causal assumptions, followed by an algorithm to select and prioritize active mediators with quantification of individual mediation contributions. We demonstrate its accurate type I error control, superior statistical power, reduced bias in mediation effect estimation, and accurate mediator selection using extensive simulations of varying levels of effect size, signal sparsity, and mediator correlations. Finally, we apply PS5 to an imaging genetics dataset of chronic obstructive pulmonary disease (COPD) patients ( N =8,897) in the COPDGene study to examine the causal mediation role of lung images ( p =5,810) in the associations between polygenic risk score and lung function and between smoking exposure and lung function, respectively. Both causal mediation analyses successfully estimate the global indirect effect and detect mediating image regions. Collectively, we find a region in the lower lobe of the right lung with a strong and concordant mediation effect for both genetic and environmental exposures. This suggests that targeted treatment toward this region might mitigate the severity of COPD due to genetic and smoking effects.
Collapse
|
6
|
Domingo-Relloso A, Tellez-Plaza M, Valeri L. Methods for the Analysis of Multiple Epigenomic Mediators in Environmental Epidemiology. Curr Environ Health Rep 2024; 11:109-117. [PMID: 38386268 DOI: 10.1007/s40572-024-00436-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/12/2024] [Indexed: 02/23/2024]
Abstract
PURPOSE OF REVIEW Epigenetic changes can be highly influenced by environmental factors and have in turn been proposed to influence chronic disease. Being able to quantify to which extent epigenomic processes are mediators of the association between environmental exposures and diseases is of interest for epidemiologic research. In this review, we summarize the proposed mediation analysis methods with applications to epigenomic data. RECENT FINDINGS The ultra-high dimensionality and high correlations that characterize omics data have hindered the precise quantification of mediated effects. Several methods have been proposed to deal with mediation in high-dimensional settings, including methods that incorporate dimensionality reduction techniques to the mediation algorithm. Although important methodological advances have been conducted in the previous years, key challenges such as the development of sensitivity analyses, dealing with mediator-mediator interactions, including environmental mixtures as exposures, or the integration of different omic data should be the focus of future methodological developments for epigenomic mediation analysis.
Collapse
Affiliation(s)
- Arce Domingo-Relloso
- Department of Biostatistics, Columbia University Mailman School of Public Health, 722 West 168Th Street, New York, NY, 10032, USA.
| | - Maria Tellez-Plaza
- Department of Chronic Diseases Epidemiology, National Center for Epidemiology, Carlos III Health Institute, Madrid, Spain
| | - Linda Valeri
- Department of Biostatistics, Columbia University Mailman School of Public Health, 722 West 168Th Street, New York, NY, 10032, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
7
|
Lopez Naranjo C, Razzaq FA, Li M, Wang Y, Bosch‐Bayard JF, Lindquist MA, Gonzalez Mitjans A, Garcia R, Rabinowitz AG, Anderson SG, Chiarenza GA, Calzada‐Reyes A, Virues‐Alba T, Galler JR, Minati L, Bringas Vega ML, Valdes‐Sosa PA. EEG functional connectivity as a Riemannian mediator: An application to malnutrition and cognition. Hum Brain Mapp 2024; 45:e26698. [PMID: 38726908 PMCID: PMC11082925 DOI: 10.1002/hbm.26698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 04/05/2024] [Accepted: 04/12/2024] [Indexed: 05/13/2024] Open
Abstract
Mediation analysis assesses whether an exposure directly produces changes in cognitive behavior or is influenced by intermediate "mediators". Electroencephalographic (EEG) spectral measurements have been previously used as effective mediators representing diverse aspects of brain function. However, it has been necessary to collapse EEG measures onto a single scalar using standard mediation methods. In this article, we overcome this limitation and examine EEG frequency-resolved functional connectivity measures as a mediator using the full EEG cross-spectral tensor (CST). Since CST samples do not exist in Euclidean space but in the Riemannian manifold of positive-definite tensors, we transform the problem, allowing for the use of classic multivariate statistics. Toward this end, we map the data from the original manifold space to the Euclidean tangent space, eliminating redundant information to conform to a "compressed CST." The resulting object is a matrix with rows corresponding to frequencies and columns to cross spectra between channels. We have developed a novel matrix mediation approach that leverages a nuclear norm regularization to determine the matrix-valued regression parameters. Furthermore, we introduced a global test for the overall CST mediation and a test to determine specific channels and frequencies driving the mediation. We validated the method through simulations and applied it to our well-studied 50+-year Barbados Nutrition Study dataset by comparing EEGs collected in school-age children (5-11 years) who were malnourished in the first year of life with those of healthy classmate controls. We hypothesized that the CST mediates the effect of malnutrition on cognitive performance. We can now explicitly pinpoint the frequencies (delta, theta, alpha, and beta bands) and regions (frontal, central, and occipital) in which functional connectivity was altered in previously malnourished children, an improvement to prior studies. Understanding the specific networks impacted by a history of postnatal malnutrition could pave the way for developing more targeted and personalized therapeutic interventions. Our methods offer a versatile framework applicable to mediation studies encompassing matrix and Hermitian 3D tensor mediators alongside scalar exposures and outcomes, facilitating comprehensive analyses across diverse research domains.
Collapse
Affiliation(s)
- Carlos Lopez Naranjo
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, School of Life Science and TechnologyUniversity of Electronic Science and Technology of ChinaChengduChina
| | - Fuleah Abdul Razzaq
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, School of Life Science and TechnologyUniversity of Electronic Science and Technology of ChinaChengduChina
| | - Min Li
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, School of Life Science and TechnologyUniversity of Electronic Science and Technology of ChinaChengduChina
- Hangzhou Dianzi UniversityZhejiangHangzhouChina
| | - Ying Wang
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, School of Life Science and TechnologyUniversity of Electronic Science and Technology of ChinaChengduChina
| | | | | | - Anisleidy Gonzalez Mitjans
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, School of Life Science and TechnologyUniversity of Electronic Science and Technology of ChinaChengduChina
- Montreal Neurological Institute‐HospitalMcGill UniversityMontrealQuebecCanada
| | - Ronaldo Garcia
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, School of Life Science and TechnologyUniversity of Electronic Science and Technology of ChinaChengduChina
| | | | - Simon G. Anderson
- The George Alleyne Chronic Disease Research Centre, Caribbean Institute for Health ResearchUniversity of the West IndiesCave HillBarbados
| | - Giuseppe A. Chiarenza
- Centro Internazionale Disturbi di Apprendimento, Attenzione, Iperattività (CIDAAI)MilanItaly
| | | | | | - Janina R. Galler
- Division of Pediatric Gastroenterology and NutritionMassachusetts General Hospital for ChildrenBostonMassachusettsUSA
| | - Ludovico Minati
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, School of Life Science and TechnologyUniversity of Electronic Science and Technology of ChinaChengduChina
- Center for Mind/Brain Science (CIMeC)University of TrentoTrentoItaly
| | - Maria L. Bringas Vega
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, School of Life Science and TechnologyUniversity of Electronic Science and Technology of ChinaChengduChina
- Cuban Center for NeuroscienceLa HabanaCuba
| | - Pedro A. Valdes‐Sosa
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, School of Life Science and TechnologyUniversity of Electronic Science and Technology of ChinaChengduChina
- Cuban Center for NeuroscienceLa HabanaCuba
| |
Collapse
|
8
|
Clark-Boucher D, Zhou X, Du J, Liu Y, Needham BL, Smith JA, Mukherjee B. Methods for mediation analysis with high-dimensional DNA methylation data: Possible choices and comparisons. PLoS Genet 2023; 19:e1011022. [PMID: 37934796 PMCID: PMC10655967 DOI: 10.1371/journal.pgen.1011022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 11/17/2023] [Accepted: 10/18/2023] [Indexed: 11/09/2023] Open
Abstract
Epigenetic researchers often evaluate DNA methylation as a potential mediator of the effect of social/environmental exposures on a health outcome. Modern statistical methods for jointly evaluating many mediators have not been widely adopted. We compare seven methods for high-dimensional mediation analysis with continuous outcomes through both diverse simulations and analysis of DNAm data from a large multi-ethnic cohort in the United States, while providing an R package for their seamless implementation and adoption. Among the considered choices, the best-performing methods for detecting active mediators in simulations are the Bayesian sparse linear mixed model (BSLMM) and high-dimensional mediation analysis (HDMA); while the preferred methods for estimating the global mediation effect are high-dimensional linear mediation analysis (HILMA) and principal component mediation analysis (PCMA). We provide guidelines for epigenetic researchers on choosing the best method in practice and offer suggestions for future methodological development.
Collapse
Affiliation(s)
- Dylan Clark-Boucher
- Department of Biostatistics, Harvard T.H. Chan School of Public Health; Boston, Massachusetts, United States of America
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan; Ann Arbor, Michigan, United States of America
| | - Jiacong Du
- Department of Biostatistics, University of Michigan; Ann Arbor, Michigan, United States of America
| | - Yongmei Liu
- Department of Medicine, Divisions of Cardiology and Neurology, Duke University Medical Center; Durham, North Carolina, United States of America
| | - Belinda L. Needham
- Department of Epidemiology, University of Michigan; Ann Arbor, Michigan, United States of America
| | - Jennifer A. Smith
- Department of Epidemiology, University of Michigan; Ann Arbor, Michigan, United States of America
- Survey Research Center, Institute for Social Research, University of Michigan; Ann Arbor, Michigan, United States of America
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan; Ann Arbor, Michigan, United States of America
- Department of Epidemiology, University of Michigan; Ann Arbor, Michigan, United States of America
| |
Collapse
|
9
|
Zhao Z, Chen C, Mani Adhikari B, Hong LE, Kochunov P, Chen S. Mediation Analysis for High-Dimensional Mediators and Outcomes with an Application to Multimodal Imaging Data. Comput Stat Data Anal 2023; 185:107765. [PMID: 37251499 PMCID: PMC10210585 DOI: 10.1016/j.csda.2023.107765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Multimodal neuroimaging data have attracted increasing attention for brain research. An integrated analysis of multimodal neuroimaging data and behavioral or clinical measurements provides a promising approach for comprehensively and systematically investigating the underlying neural mechanisms of different phenotypes. However, such an integrated data analysis is intrinsically challenging due to the complex interactive relationships between the multimodal multivariate imaging variables. To address this challenge, a novel multivariate-mediator and multivariate-outcome mediation model (MMO) is proposed to simultaneously extract the latent systematic mediation patterns and estimate the mediation effects based on a dense bi-cluster graph approach. A computationally efficient algorithm is developed for dense bicluster structure estimation and inference to identify the mediation patterns with multiple testing correction. The performance of the proposed method is evaluated by an extensive simulation analysis with comparison to the existing methods. The results show that MMO performs better in terms of both the false discovery rate and sensitivity compared to existing models. The MMO is applied to a multimodal imaging dataset from the Human Connectome Project to investigate the effect of systolic blood pressure on whole-brain imaging measures for the regional homogeneity of the blood oxygenation level-dependent signal through the cerebral blood flow.
Collapse
Affiliation(s)
- Zhiwei Zhao
- Department of Mathematics, University of Maryland, 4176 Campus Drive, CollegePark, 20742, MD, USA
| | - Chixiang Chen
- Division of Biostatistics and Bioinformatics, Department of Epidemiology and PublicHealth, University of Maryland School of Medicine, 655 W. Baltimore, Street, Baltimore, 21201, MD, USA
| | - Bhim Mani Adhikari
- Maryland Psychiatric Research Center, Department of Psychiatry, University ofMaryland School of Medicine, 655 W. Baltimore Street, Baltimore, 21201, MD, USA
| | - L. Elliot Hong
- Maryland Psychiatric Research Center, Department of Psychiatry, University ofMaryland School of Medicine, 655 W. Baltimore Street, Baltimore, 21201, MD, USA
| | - Peter Kochunov
- Maryland Psychiatric Research Center, Department of Psychiatry, University ofMaryland School of Medicine, 655 W. Baltimore Street, Baltimore, 21201, MD, USA
| | - Shuo Chen
- Division of Biostatistics and Bioinformatics, Department of Epidemiology and PublicHealth, University of Maryland School of Medicine, 655 W. Baltimore, Street, Baltimore, 21201, MD, USA
- Maryland Psychiatric Research Center, Department of Psychiatry, University ofMaryland School of Medicine, 655 W. Baltimore Street, Baltimore, 21201, MD, USA
| |
Collapse
|
10
|
Wang JX, Li Y, Reddick WE, Conklin HM, Glass JO, Onar-Thomas A, Gajjar A, Cheng C, Lu ZH. A high-dimensional mediation model for a neuroimaging mediator: Integrating clinical, neuroimaging, and neurocognitive data to mitigate late effects in pediatric cancer. Biometrics 2023; 79:2430-2443. [PMID: 35962595 DOI: 10.1111/biom.13729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Accepted: 07/06/2022] [Indexed: 11/30/2022]
Abstract
Pediatric cancer treatment, especially for brain tumors, can have profound and complicated late effects. With the survival rates increasing because of improved detection and treatment, a more comprehensive understanding of the impact of current treatments on neurocognitive function and brain structure is critically needed. A frontline medulloblastoma clinical trial (SJMB03) has collected data, including treatment, clinical, neuroimaging, and cognitive variables. Advanced methods for modeling and integrating these data are critically needed to understand the mediation pathway from the treatment through brain structure to neurocognitive outcomes. We propose an integrative Bayesian mediation analysis approach to model jointly a treatment exposure, a high-dimensional structural neuroimaging mediator, and a neurocognitive outcome and to uncover the mediation pathway. The high-dimensional imaging-related coefficients are modeled via a binary Ising-Gaussian Markov random field prior (BI-GMRF), addressing the sparsity, spatial dependency, and smoothness and increasing the power to detect brain regions with mediation effects. Numerical simulations demonstrate the estimation accuracy, power, and robustness. For the SJMB03 study, the BI-GMRF method has identified white matter microstructure that is damaged by cancer-directed treatment and impacts late neurocognitive outcomes. The results provide guidance on improving treatment planning to minimize long-term cognitive sequela for pediatric brain tumor patients.
Collapse
Affiliation(s)
- Jade Xiaoqing Wang
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Yimei Li
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Wilburn E Reddick
- Department of Diagnostic Imaging, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Heather M Conklin
- Department of Psychology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - John O Glass
- Department of Diagnostic Imaging, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Arzu Onar-Thomas
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Amar Gajjar
- Department of Pediatric Medicine, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Department of Oncology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Cheng Cheng
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Zhao-Hua Lu
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| |
Collapse
|
11
|
Clark-Boucher D, Zhou X, Du J, Liu Y, Needham BL, Smith JA, Mukherjee B. Methods for Mediation Analysis with High-Dimensional DNA Methylation Data: Possible Choices and Comparison. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.10.23285764. [PMID: 36824903 PMCID: PMC9949196 DOI: 10.1101/2023.02.10.23285764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
Epigenetic researchers often evaluate DNA methylation as a mediator between social/environmental exposures and disease, but modern statistical methods for jointly evaluating many mediators have not been widely adopted. We compare seven methods for high-dimensional mediation analysis with continuous outcomes through both diverse simulations and analysis of DNAm data from a large national cohort in the United States, while providing an R package for their implementation. Among the considered choices, the best-performing methods for detecting active mediators in simulations are the Bayesian sparse linear mixed model by Song et al. (2020) and high-dimensional mediation analysis by Gao et al. (2019); while the superior methods for estimating the global mediation effect are high-dimensional linear mediation analysis by Zhou et al. (2021) and principal component mediation analysis by Huang and Pan (2016). We provide guidelines for epigenetic researchers on choosing the best method in practice and offer suggestions for future methodological development.
Collapse
Affiliation(s)
- Dylan Clark-Boucher
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
| | - Jiacong Du
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
| | - Yongmei Liu
- Department of Medicine, Divisions of Cardiology and Neurology, Duke University Medical Center, Durham, NC
| | | | - Jennifer A Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
| |
Collapse
|
12
|
Chen F, Hu W, Cai J, Chen S, Si A, Zhang Y, Liu W. Instrumental variable-based high-dimensional mediation analysis with unmeasured confounders for survival data in the observational epigenetic study. Front Genet 2023; 14:1092489. [PMID: 36816039 PMCID: PMC9932046 DOI: 10.3389/fgene.2023.1092489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 01/16/2023] [Indexed: 02/04/2023] Open
Abstract
Background: High dimensional mediation analysis is frequently conducted to explore the role of epigenetic modifiers between exposure and health outcome. However, the issue of high dimensional mediation analysis with unmeasured confounders for survival analysis in observational study has not been well solved. Methods: In this study, we proposed an instrumental variable based approach for high dimensional mediation analysis with unmeasured confounders in survival analysis for epigenetic study. We used the Sobel's test, the Joint test, and the Bootstrap method to test the mediation effect. A comprehensive simulation study was conducted to decide the best test strategy. An empirical study based on DNA methylation data of lung cancer patients was conducted to illustrate the performance of the proposed method. Results: Simulation study suggested that the proposed method performed well in the identifying mediating factors. The estimation of the mediation effect by the proposed approach is also reliable with less bias compared with the classical approach. In the empirical study, we identified two DNA methylation signatures including cg21926276 and cg26387355 with a mediation effect of 0.226 (95%CI: 0.108-0.344) and 0.158 (95%CI: 0.065-0.251) between smoking and lung cancer using the proposed approach. Conclusion: The proposed method obtained good performance in simulation and empirical studies, it could be an effective statistical tool for high dimensional mediation analysis.
Collapse
Affiliation(s)
- Fangyao Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, China,Department of Radiology, First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Weiwei Hu
- Department of Radiology, First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Jiaxin Cai
- Department of Epidemiology and Biostatistics, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, China
| | - Shiyu Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, China
| | - Aima Si
- Department of Epidemiology and Biostatistics, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, China
| | - Yuxiang Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, China
| | - Wei Liu
- Department of Cell Biology and Genetics, School of Basic Medical Science, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, China,*Correspondence: Wei Liu,
| |
Collapse
|
13
|
Zhao Y, Li L. Multimodal data integration via mediation analysis with high-dimensional exposures and mediators. Hum Brain Mapp 2022; 43:2519-2533. [PMID: 35129252 PMCID: PMC9057105 DOI: 10.1002/hbm.25800] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 01/06/2022] [Accepted: 01/23/2022] [Indexed: 12/28/2022] Open
Abstract
Motivated by an imaging proteomics study for Alzheimer's disease (AD), in this article, we propose a mediation analysis approach with high-dimensional exposures and high-dimensional mediators to integrate data collected from multiple platforms. The proposed method combines principal component analysis with penalized least squares estimation for a set of linear structural equation models. The former reduces the dimensionality and produces uncorrelated linear combinations of the exposure variables, whereas the latter achieves simultaneous path selection and effect estimation while allowing the mediators to be correlated. Applying the method to the AD data identifies numerous interesting protein peptides, brain regions, and protein-structure-memory paths, which are in accordance with and also supplement existing findings of AD research. Additional simulations further demonstrate the effective empirical performance of the method.
Collapse
Affiliation(s)
- Yi Zhao
- Department of Biostatistics and Health Data ScienceIndiana University School of MedicineIndianapolisIndianaUSA
| | - Lexin Li
- Department of Biostatistics and EpidemiologyUniversity of California, BerkeleyBerkeleyCaliforniaUSA
| | | |
Collapse
|