1
|
Rudolph KE, Williams NT, Diaz I. Practical causal mediation analysis: extending nonparametric estimators to accommodate multiple mediators and multiple intermediate confounders. Biostatistics 2024:kxae012. [PMID: 38576206 DOI: 10.1093/biostatistics/kxae012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 01/18/2024] [Accepted: 03/17/2024] [Indexed: 04/06/2024] Open
Abstract
Mediation analysis is appealing for its ability to improve understanding of the mechanistic drivers of causal effects, but real-world data complexities challenge its successful implementation, including (i) the existence of post-exposure variables that also affect mediators and outcomes (thus, confounding the mediator-outcome relationship), that may also be (ii) multivariate, and (iii) the existence of multivariate mediators. All three challenges are present in the mediation analysis we consider here, where our goal is to estimate the indirect effects of receiving a Section 8 housing voucher as a young child on the risk of developing a psychiatric mood disorder in adolescence that operate through mediators related to neighborhood poverty, the school environment, and instability of the neighborhood and school environments, considered together and separately. Interventional direct and indirect effects (IDE/IIE) accommodate post-exposure variables that confound the mediator-outcome relationship, but currently, no readily implementable nonparametric estimator for IDE/IIE exists that allows for both multivariate mediators and multivariate post-exposure intermediate confounders. The absence of such an IDE/IIE estimator that can easily accommodate both multivariate mediators and post-exposure confounders represents a significant limitation for real-world analyses, because when considering each mediator subgroup separately, the remaining mediator subgroups (or a subset of them) become post-exposure intermediate confounders. We address this gap by extending a recently developed nonparametric estimator for the IDE/IIE to allow for easy incorporation of multivariate mediators and multivariate post-exposure confounders simultaneously. We apply the proposed estimation approach to our analysis, including walking through a strategy to account for other, possibly co-occurring intermediate variables when considering each mediator subgroup separately.
Collapse
Affiliation(s)
- Kara E Rudolph
- Department of Epidemiology, Mailman School of Public Health, Columbia University, 722 W 168th St, NY, NY 10032, United States
| | - Nicholas T Williams
- Department of Epidemiology, Mailman School of Public Health, Columbia University, 722 W 168th St, NY, NY 10032, United States
| | - Ivan Diaz
- Division of Biostatistics, Department of Population Health, New York University School of Medicine, 180 Madison Ave, NY, NY 10016, United States
| |
Collapse
|
2
|
Zeng X, Chen T, Cui Y, Zhao J, Chen Q, Yu Z, Zhang Y, Han L, Chen Y, Zhang J. In utero exposure to perfluoroalkyl substances and early childhood BMI trajectories: A mediation analysis with neonatal metabolic profiles. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 867:161504. [PMID: 36634772 DOI: 10.1016/j.scitotenv.2023.161504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 12/30/2022] [Accepted: 01/05/2023] [Indexed: 06/17/2023]
Abstract
BACKGROUND In utero perfluoroalkyl substances (PFAS) exposure has been associated with childhood adiposity, but the mechanisms are poorly known. OBJECTIVE To investigate the potential mediating role of neonatal metabolites in the relationship between prenatal PFAS exposure and childhood adiposity trajectories in the first four years of life. METHODS We analyzed the data for 1671 mother-child pairs from the Shanghai Birth Cohort study. We included those with PFAS exposure information in early pregnancy, neonatal metabolites data and at least three child anthropometric measurements at 6, 12, 24 and/or 48 months. Body mass index (BMI) z-score trajectories were identified using latent class growth mixture modeling. The associations between PFAS concentrations and trajectory classes were assessed using multinomial logistic regression. Screening and penalization-based selection was used to identify neonatal amino acids and acylcarnitines with significant mediation effects. RESULTS Three BMI z-score trajectories in early childhood were identified: a persistent increase trajectory (Class 1, 2.2 %), a stable trajectory (Class 2, 66 %), and a transient increase trajectory (Class 3, 32 %). Increased odds of being in Class 1 were observed in association with one log-unit increase in concentrations of perfluorooctane sulfonate (odds ratio [OR], 1.76 [95 % CI, 0.96-3.23], Class 2 as reference; OR, 2.36 [95 % CI, 1.27-4.40], Class 3 as reference), perfluorononanoic acid (OR, 1.90 [95 % CI, 0.97-3.72], Class 2 as reference; OR, 2.23 [95 % CI, 1.12-4.42], Class 3 as reference) and perfluorodecanoic acid (OR, 1.95 [95 % CI, 1.12-3.38], Class 2 as reference; OR, 2.14 [95 % CI, 1.22-3.76], Class 3 as reference). The effect of prenatal PFAS exposure on being in Class 1 was significantly but partly mediated by octanoylcarnitine (2.64 % for perfluorononanoic acid and 3.70 % for sum of 10 PFAS). CONCLUSIONS In utero PFAS exposure is a risk factor for persistent growth in BMI z-score in early childhood. The alteration of neonatal acylcarnitines suggests a potential molecular pathway.
Collapse
Affiliation(s)
- Xiaojing Zeng
- Ministry of Education-Shanghai Key Laboratory of Children's Environmental Health, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200092, China
| | - Ting Chen
- Department of Pediatric Endocrinology and Genetic Metabolism, Shanghai Institute for Pediatric Research, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200092, China
| | - Yidan Cui
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Jian Zhao
- Ministry of Education-Shanghai Key Laboratory of Children's Environmental Health, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200092, China
| | - Qian Chen
- Ministry of Education-Shanghai Key Laboratory of Children's Environmental Health, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200092, China
| | - Zhangsheng Yu
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China; Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yongjun Zhang
- Department of Neonatology, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200092, China
| | - Lianshu Han
- Department of Pediatric Endocrinology and Genetic Metabolism, Shanghai Institute for Pediatric Research, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200092, China
| | - Yan Chen
- Department of Neonatology, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200092, China.
| | - Jun Zhang
- Ministry of Education-Shanghai Key Laboratory of Children's Environmental Health, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200092, China.
| |
Collapse
|
3
|
Luo L, Yan Y, Cui Y, Yuan X, Yu Z. Linear high-dimensional mediation models adjusting for confounders using propensity score method. Front Genet 2022; 13:961148. [PMID: 36299590 PMCID: PMC9589256 DOI: 10.3389/fgene.2022.961148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Accepted: 09/14/2022] [Indexed: 11/13/2022] Open
Abstract
High-dimensional mediation analysis has been developed to study whether epigenetic phenotype in a high-dimensional data form would mediate the causal pathway of exposure to disease. However, most existing models are designed based on the assumption that there are no confounders between the exposure, the mediators, and the outcome. In practice, this assumption may not be feasible since high-dimensional mediation analysis (HIMA) tends to be observational where a randomized controlled trial (RCT) cannot be conducted for some economic or ethical reasons. Thus, to deal with the confounders in HIMA cases, we proposed three propensity score-related approaches named PSR (propensity score regression), PSW (propensity score weighting), and PSU (propensity score union) to adjust for the confounder bias in HIMA, and compared them with the traditional covariate regression method. The procedures mainly include four parts: calculating the propensity score, sure independence screening, MCP (minimax concave penalty) variable selection, and joint-significance testing. Simulation results show that the PSU model is the most recommended. Applying our models to the TCGA lung cancer dataset, we find that smoking may lead to lung disease through the mediation effect of some specific DNA-methylation sites, including site Cg24480765 in gene RP11-347H15.2 and site Cg22051776 in gene KLF3.
Collapse
Affiliation(s)
- Linghao Luo
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Yuting Yan
- Jinmai Community Service Center, Guiyang, China
| | - Yidan Cui
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Xin Yuan
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- *Correspondence: Zhangsheng Yu,
| |
Collapse
|
4
|
Perera C, Zhang H, Zheng Y, Hou L, Qu A, Zheng C, Xie K, Liu L. HIMA2: high-dimensional mediation analysis and its application in epigenome-wide DNA methylation data. BMC Bioinformatics 2022; 23:296. [PMID: 35879655 PMCID: PMC9310002 DOI: 10.1186/s12859-022-04748-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 05/23/2022] [Indexed: 11/28/2022] Open
Abstract
Mediation analysis plays a major role in identifying significant mediators in the pathway between environmental exposures and health outcomes. With advanced data collection technology for large-scale studies, there has been growing research interest in developing methodology for high-dimensional mediation analysis. In this paper we present HIMA2, an extension of the HIMA method (Zhang in Bioinformatics 32:3150-3154, 2016). First, the proposed HIMA2 reduces the dimension of mediators to a manageable level based on the sure independence screening (SIS) method (Fan in J R Stat Soc Ser B 70:849-911, 2008). Second, a de-biased Lasso procedure is implemented for estimating regression parameters. Third, we use a multiple-testing procedure to accurately control the false discovery rate (FDR) when testing high-dimensional mediation hypotheses. We demonstrate its practical performance using Monte Carlo simulation studies and apply our method to identify DNA methylation markers which mediate the pathway from smoking to reduced lung function in the Coronary Artery Risk Development in Young Adults (CARDIA) Study.
Collapse
Affiliation(s)
- Chamila Perera
- Division of Biostatistics, Washington University in St. Louis, St. Louis, MO, 63110, USA
| | - Haixiang Zhang
- Center for Applied Mathematics, Tianjin University, Tianjin, 300072, China
| | - Yinan Zheng
- Department of Preventive Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Lifang Hou
- Department of Preventive Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Annie Qu
- Department of Statistics, University of California, Irvine, CA, 92697, USA
| | - Cheng Zheng
- Department of Biostatistics, University of Nebraska Medical Center, Omaha, NE, 68198, USA
| | - Ke Xie
- Division of Biostatistics, Washington University in St. Louis, St. Louis, MO, 63110, USA
| | - Lei Liu
- Division of Biostatistics, Washington University in St. Louis, St. Louis, MO, 63110, USA.
| |
Collapse
|
5
|
Hou L, Yu Y, Sun X, Liu X, Yu Y, Li H, Xue F. Causal mediation analysis with multiple causally non-ordered and ordered mediators based on summarized genetic data. Stat Methods Med Res 2022; 31:1263-1279. [PMID: 35345945 DOI: 10.1177/09622802221084599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Causal mediation analysis investigates the mechanism linking exposure and outcome. Dealing with the impact of unobserved confounders among exposure, mediator and outcome is an issue of great concern. Moreover, when multiple mediators exist, this causal pathway intertwines with other causal pathways, rendering it difficult to estimate the path-specific effects. In this study, we propose a method (PSE-MR) to identify and estimate path-specific effects of an exposure (e.g. education) on an outcome (e.g. osteoarthritis risk) through multiple causally ordered and non-ordered mediators (e.g. body mass index and pack-years of smoking) using summarized genetic data, when the sequential ignorability assumption is violated. Specifically, PSE-MR requires a specific rank condition in which the number of instrumental variables is larger than the number of mediators. Furthermore, we illustrate the utility of PSE-MR by providing guidance for practitioners and exploring the mediation effects of body mass index and pack-years of smoking in the causal pathways from education to osteoarthritis risk. Additionally, the results of simulation reveal that the causal estimates of path-specific effects are almost unbiased with good coverage and Type I error properties. Also, we summarize the least number of instrumental variables for the specific number of mediators to achieve 80% power.
Collapse
Affiliation(s)
- Lei Hou
- Department of Epidemiology and Health Statistics, School of Public Health, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China.,Institute for Medical Dataology, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China
| | - Yuanyuan Yu
- Department of Epidemiology and Health Statistics, School of Public Health, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China.,Institute for Medical Dataology, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China
| | - Xiaoru Sun
- Department of Epidemiology and Health Statistics, School of Public Health, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China.,Institute for Medical Dataology, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China
| | - Xinhui Liu
- Department of Epidemiology and Health Statistics, School of Public Health, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China.,Institute for Medical Dataology, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China
| | - Yifan Yu
- Department of Epidemiology and Health Statistics, School of Public Health, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China.,Institute for Medical Dataology, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China
| | - Hongkai Li
- Department of Epidemiology and Health Statistics, School of Public Health, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China.,Institute for Medical Dataology, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China
| | - Fuzhong Xue
- Department of Epidemiology and Health Statistics, School of Public Health, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China.,Institute for Medical Dataology, Cheeloo College of Medicine, 12589Shandong University, Jinan, People's Republic of China
| |
Collapse
|
6
|
OUP accepted manuscript. Biometrika 2022. [DOI: 10.1093/biomet/asac004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
7
|
Cui Y, Luo C, Luo L, Yu Z. High-Dimensional Mediation Analysis Based on Additive Hazards Model for Survival Data. Front Genet 2021; 12:771932. [PMID: 35003213 PMCID: PMC8734376 DOI: 10.3389/fgene.2021.771932] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 10/19/2021] [Indexed: 11/13/2022] Open
Abstract
Mediation analysis has been extensively used to identify potential pathways between exposure and outcome. However, the analytical methods of high-dimensional mediation analysis for survival data are still yet to be promoted, especially for non-Cox model approaches. We propose a procedure including "two-step" variable selection and indirect effect estimation for the additive hazards model with high-dimensional mediators. We first apply sure independence screening and smoothly clipped absolute deviation regularization to select mediators. Then we use the Sobel test and the BH method for indirect effect hypothesis testing. Simulation results demonstrate its good performance with a higher true-positive rate and accuracy, as well as a lower false-positive rate. We apply the proposed procedure to analyze DNA methylation markers mediating smoking and survival time of lung cancer patients in a TCGA (The Cancer Genome Atlas) cohort study. The real data application identifies four mediate CpGs, three of which are newly found.
Collapse
Affiliation(s)
- Yidan Cui
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Chengwen Luo
- Public Laboratory, Taizhou Hospital of Zhejiang Province, Wenzhou Medical University, Linhai, Zhejiang, China
| | - Linghao Luo
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|