1
|
Hu W, Chen S, Cai J, Yang Y, Yan H, Chen F. High-dimensional mediation analysis for continuous outcome with confounders using overlap weighting method in observational epigenetic study. BMC Med Res Methodol 2024; 24:125. [PMID: 38831262 PMCID: PMC11145821 DOI: 10.1186/s12874-024-02254-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 05/22/2024] [Indexed: 06/05/2024] Open
Abstract
BACKGROUND Mediation analysis is a powerful tool to identify factors mediating the causal pathway of exposure to health outcomes. Mediation analysis has been extended to study a large number of potential mediators in high-dimensional data settings. The presence of confounding in observational studies is inevitable. Hence, it's an essential part of high-dimensional mediation analysis (HDMA) to adjust for the potential confounders. Although the propensity score (PS) related method such as propensity score regression adjustment (PSR) and inverse probability weighting (IPW) has been proposed to tackle this problem, the characteristics with extreme propensity score distribution of the PS-based method would result in the biased estimation. METHODS In this article, we integrated the overlapping weighting (OW) technique into HDMA workflow and proposed a concise and powerful high-dimensional mediation analysis procedure consisting of OW confounding adjustment, sure independence screening (SIS), de-biased Lasso penalization, and joint-significance testing underlying the mixture null distribution. We compared the proposed method with the existing method consisting of PS-based confounding adjustment, SIS, minimax concave penalty (MCP) variable selection, and classical joint-significance testing. RESULTS Simulation studies demonstrate the proposed procedure has the best performance in mediator selection and estimation. The proposed procedure yielded the highest true positive rate, acceptable false discovery proportion level, and lower mean square error. In the empirical study based on the GSE117859 dataset in the Gene Expression Omnibus database using the proposed method, we found that smoking history may lead to the estimated natural killer (NK) cell level reduction through the mediation effect of some methylation markers, mainly including methylation sites cg13917614 in CNP gene and cg16893868 in LILRA2 gene. CONCLUSIONS The proposed method has higher power, sufficient false discovery rate control, and precise mediation effect estimation. Meanwhile, it is feasible to be implemented with the presence of confounders. Hence, our method is worth considering in HDMA studies.
Collapse
Affiliation(s)
- Weiwei Hu
- Department of Epidemiology and Biostatistics, School of Public Health, Xi'an Jiaotong University, Xi'an, 710061, Shaanxi, China
| | - Shiyu Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Xi'an Jiaotong University, Xi'an, 710061, Shaanxi, China
| | - Jiaxin Cai
- Department of Epidemiology and Biostatistics, School of Public Health, Xi'an Jiaotong University, Xi'an, 710061, Shaanxi, China
| | - Yuhui Yang
- Department of Epidemiology and Biostatistics, School of Public Health, Xi'an Jiaotong University, Xi'an, 710061, Shaanxi, China
| | - Hong Yan
- Department of Epidemiology and Biostatistics, School of Public Health, Xi'an Jiaotong University, Xi'an, 710061, Shaanxi, China
| | - Fangyao Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Xi'an Jiaotong University, Xi'an, 710061, Shaanxi, China.
- Department of Radiology, First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, Shaanxi, China.
| |
Collapse
|
2
|
Chen F, Hu W, Cai J, Chen S, Si A, Zhang Y, Liu W. Instrumental variable-based high-dimensional mediation analysis with unmeasured confounders for survival data in the observational epigenetic study. Front Genet 2023; 14:1092489. [PMID: 36816039 PMCID: PMC9932046 DOI: 10.3389/fgene.2023.1092489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 01/16/2023] [Indexed: 02/04/2023] Open
Abstract
Background: High dimensional mediation analysis is frequently conducted to explore the role of epigenetic modifiers between exposure and health outcome. However, the issue of high dimensional mediation analysis with unmeasured confounders for survival analysis in observational study has not been well solved. Methods: In this study, we proposed an instrumental variable based approach for high dimensional mediation analysis with unmeasured confounders in survival analysis for epigenetic study. We used the Sobel's test, the Joint test, and the Bootstrap method to test the mediation effect. A comprehensive simulation study was conducted to decide the best test strategy. An empirical study based on DNA methylation data of lung cancer patients was conducted to illustrate the performance of the proposed method. Results: Simulation study suggested that the proposed method performed well in the identifying mediating factors. The estimation of the mediation effect by the proposed approach is also reliable with less bias compared with the classical approach. In the empirical study, we identified two DNA methylation signatures including cg21926276 and cg26387355 with a mediation effect of 0.226 (95%CI: 0.108-0.344) and 0.158 (95%CI: 0.065-0.251) between smoking and lung cancer using the proposed approach. Conclusion: The proposed method obtained good performance in simulation and empirical studies, it could be an effective statistical tool for high dimensional mediation analysis.
Collapse
Affiliation(s)
- Fangyao Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, China,Department of Radiology, First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Weiwei Hu
- Department of Radiology, First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Jiaxin Cai
- Department of Epidemiology and Biostatistics, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, China
| | - Shiyu Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, China
| | - Aima Si
- Department of Epidemiology and Biostatistics, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, China
| | - Yuxiang Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, China
| | - Wei Liu
- Department of Cell Biology and Genetics, School of Basic Medical Science, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, China,*Correspondence: Wei Liu,
| |
Collapse
|
3
|
Han Q, Wang Y, Sun N, Chu J, Hu W, Shen Y. Mediation analysis method review of high throughput data. Stat Appl Genet Mol Biol 2023; 22:sagmb-2023-0031. [PMID: 38015771 DOI: 10.1515/sagmb-2023-0031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 11/11/2023] [Indexed: 11/30/2023]
Abstract
High-throughput technologies have made high-dimensional settings increasingly common, providing opportunities for the development of high-dimensional mediation methods. We aimed to provide useful guidance for researchers using high-dimensional mediation analysis and ideas for biostatisticians to develop it by summarizing and discussing recent advances in high-dimensional mediation analysis. The method still faces many challenges when extended single and multiple mediation analyses to high-dimensional settings. The development of high-dimensional mediation methods attempts to address these issues, such as screening true mediators, estimating mediation effects by variable selection, reducing the mediation dimension to resolve correlations between variables, and utilizing composite null hypothesis testing to test them. Although these problems regarding high-dimensional mediation have been solved to some extent, some challenges remain. First, the correlation between mediators are rarely considered when the variables are selected for mediation. Second, downscaling without incorporating prior biological knowledge makes the results difficult to interpret. In addition, a method of sensitivity analysis for the strict sequential ignorability assumption in high-dimensional mediation analysis is still lacking. An analyst needs to consider the applicability of each method when utilizing them, while a biostatistician could consider extensions and improvements in the methodology.
Collapse
Affiliation(s)
- Qiang Han
- Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou 215123, China
| | - Yu Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou 215123, China
| | - Na Sun
- Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou 215123, China
| | - Jiadong Chu
- Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou 215123, China
| | - Wei Hu
- Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou 215123, China
| | - Yueping Shen
- Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou 215123, China
| |
Collapse
|
4
|
Luo L, Yan Y, Cui Y, Yuan X, Yu Z. Linear high-dimensional mediation models adjusting for confounders using propensity score method. Front Genet 2022; 13:961148. [PMID: 36299590 PMCID: PMC9589256 DOI: 10.3389/fgene.2022.961148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Accepted: 09/14/2022] [Indexed: 11/13/2022] Open
Abstract
High-dimensional mediation analysis has been developed to study whether epigenetic phenotype in a high-dimensional data form would mediate the causal pathway of exposure to disease. However, most existing models are designed based on the assumption that there are no confounders between the exposure, the mediators, and the outcome. In practice, this assumption may not be feasible since high-dimensional mediation analysis (HIMA) tends to be observational where a randomized controlled trial (RCT) cannot be conducted for some economic or ethical reasons. Thus, to deal with the confounders in HIMA cases, we proposed three propensity score-related approaches named PSR (propensity score regression), PSW (propensity score weighting), and PSU (propensity score union) to adjust for the confounder bias in HIMA, and compared them with the traditional covariate regression method. The procedures mainly include four parts: calculating the propensity score, sure independence screening, MCP (minimax concave penalty) variable selection, and joint-significance testing. Simulation results show that the PSU model is the most recommended. Applying our models to the TCGA lung cancer dataset, we find that smoking may lead to lung disease through the mediation effect of some specific DNA-methylation sites, including site Cg24480765 in gene RP11-347H15.2 and site Cg22051776 in gene KLF3.
Collapse
Affiliation(s)
- Linghao Luo
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Yuting Yan
- Jinmai Community Service Center, Guiyang, China
| | - Yidan Cui
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Xin Yuan
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- *Correspondence: Zhangsheng Yu,
| |
Collapse
|