1
|
Mehrbakhsh Z, Tapak L, Behnampour N, Roshanaei G. Identification of Risk Factors for Relapse in Childhood Leukemia Using Penalized Semi-parametric Mixture Cure Competing Risks Model. J Res Health Sci 2024; 24:e00615. [PMID: 39072551 PMCID: PMC11264451 DOI: 10.34172/jrhs.2024.150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 02/07/2024] [Accepted: 04/21/2024] [Indexed: 07/30/2024] Open
Abstract
BACKGROUND Leukemia is the most common childhood malignancy. Identifying prognostic factors of patient survival and relapse using more reliable statistical models instead of traditional variable selection methods such as stepwise regression is of great importance. The present study aimed to apply a penalized semi-parametric mixture cure model to identify the prognostic factors affecting short-term and long-term survival of childhood leukemia in the presence of competing risks. The outcome of interest in this study was time to relapse. Study Design: A retrospective cohort study. METHODS A total of 178 patients (0‒15 years old) with leukemia participated in this study (September 1997 to September 2016, followed up to June 2021) at Golestan University of Medical Sciences, Iran. Demographic, clinical, and laboratory data were collected, and then a penalized semi-parametric mixture cure competing risk model with smoothly clipped absolute deviation (SCAD) and least absolute shrinkage and selection operator (LASSO) regularizations was used to analyze the data. RESULTS Important prognostic factors of relapse patients selected by the SCAD regularization method were platelets (150000‒400000 vs.>400000; odds ratio=0.31) in the cure part and type of leukemia (ALL vs. AML, hazard ratio (HR)=0.08), mediastinal tumor (yes vs. no, HR=16.28), splenomegaly (yes vs. no; HR=2.94), in the latency part. In addition, significant prognostic factors of death identified by the SCAD regularization method included white blood cells (<4000 vs.>11000, HR=0.25) and rheumatoid arthritis signs (yes vs. no, HR=5.75) in the latency part. CONCLUSION Several laboratory factors and clinical side effects were associated with relapse and death, which can be beneficial in treating the disease and predicting relapse and death time.
Collapse
Affiliation(s)
- Zahra Mehrbakhsh
- Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran
- Student Research Committee, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Leili Tapak
- Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran
- Modeling of Noncommunicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Nasser Behnampour
- Department of Biostatistics and Epidemiology, School of Health, Golestan University of Medical Sciences, Gorgan, Iran
| | - Ghodratollah Roshanaei
- Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran
- Modeling of Noncommunicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan, Iran
| |
Collapse
|
2
|
Monterrubio-Gómez K, Constantine-Cooke N, Vallejos CA. A review on statistical and machine learning competing risks methods. Biom J 2024; 66:e2300060. [PMID: 38351217 DOI: 10.1002/bimj.202300060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 08/31/2023] [Accepted: 10/15/2023] [Indexed: 02/16/2024]
Abstract
When modeling competing risks (CR) survival data, several techniques have been proposed in both the statistical and machine learning literature. State-of-the-art methods have extended classical approaches with more flexible assumptions that can improve predictive performance, allow high-dimensional data and missing values, among others. Despite this, modern approaches have not been widely employed in applied settings. This article aims to aid the uptake of such methods by providing a condensed compendium of CR survival methods with a unified notation and interpretation across approaches. We highlight available software and, when possible, demonstrate their usage via reproducible R vignettes. Moreover, we discuss two major concerns that can affect benchmark studies in this context: the choice of performance metrics and reproducibility.
Collapse
Affiliation(s)
| | - Nathan Constantine-Cooke
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, UK
- Centre for Genomic and Experimental Medicine, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Catalina A Vallejos
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, UK
- The Alan Turing Institute, London, UK
| |
Collapse
|
3
|
Salerno S, Li Y. High-Dimensional Survival Analysis: Methods and Applications. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION 2023; 10:25-49. [PMID: 36968638 PMCID: PMC10038209 DOI: 10.1146/annurev-statistics-032921-022127] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
In the era of precision medicine, time-to-event outcomes such as time to death or progression are routinely collected, along with high-throughput covariates. These high-dimensional data defy classical survival regression models, which are either infeasible to fit or likely to incur low predictability due to over-fitting. To overcome this, recent emphasis has been placed on developing novel approaches for feature selection and survival prognostication. We will review various cutting-edge methods that handle survival outcome data with high-dimensional predictors, highlighting recent innovations in machine learning approaches for survival prediction. We will cover the statistical intuitions and principles behind these methods and conclude with extensions to more complex settings, where competing events are observed. We exemplify these methods with applications to the Boston Lung Cancer Survival Cohort study, one of the largest cancer epidemiology cohorts investigating the complex mechanisms of lung cancer.
Collapse
Affiliation(s)
- Stephen Salerno
- Department of Biostatistics, University of Michigan, Ann Arbor, United States, 48109
| | - Yi Li
- Department of Biostatistics, University of Michigan, Ann Arbor, United States, 48109
| |
Collapse
|
4
|
Sun H, Wang X. High-dimensional feature selection in competing risks modeling: A stable approach using a split-and-merge ensemble algorithm. Biom J 2023; 65:e2100164. [PMID: 35934836 PMCID: PMC10087963 DOI: 10.1002/bimj.202100164] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 02/16/2022] [Accepted: 03/04/2022] [Indexed: 11/06/2022]
Abstract
Variable selection is critical in competing risks regression with high-dimensional data. Although penalized variable selection methods and other machine learning-based approaches have been developed, many of these methods often suffer from instability in practice. This paper proposes a novel method named Random Approximate Elastic Net (RAEN). Under the proportional subdistribution hazards model, RAEN provides a stable and generalizable solution to the large-p-small-n variable selection problem for competing risks data. Our general framework allows the proposed algorithm to be applicable to other time-to-event regression models, including competing risks quantile regression and accelerated failure time models. We show that variable selection and parameter estimation improved markedly using the new computationally intensive algorithm through extensive simulations. A user-friendly R package RAEN is developed for public use. We also apply our method to a cancer study to identify influential genes associated with the death or progression from bladder cancer.
Collapse
Affiliation(s)
- Han Sun
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, Ohio, USA.,Department of Quantitative Health Sciences, Cleveland Clinic, Cleveland, Ohio, USA
| | - Xiaofeng Wang
- Department of Quantitative Health Sciences, Cleveland Clinic, Cleveland, Ohio, USA
| |
Collapse
|
5
|
Pretransplant survival of patients with end-stage heart failure under competing risks. PLoS One 2022; 17:e0273100. [PMID: 35960742 PMCID: PMC9374238 DOI: 10.1371/journal.pone.0273100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 08/03/2022] [Indexed: 11/19/2022] Open
Abstract
Heart transplantation is the gold standard of care for end-stage heart failure in the United States. Donor hearts are a scarce resource, however the current allocation policy—proposed in 2016 and implemented in 2018—has not addressed certain disparities. Between 2005 and 2016, the number of active candidates increased 127%, whereas transplant rates decreased 27.8%. Pretransplant mortality rates declined steadily for that period from 14.6 to 9.7, especially for candidates with mechanical circulatory assistive devices (MCSDs). This study reports survival analyses of candidates for heart transplantation list under competing events of transplantation and MCSD implantation. We queried the transplant data for a cohort of adult patients (age ≥ 16) without MCSDs prior to listing for transplantation between 2005 and 2014 (n = 23,373). We used cause-specific and subdistribution hazards models as multivariate regressions for all competing events. Patients listed as low priority for transplantation are less likely to require implantation but less likely to survive after 1,000 days of listing than patients listed at higher priorities. The current policy does not address this disparity as it focuses on stratifying patients with different types of MCSD. Clinical characteristics must be considered in prioritization.
Collapse
|
6
|
Rakhmawati TW, Ha ID, Lee H, Lee Y. Penalized variable selection for cause-specific hazard frailty models with clustered competing-risks data. Stat Med 2021; 40:6541-6557. [PMID: 34541690 DOI: 10.1002/sim.9197] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 08/27/2021] [Accepted: 08/28/2021] [Indexed: 11/08/2022]
Abstract
Competing risks data usually arise when an occurrence of an event precludes other types of events from being observed. Such data are often encountered in a clustered clinical study such as a multi-center clinical trial. For the clustered competing-risks data which are correlated within a cluster, competing-risks models allowing for frailty terms have been recently studied. To the best of our knowledge, however, there is no literature on variable selection methods for cause-specific hazard frailty models. In this article, we propose a variable selection procedure for fixed effects in cause-specific competing risks frailty models using a penalized h-likelihood (HL). Here, we study three penalty functions, LASSO, SCAD, and HL. Simulation studies demonstrate that the proposed procedure using the HL penalty works well, providing a higher probability of choosing the true model than LASSO and SCAD methods without losing prediction accuracy. The proposed method is illustrated by using two kinds of clustered competing-risks cancer data sets.
Collapse
Affiliation(s)
| | - Il Do Ha
- Department of Statistics, Pukyong National University, Busan, South Korea
| | - Hangbin Lee
- Department of Statistics, Seoul National University, Seoul, South Korea
| | - Youngjo Lee
- Department of Statistics, Seoul National University, Seoul, South Korea
| |
Collapse
|
7
|
Ha ID, Lee Y. A review of h-likelihood for survival analysis. JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE 2021. [DOI: 10.1007/s42081-021-00125-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
8
|
|
9
|
Dang X, Huang S, Qian X. Risk Factor Identification in Heterogeneous Disease Progression with L1-Regularized Multi-state Models. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2021; 5:20-53. [PMID: 35415453 PMCID: PMC8982743 DOI: 10.1007/s41666-020-00085-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 10/13/2020] [Accepted: 11/26/2020] [Indexed: 10/22/2022]
Abstract
Multi-state model (MSM) is a useful tool to analyze longitudinal data for modeling disease progression at multiple time points. While the regularization approaches to variable selection have been widely used, extending them to MSM remains largely unexplored. In this paper, we have developed the L1-regularized multi-state model (L1MSTATE) framework that enables parameter estimation and variable selection simultaneously. The regularized optimization problem was solved by deriving a one-step coordinate descent algorithm with great computational efficiency. The L1MSTATE approach was evaluated using extensive simulation studies, and it showed that L1MSTATE outperformed existing regularized multi-state models in terms of the accurate identification of risk factors. It also outperformed the un-regularized multi-state models (MSTATE) in terms of identifying the important risk factors in situations with small sample sizes. The power of L1MSTATE in predicting the transition probabilities comparing with MSTATE was demonstrated using the Europe Blood and Marrow Transplantation (EBMT) dataset. The L1MSTATE was implemented in the open-access R package 'L1mstate'.
Collapse
Affiliation(s)
- Xuan Dang
- Texas A&M University, College Station, TX 77840 USA
| | - Shuai Huang
- University of Washington, Seattle, WA 98195 USA
| | | |
Collapse
|
10
|
Chen X, Li C, Zhang T, Gao Z. On correlation rank screening for ultra-high dimensional competing risks data. J Appl Stat 2021; 49:1848-1864. [DOI: 10.1080/02664763.2021.1884209] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Xiaolin Chen
- School of Statistics, Qufu Normal University, Qufu, People's Republic of China
| | - Chenguang Li
- School of Statistics, Qufu Normal University, Qufu, People's Republic of China
| | - Tao Zhang
- School of Science, Guangxi University of Science and Technology, Liuzhou, People's Republic of China
| | - Zhenlong Gao
- School of Statistics, Qufu Normal University, Qufu, People's Republic of China
| |
Collapse
|
11
|
Kawaguchi ES, Shen JI, Suchard MA, Li G. Scalable Algorithms for Large Competing Risks Data. J Comput Graph Stat 2020; 30:685-693. [DOI: 10.1080/10618600.2020.1841650] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Eric S. Kawaguchi
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA
| | - Jenny I. Shen
- Division of Nephrology and Hypertension, The Lundquist Institute at Harbor UCLA, Torrance, CA
- David Geffen School of Medicine at UCLA, Los Angeles, CA
| | - Marc A. Suchard
- Department of Biostatistics, University of California, Los Angeles, CA
- Department of Computational Medicine, University of California, Los Angeles, CA
- Department of Human Genetics, University of California, Los Angeles, CA
| | - Gang Li
- Department of Biostatistics, University of California, Los Angeles, CA
- Department of Computational Medicine, University of California, Los Angeles, CA
| |
Collapse
|
12
|
Dai D, Wang Y, Zhu L, Jin H, Wang X. Prognostic value of KRAS mutation status in colorectal cancer patients: a population-based competing risk analysis. PeerJ 2020; 8:e9149. [PMID: 32547859 PMCID: PMC7271887 DOI: 10.7717/peerj.9149] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 04/17/2020] [Indexed: 12/12/2022] Open
Abstract
Background To use competing analyses to estimate the prognostic value of KRAS mutation status in colorectal cancer (CRC) patients and to build nomogram for CRC patients who had KRAS testing. Method The cohort was selected from the Surveillance, Epidemiology, and End Results database. Cumulative incidence function model and multivariate Fine-Gray regression for proportional hazards modeling of the subdistribution hazard (SH) model were used to estimate the prognosis. An SH model based nomogram was built after a variable selection process. The validation of the nomogram was conducted by discrimination and calibration with 1,000 bootstraps. Results We included 8,983 CRC patients who had KRAS testing. SH model found that KRAS mutant patients had worse CSS than KRAS wild type patients in overall cohort (HR = 1.10 (95% CI [1.04–1.17]), p < 0.05), and in subgroups that comprised stage III CRC (HR = 1.28 (95% CI [1.09–1.49]), p < 0.05) and stage IV CRC (HR = 1.14 (95% CI [1.06–1.23]), p < 0.05), left side colon cancer (HR = 1.28 (95% CI [1.15–1.42]), p < 0.05) and rectal cancer (HR = 1.23 (95% CI [1.07–1.43]), p < 0.05). We built the SH model based nomogram, which showed good accuracy by internal validation of discrimination and calibration. Calibration curves represented good agreement between the nomogram predicted CRC caused death and actual observed CRC caused death. The time dependent area under the curve of receiver operating characteristic curves (AUC) was over 0.75 for the nomogram. Conclusion This is the first population based competing risk study on the association between KRAS mutation status and the CRC prognosis. The mutation of KRAS indicated a poor prognosis of CRC patients. The current competing risk nomogram would help physicians to predict cancer specific death of CRC patients who had KRAS testing.
Collapse
Affiliation(s)
- Dongjun Dai
- Department of Medical Oncology, Sir Run Run Shaw Hospital, Medical School of Zhejiang University, Zhejiang University, Hangzhou, Zhejiang, China
| | - Yanmei Wang
- Department of Medical Oncology, Sir Run Run Shaw Hospital, Medical School of Zhejiang University, Zhejiang University, Hangzhou, Zhejiang, China
| | - Liyuan Zhu
- Laboratory of Cancer Biology, Key Lab of Biotherapy, Sir Run Run Shaw Hospital, Medical School of Zhejiang University, Zhejiang University, Hangzhou, Zhejiang, China
| | - Hongchuan Jin
- Laboratory of Cancer Biology, Key Lab of Biotherapy, Sir Run Run Shaw Hospital, Medical School of Zhejiang University, Zhejiang University, Hangzhou, Zhejiang, China
| | - Xian Wang
- Department of Medical Oncology, Sir Run Run Shaw Hospital, Medical School of Zhejiang University, Zhejiang University, Hangzhou, Zhejiang, China
| |
Collapse
|
13
|
Dai D, Shi R, Wang Z, Zhong Y, Shin VY, Jin H, Wang X. Competing Risk Analyses of Medullary Carcinoma of Breast in Comparison to Infiltrating Ductal Carcinoma. Sci Rep 2020; 10:560. [PMID: 31953417 PMCID: PMC6969020 DOI: 10.1038/s41598-019-57168-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2019] [Accepted: 12/19/2019] [Indexed: 12/29/2022] Open
Abstract
The aim of current study was to use competing risk model to assess whether medullary carcinoma of the breast (MCB) has a better prognosis than invasive ductal carcinomas of breast cancer (IDC), and to build a competing risk nomogram for predicting the risk of death of MCB. We involved 3,580 MCB patients and 319,566 IDC patients from Surveillance, Epidemiology, and End Results (SEER) database. IDC was found to have a worse BCSS than MCB (Hazard ratio (HR) > 1, p < 0.001). The 5-year cumulative incidences of death (CID) was higher in IDC than MCB (p < 0.001). Larger tumor size, increasing number of positive lymph nodes and unmarried status were found to worsen the BCSS of MCB (HR > 1, p < 0.001). We found no association between ER, PR, radiotherapy or chemotherapy and MCB prognosis (p > 0.05). After a penalized variable selection process, the SH model-based nomogram showed moderate accuracy of prediction by internal validation of discrimination and calibration with 1,000 bootstraps. In summary, MCB patients had a better prognosis than IDC patients. Interestingly, unmarried status in addition to expected risk factors such as larger tumor size and increasing number of positive lymph nodes were found to worsen the BCSS of MCB. We also established a competing risk nomogram as an easy-to-use tool for prognostic estimation of MCB patients.
Collapse
Affiliation(s)
- Dongjun Dai
- Department of Medical Oncology, Sir Run Run Shaw Hospital, Medical School of Zhejiang University, Hangzhou, China
| | - Rongkai Shi
- Department of Medical Oncology, Sir Run Run Shaw Hospital, Medical School of Zhejiang University, Hangzhou, China
| | - Zhuo Wang
- Department of Medical Oncology, Sir Run Run Shaw Hospital, Medical School of Zhejiang University, Hangzhou, China
| | - Yiming Zhong
- Department of Medical Oncology, Sir Run Run Shaw Hospital, Medical School of Zhejiang University, Hangzhou, China
| | - Vivian Y Shin
- Department of Surgery, Queen Mary Hospital, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Hongchuan Jin
- Laboratory of Cancer Biology, Key Lab of Biotherapy, Sir Run Run Shaw Hospital, Medical School of Zhejiang University, Hangzhou, China
| | - Xian Wang
- Department of Medical Oncology, Sir Run Run Shaw Hospital, Medical School of Zhejiang University, Hangzhou, China.
| |
Collapse
|
14
|
Ahn KW, Banerjee A, Sahr N, Kim S. Group and within-group variable selection for competing risks data. LIFETIME DATA ANALYSIS 2018; 24:407-424. [PMID: 28779228 PMCID: PMC5797529 DOI: 10.1007/s10985-017-9400-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2016] [Accepted: 07/23/2017] [Indexed: 06/07/2023]
Abstract
Variable selection in the presence of grouped variables is troublesome for competing risks data: while some recent methods deal with group selection only, simultaneous selection of both groups and within-group variables remains largely unexplored. In this context, we propose an adaptive group bridge method, enabling simultaneous selection both within and between groups, for competing risks data. The adaptive group bridge is applicable to independent and clustered data. It also allows the number of variables to diverge as the sample size increases. We show that our new method possesses excellent asymptotic properties, including variable selection consistency at group and within-group levels. We also show superior performance in simulated and real data sets over several competing approaches, including group bridge, adaptive group lasso, and AIC / BIC-based methods.
Collapse
Affiliation(s)
- Kwang Woo Ahn
- Division of Biostatistics, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA.
| | - Anjishnu Banerjee
- Division of Biostatistics, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| | - Natasha Sahr
- Division of Biostatistics, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| | - Soyoung Kim
- Division of Biostatistics, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| |
Collapse
|
15
|
Ren X, Li S, Shen C, Yu Z. Linear and nonlinear variable selection in competing risks data. Stat Med 2018; 37:2134-2147. [PMID: 29579776 DOI: 10.1002/sim.7637] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2016] [Revised: 01/12/2018] [Accepted: 01/24/2018] [Indexed: 11/12/2022]
Abstract
Subdistribution hazard model for competing risks data has been applied extensively in clinical researches. Variable selection methods of linear effects for competing risks data have been studied in the past decade. There is no existing work on selection of potential nonlinear effects for subdistribution hazard model. We propose a two-stage procedure to select the linear and nonlinear covariate(s) simultaneously and estimate the selected covariate effect(s). We use spectral decomposition approach to distinguish the linear and nonlinear parts of each covariate and adaptive LASSO to select each of the 2 components. Extensive numerical studies are conducted to demonstrate that the proposed procedure can achieve good selection accuracy in the first stage and small estimation biases in the second stage. The proposed method is applied to analyze a cardiovascular disease data set with competing death causes.
Collapse
Affiliation(s)
- Xiaowei Ren
- IUSM-Department of Biostatistics, Indiana University, Indianapolis, IN, USA
| | - Shanshan Li
- IUSM-Department of Biostatistics, Indiana University, Indianapolis, IN, USA
| | - Changyu Shen
- Beth Israel Deaconess Medical Center, Smith Center, Havard University, Boston, MA, USA
| | - Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, SJTU - Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
16
|
Hou J, Paravati A, Hou J, Xu R, Murphy J. High-dimensional variable selection and prediction under competing risks with application to SEER-Medicare linked data. Stat Med 2018; 37:3486-3502. [DOI: 10.1002/sim.7822] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2017] [Revised: 04/09/2018] [Accepted: 04/26/2018] [Indexed: 11/12/2022]
Affiliation(s)
- Jiayi Hou
- Altman Clinical and Translational Research Institute; University of California, San Diego; La Jolla CA 92093 U.S.A
| | - Anthony Paravati
- Department of Radiation Medicine and Applied Sciences; University of California, San Diego; La Jolla CA 92093 U.S.A
| | - Jue Hou
- Department of Mathematics; University of California, San Diego; La Jolla CA 92093 U.S.A
| | - Ronghui Xu
- Department of Mathematics; University of California, San Diego; La Jolla CA 92093 U.S.A
- Department of Family Medicine and Public Health; University of California, San Diego; La Jolla CA 92093 U.S.A
| | - James Murphy
- Department of Radiation Medicine and Applied Sciences; University of California, San Diego; La Jolla CA 92093 U.S.A
| |
Collapse
|
17
|
Abstract
BACKGROUND Epidemiologic studies that aim to estimate a causal effect of an exposure on a particular event of interest may be complicated by the existence of competing events that preclude the occurrence of the primary event. Recently, many articles have been published in the epidemiologic literature demonstrating the need for appropriate models to accommodate competing risks when they are present. However, there has been little attention to variable selection for confounder control in competing risk analyses. METHODS We employ simulation to demonstrate the bias in two variable selection strategies include covariates that are associated with the exposure and (1) which change the cause-specific hazard of any of the outcomes; or (2) which change the cause-specific hazard of the specific event of interest. RESULTS We demonstrated minimal to no bias in estimators adjusted for confounders of exposure and either the event of interest or the competing event, but bias of varying magnitude in almost all estimators adjusted only for confounders of exposure and the primary outcome. DISCUSSION When estimating causal effects for which there are competing risks, the analysis should control for confounders of both the exposure-primary outcome effect and of the exposure-competing outcome effect.
Collapse
|
18
|
Fu Z, Parikh CR, Zhou B. Penalized variable selection in competing risks regression. LIFETIME DATA ANALYSIS 2017; 23:353-376. [PMID: 27016934 DOI: 10.1007/s10985-016-9362-3] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Accepted: 03/12/2016] [Indexed: 06/05/2023]
Abstract
Penalized variable selection methods have been extensively studied for standard time-to-event data. Such methods cannot be directly applied when subjects are at risk of multiple mutually exclusive events, known as competing risks. The proportional subdistribution hazard (PSH) model proposed by Fine and Gray (J Am Stat Assoc 94:496-509, 1999) has become a popular semi-parametric model for time-to-event data with competing risks. It allows for direct assessment of covariate effects on the cumulative incidence function. In this paper, we propose a general penalized variable selection strategy that simultaneously handles variable selection and parameter estimation in the PSH model. We rigorously establish the asymptotic properties of the proposed penalized estimators and modify the coordinate descent algorithm for implementation. Simulation studies are conducted to demonstrate the good performance of the proposed method. Data from deceased donor kidney transplants from the United Network of Organ Sharing illustrate the utility of the proposed method.
Collapse
Affiliation(s)
- Zhixuan Fu
- Biostatistics Department, Yale University, 60 College Street, New Haven, CT, 06510, USA
| | - Chirag R Parikh
- Section of Nephrology, Department of Internal Medicine, Yale University, 60 Temple Street, Suite 6C, New Haven, CT, 06510, USA
| | - Bingqing Zhou
- Biostatistics Department, Yale University, 60 College Street, New Haven, CT, 06510, USA.
- Novartis AG, 1 Health Plaza, East Hanover, NJ, USA.
| |
Collapse
|
19
|
Fu Z, Ma S, Lin H, Parikh CR, Zhou B. Penalized Variable Selection for Multi-center Competing Risks Data. STATISTICS IN BIOSCIENCES 2016. [DOI: 10.1007/s12561-016-9181-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|