1
|
Hu L. A new method for clustered survival data: Estimation of treatment effect heterogeneity and variable selection. Biom J 2024; 66:e2200178. [PMID: 38072661 PMCID: PMC10953775 DOI: 10.1002/bimj.202200178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/31/2023] [Accepted: 08/11/2023] [Indexed: 01/30/2024]
Abstract
We recently developed a new method random-intercept accelerated failure time model with Bayesian additive regression trees (riAFT-BART) to draw causal inferences about population treatment effect on patient survival from clustered and censored survival data while accounting for the multilevel data structure. The practical utility of this method goes beyond the estimation of population average treatment effect. In this work, we exposit how riAFT-BART can be used to solve two important statistical questions with clustered survival data: estimating the treatment effect heterogeneity and variable selection. Leveraging the likelihood-based machine learning, we describe a way in which we can draw posterior samples of the individual survival treatment effect from riAFT-BART model runs, and use the drawn posterior samples to perform an exploratory treatment effect heterogeneity analysis to identify subpopulations who may experience differential treatment effects than population average effects. There is sparse literature on methods for variable selection among clustered and censored survival data, particularly ones using flexible modeling techniques. We propose a permutation-based approach using the predictor's variable inclusion proportion supplied by the riAFT-BART model for variable selection. To address the missing data issue frequently encountered in health databases, we propose a strategy to combine bootstrap imputation and riAFT-BART for variable selection among incomplete clustered survival data. We conduct an expansive simulation study to examine the practical operating characteristics of our proposed methods, and provide empirical evidence that our proposed methods perform better than several existing methods across a wide range of data scenarios. Finally, we demonstrate the methods via a case study of predictors for in-hospital mortality among severe COVID-19 patients and estimating the heterogeneous treatment effects of three COVID-specific medications. The methods developed in this work are readily available in the R ${\textsf {R}}$ package riAFTBART $\textsf {riAFTBART}$ .
Collapse
Affiliation(s)
- Liangyuan Hu
- Department of Biostatistics and Epidemiology, Rutgers University, Piscataway, New Jersey 08854
| |
Collapse
|
2
|
Chauvet J, Rondeau V. A flexible class of generalized joint frailty models for the analysis of survival endpoints. Stat Med 2023; 42:1233-1262. [PMID: 36775273 DOI: 10.1002/sim.9667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 10/12/2021] [Accepted: 11/17/2021] [Indexed: 02/14/2023]
Abstract
This article focuses on shared frailty models for correlated failure times, as well as joint frailty models for the simultaneous analysis of recurrent events (eg, appearance of new cancerous lesions or hospital readmissions) and a major terminal event (typically, death). As extensions of the Cox model, these joint models usually assume a frailty proportional hazards model for each of the recurrent and terminal event processes. In order to extend these models beyond the proportional hazards assumption, our proposal is to replace these proportional hazards models with generalized survival models, for which the survival function is modeled as a linear predictor through a link function. Depending on the link function considered, these can be reduced to proportional hazards, proportional odds, additive hazards, or probit models. We first consider a fully parametric framework for the time and covariate effects. For proportional and additive hazards models, our approach also allows the use of smooth functions for baseline hazard functions and time-varying coefficients. The dependence between recurrent and terminal event processes is modeled by conditioning on a shared frailty acting differently on the two processes. Parameter estimates are provided using the maximum (penalized) likelihood method, implemented in the R package frailtypack (function GenfrailtyPenal). We perform simulation studies to assess the method, which is also illustrated on real datasets.
Collapse
Affiliation(s)
- Jocelyn Chauvet
- INSERM U1219, Biostatistics Team, University of Bordeaux, Bordeaux, France.,ICES Research Center, La Roche-sur-Yon, France.,Angevin Research Laboratory in Systems Engineering, Angers, France
| | - Virginie Rondeau
- INSERM U1219, Biostatistics Team, University of Bordeaux, Bordeaux, France
| |
Collapse
|
3
|
Hu L, Ji J, Ennis RD, Hogan JW. A flexible approach for causal inference with multiple treatments and clustered survival outcomes. Stat Med 2022; 41:4982-4999. [PMID: 35948011 PMCID: PMC9588538 DOI: 10.1002/sim.9548] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 07/20/2022] [Accepted: 07/22/2022] [Indexed: 01/07/2023]
Abstract
When drawing causal inferences about the effects of multiple treatments on clustered survival outcomes using observational data, we need to address implications of the multilevel data structure, multiple treatments, censoring, and unmeasured confounding for causal analyses. Few off-the-shelf causal inference tools are available to simultaneously tackle these issues. We develop a flexible random-intercept accelerated failure time model, in which we use Bayesian additive regression trees to capture arbitrarily complex relationships between censored survival times and pre-treatment covariates and use the random intercepts to capture cluster-specific main effects. We develop an efficient Markov chain Monte Carlo algorithm to draw posterior inferences about the population survival effects of multiple treatments and examine the variability in cluster-level effects. We further propose an interpretable sensitivity analysis approach to evaluate the sensitivity of drawn causal inferences about treatment effect to the potential magnitude of departure from the causal assumption of no unmeasured confounding. Expansive simulations empirically validate and demonstrate good practical operating characteristics of our proposed methods. Applying the proposed methods to a dataset on older high-risk localized prostate cancer patients drawn from the National Cancer Database, we evaluate the comparative effects of three treatment approaches on patient survival, and assess the ramifications of potential unmeasured confounding. The methods developed in this work are readily available in theR $$ \mathsf{R}\kern.15em $$ packageriAFTBART $$ \mathsf{riAFTBART} $$ .
Collapse
Affiliation(s)
- Liangyuan Hu
- Department of Biostatistics and EpidemiologyRutgers UniversityPiscatawayNew JerseyUSA
| | - Jiayi Ji
- Department of Biostatistics and EpidemiologyRutgers UniversityPiscatawayNew JerseyUSA
| | - Ronald D. Ennis
- Department of Radiation OncologyCancer Institute of New Jersey of Rutgers UniversityNew BrunswickNew JerseyUSA
| | - Joseph W. Hogan
- Department of BiostatisticsBrown UniversityProvidenceRhode IslandUSA
| |
Collapse
|
4
|
Blaha O, Esserman D, Li F. Design and analysis of cluster randomized trials with time-to-event outcomes under the additive hazards mixed model. Stat Med 2022; 41:4860-4885. [PMID: 35908796 PMCID: PMC9588628 DOI: 10.1002/sim.9541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Revised: 05/04/2022] [Accepted: 07/19/2022] [Indexed: 11/12/2022]
Abstract
A primary focus of current methods for cluster randomized trials (CRTs) has been for continuous, binary, and count outcomes, with relatively less attention given to right-censored, time-to-event outcomes. In this article, we detail considerations for sample size requirement and statistical inference in CRTs with time-to-event outcomes when the intervention effect parameter is specified through the additive hazards mixed model (AHMM), which includes a frailty term to explicitly account for the dependency between the failure times. First, we discuss improved inference for the treatment effect parameter via bias-corrected sandwich variance estimators and randomization-based test under AHMM, addressing potential small-sample biases in CRTs. Next, we derive a new sample size formula for AHMM analysis of CRTs accommodating both equal and unequal cluster sizes. When the cluster sizes vary, our sample size formula depends on the mean and coefficient of variation of cluster sizes, based on which we articulate the impact of cluster size variation in CRTs with time-to-event outcomes. Furthermore, we obtain the insight that the classical variance inflation factor for CRTs with a non-censored outcome can in fact apply to CRTs with a time-to-event outcome, providing that an appropriate definition of the intraclass correlation coefficient is considered under AHMM. Simulation studies are carried out to illustrate key design and analysis considerations in CRTs with a small to moderate number of clusters. The proposed sample size procedure and analytical methods are further illustrated using the context of the STrategies to Reduce Injuries and Develop Confidence in Elders CRT.
Collapse
Affiliation(s)
- Ondrej Blaha
- Department of Biostatistics, Yale University School of Public Health, New Haven, Connecticut, USA
- Yale Center for Analytical Sciences, Yale University School of Public Health, New Haven, Connecticut, USA
| | - Denise Esserman
- Department of Biostatistics, Yale University School of Public Health, New Haven, Connecticut, USA
- Yale Center for Analytical Sciences, Yale University School of Public Health, New Haven, Connecticut, USA
| | - Fan Li
- Department of Biostatistics, Yale University School of Public Health, New Haven, Connecticut, USA
- Yale Center for Analytical Sciences, Yale University School of Public Health, New Haven, Connecticut, USA
- Center for Methods in Implementation and Prevention Science, Yale University School of Public Health, New Haven, Connecticut, USA
| |
Collapse
|
5
|
Zhang Z, Wang X, Peng Y. An additive hazards frailty model with semi-varying coefficients. LIFETIME DATA ANALYSIS 2022; 28:116-138. [PMID: 34820722 DOI: 10.1007/s10985-021-09540-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 11/01/2021] [Indexed: 06/13/2023]
Abstract
Proportional hazards frailty models have been extensively investigated and used to analyze clustered and recurrent failure times data. However, the proportional hazards assumption in the models may not always hold in practice. In this paper, we propose an additive hazards frailty model with semi-varying coefficients, which allows some covariate effects to be time-invariant while other covariate effects to be time-varying. The time-varying and time-invariant regression coefficients are estimated by a set of estimating equations, whereas the frailty parameter is estimated by the moment method. The large sample properties of the proposed estimators are established. The finite sample performance of the estimators is examined by simulation studies. The proposed model and estimation are illustrated with an analysis of data from a rehospitalization study of colorectal cancer patients.
Collapse
Affiliation(s)
- Zhongwen Zhang
- School of Public Health and Management, Binzhou Medical University, Yantai, 264003, China
| | - Xiaoguang Wang
- School of Mathematical Sciences, Dalian University of Technology, Dalian, 116024, China
| | - Yingwei Peng
- Departments of Public Health Sciences and Mathematics and Statistics, Queen's University, Kingston, ON, Canada.
| |
Collapse
|
6
|
Fangyuan K, Jiang G. A mixed effect model for clustered recurrent event data. COMMUN STAT-SIMUL C 2021. [DOI: 10.1080/03610918.2021.2012194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Kang Fangyuan
- School of Applied Science, Beijing Information Science and Technology University, Beijing, China
| | - Guo Jiang
- School of Applied Science, Beijing Information Science and Technology University, Beijing, China
| |
Collapse
|
7
|
Liu P, Song S, Zhou Y. Semiparametric additive frailty hazard model for clustered failure time data. CAN J STAT 2021. [DOI: 10.1002/cjs.11647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Peng Liu
- School of Mathematics, Statistics and Actuarial Science University of Kent Canterbury UK
| | - Shanshan Song
- School of Statistics and Management Shanghai University of Finance and Economics Shanghai China
| | - Yong Zhou
- Academy of Statistics and Interdisciplinary Sciences East China Normal University Shanghai China
| |
Collapse
|
8
|
Chernoukhov A, Hussein A, Nkurunziza S, Bandyopadhyay D. Bayesian inference in time-varying additive hazards models with applications to disease mapping. ENVIRONMETRICS 2018; 29:e2478. [PMID: 30510463 PMCID: PMC6268206 DOI: 10.1002/env.2478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Environmental health and disease mapping studies are often concerned with the evaluation of the combined effect of various socio-demographic and behavioral factors, and environmental exposures on time-to-events of interest, such as death of individuals, organisms or plants. In such studies, estimation of the hazard function is often of interest. In addition to known explanatory variables, the hazard function maybe subject to spatial/geographical variations, such that proximally located regions may experience similar hazards than regions that are distantly located. A popular approach for handling this type of spatially-correlated time-to-event data is the Cox's Proportional Hazards (PH) regression model with spatial frailties. However, the PH assumption poses a major practical challenge, as it entails that the effects of the various explanatory variables remain constant over time. This assumption is often unrealistic, for instance, in studies with long follow-ups where the effects of some exposures on the hazard may vary drastically over time. Our goal in this paper is to offer a flexible semiparametric additive hazards model (AH) with spatial frailties. Our proposed model allows both the frailties as well as the regression coefficients to be time-varying, thus relaxing the proportionality assumption. Our estimation framework is Bayesian, powered by carefully tailored posterior sampling strategies via Markov chain Monte Carlo techniques. We apply the model to a dataset on prostate cancer survival from the US state of Louisiana to illustrate its advantages.
Collapse
Affiliation(s)
| | - A. Hussein
- Department of Mathematics and Statistics, University of Windsor, Canada
| | - S. Nkurunziza
- Department of Mathematics and Statistics, University of Windsor, Canada
| | - D. Bandyopadhyay
- Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
9
|
Zeng D, Hyun N, Cai J. Semiparametric Additive Model for Estimating Risk Difference in Multicenter Studies. BIOSTATISTICS & EPIDEMIOLOGY 2018; 2:84-98. [PMID: 30631827 PMCID: PMC6322696 DOI: 10.1080/24709360.2018.1445430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2018] [Accepted: 02/14/2018] [Indexed: 10/17/2022]
Abstract
Many cancer studies are conducted in multiple centers. While they have the advantage of more patients and larger population, center-to-center heterogeneity could be significant such that it cannot be ignored in analysis. In this paper, we propose semiparametric additive risk models with a general link function to estimate risk effects while accounting for center-specific baseline function. We propose an estimating equation for inference and show that the derived estimators are consistent and asymptotically normal. Simulation studies demonstrate good small-sample performance of the proposed method. We apply the method to analyze data from the Study of Left Ventricular Dysfunction (SOLVD) in 1990 and discuss application to one-to-one matched design.
Collapse
Affiliation(s)
- Donglin Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599
| | - Noorie Hyun
- Division of Biostatistics, Institute for Health and Society, Medical College of Wisconsin, Milwaukee, WI, 53226
| | - Jianwen Cai
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599
| |
Collapse
|
10
|
FangYuan K. An additive marginal regression model for clustered recurrent event in the presence of a terminal event. COMMUN STAT-THEOR M 2018. [DOI: 10.1080/03610926.2017.1332221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Kang FangYuan
- School of Applied Science, Beijing Information Science and Technology University, Beijing, China
| |
Collapse
|
11
|
Ding J, Sun L. Additive mixed effect model for recurrent gap time data. LIFETIME DATA ANALYSIS 2017; 23:223-253. [PMID: 26296808 DOI: 10.1007/s10985-015-9341-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2014] [Accepted: 08/17/2015] [Indexed: 06/04/2023]
Abstract
Gap times between recurrent events are often of primary interest in medical and observational studies. The additive hazards model, focusing on risk differences rather than risk ratios, has been widely used in practice. However, the marginal additive hazards model does not take the dependence among gap times into account. In this paper, we propose an additive mixed effect model to analyze gap time data, and the proposed model includes a subject-specific random effect to account for the dependence among the gap times. Estimating equation approaches are developed for parameter estimation, and the asymptotic properties of the resulting estimators are established. In addition, some graphical and numerical procedures are presented for model checking. The finite sample behavior of the proposed methods is evaluated through simulation studies, and an application to a data set from a clinic study on chronic granulomatous disease is provided.
Collapse
Affiliation(s)
- Jieli Ding
- School of Mathematics and Statistics, Wuhan University, Wuhan, 430072, Hubei, People's Republic of China
| | - Liuquan Sun
- Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, People's Republic of China.
| |
Collapse
|
12
|
Wan F. Simulating survival data with predefined censoring rates for proportional hazards models. Stat Med 2016; 36:838-854. [DOI: 10.1002/sim.7178] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2016] [Revised: 06/28/2016] [Accepted: 10/28/2016] [Indexed: 11/09/2022]
Affiliation(s)
- Fei Wan
- Department of Biostatistics; University of Arkansas for Medical Sciences; 4301 W. Markham St., # 781 Little Rock 72205 AR U.S.A
| |
Collapse
|
13
|
Pan D, He H, Song X, Sun L. Regression Analysis of Additive Hazards Model With Latent Variables. J Am Stat Assoc 2015. [DOI: 10.1080/01621459.2014.950083] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
14
|
|
15
|
Shen PS. Additive Mixed Effect Model for Clustered Doubly Censored Data. COMMUN STAT-SIMUL C 2013. [DOI: 10.1080/03610918.2012.697241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
16
|
Schneider S, Schmidli H, Friede T. Blinded sample size re-estimation for recurrent event data with time trends. Stat Med 2013; 32:5448-57. [DOI: 10.1002/sim.5977] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2012] [Accepted: 08/27/2013] [Indexed: 11/09/2022]
Affiliation(s)
- S. Schneider
- Department of Medical Statistics; University Medical Center Göttingen; Göttingen Germany
| | - H. Schmidli
- Statistical Methodology; Novartis Pharma AG; Basel Switzerland
| | - T. Friede
- Department of Medical Statistics; University Medical Center Göttingen; Göttingen Germany
| |
Collapse
|
17
|
Li J, Wang C, Sun J. Regression analysis of clustered interval-censored failure time data with the additive hazards model. J Nonparametr Stat 2012; 24:1041-1050. [PMID: 25914511 DOI: 10.1080/10485252.2012.720256] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
This paper discusses regression analysis of clustered failure time data, which means that the failure times of interest are clustered into small groups instead of being independent. Clustering occurs in many fields such as medical studies. For the problem, a number of methods have been proposed, but most of them apply only to clustered right-censored data. In reality, the failure time data is often interval-censored. That is, the failure times of interest are known only to lie in certain intervals. We propose an estimating equation-based approach for regression analysis of clustered interval-censored failure time data generated from the additive hazards model. A major advantage of the proposed method is that it does not involve the estimation of any baseline hazard function. Both asymptotic and finite sample properties of the proposed estimates of regression parameters are established and the method is illustrated by the data arising from a lymphatic filariasis study.
Collapse
Affiliation(s)
- Junlong Li
- Department of Statistics, University of Missouri, Columbia, MO 65211, USA
| | - Chunjie Wang
- Mathematics School and Institute, Jilin University, Changchun 130012, People's Republic of China
| | - Jianguo Sun
- Department of Statistics, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|