1
|
Sun D, Guo Y, Li Y, Sun J, Tu W. A flexible time-varying coefficient rate model for panel count data. LIFETIME DATA ANALYSIS 2024:10.1007/s10985-024-09630-1. [PMID: 38805094 DOI: 10.1007/s10985-024-09630-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 04/18/2024] [Indexed: 05/29/2024]
Abstract
Panel count regression is often required in recurrent event studies, where the interest is to model the event rate. Existing rate models are unable to handle time-varying covariate effects due to theoretical and computational difficulties. Mean models provide a viable alternative but are subject to the constraints of the monotonicity assumption, which tends to be violated when covariates fluctuate over time. In this paper, we present a new semiparametric rate model for panel count data along with related theoretical results. For model fitting, we present an efficient EM algorithm with three different methods for variance estimation. The algorithm allows us to sidestep the challenges of numerical integration and difficulties with the iterative convex minorant algorithm. We showed that the estimators are consistent and asymptotically normally distributed. Simulation studies confirmed an excellent finite sample performance. To illustrate, we analyzed data from a real clinical study of behavioral risk factors for sexually transmitted infections.
Collapse
Affiliation(s)
- Dayu Sun
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine and Richard M. Fairbanks School of Public Health, Indianapolis, IN, 46202, USA
| | | | - Yang Li
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine and Richard M. Fairbanks School of Public Health, Indianapolis, IN, 46202, USA
| | - Jianguo Sun
- Department of Statistics, University of Missouri, Columbia, MO, 65211, USA
| | - Wanzhu Tu
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine and Richard M. Fairbanks School of Public Health, Indianapolis, IN, 46202, USA.
| |
Collapse
|
2
|
Gawrieh S, Vilar-Gomez E, Wilson LA, Pike F, Kleiner DE, Neuschwander-Tetri BA, Diehl AM, Dasarathy S, Kowdley KV, Hameed B, Tonascia J, Loomba R, Sanyal AJ, Chalasani N. Increases and decreases in liver stiffness measurement are independently associated with the risk of liver-related events in NAFLD. J Hepatol 2024:S0168-8278(24)00343-X. [PMID: 38762169 DOI: 10.1016/j.jhep.2024.05.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 04/29/2024] [Accepted: 05/06/2024] [Indexed: 05/20/2024]
Abstract
BACKGROUND & AIMS The clinical significance of change in liver stiffness measurement (LSM) by vibration-controlled transient elastography (VCTE) in patients with non-alcoholic fatty liver disease (NAFLD) is not well-understood. We prospectively defined rates of progression to and regression from LSM-defined compensated advanced chronic liver disease (cACLD) and their associations with liver-related events (LREs). METHODS Participants in the NASH Clinical Research Network-led NAFLD Database 2 and 3 studies were included. Progression to cACLD was defined as reaching LSM ≥10 kPa in participants with LSM <10 kPa on initial VCTE; regression from cACLD was defined as reaching LSM <10 kPa in participants with baseline LSM ≥10 kPa. LREs were defined as liver-related death, liver transplant, hepatocellular carcinoma, MELD >15, development of varices, or hepatic decompensation. Univariate and multivariable interval-censored Cox regression analyses were used to compare the cumulative LRE probability by LSM progression and regression status. RESULTS In 1,403 participants, 89 LREs developed over a mean follow-up of 4.4 years, with an annual incidence rate for LREs of 1.5 (95% CI 1.2-1.8). In participants at risk, progression to LSM ≥10 or ≥15 kPa occurred in 29% and 17%, respectively, whereas regression to LSM <10 or <15 kPa occurred in 44% and 49%, respectively. Progressors to cACLD (≥10 kPa) experienced a higher cumulative LRE rate vs. non-progressors (16% vs. 4%, adjusted hazard ratio 4.0; 95% (1.8-8.9); p <0.01). Regressors from cACLD (to LSM <10 kPa) experienced a lower LRE rate than non-regressors (7% vs. 32%, adjusted hazard ratio 0.25; 95% CI 0.10-0.61; p <0.01). CONCLUSIONS Change in LSM over time is independently and bi-directionally associated with risk of LRE and is a non-invasive surrogate for clinical outcomes in patients with NAFLD. IMPACT AND IMPLICATIONS The prognostic value of change in LSM in patients with NAFLD is not well understood. In this large prospective study of patients with NAFLD and serial vibration-controlled transient elastography exams, baseline and dynamic changes in LSM were associated with the risk of developing liver-related events. LSM is a useful non-invasive surrogate of clinical outcomes in patients with NAFLD.
Collapse
Affiliation(s)
- Samer Gawrieh
- Division of Gastroenterology and Hepatology, Indiana University, Indianapolis, IN, United States
| | - Eduardo Vilar-Gomez
- Division of Gastroenterology and Hepatology, Indiana University, Indianapolis, IN, United States
| | - Laura A Wilson
- Department of Epidemiology, Johns Hopkins University, Baltimore, MD, United States
| | - Francis Pike
- Department of Biostatistics and Health Data Science, Indiana University, Indianapolis, IN, United States
| | - David E Kleiner
- Laboratory of Pathology, National Cancer Institute, Bethesda, MD, United States
| | | | - Anna Mae Diehl
- Division of Gastroenterology and Hepatology, Duke University, Durham, NC, United States
| | - Srinivasan Dasarathy
- Division of Gastroenterology and Hepatology, Cleveland Clinic Foundation, Cleveland, OH, United States
| | | | - Bilal Hameed
- Division of Gastroenterology and Hepatology, University of California, San Francisco, CA, United States
| | - James Tonascia
- Department of Epidemiology, Johns Hopkins University, Baltimore, MD, United States
| | - Rohit Loomba
- Division of Gastroenterology and Hepatology, University of California, San Diego, CA, United States
| | - Arun J Sanyal
- Division of Gastroenterology, Hepatology and Nutrition, Virginia Commonwealth University, Richmond, VA, United States
| | - Naga Chalasani
- Division of Gastroenterology and Hepatology, Indiana University, Indianapolis, IN, United States.
| |
Collapse
|
3
|
Barbanti L, Hothorn T. A transformation perspective on marginal and conditional models. Biostatistics 2024; 25:402-428. [PMID: 36534895 PMCID: PMC11212492 DOI: 10.1093/biostatistics/kxac048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 11/02/2022] [Accepted: 11/28/2022] [Indexed: 08/04/2023] Open
Abstract
Clustered observations are ubiquitous in controlled and observational studies and arise naturally in multicenter trials or longitudinal surveys. We present a novel model for the analysis of clustered observations where the marginal distributions are described by a linear transformation model and the correlations by a joint multivariate normal distribution. The joint model provides an analytic formula for the marginal distribution. Owing to the richness of transformation models, the techniques are applicable to any type of response variable, including bounded, skewed, binary, ordinal, or survival responses. We demonstrate how the common normal assumption for reaction times can be relaxed in the sleep deprivation benchmark data set and report marginal odds ratios for the notoriously difficult toe nail data. We furthermore discuss the analysis of two clinical trials aiming at the estimation of marginal treatment effects. In the first trial, pain was repeatedly assessed on a bounded visual analog scale and marginal proportional-odds models are presented. The second trial reported disease-free survival in rectal cancer patients, where the marginal hazard ratio from Weibull and Cox models is of special interest. An empirical evaluation compares the performance of the novel approach to general estimation equations for binary responses and to conditional mixed-effects models for continuous responses. An implementation is available in the tram add-on package to the R system and was benchmarked against established models in the literature.
Collapse
Affiliation(s)
- Luisa Barbanti
- Institut für Epidemiologie, Biostatistik und Prävention, Universität Zürich, Hirschengraben 84, CH-8001 Zürich, Switzerland
| | - Torsten Hothorn
- Institut für Epidemiologie, Biostatistik und Prävention, Universität Zürich, Hirschengraben 84, CH-8001 Zürich, Switzerland
| |
Collapse
|
4
|
Lee CY, Wong KY, Bandyopadhyay D. Partly linear single-index cure models with a nonparametric incidence link function. Stat Methods Med Res 2024; 33:498-514. [PMID: 38400526 DOI: 10.1177/09622802241227960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
In cancer studies, it is commonplace that a fraction of patients participating in the study are cured, such that not all of them will experience a recurrence, or death due to cancer. Also, it is plausible that some covariates, such as the treatment assigned to the patients or demographic characteristics, could affect both the patients' survival rates and cure/incidence rates. A common approach to accommodate these features in survival analysis is to consider a mixture cure survival model with the incidence rate modeled by a logistic regression model and latency part modeled by the Cox proportional hazards model. These modeling assumptions, though typical, restrict the structure of covariate effects on both the incidence and latency components. As a plausible recourse to attain flexibility, we study a class of semiparametric mixture cure models in this article, which incorporates two single-index functions for modeling the two regression components. A hybrid nonparametric maximum likelihood estimation method is proposed, where the cumulative baseline hazard function for uncured subjects is estimated nonparametrically, and the two single-index functions are estimated via Bernstein polynomials. Parameter estimation is carried out via a curated expectation-maximization algorithm. We also conducted a large-scale simulation study to assess the finite-sample performance of the estimator. The proposed methodology is illustrated via application to two cancer datasets.
Collapse
Affiliation(s)
- Chun Yin Lee
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong
| | - Kin Yau Wong
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong
- Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, China
| | | |
Collapse
|
5
|
Liu L, Su W, Zhao X. Semiparametric estimation and testing for panel count data with informative interval-censored failure event. Stat Med 2023; 42:5596-5615. [PMID: 37867199 DOI: 10.1002/sim.9927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 07/26/2023] [Accepted: 09/19/2023] [Indexed: 10/24/2023]
Abstract
Panel count data and interval-censored data are two types of incomplete data that often occur in event history studies. Almost all existing statistical methods are developed for their separate analysis. In this paper, we investigate a more general situation where a recurrent event process and an interval-censored failure event occur together. To intuitively and clearly explain the relationship between the recurrent current process and failure event, we propose a failure time-dependent mean model through a completely unspecified link function. To overcome the challenges arising from the blending of nonparametric components and parametric regression coefficients, we develop a two-stage conditional expected likelihood-based estimation procedure. We establish the consistency, the convergence rate and the asymptotic normality of the proposed two-stage estimator. Furthermore, we construct a class of two-sample tests for comparison of mean functions from different groups. The proposed methods are evaluated by extensive simulation studies and are illustrated with the skin cancer data that motivated this study.
Collapse
Affiliation(s)
- Li Liu
- School of Mathematics and Statistics, Wuhan University, Wuhan, China
| | - Wen Su
- Department of Biostatistics, City University of Hong Kong, Hong Kong, China
| | - Xingqiu Zhao
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China
| |
Collapse
|
6
|
Rosner B, Bay C, Glynn RJ, Ying GS, Maguire MG, Lee MLT. Estimation and testing for clustered interval-censored bivariate survival data with application using the semi-parametric version of the Clayton-Oakes model. LIFETIME DATA ANALYSIS 2023; 29:854-887. [PMID: 36670299 PMCID: PMC10614833 DOI: 10.1007/s10985-022-09588-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 12/22/2022] [Indexed: 06/17/2023]
Abstract
The Kaplan-Meier estimator is ubiquitously used to estimate survival probabilities for time-to-event data. It is nonparametric, and thus does not require specification of a survival distribution, but it does assume that the risk set at any time t consists of independent observations. This assumption does not hold for data from paired organ systems such as occur in ophthalmology (eyes) or otolaryngology (ears), or for other types of clustered data. In this article, we estimate marginal survival probabilities in the setting of clustered data, and provide confidence limits for these estimates with intra-cluster correlation accounted for by an interval-censored version of the Clayton-Oakes model. We develop a goodness-of-fit test for general bivariate interval-censored data and apply it to the proposed interval-censored version of the Clayton-Oakes model. We also propose a likelihood ratio test for the comparison of survival distributions between two groups in the setting of clustered data under the assumption of a constant between-group hazard ratio. This methodology can be used both for balanced and unbalanced cluster sizes, and also when the cluster size is informative. We compare our test to the ordinary log rank test and the Lin-Wei (LW) test based on the marginal Cox proportional Hazards model with robust standard errors obtained from the sandwich estimator. Simulation results indicate that the ordinary log rank test over-inflates type I error, while the proposed unconditional likelihood ratio test has appropriate type I error and higher power than the LW test. The method is demonstrated in real examples from the Sorbinil Retinopathy Trial, and the Age-Related Macular Degeneration Study. Raw data from these two trials are provided.
Collapse
Affiliation(s)
- Bernard Rosner
- Channing Division of Network Medicine, Department of Medicine, Harvard Medical School, Boston, MA, USA.
| | - Camden Bay
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Robert J Glynn
- Division of Preventive Medicine, Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Gui-Shuang Ying
- Center for Preventive Ophthalmology and Biostatistics, Department of Ophthalmology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Maureen G Maguire
- Center for Preventive Ophthalmology and Biostatistics, Department of Ophthalmology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Mei-Ling Ting Lee
- Department of Epidemiology and Biostatistics, University of Maryland at College Park, College Park, MD, USA
| |
Collapse
|
7
|
Lee CY, Wong KY, Lam KF, Bandyopadhyay D. A semiparametric joint model for cluster size and subunit-specific interval-censored outcomes. Biometrics 2023; 79:2010-2022. [PMID: 36377514 PMCID: PMC10183480 DOI: 10.1111/biom.13795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 11/04/2022] [Indexed: 11/16/2022]
Abstract
Clustered data frequently arise in biomedical studies, where observations, or subunits, measured within a cluster are associated. The cluster size is said to be informative, if the outcome variable is associated with the number of subunits in a cluster. In most existing work, the informative cluster size issue is handled by marginal approaches based on within-cluster resampling, or cluster-weighted generalized estimating equations. Although these approaches yield consistent estimation of the marginal models, they do not allow estimation of within-cluster associations and are generally inefficient. In this paper, we propose a semiparametric joint model for clustered interval-censored event time data with informative cluster size. We use a random effect to account for the association among event times of the same cluster as well as the association between event times and the cluster size. For estimation, we propose a sieve maximum likelihood approach and devise a computationally-efficient expectation-maximization algorithm for implementation. The estimators are shown to be strongly consistent, with the Euclidean components being asymptotically normal and achieving semiparametric efficiency. Extensive simulation studies are conducted to evaluate the finite-sample performance, efficiency and robustness of the proposed method. We also illustrate our method via application to a motivating periodontal disease dataset.
Collapse
Affiliation(s)
- Chun Yin Lee
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong
| | - Kin Yau Wong
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong
| | - K. F. Lam
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore
| | | |
Collapse
|
8
|
Xu Y, Zeng D, Lin DY. Marginal proportional hazards models for multivariate interval-censored data. Biometrika 2023; 110:815-830. [PMID: 37601305 PMCID: PMC10434824 DOI: 10.1093/biomet/asac059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/22/2023] Open
Abstract
Multivariate interval-censored data arise when there are multiple types of events or clusters of study subjects, such that the event times are potentially correlated and when each event is only known to occur over a particular time interval. We formulate the effects of potentially time-varying covariates on the multivariate event times through marginal proportional hazards models while leaving the dependence structures of the related event times unspecified. We construct the nonparametric pseudolikelihood under the working assumption that all event times are independent, and we provide a simple and stable EM-type algorithm. The resulting nonparametric maximum pseudolikelihood estimators for the regression parameters are shown to be consistent and asymptotically normal, with a limiting covariance matrix that can be consistently estimated by a sandwich estimator under arbitrary dependence structures for the related event times. We evaluate the performance of the proposed methods through extensive simulation studies and present an application to data from the Atherosclerosis Risk in Communities Study.
Collapse
Affiliation(s)
- Yangjianchen Xu
- Department of Biostatistics, University of North Carolina, 3101E McGavran-Greenberg Hall, Chapel Hill, North Carolina 27599, U.S.A
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina, 3101E McGavran-Greenberg Hall, Chapel Hill, North Carolina 27599, U.S.A
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, 3101E McGavran-Greenberg Hall, Chapel Hill, North Carolina 27599, U.S.A
| |
Collapse
|
9
|
Cook K, Lu W, Wang R. Marginal proportional hazards models for clustered interval-censored data with time-dependent covariates. Biometrics 2023; 79:1670-1685. [PMID: 36314377 DOI: 10.1111/biom.13787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 10/18/2022] [Indexed: 11/29/2022]
Abstract
The Botswana Combination Prevention Project was a cluster-randomized HIV prevention trial whose follow-up period coincided with Botswana's national adoption of a universal test and treat strategy for HIV management. Of interest is whether, and to what extent, this change in policy modified the preventative effects of the study intervention. To address such questions, we adopt a stratified proportional hazards model for clustered interval-censored data with time-dependent covariates and develop a composite expectation maximization algorithm that facilitates estimation of model parameters without placing parametric assumptions on either the baseline hazard functions or the within-cluster dependence structure. We show that the resulting estimators for the regression parameters are consistent and asymptotically normal. We also propose and provide theoretical justification for the use of the profile composite likelihood function to construct a robust sandwich estimator for the variance. We characterize the finite-sample performance and robustness of these estimators through extensive simulation studies. Finally, we conclude by applying this stratified proportional hazards model to a re-analysis of the Botswana Combination Prevention Project, with the national adoption of a universal test and treat strategy now modeled as a time-dependent covariate.
Collapse
Affiliation(s)
- Kaitlyn Cook
- Program in Statistical and Data Sciences, Smith College, Northampton, Massachusetts, USA
- Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, Massachusetts, USA
| | - Wenbin Lu
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA
| | - Rui Wang
- Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, Massachusetts, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| |
Collapse
|
10
|
Li S, Peng L. Instrumental variable estimation of complier causal treatment effect with interval-censored data. Biometrics 2023; 79:253-263. [PMID: 34528243 PMCID: PMC8924024 DOI: 10.1111/biom.13565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 07/29/2021] [Accepted: 09/08/2021] [Indexed: 11/29/2022]
Abstract
Assessing causal treatment effect on a time-to-event outcome is of key interest in many scientific investigations. Instrumental variable (IV) is a useful tool to mitigate the impact of endogenous treatment selection to attain unbiased estimation of causal treatment effect. Existing development of IV methodology, however, has not attended to outcomes subject to interval censoring, which are ubiquitously present in studies with intermittent follow-up but are challenging to handle in terms of both theory and computation. In this work, we fill in this important gap by studying a general class of causal semiparametric transformation models with interval-censored data. We propose a nonparametric maximum likelihood estimator of the complier causal treatment effect. Moreover, we design a reliable and computationally stable expectation-maximization (EM) algorithm, which has a tractable objective function in the maximization step via the use of Poisson latent variables. The asymptotic properties of the proposed estimators, including the consistency, asymptotic normality, and semiparametric efficiency, are established with empirical process techniques. We conduct extensive simulation studies and an application to a colorectal cancer screening data set, showing satisfactory finite-sample performance of the proposed method as well as its prominent advantages over naive methods.
Collapse
Affiliation(s)
- Shuwei Li
- School of Economics and Statistics, Guangzhou University, Guangzhou, Guangdong 510006, China
| | - Limin Peng
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, 30322, U.S.A
| |
Collapse
|
11
|
Gao F, Zeng D, Wang Y. Semiparametric regression analysis of bivariate censored events in a family study of Alzheimer's disease. Biostatistics 2022; 24:32-51. [PMID: 33948627 DOI: 10.1093/biostatistics/kxab014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 03/21/2021] [Accepted: 03/25/2021] [Indexed: 12/16/2022] Open
Abstract
Assessing disease comorbidity patterns in families represents the first step in gene mapping for diseases and is central to the practice of precision medicine. One way to evaluate the relative contributions of genetic risk factor and environmental determinants of a complex trait (e.g., Alzheimer's disease [AD]) and its comorbidities (e.g., cardiovascular diseases [CVD]) is through familial studies, where an initial cohort of subjects are recruited, genotyped for specific loci, and interviewed to provide extensive disease history in family members. Because of the retrospective nature of obtaining disease phenotypes in family members, the exact time of disease onset may not be available such that current status data or interval-censored data are observed. All existing methods for analyzing these family study data assume single event subject to right-censoring so are not applicable. In this article, we propose a semiparametric regression model for the family history data that assumes a family-specific random effect and individual random effects to account for the dependence due to shared environmental exposures and unobserved genetic relatedness, respectively. To incorporate multiple events, we jointly model the onset of the primary disease of interest and a secondary disease outcome that is subject to interval-censoring. We propose nonparametric maximum likelihood estimation and develop a stable Expectation-Maximization (EM) algorithm for computation. We establish the asymptotic properties of the resulting estimators and examine the performance of the proposed methods through simulation studies. Our application to a real world study reveals that the main contribution of comorbidity between AD and CVD is due to genetic factors instead of environmental factors.
Collapse
Affiliation(s)
- Fei Gao
- Division of Vaccine and Infectious Disease, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Yuanjia Wang
- Department of Biostatistics, Columbia University, New York, NY 10032, USA
| |
Collapse
|
12
|
Sun L, Li S, Wang L, Song X, Sui X. Simultaneous variable selection in regression analysis of multivariate interval-censored data. Biometrics 2022; 78:1402-1413. [PMID: 34407218 DOI: 10.1111/biom.13548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 05/13/2021] [Accepted: 08/03/2021] [Indexed: 12/30/2022]
Abstract
Multivariate interval-censored data arise when each subject under study can potentially experience multiple events and the onset time of each event is not observed exactly but is known to lie in a certain time interval formed by adjacent examination times with changed statuses of the event. This type of incomplete and complex data structure poses a substantial challenge in practical data analysis. In addition, many potential risk factors exist in numerous studies. Thus, conducting variable selection for event-specific covariates simultaneously becomes useful in identifying important variables and assessing their effects on the events of interest. In this paper, we develop a variable selection technique for multivariate interval-censored data under a general class of semiparametric transformation frailty models. The minimum information criterion (MIC) method is embedded in the optimization step of the proposed expectation-maximization (EM) algorithm to obtain the parameter estimator. The proposed EM algorithm greatly reduces the computational burden in maximizing the observed likelihood function, and the MIC naturally avoids selecting the optimal tuning parameter as needed in many other popular penalties, making the proposed algorithm promising and reliable. The proposed method is evaluated through extensive simulation studies and illustrated by an analysis of patient data from the Aerobics Center Longitudinal Study.
Collapse
Affiliation(s)
- Liuquan Sun
- School of Economics and Statistics, Guangzhou University, Guangzhou, China.,Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Shuwei Li
- School of Economics and Statistics, Guangzhou University, Guangzhou, China
| | - Lianming Wang
- Department of Statistics, University of South Carolina, Columbia, South Carolina, USA
| | - Xinyuan Song
- Department of Statistics, Chinese University of Hong Kong, Hong Kong
| | - Xuemei Sui
- Department of Exercise Science, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, USA
| |
Collapse
|
13
|
Liu R, Du M, Sun J. Variable selection for bivariate interval-censored failure time data under linear transformation models. Int J Biostat 2022:ijb-2021-0031. [PMID: 35654407 DOI: 10.1515/ijb-2021-0031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Accepted: 04/20/2022] [Indexed: 11/15/2022]
Abstract
Variable selection is needed and performed in almost every field and a large literature on it has been established, especially under the context of linear models or for complete data. Many authors have also investigated the variable selection problem for incomplete data such as right-censored failure time data. In this paper, we discuss variable selection when one faces bivariate interval-censored failure time data arising from a linear transformation model, for which it does not seem to exist an established procedure. For the problem, a penalized maximum likelihood approach is proposed and in particular, a novel Poisson-based EM algorithm is developed for the implementation. The oracle property of the proposed method is established, and the numerical studies suggest that the method works well for practical situations.
Collapse
Affiliation(s)
- Rong Liu
- Center for Applied Statistical Research, School of Mathematics, Jilin University, Changchun 130012, China
| | - Mingyue Du
- Center for Applied Statistical Research, School of Mathematics, Jilin University, Changchun 130012, China
| | - Jianguo Sun
- Department of Statistics, University of Missouri, Columbia, MO, 65211, USA
| |
Collapse
|
14
|
Petti D, Eletti A, Marra G, Radice R. Copula link-based additive models for bivariate time-to-event outcomes with general censoring scheme. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2022.107550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
15
|
Zhao W, Peng L, Hanfelt J. Semiparametric latent class analysis of recurrent event data. J R Stat Soc Series B Stat Methodol 2022; 84:1175-1197. [DOI: 10.1111/rssb.12499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Wei Zhao
- Department of Biostatistics and BioinformaticsEmory University AtlantaUSA
- Zhongtai Securities Institute for Financial Studies Shandong University Jinan China
| | - Limin Peng
- Department of Biostatistics and BioinformaticsEmory University AtlantaUSA
| | - John Hanfelt
- Department of Biostatistics and BioinformaticsEmory University AtlantaUSA
| |
Collapse
|
16
|
Zeng BD, Lin DY. Maximum Likelihood Estimation for Semiparametric Regression Models With Panel Count Data. Biometrika 2021; 108:947-963. [PMID: 34949875 DOI: 10.1093/biomet/asaa091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Panel count data, in which the observation for each study subject consists of the number of recurrent events between successive examinations, are commonly encountered in industrial reliability testing, medical research, and various other scientific investigations. We formulate the effects of potentially time-dependent covariates on one or more types of recurrent events through non-homogeneous Poisson processes with random effects. We adopt nonparametric maximum likelihood estimation under arbitrary examination schemes and develop a simple and stable EM algorithm. We show that the resulting estimators of the regression parameters are consistent and asymptotically normal, with a covariance matrix that achieves the semiparametric efficiency bound and can be estimated through profile likelihood. We evaluate the performance of the proposed methods through extensive simulation studies and present a skin cancer clinical trial.
Collapse
Affiliation(s)
- By Donglin Zeng
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA
| |
Collapse
|
17
|
Yang D, Du M, Sun J. Semiparametric regression analysis of clustered interval-censored failure time data with a cured subgroup. Stat Med 2021; 40:6918-6930. [PMID: 34634837 DOI: 10.1002/sim.9218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 08/21/2021] [Accepted: 09/21/2021] [Indexed: 11/06/2022]
Abstract
This article discusses regression analysis of clustered interval-censored failure time data in the presence of a cured fraction or subgroup. Such data often occur in many areas, including epidemiological studies, medical studies, and social sciences. For the problem, a class of semiparametric transformation nonmixture cure models is presented and for estimation, the maximum likelihood estimation procedure is derived. For the implementation of the proposed method, we develop a novel EM algorithm based on a Poisson variable-based augmentation. An extensive simulation study is conducted and suggests that the proposed approach works well in practical situations. Finally the method is applied to an example that motivated this study.
Collapse
Affiliation(s)
- Dian Yang
- Department of Statistics, University of Missouri, Columbia, Missouri, USA
| | - Mingyue Du
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China
| | - Jianguo Sun
- Department of Statistics, University of Missouri, Columbia, Missouri, USA
| |
Collapse
|
18
|
Lin DY, Gu Y, Zeng D, Janes HE, Gilbert PB. Evaluating Vaccine Efficacy Against SARS-CoV-2 Infection. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2021:2021.04.16.21255614. [PMID: 33880481 PMCID: PMC8057249 DOI: 10.1101/2021.04.16.21255614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
UNLABELLED Although interim results from several large placebo-controlled phase 3 trials demonstrated high vaccine efficacy (VE) against symptomatic COVID-19, it is unknown how effective the vaccines are in preventing people from becoming asymptomatically infected and potentially spreading the virus unwittingly. It is more difficult to evaluate VE against SARS-CoV-2 infection than against symptomatic COVID-19 because infection is not observed directly but rather is known to occur between two antibody or RT-PCR tests. Additional challenges arise as community transmission changes over time and as participants are vaccinated on different dates because of staggered enrollment or crossover before the end of the study. Here, we provide valid and efficient statistical methods for estimating potentially waning VE against SARS-CoV-2 infection with blood or nasal samples under time-varying community transmission, staggered enrollment, and blinded or unblinded crossover. We demonstrate the usefulness of the proposed methods through numerical studies mimicking the BNT162b2 phase 3 trial and the Prevent COVID U study. In addition, we assess how crossover and the frequency of diagnostic tests affect the precision of VE estimates. SUMMARY We show how to estimate potentially waning efficacy of COVID-19 vaccines against SARS-CoV-2 infection using blood or nasal samples collected periodically from clinical trials with staggered enrollment of participants and crossover of placebo recipients.
Collapse
|
19
|
Zhou Q, Cai J, Zhou H. Semiparametric regression analysis of case-cohort studies with multiple interval-censored disease outcomes. Stat Med 2021; 40:3106-3123. [PMID: 33783001 DOI: 10.1002/sim.8962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 03/01/2021] [Accepted: 03/10/2021] [Indexed: 11/05/2022]
Abstract
Interval-censored failure time data commonly arise in epidemiological and biomedical studies where the occurrence of an event or a disease is determined via periodic examinations. Subject to interval-censoring, available information on the failure time can be quite limited. Cost-effective sampling designs are desirable to enhance the study power, especially when the disease rate is low and the covariates are expensive to obtain. In this work, we formulate the case-cohort design with multiple interval-censored disease outcomes and also generalize it to nonrare diseases where only a portion of diseased subjects are sampled. We develop a marginal sieve weighted likelihood approach, which assumes that the failure times marginally follow the proportional hazards model. We consider two types of weights to account for the sampling bias, and adopt a sieve method with Bernstein polynomials to handle the unknown baseline functions. We employ a weighted bootstrap procedure to obtain a variance estimate that is robust to the dependence structure between failure times. The proposed method is examined via simulation studies and illustrated with a dataset on incident diabetes and hypertension from the Atherosclerosis Risk in Communities study.
Collapse
Affiliation(s)
- Qingning Zhou
- Department of Mathematics and Statistics, University of North Carolina at Charlotte, Charlotte, North Carolina, USA
| | - Jianwen Cai
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Haibo Zhou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
20
|
Wu D, Li C. Joint analysis of multivariate interval-censored survival data and a time-dependent covariate. Stat Methods Med Res 2020; 30:769-784. [PMID: 33256555 DOI: 10.1177/0962280220975064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
We develop a joint modeling method for multivariate interval-censored survival data and a time-dependent covariate that is intermittently measured with error. The joint model is estimated using nonparametric maximum likelihood estimation, which is carried out via an expectation-maximization algorithm, and the inference for finite-dimensional parameters is performed using bootstrap. We also develop a similar joint modeling method for univariate interval-censored survival data and a time-dependent covariate, which excels the existing methods in terms of model flexibility and interpretation. Simulation studies show that the model fitting and inference approaches perform very well under realistic sample sizes. We apply the method to a longitudinal study of dental caries in African-American children from low-income families in the city of Detroit, Michigan.
Collapse
Affiliation(s)
- Di Wu
- Department of Epidemiology and Biostatistics, 3078Michigan State University, East Lansing, MI, USA
| | - Chenxi Li
- Department of Epidemiology and Biostatistics, 3078Michigan State University, East Lansing, MI, USA
| |
Collapse
|
21
|
Lee CY, Wong KY, Lam KF, Xu J. Analysis of clustered interval‐censored data using a class of semiparametric partly linear frailty transformation models. Biometrics 2020; 78:165-178. [DOI: 10.1111/biom.13399] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 10/19/2020] [Accepted: 10/22/2020] [Indexed: 11/30/2022]
Affiliation(s)
- Chun Yin Lee
- Department of Applied Mathematics The Hong Kong Polytechnic University Hung Hom, Kowloon Hong Kong People's Republic of China
| | - Kin Yau Wong
- Department of Applied Mathematics The Hong Kong Polytechnic University Hung Hom, Kowloon Hong Kong People's Republic of China
| | - K. F. Lam
- Department of Statistics and Actuarial Science The University of Hong Kong Hong Kong People's Republic of China
| | - Jinfeng Xu
- Department of Statistics and Actuarial Science The University of Hong Kong Hong Kong People's Republic of China
| |
Collapse
|
22
|
Li C, Pak D, Todem D. Adaptive lasso for the Cox regression with interval censored and possibly left truncated data. Stat Methods Med Res 2020; 29:1243-1255. [PMID: 31203741 PMCID: PMC9969839 DOI: 10.1177/0962280219856238] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
We propose a penalized variable selection method for the Cox proportional hazards model with interval censored data. It conducts a penalized nonparametric maximum likelihood estimation with an adaptive lasso penalty, which can be implemented through a penalized EM algorithm. The method is proven to enjoy the desirable oracle property. We also extend the method to left truncated and interval censored data. Our simulation studies show that the method possesses the oracle property in samples of modest sizes and outperforms available existing approaches in many of the operating characteristics. An application to a dental caries data set illustrates the method's utility.
Collapse
Affiliation(s)
- Chenxi Li
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI, USA
| | - Daewoo Pak
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - David Todem
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
23
|
Yi F, Tang N, Sun J. Regression analysis of interval-censored failure time data with time-dependent covariates. Comput Stat Data Anal 2020. [DOI: 10.1016/j.csda.2019.106848] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
24
|
Sun T, Ding Y. Copula-based semiparametric regression method for bivariate data under general interval censoring. Biostatistics 2019; 22:315-330. [PMID: 31506682 DOI: 10.1093/biostatistics/kxz032] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 08/09/2019] [Accepted: 08/11/2019] [Indexed: 11/12/2022] Open
Abstract
This research is motivated by discovering and underpinning genetic causes for the progression of a bilateral eye disease, age-related macular degeneration (AMD), of which the primary outcomes, progression times to late-AMD, are bivariate and interval-censored due to intermittent assessment times. We propose a novel class of copula-based semiparametric transformation models for bivariate data under general interval censoring, which includes the case 1 interval censoring (current status data) and case 2 interval censoring. Specifically, the joint likelihood is modeled through a two-parameter Archimedean copula, which can flexibly characterize the dependence between the two margins in both tails. The marginal distributions are modeled through semiparametric transformation models using sieves, with the proportional hazards or odds model being a special case. We develop a computationally efficient sieve maximum likelihood estimation procedure for the unknown parameters, together with a generalized score test for the regression parameter(s). For the proposed sieve estimators of finite-dimensional parameters, we establish their asymptotic normality and efficiency. Extensive simulations are conducted to evaluate the performance of the proposed method in finite samples. Finally, we apply our method to a genome-wide analysis of AMD progression using the Age-Related Eye Disease Study data, to successfully identify novel risk variants associated with the disease progression. We also produce predicted joint and conditional progression-free probabilities, for patients with different genetic characteristics.
Collapse
Affiliation(s)
- Tao Sun
- Department of Biostatistics, University of Pittsburgh, 130 DeSoto St, Pittsburgh, PA 15261, USA
| | - Ying Ding
- Department of Biostatistics, University of Pittsburgh, 130 DeSoto St, Pittsburgh, PA 15261, USA
| |
Collapse
|
25
|
Gao F, Chan KCG. Semiparametric regression analysis of length‐biased interval‐censored data. Biometrics 2019; 75:121-132. [DOI: 10.1111/biom.12970] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Accepted: 09/12/2018] [Indexed: 11/28/2022]
Affiliation(s)
- Fei Gao
- Department of BiostatisticsUniversity of WashingtonSeattleWashington
| | | |
Collapse
|
26
|
Gao F, Zeng D, Couper D, Lin DY. Semiparametric Regression Analysis of Multiple Right- and Interval-Censored Events. J Am Stat Assoc 2018; 114:1232-1240. [PMID: 31588157 DOI: 10.1080/01621459.2018.1482756] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
Health sciences research often involves both right- and interval-censored events because the occurrence of a symptomatic disease can only be observed up to the end of follow-up, while the occurrence of an asymptomatic disease can only be detected through periodic examinations. We formulate the effects of potentially time-dependent covariates on the joint distribution of multiple right- and interval-censored events through semiparametric proportional hazards models with random effects that capture the dependence both within and between the two types of events. We consider nonparametric maximum likelihood estimation and develop a simple and stable EM algorithm for computation. We show that the resulting estimators are consistent and the parametric components are asymptotically normal and efficient with a covariance matrix that can be consistently estimated by profile likelihood or nonparametric bootstrap. In addition, we leverage the joint modelling to provide dynamic prediction of disease incidence based on the evolving event history. Furthermore, we assess the performance of the proposed methods through extensive simulation studies. Finally, we provide an application to a major epidemiological cohort study. Supplementary materials for this article are available online.
Collapse
Affiliation(s)
- Fei Gao
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC
| | - David Couper
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC
| |
Collapse
|