1
|
Lee J, Ogino S, Wang M. Weighting estimation in the cause-specific Cox regression with partially missing causes of failure. Stat Med 2024; 43:2575-2591. [PMID: 38659326 DOI: 10.1002/sim.10084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 02/25/2024] [Accepted: 04/05/2024] [Indexed: 04/26/2024]
Abstract
Complex diseases are often analyzed using disease subtypes classified by multiple biomarkers to study pathogenic heterogeneity. In such molecular pathological epidemiology research, we consider a weighted Cox proportional hazard model to evaluate the effect of exposures on various disease subtypes under competing-risk settings in the presence of partially or completely missing biomarkers. The asymptotic properties of the inverse and augmented inverse probability-weighted estimating equation methods are studied with a general pattern of missing data. Simulation studies have been conducted to demonstrate the double robustness of the estimators. For illustration, we applied this method to examine the association between pack-years of smoking before the age of 30 and the incidence of colorectal cancer subtypes defined by a combination of four tumor molecular biomarkers (statuses of microsatellite instability, CpG island methylator phenotype, BRAF mutation, and KRAS mutation) in the Nurses' Health Study cohort.
Collapse
Affiliation(s)
- Jooyoung Lee
- Department of Applied Statistics, Chung-Ang University, Seoul, Korea
| | - Shuji Ogino
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Cancer Immunology and Cancer Epidemiology Programs, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- Program in Molecular Pathological Epidemiology, Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA
- Eli and Edythe L Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Molin Wang
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
2
|
Li W, Ma H, Faraggi D, Dinse GE. Generalized mean residual life models for survival data with missing censoring indicators. Stat Med 2023; 42:264-280. [PMID: 36437483 DOI: 10.1002/sim.9615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 10/23/2022] [Accepted: 11/15/2022] [Indexed: 11/29/2022]
Abstract
The mean residual life (MRL) function is an important and attractive alternative to the hazard function for characterizing the distribution of a time-to-event variable. In this article, we study the modeling and inference of a family of generalized MRL models for right-censored survival data with censoring indicators missing at random. To estimate the model parameters, augmented inverse probability weighted estimating equation approaches are developed, in which the non-missingness probability and the conditional probability of an uncensored observation are estimated by parametric methods or nonparametric kernel smoothing techniques. Asymptotic properties of the proposed estimators are established and finite sample performance is evaluated by extensive simulation studies. An application to brain cancer data is presented to illustrate the proposed methods.
Collapse
Affiliation(s)
- Wenwen Li
- KLATASDS-MOE, School of Statistics and Academy of Statistics and Interdisciplinary Sciences, East China Normal University, Shanghai, China
| | - Huijuan Ma
- KLATASDS-MOE, School of Statistics and Academy of Statistics and Interdisciplinary Sciences, East China Normal University, Shanghai, China
| | - David Faraggi
- KLATASDS-MOE, School of Statistics and Academy of Statistics and Interdisciplinary Sciences, East China Normal University, Shanghai, China.,Department of Statistics, University of Haifa, Haifa, Israel
| | - Gregg E Dinse
- Public Health & Scientific Research, Social and Scientific Systems, Durham, North Carolina, USA
| |
Collapse
|
3
|
Ben Elouefi R, Saâdaoui F. Inverse‐Probability‐Weighted Logrank Test for Stratified Survival Data with Missing Measurements. STAT NEERL 2022. [DOI: 10.1111/stan.12276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Rim Ben Elouefi
- École Supérieure Privée d'Ingénierie et de Technologie, Pôle Technologique El Ghazala Tunis Tunisia
| | - Foued Saâdaoui
- Department of Statistics, Faculty of Sciences King Abdulaziz University, P.O BOX 80203 Jeddah Saudi Arabia
- University of Sousse, Institut des Hautes Etudes Commerciales (IHEC), Route Hzamia, Sahloul 3, B.P. 40 Sousse Tunisia
- Lab: LR18ES15 Algèbre, Théorie de Nombres et Analyse Non‐linéaire, Faculté des Sciences Monastir Tunisia
| |
Collapse
|
4
|
Zhou W, Bakoyannis G, Zhang Y, Yiannoutsos CT. Semiparametric marginal regression for clustered competing risks data with missing cause of failure. Biostatistics 2022:6567216. [PMID: 35411923 PMCID: PMC10345995 DOI: 10.1093/biostatistics/kxac012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 03/22/2022] [Accepted: 03/23/2022] [Indexed: 11/12/2022] Open
Abstract
Clustered competing risks data are commonly encountered in multicenter studies. The analysis of such data is often complicated due to informative cluster size (ICS), a situation where the outcomes under study are associated with the size of the cluster. In addition, the cause of failure is frequently incompletely observed in real-world settings. To the best of our knowledge, there is no methodology for population-averaged analysis with clustered competing risks data with an ICS and missing causes of failure. To address this problem, we consider the semiparametric marginal proportional cause-specific hazards model and propose a maximum partial pseudolikelihood estimator under a missing at random assumption. To make the latter assumption more plausible in practice, we allow for auxiliary variables that may be related to the probability of missingness. The proposed method does not impose assumptions regarding the within-cluster dependence and allows for ICS. The asymptotic properties of the proposed estimators for both regression coefficients and infinite-dimensional parameters, such as the marginal cumulative incidence functions, are rigorously established. Simulation studies show that the proposed method performs well and that methods that ignore the within-cluster dependence and the ICS lead to invalid inferences. The proposed method is applied to competing risks data from a large multicenter HIV study in sub-Saharan Africa where a significant portion of causes of failure is missing.
Collapse
Affiliation(s)
- Wenxian Zhou
- Department of Biostatistics and Health Data Science, Indiana University, 410 West 10th Street, Suite 3000, Indianapolis, IN 46202, USA
| | - Giorgos Bakoyannis
- Department of Biostatistics and Health Data Science, Indiana University, 410 West 10th Street, Suite 3000, Indianapolis, IN 46202, USA
| | - Ying Zhang
- Department of Biostatistics, University of Nebraska Medical Center 42nd and Emile, Omaha, NE 68198, USA
| | - Constantin T Yiannoutsos
- Department of Biostatistics and Health Data Science, Indiana University, 410 West 10th Street, Suite 3000, Indianapolis, IN 46202, USA
| |
Collapse
|
5
|
Bakoyannis G, Zhang Y, Yiannoutsos CT. Semiparametric regression and risk prediction with competing risks data under missing cause of failure. LIFETIME DATA ANALYSIS 2020; 26:659-684. [PMID: 31982977 PMCID: PMC7381366 DOI: 10.1007/s10985-020-09494-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Accepted: 01/16/2020] [Indexed: 06/10/2023]
Abstract
The cause of failure in cohort studies that involve competing risks is frequently incompletely observed. To address this, several methods have been proposed for the semiparametric proportional cause-specific hazards model under a missing at random assumption. However, these proposals provide inference for the regression coefficients only, and do not consider the infinite dimensional parameters, such as the covariate-specific cumulative incidence function. Nevertheless, the latter quantity is essential for risk prediction in modern medicine. In this paper we propose a unified framework for inference about both the regression coefficients of the proportional cause-specific hazards model and the covariate-specific cumulative incidence functions under missing at random cause of failure. Our approach is based on a novel computationally efficient maximum pseudo-partial-likelihood estimation method for the semiparametric proportional cause-specific hazards model. Using modern empirical process theory we derive the asymptotic properties of the proposed estimators for the regression coefficients and the covariate-specific cumulative incidence functions, and provide methodology for constructing simultaneous confidence bands for the latter. Simulation studies show that our estimators perform well even in the presence of a large fraction of missing cause of failures, and that the regression coefficient estimator can be substantially more efficient compared to the previously proposed augmented inverse probability weighting estimator. The method is applied using data from an HIV cohort study and a bladder cancer clinical trial.
Collapse
Affiliation(s)
- Giorgos Bakoyannis
- Department of Biostatistics, Indiana University Fairbanks School of Public Health and School of Medicine, 410 West 10th Street, Suite 3000, Indianapolis, IN, 46202, USA.
| | - Ying Zhang
- Department of Biostatistics, University of Nebraska Medical Center, Omaha, USA
| | - Constantin T Yiannoutsos
- Department of Biostatistics, Indiana University Fairbanks School of Public Health and School of Medicine, 410 West 10th Street, Suite 3000, Indianapolis, IN, 46202, USA
| |
Collapse
|
6
|
Heng F, Sun Y, Hyun S, Gilbert PB. Analysis of the time-varying Cox model for the cause-specific hazard functions with missing causes. LIFETIME DATA ANALYSIS 2020; 26:731-760. [PMID: 32274677 PMCID: PMC7487047 DOI: 10.1007/s10985-020-09497-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Accepted: 03/27/2020] [Indexed: 06/11/2023]
Abstract
This paper studies the Cox model with time-varying coefficients for cause-specific hazard functions when the causes of failure are subject to missingness. Inverse probability weighted and augmented inverse probability weighted estimators are investigated. The latter is considered as a two-stage estimator by directly utilizing the inverse probability weighted estimator and through modeling available auxiliary variables to improve efficiency. The asymptotic properties of the two estimators are investigated. Hypothesis testing procedures are developed to test the null hypotheses that the covariate effects are zero and that the covariate effects are constant. We conduct simulation studies to examine the finite sample properties of the proposed estimation and hypothesis testing procedures under various settings of the auxiliary variables and the percentages of the failure causes that are missing. These simulation results demonstrate that the augmented inverse probability weighted estimators are more efficient than the inverse probability weighted estimators and that the proposed testing procedures have the expected satisfactory results in sizes and powers. The proposed methods are illustrated using the Mashi clinical trial data for investigating the effect of randomization to formula-feeding versus breastfeeding plus extended infant zidovudine prophylaxis on death due to mother-to-child HIV transmission in Botswana.
Collapse
Affiliation(s)
- Fei Heng
- Department of Mathematics and Statistics, University of North Florida, Jacksonville, FL, 32224, USA
| | - Yanqing Sun
- Department of Mathematics and Statistics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA.
| | - Seunggeun Hyun
- Division of Mathematics and Computer Science, University of South Carolina Upstate, Spartanburg, SC, 29303, USA
| | - Peter B Gilbert
- University of Washington and Fred Hutchinson Cancer Research Center Seattle, Seattle, WA, 98109, USA
| |
Collapse
|
7
|
Nevo D, Nishihara R, Ogino S, Wang M. The competing risks Cox model with auxiliary case covariates under weaker missing-at-random cause of failure. LIFETIME DATA ANALYSIS 2018; 24:425-442. [PMID: 28779227 PMCID: PMC5797530 DOI: 10.1007/s10985-017-9401-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Accepted: 07/23/2017] [Indexed: 05/08/2023]
Abstract
In the analysis of time-to-event data with multiple causes using a competing risks Cox model, often the cause of failure is unknown for some of the cases. The probability of a missing cause is typically assumed to be independent of the cause given the time of the event and covariates measured before the event occurred. In practice, however, the underlying missing-at-random assumption does not necessarily hold. Motivated by colorectal cancer molecular pathological epidemiology analysis, we develop a method to conduct valid analysis when additional auxiliary variables are available for cases only. We consider a weaker missing-at-random assumption, with missing pattern depending on the observed quantities, which include the auxiliary covariates. We use an informative likelihood approach that will yield consistent estimates even when the underlying model for missing cause of failure is misspecified. The superiority of our method over naive methods in finite samples is demonstrated by simulation study results. We illustrate the use of our method in an analysis of colorectal cancer data from the Nurses' Health Study cohort, where, apparently, the traditional missing-at-random assumption fails to hold.
Collapse
Affiliation(s)
- Daniel Nevo
- Departments of Biostatistics and Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| | - Reiko Nishihara
- Departments of Epidemiology and Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Shuji Ogino
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Division of MPE Molecular Pathological Epidemiology, Department of Pathology, Brigham and Womens Hospital and Harvard Medical School, Boston, MA, USA
| | - Molin Wang
- Departments of Biostatistics and Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
8
|
Chen X, Cai J. Reweighted estimators for additive hazard model with censoring indicators missing at random. LIFETIME DATA ANALYSIS 2018; 24:224-249. [PMID: 28766089 PMCID: PMC5794663 DOI: 10.1007/s10985-017-9398-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Accepted: 07/14/2017] [Indexed: 06/07/2023]
Abstract
Survival data with missing censoring indicators are frequently encountered in biomedical studies. In this paper, we consider statistical inference for this type of data under the additive hazard model. Reweighting methods based on simple and augmented inverse probability are proposed. The asymptotic properties of the proposed estimators are established. Furthermore, we provide a numerical technique for checking adequacy of the fitted model with missing censoring indicators. Our simulation results show that the proposed estimators outperform the simple and augmented inverse probability weighted estimators without reweighting. The proposed methods are illustrated by analyzing a dataset from a breast cancer study.
Collapse
Affiliation(s)
- Xiaolin Chen
- School of Statistics, Qufu Normal University, Qufu, 273165, China
| | - Jianwen Cai
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599-7420, USA.
| |
Collapse
|
9
|
Qiu Z, Wan ATK, Zhou Y, Gilbert PB. Smoothed Rank Regression for the Accelerated Failure Time Competing Risks Model with Missing Cause of Failure. Stat Sin 2018; 29:23-46. [PMID: 30740005 DOI: 10.5705/ss.202016.0231] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
This paper examines the accelerated failure time competing risks model with missing cause of failure using the monotone class rank-based estimating equations approach. We handle the non-smoothness of the rank-based estimating equations using a kernel smoothed estimation method, and estimate the unknown selection probability and the conditional expectation by non-parametric techniques. Under this setup, we propose three methods for estimating the unknown regression parameters based on 1) inverse probability weighting, 2) estimating equations imputation and 3) augmented inverse probability weighting. We also obtain the associated asymptotic theories of the proposed estimators and investigate the estimators' small sample behaviour in a simulation study. A direct plug-in method is suggested for estimating the asymptotic variances of the proposed estimators. A real data application based on a HIV vaccine efficacy trial study is considered.
Collapse
Affiliation(s)
- Zhiping Qiu
- School of Mathematical Sciences, Huaqiao University, Quanzhou 362021, China.,Research Center for Applied Statistics and Big Data, Huaqiao University, Xiamen 361021, China
| | - Alan T K Wan
- City University of Hong Kong, Kowloon, Hong Kong
| | - Yong Zhou
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200433, China.,Institute of Applied Mathematics, Chinese Academy of Science, Beijing 100190, China
| | - Peter B Gilbert
- Department of Biostatistics, University of Washington and Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| |
Collapse
|
10
|
Competing risks data analysis under the accelerated failure time model with missing cause of failure. ANN I STAT MATH 2015. [DOI: 10.1007/s10463-015-0516-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|