1
|
A Bayesian semi-parametric model for learning biomarker trajectories and changepoints in the preclinical phase of Alzheimer's disease. Biometrics 2024; 80:ujae048. [PMID: 38775703 PMCID: PMC11110494 DOI: 10.1093/biomtc/ujae048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 04/26/2024] [Accepted: 05/07/2024] [Indexed: 05/25/2024]
Abstract
It has become consensus that mild cognitive impairment (MCI), one of the early symptoms onset of Alzheimer's disease (AD), may appear 10 or more years after the emergence of neuropathological abnormalities. Therefore, understanding the progression of AD biomarkers and uncovering when brain alterations begin in the preclinical stage, while patients are still cognitively normal, are crucial for effective early detection and therapeutic development. In this paper, we develop a Bayesian semiparametric framework that jointly models the longitudinal trajectory of the AD biomarker with a changepoint relative to the occurrence of symptoms onset, which is subject to left truncation and right censoring, in a heterogeneous population. Furthermore, unlike most existing methods assuming that everyone in the considered population will eventually develop the disease, our approach accounts for the possibility that some individuals may never experience MCI or AD, even after a long follow-up time. We evaluate the proposed model through simulation studies and demonstrate its clinical utility by examining an important AD biomarker, ptau181, using a dataset from the Biomarkers of Cognitive Decline Among Normal Individuals (BIOCARD) study.
Collapse
|
2
|
A flexible model based on piecewise linear approximation for the analysis of left truncated right censored data with covariates, and applications to Worcester Heart Attack Study data and Channing House data. Stat Med 2024; 43:233-255. [PMID: 37933206 DOI: 10.1002/sim.9954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 10/14/2023] [Accepted: 10/23/2023] [Indexed: 11/08/2023]
Abstract
Left truncated right censored (LTRC) data arise quite commonly from survival studies. In this article, a model based on piecewise linear approximation is proposed for the analysis of LTRC data with covariates. Specifically, the model involves a piecewise linear approximation for the cumulative baseline hazard function of the proportional hazards model. The principal advantage of the proposed model is that it does not depend on restrictive parametric assumptions while being flexible and data-driven. Likelihood inference for the model is developed. Through detailed simulation studies, the robustness property of the model is studied by fitting it to LTRC data generated from different processes covering a wide range of lifetime distributions. A sensitivity analysis is also carried out by fitting the model to LTRC data generated from a process with a piecewise constant baseline hazard. It is observed that the performance of the model is quite satisfactory in all those cases. Analyses of two real LTRC datasets by using the model are provided as illustrative examples. Applications of the model in some practical prediction issues are discussed. In summary, the proposed model provides a comprehensive and flexible approach to model a general structure for LTRC lifetime data.
Collapse
|
3
|
How to account for early overly small risk sets in the analysis of pregnancy outcome data?-Comparison of different methods for stabilizing the Aalen-Johansen estimator. Pharmacoepidemiol Drug Saf 2024; 33:e5718. [PMID: 37850535 DOI: 10.1002/pds.5718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 09/28/2023] [Accepted: 10/06/2023] [Indexed: 10/19/2023]
Abstract
PURPOSE In analyzing pregnancy data concerning drug exposure in the first trimester, the risk of spontaneous abortions is of primary interest. For estimating the cumulative incidence function, the Aalen-Johansen estimator is typically used, and competing risks such as induced abortion and livebirth are considered. However, the delayed study entry can lead to overly small risk sets for the first events. This results in large jumps in the estimated cumulative incidence function of spontaneous abortions or induced abortions using the Aalen-Johansen estimator, and consequently in an overestimation of the probability. METHODS Several approaches account for early overly small risk sets. The first approach is conditioning on the event time being greater than the event time causing the large jump. Second, the events can be ignored by censoring them. Third, the events can be postponed until a large enough number is at risk. These three approaches are compared. RESULTS All approaches are applied using data of 54 lacosamide-exposed pregnancies. The Aalen-Johansen estimate of the probability of spontaneous abortion is 22.64%, which is relatively large for only three spontaneous abortions in the dataset. The conditional approach and the ignore approach have an estimated probability of 7.17%. In contrast, the estimate of the postpone approach is 16.45%. In this small sample, bootstrapped confidence intervals seem more accurate. CONCLUSIONS In the analyses of pregnancy data with rare events, the postpone approach is favorable as no events are excluded. However, the approach that ignores early events has the narrowest confidence interval.
Collapse
|
4
|
Systematic exclusion at study commencement masks earlier menopause for Black women in the Study of Women's Health Across the Nation (SWAN). Int J Epidemiol 2023; 52:1612-1623. [PMID: 37382579 PMCID: PMC10555828 DOI: 10.1093/ije/dyad085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 05/30/2023] [Indexed: 06/30/2023] Open
Abstract
BACKGROUND Shorter average lifespans for minoritized populations are hypothesized to stem from 'weathering' or accelerated health declines among minoritized individuals due to systemic marginalization. However, evidence is mixed on whether racial/ethnic differences exist in reproductive ageing, potentially due to selection biases in cohort studies that may systematically exclude 'weathered' participants. This study examines racial/ethnic disparities in the age of menopause after accounting for differential selection 'into' (left truncation) and 'out of' (right censoring) a cohort of midlife women. METHODS Using data from the Study of Women's Health Across the Nation (SWAN) cross-sectional screener (N = 15 695) and accompanying ∼20-year longitudinal cohort (N = 3302) (1995-2016), we adjusted for potential selection bias using inverse probability weighting (left truncation) to account for socio-demographic/health differences between the screening and cohort study, and multiple imputation (right censoring) to estimate racial/ethnic differences in age at menopause (natural and surgical). RESULTS Unadjusted for selection, no Black/White differences in menopausal timing [hazard ratio (HR)=0.98 (0.86, 1.11)] were observed. After adjustment, Black women had an earlier natural [HR = 1.13 (1.00, 1.26)] and surgical [HR= 3.21 (2.80, 3.62)] menopause than White women with natural menopause-corresponding to a 1.2-year Black/White difference in menopause timing overall. CONCLUSIONS Failure to account for multiple forms of selection bias masked racial/ethnic disparities in the timing of menopause in SWAN. Results suggest that there may be racial differences in age at menopause and that selection particularly affected the estimated menopausal age for women who experienced earlier menopause. Cohorts should consider incorporating methods to account for all selection biases, including left truncation, as they impact our understanding of health in 'weathered' populations.
Collapse
|
5
|
Neural network on interval-censored data with application to the prediction of Alzheimer's disease. Biometrics 2023; 79:2677-2690. [PMID: 35960189 PMCID: PMC10177011 DOI: 10.1111/biom.13734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Accepted: 08/01/2022] [Indexed: 11/28/2022]
Abstract
Alzheimer's disease (AD) is a progressive and polygenic disorder that affects millions of individuals each year. Given that there have been few effective treatments yet for AD, it is highly desirable to develop an accurate model to predict the full disease progression profile based on an individual's genetic characteristics for early prevention and clinical management. This work uses data composed of all four phases of the Alzheimer's Disease Neuroimaging Initiative (ADNI) study, including 1740 individuals with 8 million genetic variants. We tackle several challenges in this data, characterized by large-scale genetic data, interval-censored outcome due to intermittent assessments, and left truncation in one study phase (ADNIGO). Specifically, we first develop a semiparametric transformation model on interval-censored and left-truncated data and estimate parameters through a sieve approach. Then we propose a computationally efficient generalized score test to identify variants associated with AD progression. Next, we implement a novel neural network on interval-censored data (NN-IC) to construct a prediction model using top variants identified from the genome-wide test. Comprehensive simulation studies show that the NN-IC outperforms several existing methods in terms of prediction accuracy. Finally, we apply the NN-IC to the full ADNI data and successfully identify subgroups with differential progression risk profiles. Data used in the preparation of this article were obtained from the ADNI database.
Collapse
|
6
|
Cancer survival: left truncation and comparison of results from hospital-based cancer registry and population-based cancer registry. Front Oncol 2023; 13:1173828. [PMID: 37350938 PMCID: PMC10284078 DOI: 10.3389/fonc.2023.1173828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Accepted: 05/16/2023] [Indexed: 06/24/2023] Open
Abstract
Background Cancer survival is an important indicator for evaluating cancer prognosis and cancer care outcomes. The incidence dates used in calculating survival differ between population-based registries and hospital-based registries. Studies examining the effects of the left truncation of incidence dates and delayed reporting on survival estimates are scarce in real-world applications. Methods Cancer cases hospitalized at Nantong Tumor Hospital during the years 2002-2017 were traced with their records registered in the Qidong Cancer Registry. Survival was calculated using the life table method for cancer patients with the first visit dates recorded in the hospital-based cancer registry (HBR) as the diagnosis date (OSH), those with the registered dates of population-based cancer (PBR) registered as the incidence date (OSP), and those with corrected dates when the delayed report dates were calibrated (OSC). Results Among 2,636 cases, 1,307 had incidence dates registered in PBR prior to the diagnosis dates of the first hospitalization registered in HBR, while 667 cases with incidence dates registered in PBR were later than the diagnosis dates registered in HBR. The 5-year OSH, OSP, and OSC were 36.1%, 37.4%, and 39.0%, respectively. The "lost" proportion of 5-year survival due to the left truncation for HBR data was estimated to be between 3.5% and 7.4%, and the "delayed-report" proportion of 5-year survival for PBR data was found to be 4.1%. Conclusion Left truncation of survival in HBR cases was demonstrated. The pseudo-left truncation in PBR should be reduced by controlling delayed reporting and maximizing completeness. Our study provides practical references and suggestions for evaluating the survival of cancer patients with HBR and PBR.
Collapse
|
7
|
Female "Paradox" in Atrial Fibrillation-Role of Left Truncation Due to Competing Risks. Life (Basel) 2023; 13:life13051132. [PMID: 37240777 DOI: 10.3390/life13051132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 04/08/2023] [Accepted: 05/02/2023] [Indexed: 05/28/2023] Open
Abstract
Female sex in patients with atrial fibrillation (AF) is a controversial and paradoxical risk factor for stroke-controversial because it increases the risk of stroke only among older women of some ethnicities and paradoxical because it appears to contradict male predominance in cardiovascular diseases. However, the underlying mechanism remains unclear. We conducted simulations to examine the hypothesis that this sex difference is generated non-causally through left truncation due to competing risks (CR) such as coronary artery diseases, which occur more frequently among men than among women and share common unobserved causes with stroke. We modeled the hazards of stroke and CR with correlated heterogeneous risk. We assumed that some people died of CR before AF diagnosis and calculated the hazard ratio of female sex in the left-truncated AF population. In this situation, female sex became a risk factor for stroke in the absence of causal roles. The hazard ratio was attenuated in young populations without left truncation and in populations with low CR and high stroke incidence, which is consistent with real-world observations. This study demonstrated that spurious risk factors can be identified through left truncation due to correlated CR. Female sex in patients with AF may be a paradoxical risk factor for stroke.
Collapse
|
8
|
Semiparametric copula method for semi-competing risks data subject to interval censoring and left truncation: Application to disability in elderly. Stat Methods Med Res 2023; 32:656-670. [PMID: 36735020 PMCID: PMC11070129 DOI: 10.1177/09622802221133552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
We aim to evaluate the marginal effects of covariates on time-to-disability in the elderly under the semi-competing risks framework, as death dependently censors disability, not vice versa. It becomes particularly challenging when time-to-disability is subject to interval censoring due to intermittent assessments. A left truncation issue arises when the age time scale is applied. We develop a flexible two-parameter copula-based semiparametric transformation model for semi-competing risks data subject to interval censoring and left truncation. The two-parameter copula quantifies both upper and lower tail dependence between two margins. The semiparametric transformation models incorporate proportional hazards and proportional odds models in both margins. We propose a two-step sieve maximum likelihood estimation procedure and study the sieve estimators' asymptotic properties. Simulations show that the proposed method corrects biases in the marginal method. We demonstrate the proposed method in a large-scale Chinese Longitudinal Healthy Longevity Study and provide new insights into preventing disability in the elderly. The proposed method could be applied to the general semi-competing risks data with intermittently assessed disease status.
Collapse
|
9
|
Modeling unmeasured baseline information in observational time-to-event data subject to delayed study entry. Stat Methods Med Res 2023:9622802231163334. [PMID: 36924264 DOI: 10.1177/09622802231163334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Unmeasured baseline information in left-truncated data situations frequently occurs in observational time-to-event analyses. For instance, a typical timescale in trials of antidiabetic treatment is "time since treatment initiation", but individuals may have initiated treatment before the start of longitudinal data collection. When the focus is on baseline effects, one widespread approach is to fit a Cox proportional hazards model incorporating the measurements at delayed study entry. This has been criticized because of the potential time dependency of covariates. We tackle this problem by using a Bayesian joint model that combines a mixed-effects model for the longitudinal trajectory with a proportional hazards model for the event of interest incorporating the baseline covariate, possibly unmeasured in the presence of left truncation. The novelty is that our procedure is not used to account for non-continuously monitored longitudinal covariates in right-censored time-to-event studies, but to utilize these trajectories to make inferences about missing baseline measurements in left-truncated data. Simulating times-to-event depending on baseline covariates we also compared our proposal to a simpler two-stage approach which performed favorably. Our approach is illustrated by investigating the impact of baseline blood glucose levels on antidiabetic treatment failure using data from a German diabetes register.
Collapse
|
10
|
Elucidating Analytic Bias Due to Informative Cohort Entry in Cancer Clinico-genomic Datasets. Cancer Epidemiol Biomarkers Prev 2023; 32:344-352. [PMID: 36626408 PMCID: PMC9992002 DOI: 10.1158/1055-9965.epi-22-0875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Revised: 11/12/2022] [Accepted: 01/04/2023] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Oncologists often order genomic testing to inform treatment for worsening cancer. The resulting correlation between genomic testing timing and prognosis, or "informative entry," can bias observational clinico-genomic research. The efficacy of existing approaches to this problem in clinico-genomic cohorts is poorly understood. METHODS We simulated clinico-genomic cohorts followed from an index date to death. Subgroups in each cohort who underwent genomic testing before death were "observed." We varied data generation parameters under four scenarios: (i) independent testing and survival times; (ii) correlated testing and survival times for all patients; (iii) correlated testing and survival times for a subset of patients; and (iv) testing and mortality exclusively following progression events. We examined the behavior of conditional Kendall tau (Tc) statistics, Cox entry time coefficients, and biases in overall survival (OS) estimation and biomarker inference across scenarios. RESULTS Scenario #1 yielded null Tc and Cox entry time coefficients and unbiased OS inference. Scenario #2 yielded positive Tc, negative Cox entry time coefficients, underestimated OS, and biomarker associations biased toward the null. Scenario #3 yielded negative Tc, positive Cox entry time coefficients, and underestimated OS, but biomarker estimates were less biased. Scenario #4 yielded null Tc and Cox entry time coefficients, underestimated OS, and biased biomarker estimates. Transformation and copula modeling did not provide unbiased results. CONCLUSIONS Approaches to informative clinico-genomic cohort entry, including Tc and Cox entry time statistics, are sensitive to heterogeneity in genotyping and survival time distributions. IMPACT Novel methods are needed for unbiased inference using observational clinico-genomic data.
Collapse
|
11
|
Left truncation in linked data: A practical guide to understanding left truncation and applying it using SAS and R. Pharm Stat 2023; 22:194-204. [PMID: 35843723 DOI: 10.1002/pst.2257] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 05/02/2022] [Accepted: 07/06/2022] [Indexed: 02/01/2023]
Abstract
Time-to-event data such as time to death are broadly used in medical research and drug development to understand the efficacy of a therapeutic. For time-to-event data, right censoring (data only observed up to a certain point of time) is common and easy to recognize. Methods that use right censored data, such as the Kaplan-Meier estimator and the Cox proportional hazard model, are well established. Time-to-event data can also be left truncated, which arises when patients are excluded from the sample because their events occur before a specific milestone, potentially resulting in an immortal time bias. For example, in a study evaluating the association between biomarker status and overall survival, patients who did not live long enough to receive a genomic test were not observed in the study. Left truncation causes selection bias and often leads to an overestimate of survival time. In this tutorial, we used a nationwide electronic health record-derived de-identified database to demonstrate how to analyze left truncated and right censored data without bias using example code from SAS and R.
Collapse
|
12
|
Estimating survival parameters under conditionally independent left truncation. Pharm Stat 2022; 21:895-906. [PMID: 35262259 PMCID: PMC9545094 DOI: 10.1002/pst.2202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 02/22/2022] [Accepted: 02/22/2022] [Indexed: 11/24/2022]
Abstract
Databases derived from electronic health records (EHRs) are commonly subject to left truncation, a type of selection bias that occurs when patients need to survive long enough to satisfy certain entry criteria. Standard methods to adjust for left truncation bias rely on an assumption of marginal independence between entry and survival times, which may not always be satisfied in practice. In this work, we examine how a weaker assumption of conditional independence can result in unbiased estimation of common statistical parameters. In particular, we show the estimability of conditional parameters in a truncated dataset, and of marginal parameters that leverage reference data containing non‐truncated data on confounders. The latter is complementary to observational causal inference methodology applied to real‐world external comparators, which is a common use case for real‐world databases. We implement our proposed methods in simulation studies, demonstrating unbiased estimation and valid statistical inference. We also illustrate estimation of a survival distribution under conditionally independent left truncation in a real‐world clinico‐genomic database.
Collapse
|
13
|
A SAS macro for estimating direct adjusted survival functions for time-to-event data with or without left truncation. Bone Marrow Transplant 2022; 57:6-10. [PMID: 34413470 PMCID: PMC9396933 DOI: 10.1038/s41409-021-01435-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Revised: 07/26/2021] [Accepted: 08/03/2021] [Indexed: 02/08/2023]
Abstract
There are several statistical programmes to compute direct adjusted survival estimates from results of the Cox proportional hazards model. However, when used to analyze observational databases with large sample sizes or highly stratified treatment groups such as in registry-related datasets, these programmes are inefficient or unable to generate confidence bands and simultaneous p values. Also, these programmes do not consider potential left-truncation in retrospectively collected data. To address these deficiencies we developed a new SAS macro %adjsurvlt() able to produce direct adjusted survival estimates based on a stratified Cox model. The macro has improved computational performance and is able to handle left-truncated and right-censored time-to-event data. Several mechanisms were implemented to improve computational efficiency including choosing matrix operations over do-loops and reducing dimensions of co-variate matrices. Compared to the latest SAS macro, %adjsurvlt() used < 0.1% computational time to process a dataset with 100 treatment cohorts and a sample size of 20,000 and showed similar computational efficiency when analyzing left-truncated and right-censored data. We illustrate use of %adjsurvlt() to compare retrospectively collected survival data of 2 transplant cohorts.
Collapse
|
14
|
Incorporating survival data into case-control studies with incident and prevalent cases. Stat Med 2021; 40:6295-6308. [PMID: 34510499 DOI: 10.1002/sim.9183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 08/03/2021] [Accepted: 08/17/2021] [Indexed: 11/09/2022]
Abstract
Typically, case-control studies to estimate odds-ratios associating risk factors with disease incidence only include newly diagnosed cases. Recently proposed methods allow incorporating information on prevalent cases, individuals who survived from disease diagnosis to sampling, into cross-sectionally sampled case-control studies under parametric assumptions for the survival time after diagnosis. Here we propose and study methods to additionally use prospectively observed survival times from prevalent and incident cases to adjust logistic models for the time between diagnosis and sampling, the backward time, for prevalent cases. This adjustment yields unbiased odds-ratio estimates from case-control studies that include prevalent cases. We propose a computationally simple two-step generalized method-of-moments estimation procedure. First, we estimate the survival distribution assuming a semiparametric Cox model using an expectation-maximization algorithm that yields fully efficient estimates and accommodates left truncation for prevalent cases and right censoring. Then, we use the estimated survival distribution in an extension of the logistic model to three groups (controls, incident, and prevalent cases), to adjust for the survival bias in prevalent cases. In simulations, under modest amounts of censoring, odds-ratios from the two-step procedure were equally efficient as those estimated from a joint logistic and survival data likelihood under parametric assumptions. This indicates that utilizing the cases' prospective survival data lessens model dependencies and improves precision of association estimates for case-control studies with prevalent cases. We illustrate the methods by estimating associations between single nucleotide polymorphisms and breast cancer risk using controls, and incident and prevalent cases sampled from the US Radiologic Technologists Study cohort.
Collapse
|
15
|
Multi-marker genetic association and interaction tests with interval-censored survival outcomes. Genet Epidemiol 2021; 45:860-873. [PMID: 34472134 DOI: 10.1002/gepi.22429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 07/13/2021] [Accepted: 08/12/2021] [Indexed: 11/06/2022]
Abstract
The development of set-based genetic-survival association tests has been focusing on right-censored survival outcomes. However, interval-censored failure time data arise widely from health science studies, especially those on the development of chronic diseases. In this paper, we proposed a suite of set-based genetic association and interaction tests for interval-censored survival outcomes under a unified weighted-V-statistic framework. Besides dealing with interval censoring, the new tests can account for genetic effect heterogeneity and accommodate left truncation of survival outcomes. Simulation studies showed that the new tests perform well in terms of size and power under various scenarios and that the new interaction test is more powerful than the standard likelihood ratio test for testing gene-gene/gene-environment interactions. The practical utility of the developed tests was illustrated by a genome-wide association study of age to early childhood caries.
Collapse
|
16
|
Penalized regression for left-truncated and right-censored survival data. Stat Med 2021; 40:5487-5500. [PMID: 34302373 PMCID: PMC9290657 DOI: 10.1002/sim.9136] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 06/25/2021] [Accepted: 06/28/2021] [Indexed: 01/14/2023]
Abstract
High‐dimensional data are becoming increasingly common in the medical field as large volumes of patient information are collected and processed by high‐throughput screening, electronic health records, and comprehensive genomic testing. Statistical models that attempt to study the effects of many predictors on survival typically implement feature selection or penalized methods to mitigate the undesirable consequences of overfitting. In some cases survival data are also left‐truncated which can give rise to an immortal time bias, but penalized survival methods that adjust for left truncation are not commonly implemented. To address these challenges, we apply a penalized Cox proportional hazards model for left‐truncated and right‐censored survival data and assess implications of left truncation adjustment on bias and interpretation. We use simulation studies and a high‐dimensional, real‐world clinico‐genomic database to highlight the pitfalls of failing to account for left truncation in survival modeling.
Collapse
|
17
|
Cox regression model under dependent truncation. Biometrics 2021; 78:460-473. [PMID: 33687064 DOI: 10.1111/biom.13451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 02/07/2021] [Accepted: 02/24/2021] [Indexed: 11/28/2022]
Abstract
Truncation is a statistical phenomenon that occurs in many time-to-event studies. For example, autopsy-confirmed studies of neurodegenerative diseases are subject to an inherent left and right truncation, also known as double truncation. When the goal is to study the effect of risk factors on survival, the standard Cox regression model cannot be used when the survival time is subject to truncation. Existing methods that adjust for both left and right truncation in the Cox regression model require independence between the survival times and truncation times, which may not be a reasonable assumption in practice. We propose an expectation-maximization algorithm to relax the independence assumption in the Cox regression model under left, right, or double truncation to an assumption of conditional independence on the observed covariates. The resulting regression coefficient estimators are consistent and asymptotically normal. We demonstrate through extensive simulations that the proposed estimator has little bias and has a similar or lower mean-squared error compared to existing estimators. We implement our approach to assess the effect of occupation on survival in subjects with autopsy-confirmed Alzheimer's disease.
Collapse
|
18
|
A Bayesian joint model for zero-inflated integers and left-truncated event times with a time-varying association: Applications to senior health care. Stat Med 2020; 40:147-166. [PMID: 33104241 DOI: 10.1002/sim.8767] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Revised: 09/10/2020] [Accepted: 09/17/2020] [Indexed: 11/09/2022]
Abstract
Population aging in most industrialized societies has led to a dramatic increase in emergency medical demand among the elderly. In the context of private health care, an optimal allocation of the medical resources for seniors is commonly done by forecasting their life spans. Accounting for each subject's particularities is therefore indispensable, so the available data must be processed at an individual level. We use a large and unique dataset of insured parties aged 65 and older to appropriately relate the emergency care usage with mortality risk. Longitudinal and time-to-event processes are jointly modeled, and their underlying relationship can therefore be assessed. Such an application, however, requires some special features to also be considered. First, longitudinal demand for emergency services exhibits a nonnegative integer response with an excess of zeros due to the very nature of the data. These subject-specific responses are handled by a zero-inflated version of the hierarchical negative binomial model. Second, event times must account for the left truncation derived from the fact that policyholders must reach the age of 65 before they may begin to be observed. Consequently, a delayed entry bias arises for those individuals entering the study after this age threshold. Third, and as the main challenge of our analysis, the association parameter between both processes is expected to be age-dependent, with an unspecified association structure. This is well-approximated through a flexible functional specification provided by penalized B-splines. The parameter estimation of the joint model is derived under a Bayesian scheme.
Collapse
|
19
|
Analyzing left-truncated and right-censored infectious disease cohort data with interval-censored infection onset. Stat Med 2020; 40:287-298. [PMID: 33086432 DOI: 10.1002/sim.8774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 08/18/2020] [Accepted: 09/26/2020] [Indexed: 11/10/2022]
Abstract
In an infectious disease cohort study, individuals who have been infected with a pathogen are often recruited for follow up. The period between infection and the onset of symptomatic disease, referred to as the incubation period, is of interest because of its importance on disease surveillance and control. However, the incubation period is often difficult to ascertain due to the uncertainty associated with asymptomatic infection onset time. An additional complication is that the observed infected subjects are likely to have longer incubation periods due to the prevalent sampling. In this article, we demonstrate how to estimate the distribution of the incubation period with the uncertain infection onset, subject to left-truncation and right-censoring. We employ a family of sufficiently general parametric models, the generalized odds-rate class of regression models, for the underlying incubation period and its correlation with covariates. In simulation studies, we assess the finite sample performance of the model fitting and hazard function estimation. The proposed method is illustrated on data from the HIV/AIDS study on injection drug users admitted to a detoxification program in Badalona, Spain.
Collapse
|
20
|
A pairwise pseudo-likelihood approach for left-truncated and interval-censored data under the Cox model. Biometrics 2020; 77:1303-1314. [PMID: 33058180 DOI: 10.1111/biom.13394] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 08/05/2020] [Accepted: 09/29/2020] [Indexed: 11/29/2022]
Abstract
Left truncation commonly occurs in many areas, and many methods have been proposed in the literature for the analysis of various types of left-truncated failure time data. For the situation, a common approach is to conduct the analysis conditional on truncation times, and the method is relatively simple but may not be efficient. In this paper, we discuss regression analysis of such data arising from the proportional hazards model that also suffer interval censoring. For the problem, a pairwise pseudo-likelihood approach is proposed that aims to recover some missing information in the conditional methods. The resulting estimator is shown to be consistent and asymptotically normal. A simulation study is conducted to assess the performance of the proposed method and suggests that it works well in practical situations and is indeed more efficient than the existing method. The approach is also applied to a set of real data arising from an AIDS cohort study.
Collapse
|
21
|
Set-based genetic association and interaction tests for survival outcomes based on weighted V statistics. Genet Epidemiol 2020; 45:46-63. [PMID: 32896012 DOI: 10.1002/gepi.22353] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 08/03/2020] [Accepted: 08/03/2020] [Indexed: 01/07/2023]
Abstract
With advancements in high-throughout technologies, studies have been conducted to investigate the role of massive genetic variants in human diseases. While set-based tests have been developed for binary and continuous disease outcomes, there are few computationally efficient set-based tests available for time-to-event outcomes. To facilitate the genetic association and interaction analyses of time-to-event outcomes, We develop a suite of multivariant tests based on weighted V statistics with or without considering potential genetic heterogeneity. In addition to the computation efficiency and nice asymptotic properties, all the new tests can deal with left truncation and competing risks in the survival data, and adjust for covariates. Simulation studies show that the new tests run faster, are more accurate in small samples, and account for confounding effect better than the existing multivariant survival tests. When the genetic effect is heterogeneous across individuals/subpopulations, the association test considering genetic heterogeneity is more powerful than the existing tests that do not account for genetic heterogeneity. Using the new methods, we perform a genome-wide association analysis of the genotype and age-to-Alzheimer's data from the Rush Memory and Aging Project and the Religious Orders Study. The analysis identifies two genes, APOE and APOC1, associated with age to Alzheimer's disease onset.
Collapse
|
22
|
Adaptive lasso for the Cox regression with interval censored and possibly left truncated data. Stat Methods Med Res 2020; 29:1243-1255. [PMID: 31203741 PMCID: PMC9969839 DOI: 10.1177/0962280219856238] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
We propose a penalized variable selection method for the Cox proportional hazards model with interval censored data. It conducts a penalized nonparametric maximum likelihood estimation with an adaptive lasso penalty, which can be implemented through a penalized EM algorithm. The method is proven to enjoy the desirable oracle property. We also extend the method to left truncated and interval censored data. Our simulation studies show that the method possesses the oracle property in samples of modest sizes and outperforms available existing approaches in many of the operating characteristics. An application to a dental caries data set illustrates the method's utility.
Collapse
|
23
|
Transformation model estimation of survival under dependent truncation and independent censoring. Stat Methods Med Res 2019; 28:3785-3798. [PMID: 30543153 PMCID: PMC6565507 DOI: 10.1177/0962280218817573] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Truncation is a mechanism that permits observation of selected subjects from a source population; subjects are excluded if their event times are not contained within subject-specific intervals. Standard survival analysis methods for estimation of the distribution of the event time require quasi-independence of failure and truncation. When quasi-independence does not hold, alternative estimation procedures are required; currently, there is a copula model approach that makes strong modeling assumptions, and a transformation model approach that does not allow for right censoring. We extend the transformation model approach to accommodate right censoring. We propose a regression diagnostic for assessment of model fit. We evaluate the proposed transformation model in simulations and apply it to the National Alzheimer's Coordinating Centers autopsy cohort study, and an AIDS incubation study. Our methods are publicly available in an R package, tranSurv.
Collapse
|
24
|
Benefits of combining prevalent and incident cohorts: An application to myotonic dystrophy. Stat Methods Med Res 2018; 28:3333-3345. [PMID: 30293502 DOI: 10.1177/0962280218804275] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
It is frequently of interest to estimate the time that individuals survive with a disease, that is, to estimate the time between disease onset and occurrence of a clinical endpoint such as death. Epidemiologic survival data are commonly collected from either an incident cohort, whose members' disease onset occurs after the study baseline date, or from a cohort with prevalent disease that is followed forward in time. Incident cohort survival data are limited by study termination, while prevalent cohort data provide biased (left-truncated) survival data. In this article, we investigate the advantages of a study design featuring simultaneous follow-up of prevalent and incident cohorts to the estimation of the survivor function. Our analyses are supported by simulations and illustrated using data on survival after myotonic dystrophy diagnosis from the United Kingdom Clinical Practice Research Datalink (CPRD). We demonstrate that the NPMLE using combined incident and prevalent cohort data estimates the true survivor function very well, even for moderate sample sizes, and ameliorates the disadvantages of using a purely incident or prevalent cohort.
Collapse
|
25
|
Stabilizing cumulative incidence estimation of pregnancy outcome with delayed entries. Biom J 2018; 61:1290-1302. [PMID: 29984423 DOI: 10.1002/bimj.201700237] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Revised: 05/15/2018] [Accepted: 06/01/2018] [Indexed: 11/08/2022]
Abstract
A pregnancy may end up with (at least) three possible events: live birth, spontaneous abortion, or elective termination, yielding a competing risks issue when studying an association between a risk factor and a pregnancy outcome. Cumulative incidences (probabilities to end up with the different outcomes depending on gestational age) can be estimated via the Aalen-Johansen estimate. Another issue is that women are usually not entering such an observational study from the first day of pregnancy, resulting in delayed entries. As in traditional survival analysis, this can be solved by considering "at risk" at a given gestational age only for those women who entered the study before that age. However, the number of women at risk at an early gestational age might be extremely low, such that the estimates of cumulative incidence may increase exaggeratedly at that age because of a single event. One solution to reduce the problem has been recently proposed in the literature, which is to ignore simply those early events, creating a small mean bias but enhancing stability of estimates. In the present paper, we propose an alternative computationally simple approach to tackle this problem that consists to postpone to later gestational ages (rather than to ignore) those early events. The two approaches are compared with respect to bias, stability, and sensitivity on the smoothing parameter via simulations reproducing realistic pregnancy scenarios, and are illustrated with data from a study on the effects of statins on pregnancy outcomes. We also outline that all three approaches are asymptotically equivalent.
Collapse
|
26
|
Time-to-event data with time-varying biomarkers measured only at study entry, with applications to Alzheimer's disease. Stat Med 2018; 37:914-932. [PMID: 29266591 PMCID: PMC5801265 DOI: 10.1002/sim.7547] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2017] [Revised: 09/01/2017] [Accepted: 10/06/2017] [Indexed: 11/09/2022]
Abstract
Relating time-varying biomarkers of Alzheimer's disease to time-to-event using a Cox model is complicated by the fact that Alzheimer's disease biomarkers are sparsely collected, typically only at study entry; this is problematic since Cox regression with time-varying covariates requires observation of the covariate process at all failure times. The analysis might be simplified by using study entry as the time origin and treating the time-varying covariate measured at study entry as a fixed baseline covariate. In this paper, we first derive conditions under which using an incorrect time origin of study entry results in consistent estimation of regression parameters when the time-varying covariate is continuous and fully observed. We then derive conditions under which treating the time-varying covariate as fixed at study entry results in consistent estimation. We provide methods for estimating the regression parameter when a functional form can be assumed for the time-varying biomarker, which is measured only at study entry. We demonstrate our analytical results in a simulation study and apply our methods to data from the Rush Religious Orders Study and Memory and Aging Project and data from the Alzheimer's Disease Neuroimaging Initiative.
Collapse
|
27
|
Estimating time-dependent ROC curves using data under prevalent sampling. Stat Med 2017; 36:1285-1301. [PMID: 27891650 DOI: 10.1002/sim.7184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2016] [Revised: 10/09/2016] [Accepted: 11/03/2016] [Indexed: 11/10/2022]
Abstract
Prevalent sampling is frequently a convenient and economical sampling technique for the collection of time-to-event data and thus is commonly used in studies of the natural history of a disease. However, it is biased by design because it tends to recruit individuals with longer survival times. This paper considers estimation of time-dependent receiver operating characteristic curves when data are collected under prevalent sampling. To correct the sampling bias, we develop both nonparametric and semiparametric estimators using extended risk sets and the inverse probability weighting techniques. The proposed estimators are consistent and converge to Gaussian processes, while substantial bias may arise if standard estimators for right-censored data are used. To illustrate our method, we analyze data from an ovarian cancer study and estimate receiver operating characteristic curves that assess the accuracy of the composite markers in distinguishing subjects who died within 3-5 years from subjects who remained alive. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
|
28
|
Regression models for the restricted residual mean life for right-censored and left-truncated data. Stat Med 2017; 36:1803-1822. [PMID: 28106926 DOI: 10.1002/sim.7222] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2016] [Revised: 11/04/2016] [Accepted: 12/16/2016] [Indexed: 11/07/2022]
Abstract
The hazard ratios resulting from a Cox's regression hazards model are hard to interpret and to be converted into prolonged survival time. As the main goal is often to study survival functions, there is increasing interest in summary measures based on the survival function that are easier to interpret than the hazard ratio; the residual mean time is an important example of those measures. However, because of the presence of right censoring, the tail of the survival distribution is often difficult to estimate correctly. Therefore, we consider the restricted residual mean time, which represents a partial area under the survival function, given any time horizon τ, and is interpreted as the residual life expectancy up to τ of a subject surviving up to time t. We present a class of regression models for this measure, based on weighted estimating equations and inverse probability of censoring weighted estimators to model potential right censoring. Furthermore, we show how to extend the models and the estimators to deal with delayed entries. We demonstrate that the restricted residual mean life estimator is equivalent to integrals of Kaplan-Meier estimates in the case of simple factor variables. Estimation performance is investigated by simulation studies. Using real data from Danish Monitoring Cardiovascular Risk Factor Surveys, we illustrate an application to additive regression models and discuss the general assumption of right censoring and left truncation being dependent on covariates. Copyright © 2017 John Wiley & Sons, Ltd.
Collapse
|
29
|
Ages at Onset of 5 Cardiometabolic Diseases Adjusting for Nonsusceptibility: Implications for the Pathogenesis of Metabolic Syndrome. Am J Epidemiol 2016; 184:366-77. [PMID: 27543092 DOI: 10.1093/aje/kwv449] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Accepted: 12/21/2015] [Indexed: 12/28/2022] Open
Abstract
To shed light on the etiology of metabolic syndrome development, it is important to understand whether its 5 component disorders follow certain onset sequences. To explore disease progression of the syndrome, we studied the ages at onset of 5 cardiometabolic diseases: abdominal obesity, diabetes, hypertension, hypertriglyceridemia, and hypo-α-lipoproteinemia. In analyzing longitudinal data from the Cardiovascular Disease Risk Factors Two-Township Study (1989-2002) in Taiwan, we adjusted for nonsusceptibility, utilizing the logistic-accelerated failure time location-scale mixture regression models for left-truncated and interval-censored data to simultaneously estimate the associations of township and sex with the susceptibility probability and the age-at-onset distribution of susceptible individuals for each disease. We then validated the onset sequences of 5 cardiometabolic diseases by comparing the overall probability density curves across township-sex strata. Visualization of these curves indicates that women tended to have onsets of abdominal obesity and hypo-α-lipoproteinemia in young adulthood, hypertension and hypertriglyceridemia in middle age, and diabetes later; men tended to have onsets of abdominal obesity, hypo-α-lipoproteinemia, and hypertriglyceridemia in young adulthood, hypertension in middle age, and diabetes later. Different onset patterns of abdominal obesity, hypo-α-lipoproteinemia, and male hypertension were identified between townships. Our proposed method provides a novel strategy for investigating both pathogenesis and preventive measures of complex syndromes.
Collapse
|
30
|
Joint modelling of longitudinal and survival data: incorporating delayed entry and an assessment of model misspecification. Stat Med 2016; 35:1193-209. [PMID: 26514596 PMCID: PMC5019272 DOI: 10.1002/sim.6779] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Revised: 09/28/2015] [Accepted: 10/05/2015] [Indexed: 11/10/2022]
Abstract
A now common goal in medical research is to investigate the inter-relationships between a repeatedly measured biomarker, measured with error, and the time to an event of interest. This form of question can be tackled with a joint longitudinal-survival model, with the most common approach combining a longitudinal mixed effects model with a proportional hazards survival model, where the models are linked through shared random effects. In this article, we look at incorporating delayed entry (left truncation), which has received relatively little attention. The extension to delayed entry requires a second set of numerical integration, beyond that required in a standard joint model. We therefore implement two sets of fully adaptive Gauss-Hermite quadrature with nested Gauss-Kronrod quadrature (to allow time-dependent association structures), conducted simultaneously, to evaluate the likelihood. We evaluate fully adaptive quadrature compared with previously proposed non-adaptive quadrature through a simulation study, showing substantial improvements, both in terms of minimising bias and reducing computation time. We further investigate, through simulation, the consequences of misspecifying the longitudinal trajectory and its impact on estimates of association. Our scenarios showed the current value association structure to be very robust, compared with the rate of change that we found to be highly sensitive showing that assuming a simpler trend when the truth is more complex can lead to substantial bias. With emphasis on flexible parametric approaches, we generalise previous models by proposing the use of polynomials or splines to capture the longitudinal trend and restricted cubic splines to model the baseline log hazard function. The methods are illustrated on a dataset of breast cancer patients, modelling mammographic density jointly with survival, where we show how to incorporate density measurements prior to the at-risk period, to make use of all the available information. User-friendly Stata software is provided.
Collapse
|
31
|
Bayesian joint modeling for assessing the progression of chronic kidney disease in children. Stat Methods Med Res 2016; 27:298-311. [PMID: 26988933 DOI: 10.1177/0962280216628560] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Joint models are rich and flexible models for analyzing longitudinal data with nonignorable missing data mechanisms. This article proposes a Bayesian random-effects joint model to assess the evolution of a longitudinal process in terms of a linear mixed-effects model that accounts for heterogeneity between the subjects, serial correlation, and measurement error. Dropout is modeled in terms of a survival model with competing risks and left truncation. The model is applied to data coming from ReVaPIR, a project involving children with chronic kidney disease whose evolution is mainly assessed through longitudinal measurements of glomerular filtration rate.
Collapse
|
32
|
Methods for testing the Markov condition in the illness-death model: a comparative study. Stat Med 2016; 35:3549-62. [PMID: 26990971 DOI: 10.1002/sim.6940] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2015] [Revised: 02/09/2016] [Accepted: 02/22/2016] [Indexed: 11/09/2022]
Abstract
Markov three-state progressive and illness-death models are often used in biomedicine for describing survival data when an intermediate event of interest may be observed during the follow-up. However, the usual estimators for Markov models (e.g., Aalen-Johansen transition probabilities) may be systematically biased in non-Markovian situations. On the other hand, despite non-Markovian estimators for transition probabilities and related curves are available, including the Markov information in the construction of the estimators allows for variance reduction. Therefore, testing for the Markov condition is a relevant issue in practice. In this paper, we discuss several characterizations of the Markov condition, with special focus on its equivalence with the quasi-independence between left truncation and survival times in standard survival analysis. New methods for testing the Markovianity of an illness-death model are proposed and compared with existing ones by means of an intensive simulation study. We illustrate our findings through the analysis of a data set from stem cell transplant in leukemia. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
|
33
|
Counterpoint: epidemiology to guide decision-making: moving away from practice-free research. Am J Epidemiol 2015; 182:834-9. [PMID: 26507306 DOI: 10.1093/aje/kwv215] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Accepted: 07/02/2015] [Indexed: 11/13/2022] Open
Abstract
Analyses of observational data aimed at supporting decision-making are ideally framed as a contrast between well-defined treatment strategies. These analyses compare individuals' outcomes from the start of the treatment strategies under consideration. Exceptions to this synchronizing of the start of follow-up and the treatment strategies may be justified on a case-by-case basis.
Collapse
|
34
|
Point: incident exposures, prevalent exposures, and causal inference: does limiting studies to persons who are followed from first exposure onward damage epidemiology? Am J Epidemiol 2015; 182:826-33. [PMID: 26507305 PMCID: PMC4634310 DOI: 10.1093/aje/kwv225] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2014] [Accepted: 08/03/2015] [Indexed: 12/01/2022] Open
Abstract
The idea that epidemiologic studies should start from first exposure onward has been advocated in the past few years. The study of incident exposures is contrasted with studies of prevalent exposures in which follow-up may commence after first exposure. The former approach is seen as a hallmark of a good study and necessary for causal inference. We argue that studying incident exposures may be necessary in some situations, but it is not always necessary and is not the preferred option in many instances. Conducting a study involves decisions as to which person-time experience should be included. Although studies of prevalent exposures involve left truncation (missingness on the left), studies of incident exposures may involve right censoring (missingness on the right) and therefore may not be able to assess the long-term effects of exposure. These considerations have consequences for studies of dynamic (open) populations that involve a mixture of prevalent and incident exposures. We argue that studies with prevalent exposures will remain a necessity for epidemiology. The purpose of this paper is to restore the balance between the emphasis on first exposure cohorts and the richness of epidemiologic information obtained when studying prevalent exposures.
Collapse
|
35
|
Effects of gestational age at enrollment in pregnancy exposure registries. Pharmacoepidemiol Drug Saf 2015; 24:343-52. [PMID: 25702683 DOI: 10.1002/pds.3731] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Revised: 09/11/2014] [Accepted: 10/02/2014] [Indexed: 11/08/2022]
Abstract
PURPOSE This study aims to explore the influence of gestational age at enrollment, and enrollment before or after prenatal screening, on the estimation of drug effects in pregnancy exposure registries. METHODS We assessed the associations between first trimester antiepileptic drug (AED) exposure and risk of spontaneous abortion and major congenital malformations in the North American AED Registry (1996-2013). We performed logistic regression analyses, conditional or unconditional on gestational age at enrollment, to estimate relative risk (RR) for first trimester AED users compared with non-users. We also compared first trimester users of valproic acid and lamotrigine. Analyses were repeated in women who enrolled before prenatal screening. RESULTS Enrollment occurred earlier among 7029 AED users than among 581 non-users; it was similar among AEDs. Comparing AED users with non-users, RR (95% confidence interval) of spontaneous abortion (n = 359) decreased from 5.1 (2.3-14.1) to 2.0 (0.9-5.6) after conditioning on gestational week at enrollment and to 1.9 (0.8-5.4) upon further restriction to before-screening enrollees. RR of congenital malformations (n = 216) changed from 3.1 (1.4-8.5) to 3.2 (1.4-9.0) after conditioning on gestational week at enrollment and to 2.0 (0.7-10.1) upon further restriction to before-screening enrollees. When comparing valproic acid users and lamotrigine users, the RR of congenital malformations was not substantially changed by conditioning or restricting. CONCLUSIONS Spontaneous abortion rates were sensitive to gestational age at enrollment. Estimates of congenital malformation risks for AED users relative to non-users were sensitive to before/after-screening enrollment. This difference was not apparent between active drugs, likely due to similar gestational age at enrollment.
Collapse
|
36
|
Estimating the lifetime risk of dementia in the Canadian elderly population using cross-sectional cohort survival data. J Am Stat Assoc 2014; 109:24-35. [PMID: 26139951 DOI: 10.1080/01621459.2013.859076] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Dementia is one of the world's major public health challenges. The lifetime risk of dementia is the proportion of individuals who ever develop dementia during their lifetime. Despite its importance to epidemiologists and policy-makers, this measure does not seem to have been estimated in the Canadian population. Data from a birth cohort study of dementia are not available. Instead, we must rely on data from the Canadian Study of Heath and Aging, a large cross-sectional study of dementia with follow-up for survival. These data present challenges because they include substantial loss to follow-up and are not representatively drawn from the target population because of structural sampling biases. A first bias is imparted by the cross-sectional sampling scheme, while a second bias is a result of stratified sampling. Estimation of the lifetime risk and related quantities in the presence of these biases has not been previously addressed in the literature. We develop and study nonparametric estimators of the lifetime risk, the remaining lifetime risk and cumulative risk at specific ages, accounting for these complexities. In particular, we reveal the fact that estimation of the lifetime risk is invariant to stratification by current age at sampling. We present simulation results validating our methodology, and provide novel facts about the epidemiology of dementia in Canada using data from the Canadian Study of Health and Aging.
Collapse
|
37
|
Cross-ratio estimation for bivariate failure times with left truncation. LIFETIME DATA ANALYSIS 2014; 20:23-37. [PMID: 23700275 PMCID: PMC3815963 DOI: 10.1007/s10985-013-9263-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2012] [Accepted: 05/13/2013] [Indexed: 06/02/2023]
Abstract
The cross-ratio is an important local measure that characterizes the dependence between bivariate failure times. To estimate the cross-ratio in follow-up studies where delayed entry is present, estimation procedures need to account for left truncation. Ignoring left truncation yields biased estimates of the cross-ratio. We extend the method of Hu et al., Biometrika 98:341-354 (2011) by modifying the risk sets and relevant indicators to handle left-truncated bivariate failure times, which yields the cross-ratio estimate with desirable asymptotic properties that can be shown by the same techniques used in Hu et al., Biometrika 98:341-354 (2011). Numerical studies are conducted.
Collapse
|
38
|
Jointly modeling the relationship between longitudinal and survival data subject to left truncation with applications to cystic fibrosis. Stat Med 2012; 31:3931-45. [PMID: 22786556 PMCID: PMC5551379 DOI: 10.1002/sim.5469] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2010] [Accepted: 05/10/2012] [Indexed: 11/05/2022]
Abstract
Numerous methods for joint analysis of longitudinal measures of a continuous outcome y and a time to event outcome T have recently been developed either to focus on the longitudinal data y while correcting for nonignorable dropout, to predict the survival outcome T using the longitudinal data y, or to examine the relationship between y and T. The motivating problem for our work is in joint modeling of the serial measurements of pulmonary function (FEV1% predicted) and survival in cystic fibrosis (CF) patients using registry data. Within the CF registry data, an additional complexity is that not all patients have been followed from birth; therefore, some patients have delayed entry into the study while others may have been missed completely, giving rise to a left truncated distribution. This paper shows in joint modeling situations where y and T are not independent, that it is necessary to account for this left truncation to obtain valid parameter estimates related to both survival and the longitudinal marker. We assume a linear random effects model for FEV1% predicted, where the random intercept and slope of FEV1% predicted, along with a specified transformation of the age at death follow a trivariate normal distribution. We develop an expectation-maximization algorithm for maximum likelihood estimation of parameters, which takes left truncation and right censoring of survival times into account. The methods are illustrated using simulation studies and using data from CF patients in a registry followed at Rainbow Babies and Children's Hospital, Cleveland, OH.
Collapse
|
39
|
Prevalent cases in observational studies of cancer survival: do they bias hazard ratio estimates? Br J Cancer 2009; 100:1806-11. [PMID: 19401693 PMCID: PMC2695697 DOI: 10.1038/sj.bjc.6605062] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2009] [Revised: 03/24/2009] [Accepted: 03/30/2009] [Indexed: 11/16/2022] Open
Abstract
Observational epidemiological studies often include prevalent cases recruited at various times past diagnosis. This left truncation can be dealt with in non-parametric (Kaplan-Meier) and semi-parametric (Cox) time-to-event analyses, theoretically generating an unbiased hazard ratio (HR) when the proportional hazards (PH) assumption holds. However, concern remains that inclusion of prevalent cases in survival analysis results inevitably in HR bias. We used data on three well-established breast cancer prognosticators - clinical stage, histopathological grade and oestrogen receptor (ER) status - from the SEARCH study, a population-based study including 4470 invasive breast cancer cases (incident and prevalent), to evaluate empirically the effectiveness of allowing for left truncation in limiting HR bias. We found that HRs of prognostic factors changed over time and used extended Cox models incorporating time-dependent covariates. When comparing Cox models restricted to subjects ascertained within six months of diagnosis (incident cases) to models based on the full data set allowing for left truncation, we found no difference in parameter estimates (P=0.90, 0.32 and 0.95, for stage, grade and ER status respectively). Our results show that use of prevalent cases in an observational epidemiological study of breast cancer does not bias the HR in a left truncation Cox survival analysis, provided the PH assumption holds true.
Collapse
|