1
|
Li W, Wang Q, Ning J, Zhang J, Li Z, Savitz SI, Tahanan A, Rahbar MH. Enhancing long-term survival prediction with two short-term events: Landmarking with a flexible varying coefficient model. Stat Med 2024; 43:2607-2621. [PMID: 38664221 DOI: 10.1002/sim.10086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 03/08/2024] [Accepted: 04/11/2024] [Indexed: 05/24/2024]
Abstract
Patients with cardiovascular diseases who experience disease-related short-term events, such as hospitalizations, often exhibit diverse long-term survival outcomes compared to others. In this study, we aim to improve the prediction of long-term survival probability by incorporating two short-term events using a flexible varying coefficient landmark model. Our objective is to predict the long-term survival among patients who survived up to a pre-specified landmark time since the initial admission. Inverse probability weighting estimation equations are formed based on the information of the short-term outcomes before the landmark time. The kernel smoothing method with the use of cross-validation for bandwidth selection is employed to estimate the time-varying coefficients. The predictive performance of the proposed model is evaluated and compared using predictive measures: area under the receiver operating characteristic curve and Brier score. Simulation studies confirm that parameters under the landmark models can be estimated accurately and the predictive performance of the proposed method consistently outperforms existing methods that either do not incorporate or only partially incorporate information from two short-term events. We demonstrate the practical application of our model using a community-based cohort from the Atherosclerosis Risk in Communities (ARIC) study.
Collapse
Affiliation(s)
- Wen Li
- Division of Clinical and Translational Sciences, Department of Internal Medicine, The University of Texas McGovern Medical School at Houston, Houston, Texas, USA
- Biostatistics/Epidemiology/Research Design (BERD) Component, Center for Clinical and Translational Sciences (CCTS), University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Qian Wang
- Biostatistics/Epidemiology/Research Design (BERD) Component, Center for Clinical and Translational Sciences (CCTS), University of Texas Health Science Center at Houston, Houston, Texas, USA
- Department of Biostatistics and Data Science, The University of Texas School of Public Health, Houston, Texas, USA
| | - Jing Ning
- Department of Biostatistics, University of Texas MD Anderson Cancer Center at Houston, Houston, Texas, USA
| | - Jing Zhang
- Biostatistics/Epidemiology/Research Design (BERD) Component, Center for Clinical and Translational Sciences (CCTS), University of Texas Health Science Center at Houston, Houston, Texas, USA
- Department of Biostatistics and Data Science, The University of Texas School of Public Health, Houston, Texas, USA
| | - Zhouxuan Li
- Biostatistics/Epidemiology/Research Design (BERD) Component, Center for Clinical and Translational Sciences (CCTS), University of Texas Health Science Center at Houston, Houston, Texas, USA
- Department of Biostatistics and Data Science, The University of Texas School of Public Health, Houston, Texas, USA
| | - Sean I Savitz
- Department of Neurology and Institute for Stroke and Cerebrovascular Disease, The University of Texas Health Science Center, Houston, Texas, USA
| | - Amirali Tahanan
- Biostatistics/Epidemiology/Research Design (BERD) Component, Center for Clinical and Translational Sciences (CCTS), University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Mohammad H Rahbar
- Division of Clinical and Translational Sciences, Department of Internal Medicine, The University of Texas McGovern Medical School at Houston, Houston, Texas, USA
- Biostatistics/Epidemiology/Research Design (BERD) Component, Center for Clinical and Translational Sciences (CCTS), University of Texas Health Science Center at Houston, Houston, Texas, USA
- Division of Epidemiology, Human Genetics and Environmental Sciences (EHGES), University of Texas School of Public Health at Houston, Houston, Texas, USA
| |
Collapse
|
2
|
Parast L, Tian L, Cai T. Assessing heterogeneity in surrogacy using censored data. Stat Med 2024. [PMID: 38812276 DOI: 10.1002/sim.10122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 04/22/2024] [Accepted: 05/10/2024] [Indexed: 05/31/2024]
Abstract
Determining whether a surrogate marker can be used to replace a primary outcome in a clinical study is complex. While many statistical methods have been developed to formally evaluate a surrogate marker, they generally do not provide a way to examine heterogeneity in the utility of a surrogate marker. Similar to treatment effect heterogeneity, where the effect of a treatment varies based on a patient characteristic, heterogeneity in surrogacy means that the strength or utility of the surrogate marker varies based on a patient characteristic. The few methods that have been recently developed to examine such heterogeneity cannot accommodate censored data. Studies with a censored outcome are typically the studies that could most benefit from a surrogate because the follow-up time is often long. In this paper, we develop a robust nonparametric approach to assess heterogeneity in the utility of a surrogate marker with respect to a baseline variable in a censored time-to-event outcome setting. In addition, we propose and evaluate a testing procedure to formally test for heterogeneity at a single time point or across multiple time points simultaneously. Finite sample performance of our estimation and testing procedure are examined in a simulation study. We use our proposed method to investigate the complex relationship between change in fasting plasma glucose, diabetes, and sex hormones using data from the diabetes prevention program study.
Collapse
Affiliation(s)
- Layla Parast
- Department of Statistics and Data Sciences, The University of Texas at Austin, Austin, Texas
| | - Lu Tian
- Department of Biomedical Data Science, Stanford University, Stanford, California
| | - Tianxi Cai
- Department of Biostatistics, Harvard University, Cambridge, Massachusetts
| |
Collapse
|
3
|
Zhou W, Zhu R, Zeng D. A parsimonious personalized dose-finding model via dimension reduction. Biometrika 2021; 108:643-659. [PMID: 34658383 PMCID: PMC8514170 DOI: 10.1093/biomet/asaa087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Learning an individualized dose rule in personalized medicine is a challenging statistical problem. Existing methods often suffer from the curse of dimensionality, especially when the decision function is estimated nonparametrically. To tackle this problem, we propose a dimension reduction framework that effectively reduces the estimation to a lower-dimensional subspace of the covariates. We exploit that the individualized dose rule can be defined in a subspace spanned by a few linear combinations of the covariates, leading to a more parsimonious model. Also, our framework does not require the inverse probability of the propensity score under observational studies due to a direct maximization of the value function. This distinguishes us from the outcome weighted learning framework, which also solves decision rules directly. Under the same framework, we further propose a pseudo-direct learning approach focuses more on estimating the dimensionality-reduced subspace of the treatment outcome. Parameters in both approaches can be estimated efficiently using an orthogonality constrained optimization algorithm on the Stiefel manifold. Under mild regularity assumptions, the asymptotic normality results of the proposed estimators can are established, respectively. We also derive the consistency and convergence rate for the value function under the estimated optimal dose rule. We evaluate the performance of the proposed approaches through extensive simulation studies and a warfarin pharmacogenetic dataset.
Collapse
Affiliation(s)
- Wenzhuo Zhou
- Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, Illinois 61820, U.S.A
| | - Ruoqing Zhu
- Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, Illinois 61820, U.S.A
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| |
Collapse
|
4
|
Wang X, Zheng Y, Jensen MK, He Z, Cai T. Biomarker evaluation under imperfect nested case-control design. Stat Med 2021; 40:4035-4052. [PMID: 33915597 PMCID: PMC8286316 DOI: 10.1002/sim.9012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 04/06/2021] [Accepted: 04/12/2021] [Indexed: 12/24/2022]
Abstract
The nested case-control (NCC) design has been widely adopted as a cost-effective sampling design for biomarker research. Under the NCC design, markers are only measured for the NCC subcohort consisting of all cases and a fraction of the controls selected randomly from the matched risk sets of the cases. Robust methods for evaluating prediction performance of risk models have been derived under the inverse probability weighting framework. The probabilities of samples being included in the NCC cohort can be calculated based on the study design ``a previous study'' or estimated non-parametrically ``a previous study''. Neither strategy works well due to model mis-specification and the curse of dimensionality in practical settings where the sampling does not entirely follow the study design or depends on many factors. In this paper, we propose an alternative strategy to estimate the sampling probabilities based on a varying coefficient model, which attains a balance between robustness and the curse of dimensionality. The complex correlation structure induced by repeated finite risk set sampling makes the standard resampling procedure for variance estimation fail. We propose a perturbation resampling procedure that provides valid interval estimation for the proposed estimators. Simulation studies show that the proposed method performs well in finite samples. We apply the proposed method to the Nurses' Health Study II to develop and evaluate prediction models using clinical biomarkers for cardiovascular risk.
Collapse
Affiliation(s)
- Xuan Wang
- Department of Biostatistics, Harvard University, Boston, MA, USA
| | - Yingye Zheng
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | | | - Zeling He
- Department of Biostatistics, Harvard University, Boston, MA, USA
| | - Tianxi Cai
- Department of Biostatistics, Harvard University, Boston, MA, USA,Department of Biomedical Informatics, Harvard University, Boston, MA, USA
| |
Collapse
|
5
|
Watson JA, Holmes CC. Machine learning analysis plans for randomised controlled trials: detecting treatment effect heterogeneity with strict control of type I error. Trials 2020; 21:156. [PMID: 32041653 PMCID: PMC7011561 DOI: 10.1186/s13063-020-4076-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 01/15/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Retrospective exploratory analyses of randomised controlled trials (RCTs) seeking to identify treatment effect heterogeneity (TEH) are prone to bias and false positives. Yet the desire to learn all we can from exhaustive data measurements on trial participants motivates the inclusion of such analyses within RCTs. Moreover, widespread advances in machine learning (ML) methods hold potential to utilise such data to identify subjects exhibiting heterogeneous treatment response. METHODS We present a novel analysis strategy for detecting TEH in randomised data using ML methods, whilst ensuring proper control of the false positive discovery rate. Our approach uses random data partitioning with statistical or ML-based prediction on held-out data. This method can test for both crossover TEH (switch in optimal treatment) and non-crossover TEH (systematic variation in benefit across patients). The former is done via a two-sample hypothesis test measuring overall predictive performance. The latter is done via 'stacking' the ML predictors alongside a classical statistical model to formally test the added benefit of the ML algorithm. An adaptation of recent statistical theory allows for the construction of a valid aggregate p value. This testing strategy is independent of the choice of ML method. RESULTS We demonstrate our approach with a re-analysis of the SEAQUAMAT trial, which compared quinine to artesunate for the treatment of severe malaria in Asian adults. We find no evidence for any subgroup who would benefit from a change in treatment from the current standard of care, artesunate, but strong evidence for significant TEH within the artesunate treatment group. In particular, we find that artesunate provides a differential benefit to patients with high numbers of circulating ring stage parasites. CONCLUSIONS ML analysis plans using computational notebooks (documents linked to a programming language that capture the model parameter settings, data processing choices, and evaluation criteria) along with version control can improve the robustness and transparency of RCT exploratory analyses. A data-partitioning algorithm allows researchers to apply the latest ML techniques safe in the knowledge that any declared associations are statistically significant at a user-defined level.
Collapse
Affiliation(s)
- James A Watson
- Mahidol Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Rajvithi Road, Bangkok, 10400, Thailand. .,Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7LF, UK.
| | - Chris C Holmes
- Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7LF, UK.,Department of Statistics, University of Oxford, 29 Saint Giles', Oxford, OX1 3LB, UK
| |
Collapse
|
6
|
Garcia TP, Parast L. Dynamic landmark prediction for mixture data. Biostatistics 2019; 22:558-574. [PMID: 31758793 PMCID: PMC8286554 DOI: 10.1093/biostatistics/kxz052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Revised: 10/27/2019] [Accepted: 10/30/2019] [Indexed: 11/13/2022] Open
Abstract
In kin-cohort studies, clinicians want to provide their patients with the most current cumulative risk of death arising from a rare deleterious mutation. Estimating the cumulative risk is difficult when the genetic mutation status is unknown and only estimated probabilities of a patient having the mutation are available. We estimate the cumulative risk for this scenario using a novel nonparametric estimator that incorporates covariate information and dynamic landmark prediction. Our estimator has improved prediction accuracy over existing estimators that ignore covariate information. It is built within a dynamic landmark prediction framework whereby we can obtain personalized dynamic predictions over time. Compared to current standards, a simple transformation of our estimator provides more efficient estimates of marginal distribution functions in settings where patient-specific predictions are not the main goal. We show our estimator is unbiased and has more predictive accuracy compared to methods that ignore covariate information and landmarking. Applying our method to a Huntington disease study of mortality, we develop dynamic survival prediction curves incorporating gender and familial genetic information.
Collapse
Affiliation(s)
- Tanya P Garcia
- Department of Statistics, Texas A&M University, 3143 TAMU, College Station, TX 77843-3143, USA and RAND Corporation, 1776 Main Street, Santa Monica, CA 90401, USA
| | - Layla Parast
- Department of Statistics, Texas A&M University, 3143 TAMU, College Station, TX 77843-3143, USA and RAND Corporation, 1776 Main Street, Santa Monica, CA 90401, USA
| |
Collapse
|
7
|
Claggett B, Tian L, Fu H, Solomon SD, Wei LJ. Quantifying the totality of treatment effect with multiple event-time observations in the presence of a terminal event from a comparative clinical study. Stat Med 2018; 37:3589-3598. [PMID: 30047148 DOI: 10.1002/sim.7907] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Revised: 06/14/2018] [Accepted: 06/14/2018] [Indexed: 01/22/2023]
Abstract
To evaluate the totality of one treatment's benefit/risk profile relative to an alternative treatment via a longitudinal comparative clinical study, the timing and occurrence of multiple clinical events are typically collected during the patient's follow-up. These multiple observations reflect the patient's disease progression/burden over time. The standard practice is to create a composite endpoint from the multiple outcomes, the timing of the occurrence of the first clinical event, to evaluate the treatment via the standard survival analysis techniques. By ignoring all events after the composite outcome, this type of assessment may not be ideal. Various parametric or semiparametric procedures have been extensively discussed in the literature for the purposes of analyzing multiple event-time data. Many existing methods were developed based on extensive model assumptions. When the model assumptions are not plausible, the resulting inferences for the treatment effect may be misleading. In this article, we propose a simple, nonparametric inference procedure to quantify the treatment effect, which has an intuitive clinically meaningful interpretation. We use the data from a cardiovascular clinical trial for heart failure to illustrate the procedure. A simulation study is also conducted to evaluate the performance of the new proposal.
Collapse
Affiliation(s)
| | - Lu Tian
- Stanford University School of Medicine, Stanford, California
| | - Haoda Fu
- Lilly Research Laboratories, Indianapolis, Indiana
| | | | - Lee-Jen Wei
- Harvard University, Cambridge, Massachusetts
| |
Collapse
|
8
|
Zhou QM, Dai W, Zheng Y, Cai T. Robust Dynamic Risk Prediction with Longitudinal Studies. ACTA ACUST UNITED AC 2017; 1:159-170. [PMID: 29335682 DOI: 10.1080/24754269.2017.1400418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Providing accurate and dynamic age-specific risk prediction is a crucial step in precision medicine. In this manuscript, we introduce an approach for estimating the τ-year age-specific absolute risk directly via a flexible varying coefficient model. The approach facilitates the utilization of predictors varying over an individual's lifetime. By using a nonparametric inverse probability weighted kernel estimating equation, the age-specific effects of risk factors are estimated without requiring the specification of the functional form. The approach allows borrowing information across individuals of similar ages, and therefore provides a practical solution for situations where the longitudinal information is only measured sparsely. We evaluate the performance of the proposed estimation and inference procedures with numerical studies, and make comparisons with existing methods in the literature. We illustrate the performance of our proposed approach by developing a dynamic prediction model using data from the Framingham Study.
Collapse
Affiliation(s)
- Qian M Zhou
- Department of Mathematics and Statistics, Mississippi State University, Mississippi State, Mississippi, USA, 39762
| | - Wei Dai
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA, 02115
| | - Yingye Zheng
- Department of Biostatistics and Biomathematics, Fred Hutchinson Cancer Research Center, Seattle, WA, USA, 98109
| | - Tianxi Cai
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA, 02115
| |
Collapse
|
9
|
Skrivankova V, Heagerty PJ. Single index methods for evaluation of marker-guided treatment rules based on multivariate marker panels. Biometrics 2017; 74:663-672. [PMID: 28783868 DOI: 10.1111/biom.12752] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Revised: 06/01/2017] [Accepted: 06/01/2017] [Indexed: 11/30/2022]
Abstract
Clinical practice may be enhanced by use of person-level information that could guide treatment choice and lead to better outcomes for both treated individuals and for the population. The scientific challenge is to identify and validate those factors that can reliably be used to target treatment, and to accurately quantify the expected treatment benefit as a function of candidate markers. Our proposal is to explicitly focus on smooth non-parametric evaluation of a canonical single index score that estimates the expected treatment benefit associated with patient characteristics. Our methods intentionally decouple the model used to generate the treatment benefit score from the methods that are adopted to evaluate the performance of the resulting single index score. We are motivated by the practical issue that model performance can not realistically be evaluated for every specific covariate value due to intrinsic sparseness. However, direct validation of a scalar treatment benefit score obtained through model-based dimension reduction is feasible, and we believe should be the focus of validation efforts. We also show that the canonical single index treatment benefit score can be used for selecting subsets of patients with enriched expected treatment response since patients can be easily ordered and grouped based on the scalar score. Our biomedical motivation comes from a recent randomized trial of steroid injections for low back pain where baseline clinical and imaging data are candidate measures for guiding therapeutic choice.
Collapse
Affiliation(s)
| | - Patrick J Heagerty
- Department of Biostatistics, University of Washington, Seattle, Washington, U.S.A
| |
Collapse
|
10
|
Dobler D, Pauly M. Approximate tests for the equality of two cumulative incidence functions of a competing risk. STATISTICS-ABINGDON 2017. [DOI: 10.1080/02331888.2017.1336171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Dennis Dobler
- Institute of Statistics, Ulm University, Ulm, Germany
| | - Markus Pauly
- Institute of Statistics, Ulm University, Ulm, Germany
| |
Collapse
|
11
|
Agniel D, Cai T. Analysis of multiple diverse phenotypes via semiparametric canonical correlation analysis. Biometrics 2017; 73:1254-1265. [PMID: 28407213 DOI: 10.1111/biom.12690] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2015] [Revised: 02/01/2017] [Accepted: 02/01/2017] [Indexed: 11/30/2022]
Abstract
Studying multiple outcomes simultaneously allows researchers to begin to identify underlying factors that affect all of a set of diseases (i.e., shared etiology) and what may give rise to differences in disorders between patients (i.e., disease subtypes). In this work, our goal is to build risk scores that are predictive of multiple phenotypes simultaneously and identify subpopulations at high risk of multiple phenotypes. Such analyses could yield insight into etiology or point to treatment and prevention strategies. The standard canonical correlation analysis (CCA) can be used to relate multiple continuous outcomes to multiple predictors. However, in order to capture the full complexity of a disorder, phenotypes may include a diverse range of data types, including binary, continuous, ordinal, and censored variables. When phenotypes are diverse in this way, standard CCA is not possible and no methods currently exist to model them jointly. In the presence of such complications, we propose a semi-parametric CCA method to develop risk scores that are predictive of multiple phenotypes. To guard against potential model mis-specification, we also propose a nonparametric calibration method to identify subgroups that are at high risk of multiple disorders. A resampling procedure is also developed to account for the variability in these estimates. Our method opens the door to synthesizing a wide array of data sources for the purposes of joint prediction.
Collapse
Affiliation(s)
- Denis Agniel
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| | - Tianxi Cai
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts
| |
Collapse
|
12
|
Parast L, Griffin BA. Landmark estimation of survival and treatment effects in observational studies. LIFETIME DATA ANALYSIS 2017; 23:161-182. [PMID: 26880366 PMCID: PMC4985509 DOI: 10.1007/s10985-016-9358-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2015] [Accepted: 01/12/2016] [Indexed: 06/05/2023]
Abstract
Clinical studies aimed at identifying effective treatments to reduce the risk of disease or death often require long term follow-up of participants in order to observe a sufficient number of events to precisely estimate the treatment effect. In such studies, observing the outcome of interest during follow-up may be difficult and high rates of censoring may be observed which often leads to reduced power when applying straightforward statistical methods developed for time-to-event data. Alternative methods have been proposed to take advantage of auxiliary information that may potentially improve efficiency when estimating marginal survival and improve power when testing for a treatment effect. Recently, Parast et al. (J Am Stat Assoc 109(505):384-394, 2014) proposed a landmark estimation procedure for the estimation of survival and treatment effects in a randomized clinical trial setting and demonstrated that significant gains in efficiency and power could be obtained by incorporating intermediate event information as well as baseline covariates. However, the procedure requires the assumption that the potential outcomes for each individual under treatment and control are independent of treatment group assignment which is unlikely to hold in an observational study setting. In this paper we develop the landmark estimation procedure for use in an observational setting. In particular, we incorporate inverse probability of treatment weights (IPTW) in the landmark estimation procedure to account for selection bias on observed baseline (pretreatment) covariates. We demonstrate that consistent estimates of survival and treatment effects can be obtained by using IPTW and that there is improved efficiency by using auxiliary intermediate event and baseline information. We compare our proposed estimates to those obtained using the Kaplan-Meier estimator, the original landmark estimation procedure, and the IPTW Kaplan-Meier estimator. We illustrate our resulting reduction in bias and gains in efficiency through a simulation study and apply our procedure to an AIDS dataset to examine the effect of previous antiretroviral therapy on survival.
Collapse
Affiliation(s)
- Layla Parast
- RAND Corporation, 1776 Main Street, Santa Monica, CA, 90403, USA.
| | - Beth Ann Griffin
- RAND Corporation, 1776 Main Street, Santa Monica, CA, 90403, USA
| |
Collapse
|
13
|
Chen G, Zeng D, Kosorok MR. Personalized Dose Finding Using Outcome Weighted Learning. J Am Stat Assoc 2017; 111:1509-1521. [PMID: 28255189 PMCID: PMC5327863 DOI: 10.1080/01621459.2016.1148611] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Revised: 12/01/2015] [Indexed: 10/22/2022]
Abstract
In dose-finding clinical trials, it is becoming increasingly important to account for individual level heterogeneity while searching for optimal doses to ensure an optimal individualized dose rule (IDR) maximizes the expected beneficial clinical outcome for each individual. In this paper, we advocate a randomized trial design where candidate dose levels assigned to study subjects are randomly chosen from a continuous distribution within a safe range. To estimate the optimal IDR using such data, we propose an outcome weighted learning method based on a nonconvex loss function, which can be solved efficiently using a difference of convex functions algorithm. The consistency and convergence rate for the estimated IDR are derived, and its small-sample performance is evaluated via simulation studies. We demonstrate that the proposed method outperforms competing approaches. Finally, we illustrate this method using data from a cohort study for Warfarin (an anti-thrombotic drug) dosing.
Collapse
Affiliation(s)
- Guanhua Chen
- Assistant Professor, Department of Biostatistics, Vanderbilt University, Nashville, TN 37203
| | - Donglin Zeng
- Professor, Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599
| | - Michael R Kosorok
- W. R. Kenan, Jr. Distinguished Professor and Chair, Department of Biostatistics, and Professor, Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599
| |
Collapse
|
14
|
Parast L, McDermott MM, Tian L. Robust estimation of the proportion of treatment effect explained by surrogate marker information. Stat Med 2015; 35:1637-53. [PMID: 26631934 DOI: 10.1002/sim.6820] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Revised: 10/28/2015] [Accepted: 11/02/2015] [Indexed: 11/10/2022]
Abstract
In randomized treatment studies where the primary outcome requires long follow-up of patients and/or expensive or invasive obtainment procedures, the availability of a surrogate marker that could be used to estimate the treatment effect and could potentially be observed earlier than the primary outcome would allow researchers to make conclusions regarding the treatment effect with less required follow-up time and resources. The Prentice criterion for a valid surrogate marker requires that a test for treatment effect on the surrogate marker also be a valid test for treatment effect on the primary outcome of interest. Based on this criterion, methods have been developed to define and estimate the proportion of treatment effect on the primary outcome that is explained by the treatment effect on the surrogate marker. These methods aim to identify useful statistical surrogates that capture a large proportion of the treatment effect. However, current methods to estimate this proportion usually require restrictive model assumptions that may not hold in practice and thus may lead to biased estimates of this quantity. In this paper, we propose a nonparametric procedure to estimate the proportion of treatment effect on the primary outcome that is explained by the treatment effect on a potential surrogate marker and extend this procedure to a setting with multiple surrogate markers. We compare our approach with previously proposed model-based approaches and propose a variance estimation procedure based on a perturbation-resampling method. Simulation studies demonstrate that the procedure performs well in finite samples and outperforms model-based procedures when the specified models are not correct. We illustrate our proposed procedure using a data set from a randomized study investigating a group-mediated cognitive behavioral intervention for peripheral artery disease participants.
Collapse
Affiliation(s)
- Layla Parast
- RAND Corporation, 1776 Main Street, Santa Monica, 90401, CA, U.S.A
| | - Mary M McDermott
- Department of Medicine and Department of Preventative Medicine, Northwestern University Feinberg School of Medicine, Chicago, 60611, IL, U.S.A
| | - Lu Tian
- Department of Health Research and Policy, Stanford University, Stanford, 94305, CA, U.S.A
| |
Collapse
|
15
|
Zhao L, Claggett B, Tian L, Uno H, Pfeffer MA, Solomon SD, Trippa L, Wei LJ. On the restricted mean survival time curve in survival analysis. Biometrics 2015; 72:215-21. [PMID: 26302239 DOI: 10.1111/biom.12384] [Citation(s) in RCA: 149] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2015] [Revised: 06/01/2015] [Accepted: 07/01/2015] [Indexed: 12/29/2022]
Abstract
For a study with an event time as the endpoint, its survival function contains all the information regarding the temporal, stochastic profile of this outcome variable. The survival probability at a specific time point, say t, however, does not transparently capture the temporal profile of this endpoint up to t. An alternative is to use the restricted mean survival time (RMST) at time t to summarize the profile. The RMST is the mean survival time of all subjects in the study population followed up to t, and is simply the area under the survival curve up to t. The advantages of using such a quantification over the survival rate have been discussed in the setting of a fixed-time analysis. In this article, we generalize this approach by considering a curve based on the RMST over time as an alternative summary to the survival function. Inference, for instance, based on simultaneous confidence bands for a single RMST curve and also the difference between two RMST curves are proposed. The latter is informative for evaluating two groups under an equivalence or noninferiority setting, and quantifies the difference of two groups in a time scale. The proposal is illustrated with the data from two clinical trials, one from oncology and the other from cardiology.
Collapse
Affiliation(s)
- Lihui Zhao
- Department of Preventive Medicine, Northwestern University, Chicago, Illinois 60611, U.S.A
| | - Brian Claggett
- Division of Cardiovascular Medicine, Brigham & Women's Hospital, Harvard Medical School, Boston, Massachusetts 02115, U.S.A
| | - Lu Tian
- Department of Health Research and Policy, Stanford University, Stanford, California 94305, U.S.A
| | - Hajime Uno
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, U.S.A
| | - Marc A Pfeffer
- Division of Cardiovascular Medicine, Brigham & Women's Hospital, Harvard Medical School, Boston, Massachusetts 02115, U.S.A
| | - Scott D Solomon
- Division of Cardiovascular Medicine, Brigham & Women's Hospital, Harvard Medical School, Boston, Massachusetts 02115, U.S.A
| | - Lorenzo Trippa
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, U.S.A.,Department of Biostatistics, Harvard University, Boston, Massachusetts 02115, U.S.A
| | - L J Wei
- Department of Biostatistics, Harvard University, Boston, Massachusetts 02115, U.S.A
| |
Collapse
|
16
|
Claggett B, Tian L, Castagno D, Wei LJ. Treatment selections using risk-benefit profiles based on data from comparative randomized clinical trials with multiple endpoints. Biostatistics 2014; 16:60-72. [PMID: 25122189 DOI: 10.1093/biostatistics/kxu037] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In a typical randomized clinical study to compare a new treatment with a control, oftentimes each study subject may experience any of several distinct outcomes during the study period, which collectively define the "risk-benefit" profile. To assess the effect of treatment, it is desirable to utilize the entirety of such outcome information. The times to these events, however, may not be observed completely due to, for example, competing risks or administrative censoring. The standard analyses based on the time to the first event, or individual component analyses with respect to each event time, are not ideal. In this paper, we classify each patient's risk-benefit profile, by considering all event times during follow-up, into several clinically meaningful ordinal categories. We first show how to make inferences for the treatment difference in a two-sample setting where categorical data are incomplete due to censoring. We then present a systematic procedure to identify patients who would benefit from a specific treatment using baseline covariate information. To obtain a valid and efficient system for personalized medicine, we utilize a cross-validation method for model building and evaluation and then make inferences using the final selected prediction procedure with an independent data set. The proposal is illustrated with the data from a clinical trial to evaluate a beta-blocker for treating chronic heart failure patients.
Collapse
Affiliation(s)
- Brian Claggett
- Division of Cardiovascular Medicine, Harvard Medical School, Boston, MA 02115, USA
| | - Lu Tian
- Department of Health Research and Policy, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Davide Castagno
- Division of Cardiology, Department of Medical Sciences, University of Turin, Turin 10124, Italy
| | - Lee-Jen Wei
- Department of Biostatistics, Harvard University, Boston, MA 02115, USA
| |
Collapse
|
17
|
Abstract
Biosignatures such as brain scans, mass spectrometry, or gene expression profiles might one day be used to guide treatment selection and improve outcomes. This article develops a way of estimating optimal treatment policies based on data from randomized clinical trials by interpreting patient biosignatures as functional predictors. A flexible functional regression model is used to represent the treatment effect and construct the estimated policy. The effectiveness of the estimated policy is assessed by furnishing prediction intervals for the mean outcome when all patients follow the policy. The validity of these prediction intervals is established under mild regularity conditions on the functional regression model. The performance of the proposed approach is evaluated in numerical studies.
Collapse
Affiliation(s)
- Ian W McKeague
- Department of Biostatistics, Columbia University, 722 West 168th Street, New York, NY 10032, USA,
| | - Min Qian
- Department of Biostatistics, Columbia University, 722 West 168th Street, New York, NY 10032, USA,
| |
Collapse
|
18
|
Parast L, Tian L, Cai T. Landmark Estimation of Survival and Treatment Effect in a Randomized Clinical Trial. J Am Stat Assoc 2014; 109:384-394. [PMID: 24659838 DOI: 10.1080/01621459.2013.842488] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
In many studies with a survival outcome, it is often not feasible to fully observe the primary event of interest. This often leads to heavy censoring and thus, difficulty in efficiently estimating survival or comparing survival rates between two groups. In certain diseases, baseline covariates and the event time of non-fatal intermediate events may be associated with overall survival. In these settings, incorporating such additional information may lead to gains in efficiency in estimation of survival and testing for a difference in survival between two treatment groups. If gains in efficiency can be achieved, it may then be possible to decrease the sample size of patients required for a study to achieve a particular power level or decrease the duration of the study. Most existing methods for incorporating intermediate events and covariates to predict survival focus on estimation of relative risk parameters and/or the joint distribution of events under semiparametric models. However, in practice, these model assumptions may not hold and hence may lead to biased estimates of the marginal survival. In this paper, we propose a semi-nonparametric two-stage procedure to estimate and compare t-year survival rates by incorporating intermediate event information observed before some landmark time, which serves as a useful approach to overcome semi-competing risks issues. In a randomized clinical trial setting, we further improve efficiency through an additional calibration step. Simulation studies demonstrate substantial potential gains in efficiency in terms of estimation and power. We illustrate our proposed procedures using an AIDS Clinical Trial Protocol 175 dataset by estimating survival and examining the difference in survival between two treatment groups: zidovudine and zidovudine plus zalcitabine.
Collapse
Affiliation(s)
| | - Lu Tian
- Stanford University, Department of Health, Research and Policy, Stanford, CA 94305
| | - Tianxi Cai
- Harvard University, Department of Biostatistics, Boston, MA 02115
| |
Collapse
|
19
|
Dobler D, Pauly M. Bootstrapping Aalen-Johansen processes for competing risks: Handicaps, solutions, and limitations. Electron J Stat 2014. [DOI: 10.1214/14-ejs972] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
20
|
Gorfine M, Hsu L, Parmigiani G. Frailty Models for Familial Risk with Application to Breast Cancer. J Am Stat Assoc 2013; 108:1205-1215. [PMID: 24678132 PMCID: PMC3963469 DOI: 10.1080/01621459.2013.818001] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
In evaluating familial risk for disease we have two main statistical tasks: assessing the probability of carrying an inherited genetic mutation conferring higher risk; and predicting the absolute risk of developing diseases over time, for those individuals whose mutation status is known. Despite substantial progress, much remains unknown about the role of genetic and environmental risk factors, about the sources of variation in risk among families that carry high-risk mutations, and about the sources of familial aggregation beyond major Mendelian effects. These sources of heterogeneity contribute substantial variation in risk across families. In this paper we present simple and efficient methods for accounting for this variation in familial risk assessment. Our methods are based on frailty models. We implemented them in the context of generalizing Mendelian models of cancer risk, and compared our approaches to others that do not consider heterogeneity across families. Our extensive simulation study demonstrates that when predicting the risk of developing a disease over time conditional on carrier status, accounting for heterogeneity results in a substantial improvement in the area under the curve of the receiver operating characteristic. On the other hand, the improvement for carriership probability estimation is more limited. We illustrate the utility of the proposed approach through the analysis of BRCA1 and BRCA2 mutation carriers in the Washington Ashkenazi Kin-Cohort Study of Breast Cancer.
Collapse
Affiliation(s)
- Malka Gorfine
- Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel
| | - Li Hsu
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109-1024, U.S.A
| | - Giovanni Parmigiani
- Department of Biostatistics and Computational Biology, Dana Farber Cancer Institute Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, U.S.A
| |
Collapse
|
21
|
Tian L, Zhao L, Wei LJ. Predicting the restricted mean event time with the subject's baseline covariates in survival analysis. Biostatistics 2013; 15:222-33. [PMID: 24292992 DOI: 10.1093/biostatistics/kxt050] [Citation(s) in RCA: 124] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
For designing, monitoring, and analyzing a longitudinal study with an event time as the outcome variable, the restricted mean event time (RMET) is an easily interpretable, clinically meaningful summary of the survival function in the presence of censoring. The RMET is the average of all potential event times measured up to a time point τ and can be estimated consistently by the area under the Kaplan-Meier curve over $[0, \tau ]$. In this paper, we study a class of regression models, which directly relates the RMET to its "baseline" covariates for predicting the future subjects' RMETs. Since the standard Cox and the accelerated failure time models can also be used for estimating such RMETs, we utilize a cross-validation procedure to select the "best" among all the working models considered in the model building and evaluation process. Lastly, we draw inferences for the predicted RMETs to assess the performance of the final selected model using an independent data set or a "hold-out" sample from the original data set. All the proposals are illustrated with the data from the an HIV clinical trial conducted by the AIDS Clinical Trials Group and the primary biliary cirrhosis study conducted by the Mayo Clinic.
Collapse
Affiliation(s)
- Lu Tian
- Department of Health Research and Policy, Stanford University, Stanford, CA 94305, USA
| | | | | |
Collapse
|
22
|
Estimating Subject-Specific Treatment Differences for Risk-Benefit Assessment with Applications to Beta-Blocker Effectiveness Trials. ACTA ACUST UNITED AC 2013. [DOI: 10.1007/978-1-4614-7846-1_7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
23
|
Zheng Y, Parast L, Cai T, Brown M. Evaluating incremental values from new predictors with net reclassification improvement in survival analysis. LIFETIME DATA ANALYSIS 2013; 19:350-370. [PMID: 23254468 PMCID: PMC3686882 DOI: 10.1007/s10985-012-9239-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2012] [Accepted: 11/27/2012] [Indexed: 06/01/2023]
Abstract
Developing individualized prediction rules for disease risk and prognosis has played a key role in modern medicine. When new genomic or biological markers become available to assist in risk prediction, it is essential to assess the improvement in clinical usefulness of the new markers over existing routine variables. Net reclassification improvement (NRI) has been proposed to assess improvement in risk reclassification in the context of comparing two risk models and the concept has been quickly adopted in medical journals (Pencina et al., Stat Med 27:157-172, 2008). We propose both nonparametric and semiparametric procedures for calculating NRI as a function of a future prediction time [Formula: see text] with a censored failure time outcome. The proposed methods accommodate covariate-dependent censoring, therefore providing more robust and sometimes more efficient procedures compared with the existing nonparametric-based estimators (Pencina et al., Stat Med 30:11-21, 2011; Uno et al., Comparing risk scoring systems beyond the roc paradigm in survival analysis, 2009). Simulation results indicate that the proposed procedures perform well in finite samples. We illustrate these procedures by evaluating a new risk model for predicting the onset of cardiovascular disease.
Collapse
Affiliation(s)
- Yingye Zheng
- Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA.
| | | | | | | |
Collapse
|
24
|
Uno H, Tian L, Cai T, Kohane IS, Wei LJ. A unified inference procedure for a class of measures to assess improvement in risk prediction systems with survival data. Stat Med 2013; 32:2430-42. [PMID: 23037800 PMCID: PMC3734387 DOI: 10.1002/sim.5647] [Citation(s) in RCA: 282] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2011] [Accepted: 09/14/2012] [Indexed: 01/18/2023]
Abstract
Risk prediction procedures can be quite useful for the patient's treatment selection, prevention strategy, or disease management in evidence-based medicine. Often, potentially important new predictors are available in addition to the conventional markers. The question is how to quantify the improvement from the new markers for prediction of the patient's risk in order to aid cost-benefit decisions. The standard method, using the area under the receiver operating characteristic curve, to measure the added value may not be sensitive enough to capture incremental improvements from the new markers. Recently, some novel alternatives to area under the receiver operating characteristic curve, such as integrated discrimination improvement and net reclassification improvement, were proposed. In this paper, we consider a class of measures for evaluating the incremental values of new markers, which includes the preceding two as special cases. We present a unified procedure for making inferences about measures in the class with censored event time data. The large sample properties of our procedures are theoretically justified. We illustrate the new proposal with data from a cancer study to evaluate a new gene score for prediction of the patient's survival.
Collapse
Affiliation(s)
- Hajime Uno
- Department of Biostatistics and Computational Biology, Dana Farber Cancer Institute, Boston, MA, USA.
| | | | | | | | | |
Collapse
|
25
|
Zhou QM, Zheng Y, Cai T. Subgroup specific incremental value of new markers for risk prediction. LIFETIME DATA ANALYSIS 2013; 19:142-169. [PMID: 23263882 PMCID: PMC3633735 DOI: 10.1007/s10985-012-9235-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2011] [Accepted: 11/06/2012] [Indexed: 06/01/2023]
Abstract
In many clinical applications, understanding when measurement of new markers is necessary to provide added accuracy to existing prediction tools could lead to more cost effective disease management. Many statistical tools for evaluating the incremental value (IncV) of the novel markers over the routine clinical risk factors have been developed in recent years. However, most existing literature focuses primarily on global assessment. Since the IncVs of new markers often vary across subgroups, it would be of great interest to identify subgroups for which the new markers are most/least useful in improving risk prediction. In this paper we provide novel statistical procedures for systematically identifying potential traditional-marker based subgroups in whom it might be beneficial to apply a new model with measurements of both the novel and traditional markers. We consider various conditional time-dependent accuracy parameters for censored failure time outcome to assess the subgroup-specific IncVs. We provide non-parametric kernel-based estimation procedures to calculate the proposed parameters. Simultaneous interval estimation procedures are provided to account for sampling variation and adjust for multiple testing. Simulation studies suggest that our proposed procedures work well in finite samples. The proposed procedures are applied to the Framingham Offspring Study to examine the added value of an inflammation marker, C-reactive protein, on top of the traditional Framingham risk score for predicting 10-year risk of cardiovascular disease.
Collapse
Affiliation(s)
- Qian M. Zhou
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada,
| | - Yingye Zheng
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA,
| | - Tianxi Cai
- Department of Biostatistics, Harvard University, Boston, MA 02115, USA
| |
Collapse
|
26
|
Zhao Y, Zeng D. Recent development on statistical methods for personalized medicine discovery. Front Med 2013; 7:102-10. [PMID: 23377890 DOI: 10.1007/s11684-013-0245-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2012] [Accepted: 12/06/2012] [Indexed: 01/01/2023]
Abstract
It is well documented that patients can show significant heterogeneous responses to treatments so the best treatment strategies may require adaptation over individuals and time. Recently, a number of new statistical methods have been developed to tackle the important problem of estimating personalized treatment rules using single-stage or multiple-stage clinical data. In this paper, we provide an overview of these methods and list a number of challenges.
Collapse
Affiliation(s)
- Yingqi Zhao
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, 600 Highland Ave., Madison, WI 53792, USA
| | | |
Collapse
|
27
|
BEYERSMANN JAN, TERMINI SUSANNADI, PAULY MARKUS. Weak Convergence of the Wild Bootstrap for the Aalen-Johansen Estimator of the Cumulative Incidence Function of a Competing Risk. Scand Stat Theory Appl 2012. [DOI: 10.1111/j.1467-9469.2012.00817.x] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
28
|
Parast L, Cheng SC, Cai T. Landmark Prediction of Long Term Survival Incorporating Short Term Event Time Information. J Am Stat Assoc 2012; 107:1492-1501. [PMID: 23293405 DOI: 10.1080/01621459.2012.721281] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
In recent years, a wide range of markers have become available as potential tools to predict risk or progression of disease. In addition to such biological and genetic markers, short term outcome information may be useful in predicting long term disease outcomes. When such information is available, it would be desirable to combine this along with predictive markers to improve the prediction of long term survival. Most existing methods for incorporating censored short term event information in predicting long term survival focus on modeling the disease process and are derived under restrictive parametric models in a multi-state survival setting. When such model assumptions fail to hold, the resulting prediction of long term outcomes may be invalid or inaccurate. When there is only a single discrete baseline covariate, a fully non-parametric estimation procedure to incorporate short term event time information has been previously proposed. However, such an approach is not feasible for settings with one or more continuous covariates due to the curse of dimensionality. In this paper, we propose to incorporate short term event time information along with multiple covariates collected up to a landmark point via a flexible varying-coefficient model. To evaluate and compare the prediction performance of the resulting landmark prediction rule, we use robust non-parametric procedures which do not require the correct specification of the proposed varying coefficient model. Simulation studies suggest that the proposed procedures perform well in finite samples. We illustrate them here using a dataset of post-dialysis patients with end-stage renal disease.
Collapse
Affiliation(s)
- Layla Parast
- Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115
| | | | | |
Collapse
|
29
|
Zhao Y, Zeng D, Rush AJ, Kosorok MR. Estimating Individualized Treatment Rules Using Outcome Weighted Learning. J Am Stat Assoc 2012; 107:1106-1118. [PMID: 23630406 DOI: 10.1080/01621459.2012.695674] [Citation(s) in RCA: 357] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
There is increasing interest in discovering individualized treatment rules for patients who have heterogeneous responses to treatment. In particular, one aims to find an optimal individualized treatment rule which is a deterministic function of patient specific characteristics maximizing expected clinical outcome. In this paper, we first show that estimating such an optimal treatment rule is equivalent to a classification problem where each subject is weighted proportional to his or her clinical outcome. We then propose an outcome weighted learning approach based on the support vector machine framework. We show that the resulting estimator of the treatment rule is consistent. We further obtain a finite sample bound for the difference between the expected outcome using the estimated individualized treatment rule and that of the optimal treatment rule. The performance of the proposed approach is demonstrated via simulation studies and an analysis of chronic depression data.
Collapse
Affiliation(s)
- Yingqi Zhao
- Department of Biostatistics, University of North Carolina at Chapel Hill, NC 27599
| | | | | | | |
Collapse
|
30
|
Abstract
Because many illnesses show heterogeneous response to treatment, there is increasing interest in individualizing treatment to patients [11]. An individualized treatment rule is a decision rule that recommends treatment according to patient characteristics. We consider the use of clinical trial data in the construction of an individualized treatment rule leading to highest mean response. This is a difficult computational problem because the objective function is the expectation of a weighted indicator function that is non-concave in the parameters. Furthermore there are frequently many pretreatment variables that may or may not be useful in constructing an optimal individualized treatment rule yet cost and interpretability considerations imply that only a few variables should be used by the individualized treatment rule. To address these challenges we consider estimation based on l(1) penalized least squares. This approach is justified via a finite sample upper bound on the difference between the mean response due to the estimated individualized treatment rule and the mean response due to the optimal individualized treatment rule.
Collapse
|
31
|
Parast L, Cheng SC, Cai T. Incorporating short-term outcome information to predict long-term survival with discrete markers. Biom J 2011; 53:294-307. [PMID: 21337601 DOI: 10.1002/bimj.201000150] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2010] [Revised: 12/22/2010] [Accepted: 01/04/2011] [Indexed: 11/11/2022]
Abstract
In disease screening and prognosis studies, an important task is to determine useful markers for identifying high-risk subgroups. Once such markers are established, they can be incorporated into public health practice to provide appropriate strategies for treatment or disease monitoring based on each individual's predicted risk. In the recent years, genetic and biological markers have been examined extensively for their potential to signal progression or risk of disease. In addition to these markers, it has often been argued that short-term outcomes may be helpful in making a better prediction of disease outcomes in clinical practice. In this paper we propose model-free non-parametric procedures to incorporate short-term event information to improve the prediction of a long-term terminal event. We include the optional availability of a single discrete marker measurement and assess the additional information gained by including the short-term outcome. We focus on the semi-competing risk setting where the short-term event is an intermediate event that may be censored by the terminal event while the terminal event is only subject to administrative censoring. Simulation studies suggest that the proposed procedures perform well in finite samples. Our procedures are illustrated using a data set of post-dialysis patients with end-stage renal disease.
Collapse
Affiliation(s)
- Layla Parast
- Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA.
| | | | | |
Collapse
|
32
|
Li Y, Tian L, Wei LJ. Estimating subject-specific dependent competing risk profile with censored event time observations. Biometrics 2010; 67:427-35. [PMID: 20618311 DOI: 10.1111/j.1541-0420.2010.01456.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In a longitudinal study, suppose that the primary endpoint is the time to a specific event. This response variable, however, may be censored by an independent censoring variable or by the occurrence of one of several dependent competing events. For each study subject, a set of baseline covariates is collected. The question is how to construct a reliable prediction rule for the future subject's profile of all competing risks of interest at a specific time point for risk-benefit decision making. In this article, we propose a two-stage procedure to make inferences about such subject-specific profiles. For the first step, we use a parametric model to obtain a univariate risk index score system. We then estimate consistently the average competing risks for subjects who have the same parametric index score via a nonparametric function estimation procedure. We illustrate this new proposal with the data from a randomized clinical trial for evaluating the efficacy of a treatment for prostate cancer. The primary endpoint for this study was the time to prostate cancer death, but had two types of dependent competing events, one from cardiovascular death and the other from death of other causes.
Collapse
Affiliation(s)
- Yi Li
- Department of Biostatistics, Harvard University, Boston, Massachusetts 02115, USA.
| | | | | |
Collapse
|