1
|
Model selection for survival individualized treatment rules using the jackknife estimator. BMC Med Res Methodol 2022; 22:328. [PMID: 36550398 PMCID: PMC9773469 DOI: 10.1186/s12874-022-01811-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 12/01/2022] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Precision medicine is an emerging field that involves the selection of treatments based on patients' individual prognostic data. It is formalized through the identification of individualized treatment rules (ITRs) that maximize a clinical outcome. When the type of outcome is time-to-event, the correct handling of censoring is crucial for estimating reliable optimal ITRs. METHODS We propose a jackknife estimator of the value function to allow for right-censored data for a binary treatment. The jackknife estimator or leave-one-out-cross-validation approach can be used to estimate the value function and select optimal ITRs using existing machine learning methods. We address the issue of censoring in survival data by introducing an inverse probability of censoring weighted (IPCW) adjustment in the expression of the jackknife estimator of the value function. In this paper, we estimate the optimal ITR by using random survival forest (RSF) and Cox proportional hazards model (COX). We use a Z-test to compare the optimal ITRs learned by RSF and COX with the zero-order model (or one-size-fits-all). Through simulation studies, we investigate the asymptotic properties and the performance of our proposed estimator under different censoring rates. We illustrate our proposed method on a phase III clinical trial of non-small cell lung cancer data. RESULTS Our simulations show that COX outperforms RSF for small sample sizes. As sample sizes increase, the performance of RSF improves, in particular when the expected log failure time is not linear in the covariates. The estimator is fairly normally distributed across different combinations of simulation scenarios and censoring rates. When applied to a non-small-cell lung cancer data set, our method determines the zero-order model (ZOM) as the best performing model. This finding highlights the possibility that tailoring may not be needed for this cancer data set. CONCLUSION The jackknife approach for estimating the value function in the presence of right-censored data shows satisfactory performance when there is small to moderate censoring. Winsorizing the upper and lower percentiles of the estimated survival weights for computing the IPCWs stabilizes the estimator.
Collapse
|
2
|
Wu L, Yang S. Transfer learning of individualized treatment rules from experimental to real-world data. J Comput Graph Stat 2022; 32:1036-1045. [PMID: 37997592 PMCID: PMC10664843 DOI: 10.1080/10618600.2022.2141752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 10/04/2022] [Indexed: 11/06/2022]
Abstract
Individualized treatment effect lies at the heart of precision medicine. Interpretable individualized treatment rules (ITRs) are desirable for clinicians or policymakers due to their intuitive appeal and transparency. The gold-standard approach to estimating the ITRs is randomized experiments, where subjects are randomized to different treatment groups and the confounding bias is minimized to the extent possible. However, experimental studies are limited in external validity because of their selection restrictions, and therefore the underlying study population is not representative of the target real-world population. Conventional learning methods of optimal interpretable ITRs for a target population based only on experimental data are biased. On the other hand, real-world data (RWD) are becoming popular and provide a representative sample of the target population. To learn the generalizable optimal interpretable ITRs, we propose an integrative transfer learning method based on weighting schemes to calibrate the covariate distribution of the experiment to that of the RWD. Theoretically, we establish the risk consistency for the proposed ITR estimator. Empirically, we evaluate the finite-sample performance of the transfer learner through simulations and apply it to a real data application of a job training program.
Collapse
Affiliation(s)
- Lili Wu
- Department of Statistics, North Carolina State University
| | - Shu Yang
- Department of Statistics, North Carolina State University
| |
Collapse
|
3
|
Xie S, Tarpey T, Petkova E, Ogden RT. Multiple domain and multiple kernel outcome-weighted learning for estimating individualized treatment regimes. J Comput Graph Stat 2022; 31:1375-1383. [PMID: 36970034 PMCID: PMC10035569 DOI: 10.1080/10618600.2022.2067552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 03/23/2022] [Accepted: 04/11/2022] [Indexed: 10/18/2022]
Abstract
Individualized treatment rules (ITRs) recommend treatments that are tailored specifically according to each patient's own characteristics. It can be challenging to estimate optimal ITRs when there are many features, especially when these features have arisen from multiple data domains (e.g., demographics, clinical measurements, neuroimaging modalities). Considering data from complementary domains and using multiple similarity measures to capture the potential complex relationship between features and treatment can potentially improve the accuracy of assigning treatments. Outcome weighted learning (OWL) methods that are based on support vector machines using a predetermined single kernel function have previously been developed to estimate optimal ITRs. In this paper, we propose an approach to estimate optimal ITRs by exploiting multiple kernel functions to describe the similarity of features between subjects both within and across data domains within the OWL framework, as opposed to preselecting a single kernel function to be used for all features for all domains. Our method takes into account the heterogeneity of each data domain and combines multiple data domains optimally. Our learning process estimates optimal ITRs and also identifies the data domains that are most important for determining ITRs. This approach can thus be used to prioritize the collection of data from multiple domains, potentially reducing cost without sacrificing accuracy. The comparative advantage of our method is demonstrated by simulation studies and by an application to a randomized clinical trial for major depressive disorder that collected features from multiple data domains. Supplemental materials for this article are available online.
Collapse
Affiliation(s)
- Shanghong Xie
- School of Statistics, Southwestern University of Finance and Economics
- Department of Biostatistics, Mailman School of Public Health, Columbia University
| | - Thaddeus Tarpey
- Division of Biostatistics, Department of Population Health, New York University
| | - Eva Petkova
- Division of Biostatistics, Department of Population Health, New York University
| | - R. Todd Ogden
- Department of Biostatistics, Mailman School of Public Health, Columbia University
| |
Collapse
|
4
|
Park H, Petkova E, Tarpey T, Ogden RT. A sparse additive model for treatment effect-modifier selection. Biostatistics 2022; 23:412-429. [PMID: 32808656 PMCID: PMC9308457 DOI: 10.1093/biostatistics/kxaa032] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 07/05/2020] [Accepted: 07/10/2020] [Indexed: 11/26/2023] Open
Abstract
Sparse additive modeling is a class of effective methods for performing high-dimensional nonparametric regression. This article develops a sparse additive model focused on estimation of treatment effect modification with simultaneous treatment effect-modifier selection. We propose a version of the sparse additive model uniquely constrained to estimate the interaction effects between treatment and pretreatment covariates, while leaving the main effects of the pretreatment covariates unspecified. The proposed regression model can effectively identify treatment effect-modifiers that exhibit possibly nonlinear interactions with the treatment variable that are relevant for making optimal treatment decisions. A set of simulation experiments and an application to a dataset from a randomized clinical trial are presented to demonstrate the method.
Collapse
Affiliation(s)
- Hyung Park
- Division of Biostatistics, Department of Population Health, New York University, New York, NY, USA and Department of Biostatistics, Columbia University, New York, NY, USA
| | - Eva Petkova
- Division of Biostatistics, Department of Population Health, New York University, New York, NY, USA and Department of Biostatistics, Columbia University, New York, NY, USA
| | - Thaddeus Tarpey
- Division of Biostatistics, Department of Population Health, New York University, New York, NY, USA and Department of Biostatistics, Columbia University, New York, NY, USA
| | - R Todd Ogden
- Division of Biostatistics, Department of Population Health, New York University, New York, NY, USA and Department of Biostatistics, Columbia University, New York, NY, USA
| |
Collapse
|
5
|
Feng Q, Li J, Ping X, Van Calster B. Hypervolume under ROC manifold for discrete biomarkers with ties. J STAT COMPUT SIM 2021. [DOI: 10.1080/00949655.2021.1954184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Qunqiang Feng
- Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei, People's Republic of China
| | - Jialiang Li
- National University of Singapore, Singapore, Singapore
- Duke-NUS Graduate Medical School, Singapore, Singapore
- Singapore Eye Research Institute, Singapore, Singapore
| | - Xingrun Ping
- Shanghai Jiaotong University, Shanghai, People's Republic of China
| | | |
Collapse
|
6
|
Hoogland J, IntHout J, Belias M, Rovers MM, Riley RD, E. Harrell Jr F, Moons KGM, Debray TPA, Reitsma JB. A tutorial on individualized treatment effect prediction from randomized trials with a binary endpoint. Stat Med 2021; 40:5961-5981. [PMID: 34402094 PMCID: PMC9291969 DOI: 10.1002/sim.9154] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 06/08/2021] [Accepted: 07/19/2021] [Indexed: 12/23/2022]
Abstract
Randomized trials typically estimate average relative treatment effects, but decisions on the benefit of a treatment are possibly better informed by more individualized predictions of the absolute treatment effect. In case of a binary outcome, these predictions of absolute individualized treatment effect require knowledge of the individual's risk without treatment and incorporation of a possibly differential treatment effect (ie, varying with patient characteristics). In this article, we lay out the causal structure of individualized treatment effect in terms of potential outcomes and describe the required assumptions that underlie a causal interpretation of its prediction. Subsequently, we describe regression models and model estimation techniques that can be used to move from average to more individualized treatment effect predictions. We focus mainly on logistic regression-based methods that are both well-known and naturally provide the required probabilistic estimates. We incorporate key components from both causal inference and prediction research to arrive at individualized treatment effect predictions. While the separate components are well known, their successful amalgamation is very much an ongoing field of research. We cut the problem down to its essentials in the setting of a randomized trial, discuss the importance of a clear definition of the estimand of interest, provide insight into the required assumptions, and give guidance with respect to modeling and estimation options. Simulated data illustrate the potential of different modeling options across scenarios that vary both average treatment effect and treatment effect heterogeneity. Two applied examples illustrate individualized treatment effect prediction in randomized trial data.
Collapse
Affiliation(s)
- Jeroen Hoogland
- Julius Center for Health Sciences and Primary Care, University Medical Center UtrechtUtrecht UniversityUtrechtthe Netherlands
| | - Joanna IntHout
- Radboud Institute for Health Sciences (RIHS)Radboud University Medical CenterNijmegenthe Netherlands
| | - Michail Belias
- Radboud Institute for Health Sciences (RIHS)Radboud University Medical CenterNijmegenthe Netherlands
| | - Maroeska M. Rovers
- Radboud Institute for Health Sciences (RIHS)Radboud University Medical CenterNijmegenthe Netherlands
| | | | - Frank E. Harrell Jr
- Department of BiostatisticsVanderbilt University School of MedicineNashvilleTennesseeUSA
| | - Karel G. M. Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center UtrechtUtrecht UniversityUtrechtthe Netherlands
- Cochrane Netherlands, University Medical Center UtrechtUtrecht UniversityUtrechtthe Netherlands
| | - Thomas P. A. Debray
- Julius Center for Health Sciences and Primary Care, University Medical Center UtrechtUtrecht UniversityUtrechtthe Netherlands
- Cochrane Netherlands, University Medical Center UtrechtUtrecht UniversityUtrechtthe Netherlands
| | - Johannes B. Reitsma
- Julius Center for Health Sciences and Primary Care, University Medical Center UtrechtUtrecht UniversityUtrechtthe Netherlands
- Cochrane Netherlands, University Medical Center UtrechtUtrecht UniversityUtrechtthe Netherlands
| |
Collapse
|
7
|
Zhou W, Zhu R, Zeng D. A parsimonious personalized dose-finding model via dimension reduction. Biometrika 2021; 108:643-659. [PMID: 34658383 PMCID: PMC8514170 DOI: 10.1093/biomet/asaa087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Learning an individualized dose rule in personalized medicine is a challenging statistical problem. Existing methods often suffer from the curse of dimensionality, especially when the decision function is estimated nonparametrically. To tackle this problem, we propose a dimension reduction framework that effectively reduces the estimation to a lower-dimensional subspace of the covariates. We exploit that the individualized dose rule can be defined in a subspace spanned by a few linear combinations of the covariates, leading to a more parsimonious model. Also, our framework does not require the inverse probability of the propensity score under observational studies due to a direct maximization of the value function. This distinguishes us from the outcome weighted learning framework, which also solves decision rules directly. Under the same framework, we further propose a pseudo-direct learning approach focuses more on estimating the dimensionality-reduced subspace of the treatment outcome. Parameters in both approaches can be estimated efficiently using an orthogonality constrained optimization algorithm on the Stiefel manifold. Under mild regularity assumptions, the asymptotic normality results of the proposed estimators can are established, respectively. We also derive the consistency and convergence rate for the value function under the estimated optimal dose rule. We evaluate the performance of the proposed approaches through extensive simulation studies and a warfarin pharmacogenetic dataset.
Collapse
Affiliation(s)
- Wenzhuo Zhou
- Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, Illinois 61820, U.S.A
| | - Ruoqing Zhu
- Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, Illinois 61820, U.S.A
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| |
Collapse
|
8
|
Kapelner A, Bleich J, Levine A, Cohen ZD, DeRubeis RJ, Berk R. Evaluating the Effectiveness of Personalized Medicine With Software. Front Big Data 2021; 4:572532. [PMID: 34085036 PMCID: PMC8167073 DOI: 10.3389/fdata.2021.572532] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2020] [Accepted: 02/03/2021] [Indexed: 11/13/2022] Open
Abstract
We present methodological advances in understanding the effectiveness of personalized medicine models and supply easy-to-use open-source software. Personalized medicine involves the systematic use of individual patient characteristics to determine which treatment option is most likely to result in a better average outcome for the patient. Why is personalized medicine not done more in practice? One of many reasons is because practitioners do not have any easy way to holistically evaluate whether their personalization procedure does better than the standard of care, termed improvement. Our software, "Personalized Treatment Evaluator" (the R package PTE), provides inference for improvement out-of-sample in many clinical scenarios. We also extend current methodology by allowing evaluation of improvement in the case where the endpoint is binary or survival. In the software, the practitioner inputs 1) data from a single-stage randomized trial with one continuous, incidence or survival endpoint and 2) an educated guess of a functional form of a model for the endpoint constructed from domain knowledge. The bootstrap is then employed on data unseen during model fitting to provide confidence intervals for the improvement for the average future patient (assuming future patients are similar to the patients in the trial). One may also test against a null scenario where the hypothesized personalization are not more useful than a standard of care. We demonstrate our method's promise on simulated data as well as on data from a randomized comparative trial investigating two treatments for depression.
Collapse
Affiliation(s)
- Adam Kapelner
- Department of Mathematics, Queens College, CUNY, Queens, NY, United States
| | - Justin Bleich
- Department of Statistics, The Wharton School of the University of Pennsylvania, Philadelphia, PA, United States
| | - Alina Levine
- Department of Mathematics, Queens College, CUNY, Queens, NY, United States
| | - Zachary D. Cohen
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, United States
| | - Robert J. DeRubeis
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, United States
| | - Richard Berk
- Department of Statistics, The Wharton School of the University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
9
|
Guo W, Zhou XH, Ma S. Estimation of Optimal Individualized Treatment Rules Using a Covariate-Specific Treatment Effect Curve With High-Dimensional Covariates. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2020.1865167] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Wenchuan Guo
- Department of Statistics, University of California Riverside, Riverside, CA
- Global Biometric Sciences, Bristol-Myers Squibb, Pennington, NJ
| | - Xiao-Hua Zhou
- Beijing International Center for Mathematical Research, and Department of Biostatistics, Peking University, Beijing, China
| | - Shujie Ma
- Department of Statistics, University of California Riverside, Riverside, CA
| |
Collapse
|
10
|
Nguyen TL, Collins GS, Landais P, Le Manach Y. Counterfactual clinical prediction models could help to infer individualized treatment effects in randomized controlled trials-An illustration with the International Stroke Trial. J Clin Epidemiol 2020; 125:47-56. [PMID: 32464321 DOI: 10.1016/j.jclinepi.2020.05.022] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 04/17/2020] [Accepted: 05/20/2020] [Indexed: 12/22/2022]
Abstract
OBJECTIVE Causal treatment effects are estimated at the population level in randomized controlled trials, while clinical decision is often to be made at the individual level in practice. We aim to show how clinical prediction models used under a counterfactual framework may help to infer individualized treatment effects. STUDY DESIGN AND SETTING As an illustrative example, we reanalyze the International Stroke Trial. This large, multicenter trial enrolled 19,435 adult patients with suspected acute ischemic stroke from 36 countries, and reported a modest average benefit of aspirin (vs. no aspirin) on a composite outcome of death or dependency at 6 months. We derive and validate multivariable logistic regression models that predict the patient counterfactual risks of outcome with and without aspirin, conditionally on 23 predictors. RESULTS The counterfactual prediction models display good performance in terms of calibration and discrimination (validation c-statistics: 0.798 and 0.794). Comparing the counterfactual predicted risks on an absolute difference scale, we show that aspirin-despite an average benefit-may increase the risk of death or dependency at 6 months (compared with the control) in a quarter of stroke patients. CONCLUSIONS Counterfactual prediction models could help researchers and clinicians (i) infer individualized treatment effects and (ii) better target patients who may benefit from treatments.
Collapse
Affiliation(s)
- Tri-Long Nguyen
- Section of Epidemiology, Department of Public Health, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen K, Denmark; Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Windmill Road, Oxford, UK; Laboratory of Biostatistics, Epidemiology, Clinical Research and Health Economics, EA2415, Montpellier University, Montpellier, France; Departments of Anesthesia & Health Research Methods, Evidence, and Impact, Michael DeGroote School of Medicine, Faculty of Health Sciences, McMaster University and the Perioperative Research Group, Population Health Research Institute, Hamilton, Canada; Department of Pharmacy, Nîmes University Hospital, University of Montpellier, Nîmes, France.
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Windmill Road, Oxford, UK; NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, UK
| | - Paul Landais
- Laboratory of Biostatistics, Epidemiology, Clinical Research and Health Economics, EA2415, Montpellier University, Montpellier, France
| | - Yannick Le Manach
- Departments of Anesthesia & Health Research Methods, Evidence, and Impact, Michael DeGroote School of Medicine, Faculty of Health Sciences, McMaster University and the Perioperative Research Group, Population Health Research Institute, Hamilton, Canada
| |
Collapse
|
11
|
Huang Y, Zhou XH. Identification of the optimal treatment regimen in the presence of missing covariates. Stat Med 2020; 39:353-368. [PMID: 31774192 PMCID: PMC6954309 DOI: 10.1002/sim.8407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Revised: 09/25/2019] [Accepted: 09/27/2019] [Indexed: 12/25/2022]
Abstract
Covariates associated with treatment-effect heterogeneity can potentially be used to make personalized treatment recommendations towards best clinical outcomes. Methods for treatment-selection rule development that directly maximize treatment-selection benefits have attracted much interest in recent years, due to the robustness of these methods to outcome modeling. In practice, the task of treatment-selection rule development can be further complicated by missingness in data. Here, we consider the identification of optimal treatment-selection rules for a binary disease outcome when measurements of an important covariate from study participants are partly missing. Under the missing at random assumption, we develop a robust estimator of treatment-selection rules under the direct-optimization paradigm. This estimator targets the maximum selection benefits to the population under correct specification of at least one mechanism from each of the two sets-missing data or conditional covariate distribution, and treatment assignment or disease outcome model. We evaluate and compare performance of the proposed estimator with alternative direct-optimization estimators through extensive simulation studies. We demonstrate the application of the proposed method through a real data example from an Alzheimer's disease study for developing covariate combinations to guide the treatment of Alzheimer's disease.
Collapse
Affiliation(s)
- Ying Huang
- Vaccine & Infectious Diseases Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA,Correspondence;
| | - Xiao-Hua Zhou
- Department of Biostatistics, Peking University, Beijing, China,Correspondence;
| |
Collapse
|
12
|
Simon R. Review of Statistical Methods for Biomarker-Driven Clinical Trials. JCO Precis Oncol 2019; 3:1-9. [PMID: 35100721 DOI: 10.1200/po.18.00407] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The discovery of somatic driver mutations in kinases and receptors has stimulated the development of molecularly targeted treatments that require companion diagnostics and new approaches to clinical development. This article reviews some of the clinical trial designs that have been developed to address these opportunities, including phase II basket and platform trials as well as phase III enrichment and biomarker adaptive designs. It also re-examines some of the conventional wisdom that previously dominated clinical trial design and discusses development and internal validation of a predictive biomarker as a new paradigm for optimizing the intended-use subset for a treatment. Statistical methods now being used in adaptive biomarker-driven clinical trials are reviewed. Some previous paradigms for clinical trial design can limit the development of more effective methods on the basis of prospectively planned adaptive methods, but useful new methods have been developed for analysis of genome-wide data and for the design of adaptively enriched studies. In many cases, the heterogeneity of populations eligible for clinical trials as traditionally defined makes it unlikely that molecularly targeted treatments will be effective for a majority of the eligible patients. New methods for dealing with patient heterogeneity in therapeutic response should be used in the design of phase III clinical trials.
Collapse
|
13
|
Sugasawa S, Noma H. Estimating individual treatment effects by gradient boosting trees. Stat Med 2019; 38:5146-5159. [DOI: 10.1002/sim.8357] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Revised: 07/13/2019] [Accepted: 08/02/2019] [Indexed: 11/08/2022]
Affiliation(s)
- Shonosuke Sugasawa
- Center for Spatial Information Science The University of Tokyo Kashiwa Japan
- Research Center for Medical and Health Data Science The Institute of Statistical Mathematics Tokyo Japan
| | - Hisashi Noma
- Research Center for Medical and Health Data Science The Institute of Statistical Mathematics Tokyo Japan
- Department of Data Science The Institute of Statistical Mathematics Tokyo Japan
| |
Collapse
|
14
|
Ma J, Stingo FC, Hobbs BP. Bayesian personalized treatment selection strategies that integrate predictive with prognostic determinants. Biom J 2019; 61:902-917. [PMID: 30786040 PMCID: PMC7341533 DOI: 10.1002/bimj.201700323] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2017] [Revised: 09/28/2018] [Accepted: 12/04/2018] [Indexed: 01/13/2023]
Abstract
The evolution of "informatics" technologies has the potential to generate massive databases, but the extent to which personalized medicine may be effectuated depends on the extent to which these rich databases may be utilized to advance understanding of the disease molecular profiles and ultimately integrated for treatment selection, necessitating robust methodology for dimension reduction. Yet, statistical methods proposed to address challenges arising with the high-dimensionality of omics-type data predominately rely on linear models and emphasize associations deriving from prognostic biomarkers. Existing methods are often limited for discovering predictive biomarkers that interact with treatment and fail to elucidate the predictive power of their resultant selection rules. In this article, we present a Bayesian predictive method for personalized treatment selection that is devised to integrate both the treatment predictive and disease prognostic characteristics of a particular patient's disease. The method appropriately characterizes the structural constraints inherent to prognostic and predictive biomarkers, and hence properly utilizes these complementary sources of information for treatment selection. The methodology is illustrated through a case study of lower grade glioma. Theoretical considerations are explored to demonstrate the manner in which treatment selection is impacted by prognostic features. Additionally, simulations based on an actual leukemia study are provided to ascertain the method's performance with respect to selection rules derived from competing methods.
Collapse
Affiliation(s)
- Junsheng Ma
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center Houston, Texas 77030
| | - Francesco C. Stingo
- Department of Statistica, Informatica, Applicazioni “G.Parenti”, University of Florence, Florence, 50134, Italy
| | - Brian P. Hobbs
- Quantitative Health Sciences and The Taussig Cancer Institute, Cleveland Clinic, Cleveland, Ohio 44195
| |
Collapse
|
15
|
Huang X, Goldberg Y, Xu J. Multicategory individualized treatment regime using outcome weighted learning. Biometrics 2019; 75:1216-1227. [PMID: 31095722 DOI: 10.1111/biom.13084] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Accepted: 05/09/2019] [Indexed: 12/01/2022]
Abstract
Individualized treatment regimes (ITRs) aim to recommend treatments based on patient-specific characteristics in order to maximize the expected clinical outcome. Outcome weighted learning approaches have been proposed for this optimization problem with primary focus on the binary treatment case. Many require assumptions of the outcome value or the randomization mechanism. In this paper, we propose a general framework for multicategory ITRs using generic surrogate risk. The proposed method accommodates the situations when the outcome takes negative value and/or when the propensity score is unknown. Theoretical results about Fisher consistency, excess risk, and risk consistency are established. In practice, we recommend using differentiable convex loss for computational optimization. We demonstrate the superiority of the proposed method under multinomial deviance risk to some existing methods by simulation and application on data from a clinical trial.
Collapse
Affiliation(s)
- Xinyang Huang
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, and School of Statistics, East China Normal University, Shanghai, China
| | - Yair Goldberg
- The Faculty of Industrial Engineering and Management, Technion-Israel Institute of Technology, Haifa, Israel
| | - Jin Xu
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, and School of Statistics, East China Normal University, Shanghai, China
| |
Collapse
|
16
|
Zhao YQ, Zeng D, Tangen CM, LeBlanc ML. Robustifying Trial-Derived Optimal Treatment Rules for A Target Population. Electron J Stat 2019; 13:1717-1743. [PMID: 31440323 DOI: 10.1214/19-ejs1540] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Treatment rules based on individual patient characteristics that are easy to interpret and disseminate are important in clinical practice. Properly planned and conducted randomized clinical trials are used to construct individualized treatment rules. However, it is often a concern that trial participants lack representativeness, so it limits the applicability of the derived rules to a target population. In this work, we use data from a single trial study to propose a two-stage procedure to derive a robust and parsimonious rule to maximize the benefit in the target population. The procedure allows a wide range of possible covariate distributions in the target population, with minimal assumptions on the first two moments of the covariate distribution. The practical utility and favorable performance of the methodology are demonstrated using extensive simulations and a real data application.
Collapse
Affiliation(s)
- Ying-Qi Zhao
- Associate Member, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109
| | - Donglin Zeng
- Professor, Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599
| | - Catherine M Tangen
- Member, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109
| | - Michael L LeBlanc
- Member, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109
| |
Collapse
|
17
|
Wu P, Zeng D, Wang Y. Matched Learning for Optimizing Individualized Treatment Strategies Using Electronic Health Records. J Am Stat Assoc 2019; 115:380-392. [PMID: 33041401 DOI: 10.1080/01621459.2018.1549050] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Current guidelines for treatment decision making largely rely on data from randomized controlled trials (RCTs) studying average treatment effects. They may be inadequate to make individualized treatment decisions in real-world settings. Large-scale electronic health records (EHR) provide opportunities to fulfill the goals of personalized medicine and learn individualized treatment rules (ITRs) depending on patient-specific characteristics from real-world patient data. In this work, we tackle challenges with EHRs and propose a machine learning approach based on matching (M-learning) to estimate optimal ITRs from EHRs. This new learning method performs matching instead of inverse probability weighting as commonly used in many existing methods for estimating ITRs to more accurately assess individuals' treatment responses to alternative treatments and alleviate confounding. Matching-based value functions are proposed to compare matched pairs under a unified framework, where various types of outcomes for measuring treatment response (including continuous, ordinal, and discrete outcomes) can easily be accommodated. We establish the Fisher consistency and convergence rate of M-learning. Through extensive simulation studies, we show that M-learning outperforms existing methods when propensity scores are misspecified or when unmeasured confounders are present in certain scenarios. Lastly, we apply M-learning to estimate optimal personalized second-line treatments for type 2 diabetes patients to achieve better glycemic control or reduce major complications using EHRs from New York Presbyterian Hospital.
Collapse
Affiliation(s)
- Peng Wu
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10032;
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill.
| | - Yuanjia Wang
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10032
| |
Collapse
|
18
|
Ritz C, Astrup A, Larsen TM, Hjorth MF. Weight loss at your fingertips: personalized nutrition with fasting glucose and insulin using a novel statistical approach. Eur J Clin Nutr 2019; 73:1529-1535. [DOI: 10.1038/s41430-019-0423-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 03/26/2019] [Accepted: 03/26/2019] [Indexed: 01/09/2023]
|
19
|
Abstract
Precision medicine seeks to maximize the quality of healthcare by individualizing the healthcare process to the uniquely evolving health status of each patient. This endeavor spans a broad range of scientific areas including drug discovery, genetics/genomics, health communication, and causal inference all in support of evidence-based, i.e., data-driven, decision making. Precision medicine is formalized as a treatment regime which comprises a sequence of decision rules, one per decision point, which map up-to-date patient information to a recommended action. The potential actions could be the selection of which drug to use, the selection of dose, timing of administration, specific diet or exercise recommendation, or other aspects of treatment or care. Statistics research in precision medicine is broadly focused on methodological development for estimation of and inference for treatment regimes which maximize some cumulative clinical outcome. In this review, we provide an overview of this vibrant area of research and present important and emerging challenges.
Collapse
Affiliation(s)
- Michael R Kosorok
- Department of Biostatistics and Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, U.S.A.;
| | - Eric B Laber
- Department of Statistics, North Carolina State University, Raleight, North Carolina, 27695, U.S.A.;
| |
Collapse
|
20
|
Zhang B, Zhang M. Variable selection for estimating the optimal treatment regimes in the presence of a large number of covariates. Ann Appl Stat 2018. [DOI: 10.1214/18-aoas1154] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
21
|
Abstract
Precision medicine is currently a topic of great interest in clinical and intervention science. A key component of precision medicine is that it is evidence-based, i.e., data-driven, and consequently there has been tremendous interest in estimation of precision medicine strategies using observational or randomized study data. One way to formalize precision medicine is through a treatment regime, which is a sequence of decision rules, one per stage of clinical intervention, that map up-to-date patient information to a recommended treatment. An optimal treatment regime is defined as maximizing the mean of some cumulative clinical outcome if applied to a population of interest. It is well-known that even under simple generative models an optimal treatment regime can be a highly nonlinear function of patient information. Consequently, a focal point of recent methodological research has been the development of flexible models for estimating optimal treatment regimes. However, in many settings, estimation of an optimal treatment regime is an exploratory analysis intended to generate new hypotheses for subsequent research and not to directly dictate treatment to new patients. In such settings, an estimated treatment regime that is interpretable in a domain context may be of greater value than an unintelligible treatment regime built using 'black-box' estimation methods. We propose an estimator of an optimal treatment regime composed of a sequence of decision rules, each expressible as a list of "if-then" statements that can be presented as either a paragraph or as a simple flowchart that is immediately interpretable to domain experts. The discreteness of these lists precludes smooth, i.e., gradient-based, methods of estimation and leads to non-standard asymptotics. Nevertheless, we provide a computationally efficient estimation algorithm, prove consistency of the proposed estimator, and derive rates of convergence. We illustrate the proposed methods using a series of simulation examples and application to data from a sequential clinical trial on bipolar disorder.
Collapse
Affiliation(s)
- Yichi Zhang
- Department of Biostatistics, Harvard University
| | | | | | | |
Collapse
|
22
|
Laber EB, Meyer NJ, Reich BJ, Pacifici K, Collazo JA, Drake JM. Optimal treatment allocations in space and time for on-line control of an emerging infectious disease. J R Stat Soc Ser C Appl Stat 2018; 67:743-770. [PMID: 30662097 PMCID: PMC6334759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
A key component in controlling the spread of an epidemic is deciding where, when and to whom to apply an intervention. We develop a framework for using data to inform these decisions in realtime. We formalize a treatment allocation strategy as a sequence of functions, one per treatment period, that map up-to-date information on the spread of an infectious disease to a subset of locations where treatment should be allocated. An optimal allocation strategy optimizes some cumulative outcome, e.g. the number of uninfected locations, the geographic footprint of the disease or the cost of the epidemic. Estimation of an optimal allocation strategy for an emerging infectious disease is challenging because spatial proximity induces interference between locations, the number of possible allocations is exponential in the number of locations, and because disease dynamics and intervention effectiveness are unknown at out-break. We derive a Bayesian on-line estimator of the optimal allocation strategy that combines simulation-optimization with Thompson sampling. The estimator proposed performs favourably in simulation experiments. This work is motivated by and illustrated using data on the spread of white nose syndrome, which is a highly fatal infectious disease devastating bat populations in North America.
Collapse
Affiliation(s)
| | | | | | | | - Jaime A Collazo
- US Geological Survey North Carolina Cooperative Fish and Wildlife Research Unit, and North Carolina State University, Raleigh, USA
| | | |
Collapse
|
23
|
Huang M, Hobbs BP. Estimating mean local posterior predictive benefit for biomarker-guided treatment strategies. Stat Methods Med Res 2018; 28:2820-2833. [PMID: 30037304 DOI: 10.1177/0962280218788099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Precision medicine has emerged from the awareness that many human diseases are intrinsically heterogeneous with respect to their pathogenesis and composition among patients as well as dynamic over the course therapy. Its successful application relies on our understanding of distinct molecular profiles and their biomarkers which can be used as targets to devise treatment strategies that exploit current understanding of the biological mechanisms of the disease. Precision medicine present challenges to traditional paradigms of clinical translational, however, for which estimates of population-averaged effects from large randomized trials are used as the basis for demonstrating improvements comparative effectiveness. A general approach for estimating the relative effectiveness of biomarker-guided therapeutic strategies is presented herein. The statistical procedure attempts to define the local benefit of a given biomarker-guided therapeutic strategy in consideration of the treatment response surfaces, selection rule, and inter-cohort balance of prognostic determinants. Theoretical and simulation results are provided. Additionally, the methodology is demonstrated through a proteomic study of lower grade glioma.
Collapse
Affiliation(s)
- Meilin Huang
- 1 Regeneron Pharmaceuticals, Inc., Tarrytown, NY, USA
| | - Brian P Hobbs
- 2 Taussig Cancer Institute and Department of Quantitative Health Sciences, Cleveland Clinic, Cleveland, OH, USA
| |
Collapse
|
24
|
Kim S, Wong WK. Discussion on Optimal treatment allocations in space and time for on-line control of an emerging infectious disease. J R Stat Soc Ser C Appl Stat 2018. [PMID: 30270943 DOI: 10.1111/rssc.12266] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Seongho Kim
- Biostatistics Core, Karmanos Cancer Institute, Department of Oncology, School of Medicine, Wayne State University, Detroit, MI 48201
| | - Weng Kee Wong
- Department of Biostatistics, UCLA School of Public Health, Los Angeles, CA 90095
| |
Collapse
|
25
|
Roth J, Simon N. A framework for estimating and testing qualitative interactions with applications to predictive biomarkers. Biostatistics 2018; 19:263-280. [PMID: 28968765 DOI: 10.1093/biostatistics/kxx038] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Accepted: 07/21/2017] [Indexed: 11/13/2022] Open
Abstract
An effective treatment may only benefit a subset of patients enrolled in a clinical trial. We translate the search for patient characteristics that predict treatment benefit to a search for qualitative interactions, which occur when the estimated response-curve under treatment crosses the estimated response-curve under control. We propose a regression-based framework that tests for qualitative interactions without assuming linearity or requiring pre-specified risk strata; this flexibility is useful in settings where there is limited a priori scientific knowledge about the relationship between features and the response. Simulations suggest that our method controls Type I error while offering an improvement in power over a procedure based on linear regression or a procedure that pre-specifies evenly spaced risk strata. We apply our method to a publicly available dataset to search for a subset of HER2+ breast cancer patients who benefit from adjuvant chemotherapy. We implement our method in Python and share the code/data used to produce our results on GitHub (https://github.com/jhroth/data-example).
Collapse
Affiliation(s)
- Jeremy Roth
- Department of Biostatistics, University of Washington, 1705 NE Pacific St, Seattle, WA 98195, USA
| | - Noah Simon
- Department of Biostatistics, University of Washington, 1705 NE Pacific St, Seattle, WA 98195, USA
| |
Collapse
|
26
|
Tajik P, Zafarmand MH, Zwinderman AH, Mol BW, Bossuyt PM. Development and evaluating multimarker models for guiding treatment decisions. BMC Med Inform Decis Mak 2018; 18:52. [PMID: 29954372 PMCID: PMC6022448 DOI: 10.1186/s12911-018-0619-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Accepted: 05/30/2018] [Indexed: 01/19/2023] Open
Abstract
Background Despite the growing interest in developing markers for predicting treatment response and optimizing treatment decisions, an appropriate methodology to identify, combine and evaluate such markers has been slow to develop. We propose a step-by-step strategy for analysing data from existing randomised trials with the aim of identifying a multi-marker model for guiding decisions about treatment. Methods We start with formulating the treatment selection problem, continue with defining the treatment threshold, prepare a list of candidate markers, develop the model, apply the model to estimate individual treatment effects, and evaluate model performance in the study group of patients who meet the trial eligibility criteria. In this process, we rely on some well-known techniques for multivariable prediction modelling, but focus on predicting benefit from treatment, rather than outcome itself. We present our approach using data from a randomised trial in which 808 women with multiple pregnancy were assigned to cervical pessary or control, to prevent adverse perinatal outcomes. Overall, cervical pessary did not reduce the risk of adverse perinatal outcomes. Results The treatment threshold was zero. We had a preselected list of 5 potential markers and developed a logistic model including the markers, treatment and all marker-by-treatment interaction terms. The model was well calibrated and identified 35% (95% confidence interval (CI) 32 to 39%) of the trial participants as benefitting from pessary insertion. We estimated that the risk of adverse outcome could be reduced from 13.5 to 8.1% (5.4% risk reduction; 95% CI 2.1 to 8.6%) through model-based selective pessary insertion. The next step is external validation upon existence of independent trial data. Conclusions We suggest revisiting existing trials data to explore whether differences in treatment benefit can be explained by differences in baseline characteristics of patients. This could lead to treatment selection tools which, after validation in comparable existing trials, can be introduced into clinical practice for guiding treatment decisions in future patients.
Collapse
Affiliation(s)
- Parvin Tajik
- Department of Pathology, Department of Clinical Epidemiology, Biostatistics & Bioinformatics, Department of Obstetrics & Gynaecology, Academic Medical Centre - University of Amsterdam, Room J1b-210, PO Box 22700, 1100, DE, Amsterdam, the Netherlands.
| | - Mohammad Hadi Zafarmand
- Department of Clinical Epidemiology, Biostatistics & Bioinformatics, Department of Obstetrics & Gynaecology, Academic Medical Centre, Amsterdam, the Netherlands
| | - Aeilko H Zwinderman
- Department of Clinical Epidemiology, Biostatistics & Bioinformatics, Academic Medical Centre, Amsterdam, the Netherlands
| | - Ben W Mol
- Department of Obstetrics and Gynaecology, Monash University, Clayton, VIC, Australia
| | - Patrick M Bossuyt
- Department of Clinical Epidemiology, Biostatistics & Bioinformatics, Academic Medical Centre, Amsterdam, the Netherlands
| |
Collapse
|
27
|
Liu Y, Wang Y, Kosorok MR, Zhao Y, Zeng D. Augmented outcome-weighted learning for estimating optimal dynamic treatment regimens. Stat Med 2018; 37:3776-3788. [PMID: 29873099 DOI: 10.1002/sim.7844] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Revised: 03/30/2018] [Accepted: 05/12/2018] [Indexed: 11/08/2022]
Abstract
Dynamic treatment regimens (DTRs) are sequential treatment decisions tailored by patient's evolving features and intermediate outcomes at each treatment stage. Patient heterogeneity and the complexity and chronicity of many diseases call for learning optimal DTRs that can best tailor treatment according to each individual's time-varying characteristics (eg, intermediate response over time). In this paper, we propose a robust and efficient approach referred to as Augmented Outcome-weighted Learning (AOL) to identify optimal DTRs from sequential multiple assignment randomized trials. We improve previously proposed outcome-weighted learning to allow for negative weights. Furthermore, to reduce the variability of weights for numeric stability and improve estimation accuracy, in AOL, we propose a robust augmentation to the weights by making use of predicted pseudooutcomes from regression models for Q-functions. We show that AOL still yields Fisher-consistent DTRs even if the regression models are misspecified and that an appropriate choice of the augmentation guarantees smaller stochastic errors in value function estimation for AOL than the previous outcome-weighted learning. Finally, we establish the convergence rates for AOL. The comparative advantage of AOL over existing methods is demonstrated through extensive simulation studies and an application to a sequential multiple assignment randomized trial for major depressive disorder.
Collapse
Affiliation(s)
- Ying Liu
- Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Yuanjia Wang
- Department of Biostatistics, Columbia University, New York City, NY, USA
| | - Michael R Kosorok
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yingqi Zhao
- Public Health Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
28
|
Dai JY, Liang J, LeBlanc M, Prentice RL, Janes H. Case-only approach to identifying markers predicting treatment effects on the relative risk scale. Biometrics 2018; 74:753-763. [PMID: 28960244 PMCID: PMC5874156 DOI: 10.1111/biom.12789] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Revised: 06/01/2017] [Accepted: 08/01/2017] [Indexed: 11/29/2022]
Abstract
Retrospectively measuring markers on stored baseline samples from participants in a randomized controlled trial (RCT) may provide high quality evidence as to the value of the markers for treatment selection. Originally developed for approximating gene-environment interactions in the odds ratio scale, the case-only method has recently been advocated for assessing gene-treatment interactions on rare disease endpoints in randomized clinical trials. In this article, the case-only approach is shown to provide a consistent and efficient estimator of marker by treatment interactions and marker-specific treatment effects on the relative risk scale. The prohibitive rare-disease assumption is no longer needed, broadening the utility of the case-only approach. The case-only method is resource-efficient as markers only need to be measured in cases only. It eliminates the need to model the marker's main effect, and can be used with any parametric or nonparametric learning method. The utility of this approach is illustrated by an application to genetic data in the Women's Health Initiative (WHI) hormone therapy trial.
Collapse
Affiliation(s)
- James Y. Dai
- Public Health Sciences Division and Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A
| | - Jason Liang
- Public Health Sciences Division and Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A
| | - Michael LeBlanc
- Public Health Sciences Division and Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A
| | - Ross L. Prentice
- Public Health Sciences Division and Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A
| | - Holly Janes
- Public Health Sciences Division and Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A
| |
Collapse
|
29
|
Abstract
There is a growing interest in development of statistical methods for personalized medicine or precision medicine, especially for deriving optimal individualized treatment rules (ITRs). An ITR recommends a patient to a treatment based on the patient's characteristics. The common parametric methods for deriving an optimal ITR, which model the clinical endpoint as a function of the patient's characteristics, can have suboptimal performance when the conditional mean model is misspecified. Recent methodology development has cast the problem of deriving optimal ITR under a weighted classification framework. Under this weighted classification framework, we develop a weighted random forests (W-RF) algorithm that derives an optimal ITR nonparametrically. In addition, with the W-RF algorithm, we propose the variable importance measures for quantifying relative relevance of the patient's characteristics to treatment selection, and the out-of-bag estimator for the population average outcome under the estimated optimal ITR. Our proposed methods are evaluated through intensive simulation studies. We illustrate the application of our methods using data from Clinical Antipsychotic Trials of Intervention Effectiveness Alzheimers Disease Study (CATIE-AD).
Collapse
Affiliation(s)
- Kehao Zhu
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Ying Huang
- Department of Biostatistics, University of Washington, Seattle, WA, USA.,Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Xiao-Hua Zhou
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| |
Collapse
|
30
|
Logan BR, Sparapani R, McCulloch RE, Laud PW. Decision making and uncertainty quantification for individualized treatments using Bayesian Additive Regression Trees. Stat Methods Med Res 2017; 28:1079-1093. [PMID: 29254443 DOI: 10.1177/0962280217746191] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Individualized treatment rules can improve health outcomes by recognizing that patients may respond differently to treatment and assigning therapy with the most desirable predicted outcome for each individual. Flexible and efficient prediction models are desired as a basis for such individualized treatment rules to handle potentially complex interactions between patient factors and treatment. Modern Bayesian semiparametric and nonparametric regression models provide an attractive avenue in this regard as these allow natural posterior uncertainty quantification of patient specific treatment decisions as well as the population wide value of the prediction-based individualized treatment rule. In addition, via the use of such models, inference is also available for the value of the optimal individualized treatment rules. We propose such an approach and implement it using Bayesian Additive Regression Trees as this model has been shown to perform well in fitting nonparametric regression functions to continuous and binary responses, even with many covariates. It is also computationally efficient for use in practice. With Bayesian Additive Regression Trees, we investigate a treatment strategy which utilizes individualized predictions of patient outcomes from Bayesian Additive Regression Trees models. Posterior distributions of patient outcomes under each treatment are used to assign the treatment that maximizes the expected posterior utility. We also describe how to approximate such a treatment policy with a clinically interpretable individualized treatment rule, and quantify its expected outcome. The proposed method performs very well in extensive simulation studies in comparison with several existing methods. We illustrate the usage of the proposed method to identify an individualized choice of conditioning regimen for patients undergoing hematopoietic cell transplantation and quantify the value of this method of choice in relation to the optimal individualized treatment rule as well as non-individualized treatment strategies.
Collapse
Affiliation(s)
- Brent R Logan
- 1 Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Rodney Sparapani
- 1 Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Robert E McCulloch
- 2 School of Mathematical and Statistical Sciences, Arizona State University, Tempe, AZ, USA
| | - Purushottam W Laud
- 1 Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, USA
| |
Collapse
|
31
|
Zhang B, Zhang M. C-learning: A new classification framework to estimate optimal dynamic treatment regimes. Biometrics 2017; 74:891-899. [PMID: 29228509 DOI: 10.1111/biom.12836] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Revised: 09/01/2017] [Accepted: 10/01/2017] [Indexed: 11/27/2022]
Abstract
A dynamic treatment regime is a sequence of decision rules, each corresponding to a decision point, that determine that next treatment based on each individual's own available characteristics and treatment history up to that point. We show that identifying the optimal dynamic treatment regime can be recast as a sequential optimization problem and propose a direct sequential optimization method to estimate the optimal treatment regimes. In particular, at each decision point, the optimization is equivalent to sequentially minimizing a weighted expected misclassification error. Based on this classification perspective, we propose a powerful and flexible C-learning algorithm to learn the optimal dynamic treatment regimes backward sequentially from the last stage until the first stage. C-learning is a direct optimization method that directly targets optimizing decision rules by exploiting powerful optimization/classification techniques and it allows incorporation of patient's characteristics and treatment history to improve performance, hence enjoying advantages of both the traditional outcome regression-based methods (Q- and A-learning) and the more recent direct optimization methods. The superior performance and flexibility of the proposed methods are illustrated through extensive simulation studies.
Collapse
Affiliation(s)
- Baqun Zhang
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, P.R. China
| | - Min Zhang
- Department of Biostatistics, University of Michigan, Ann Arbor, U.S.A
| |
Collapse
|
32
|
A prognostic index (PI) as a moderator of outcomes in the treatment of depression: A proof of concept combining multiple variables to inform risk-stratified stepped care models. J Affect Disord 2017; 213:78-85. [PMID: 28199892 DOI: 10.1016/j.jad.2017.02.010] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/17/2016] [Revised: 01/29/2017] [Accepted: 02/06/2017] [Indexed: 11/20/2022]
Abstract
BACKGROUND Prognostic indices (PIs) combining variables to predict future depression risk may help guide the selection of treatments that differ in intensity. We develop a PI and show its promise in guiding treatment decisions between treatment as usual (TAU), treatment starting with a low-intensity treatment (brief therapy (BT)), or treatment starting with a high-intensity treatment intervention (cognitive-behavioral therapy (CBT)). METHODS We utilized data from depressed patients (N=622) who participated in a randomized comparison of TAU, BT, and CBT in which no statistically significant differences in the primary outcomes emerged between the three treatments. We developed a PI by predicting depression risk at follow-up using a LASSO-style bootstrap variable selection procedure. We then examined between-treatment differences in outcome as a function of the PI. RESULTS Unemployment, depression severity, hostility, sleep problems, and lower positive emotionality at baseline predicted a lower likelihood of recovery across treatments. The PI incorporating these variables produced a fair classification accuracy (c=0.73). Among patients with a high PI (75% percent of the sample), recovery rates were high and did not differ between treatments (79-86%). Among the patients with the poorest prognosis, recovery rates were substantially higher in the CBT condition (60%) than in TAU (39%) or BT (44%). LIMITATIONS No information on additional treatment sought. Prospective tests needed. CONCLUSION Replicable PIs may aid treatment selection and help streamline stepped models of care. Differences between treatments for depression that differ in intensity may only emerge for patients with the poorest prognosis.
Collapse
|
33
|
Linn KA, Laber EB, Stefanski LA. Interactive Q-learning for Quantiles. J Am Stat Assoc 2017; 112:638-649. [PMID: 28890584 PMCID: PMC5586239 DOI: 10.1080/01621459.2016.1155993] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Revised: 01/01/2016] [Indexed: 12/18/2022]
Abstract
A dynamic treatment regime is a sequence of decision rules, each of which recommends treatment based on features of patient medical history such as past treatments and outcomes. Existing methods for estimating optimal dynamic treatment regimes from data optimize the mean of a response variable. However, the mean may not always be the most appropriate summary of performance. We derive estimators of decision rules for optimizing probabilities and quantiles computed with respect to the response distribution for two-stage, binary treatment settings. This enables estimation of dynamic treatment regimes that optimize the cumulative distribution function of the response at a prespecified point or a prespecified quantile of the response distribution such as the median. The proposed methods perform favorably in simulation experiments. We illustrate our approach with data from a sequentially randomized trial where the primary outcome is remission of depression symptoms.
Collapse
Affiliation(s)
- Kristin A Linn
- Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA 19104
| | - Eric B Laber
- Department of Statistics, North Carolina State University, Raleigh, NC 27695
| | - Leonard A Stefanski
- Department of Statistics, North Carolina State University, Raleigh, NC 27695
| |
Collapse
|
34
|
Zhang Z, Ma S, Nie L, Soon G. A Quantitative Concordance Measure for Comparing and Combining Treatment Selection Markers. Int J Biostat 2017; 13:/j/ijb.ahead-of-print/ijb-2016-0064/ijb-2016-0064.xml. [PMID: 28343164 DOI: 10.1515/ijb-2016-0064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Motivated by an HIV example, we consider how to compare and combine treatment selection markers, which are essential to the notion of precision medicine. The current literature on precision medicine is focused on evaluating and optimizing treatment regimes, which can be obtained by dichotomizing treatment selection markers. In practice, treatment decisions are based not only on efficacy but also on safety, cost and individual preference, making it difficult to choose a single cutoff value for all patients in all settings. It is therefore desirable to have a statistical framework for comparing and combining treatment selection markers without dichotomization. We provide such a framework based on a quantitative concordance measure, which quantifies the extent to which higher marker values are predictive of larger treatment effects. For a given marker, the proposed concordance measure can be estimated from clinical trial data using a U-statistic, which can incorporate auxiliary covariate information through an augmentation term. For combining multiple markers, we propose to maximize the estimated concordance measure among a specified family of combination markers. A cross-validation procedure can be used to remove any re-substitution bias in assessing the quality of an optimized combination marker. The proposed methodology is applied to the HIV example and evaluated in simulation studies.
Collapse
|
35
|
Delmar P, Irl C, Tian L. Innovative methods for the identification of predictive biomarker signatures in oncology: Application to bevacizumab. Contemp Clin Trials Commun 2017; 5:107-115. [PMID: 29740627 PMCID: PMC5936698 DOI: 10.1016/j.conctc.2017.01.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Revised: 12/06/2016] [Accepted: 01/17/2017] [Indexed: 11/26/2022] Open
Abstract
Current methods for subgroup analyses of data collected from randomized clinical trials (RCTs) may lead to false-positives from multiple testing, lack power to detect moderate but clinically meaningful differences, or be too simplistic in characterizing patients who may benefit from treatment. Herein, we present a general procedure based on a set of newly developed statistical methods for the identification and evaluation of complex multivariate predictors of treatment effect. Furthermore, we implemented this procedure to identify a subgroup of patients who may receive the largest benefit from bevacizumab treatment using a panel of 10 biomarkers measured at baseline in patients enrolled on two RCTs investigating bevacizumab in metastatic breast cancer. Data were collected from patients with human epidermal growth factor receptor 2 (HER2)-negative (AVADO) and HER2-positive (AVEREL) metastatic breast cancer. We first developed a classification rule based on an estimated individual scoring system, using data from the AVADO study only. The classification rule takes into consideration a panel of biomarkers, including vascular endothelial growth factor (VEGF)-A. We then classified the patients in the independent AVEREL study into patient groups according to “promising” or “not-promising” treatment benefit based on this rule and conducted a statistical analysis within these subgroups to compute point estimates, confidence intervals, and p-values for treatment effect and its interaction. In the group with promising treatment benefit in the AVEREL study, the estimated hazard ratio of bevacizumab versus placebo for progression-free survival was 0.687 (95% confidence interval [CI]: 0.462–1.024, p = 0.065), while in the not-promising group the hazard ratio (HR) was 1.152 (95% CI: 0.526–2.524, p = 0.723). Using the median level of VEGF-A from the AVEREL study to divide the study population, then the HR becomes 0.711 (95% CI: 0.435–1.163, p = 0.174) in the promising group and 0.828 (95% CI: 0.496–1.380, p = 0.468) in the not-promising group. Similar results were obtained with the median VEGF-A levels from the AVADO study (“promising” group: HR = 0.709, 95%CI: 0.444–1.133, p = 0.151; “not-promising” group: HR = 0.851, 95% CI: 0.497–1.458, p = 0.556). Our analysis shows it is feasible to employ statistical methods for empirically constructing and validating a scoring system based on a panel of biomarkers. This scoring system can be used to estimate the treatment effect for individual patients and identify a subgroup of patients who may benefit from treatment. The proposed procedure can provide a general framework to organize many statistical methods (existing or to be developed) into a coherent set of analyses for the development of personalized medicines and has the potential of broad applications.
Collapse
Affiliation(s)
- Paul Delmar
- Department of Biostatistics, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Cornelia Irl
- Department of Biostatistics, Genentech Inc., South San Francisco, CA, USA
| | - Lu Tian
- Department of Biomedical Data Science, Stanford University School of Medicine, Palo Alto, CA, USA
| |
Collapse
|
36
|
Ma J, Hobbs BP, Stingo FC. Integrating genomic signatures for treatment selection with Bayesian predictive failure time models. Stat Methods Med Res 2016; 27:2093-2113. [PMID: 27807177 DOI: 10.1177/0962280216675373] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Over the past decade, a tremendous amount of resources have been dedicated to the pursuit of developing genomic signatures that effectively match patients with targeted therapies. Although dozens of therapies that target DNA mutations have been developed, the practice of studying single candidate genes has limited our understanding of cancer. Moreover, many studies of multiple-gene signatures have been conducted for the purpose of identifying prognostic risk cohorts, and thus are limited for selecting personalized treatments. Existing statistical methods for treatment selection often model treatment-by-covariate interactions that are difficult to specify, and require prohibitively large patient cohorts. In this article, we describe a Bayesian predictive failure time model for treatment selection that integrates multiple-gene signatures. Our approach relies on a heuristic measure of similarity that determines the extent to which historically treated patients contribute to the outcome prediction of new patients. The similarity measure, which can be obtained from existing clustering methods, imparts robustness to the underlying stochastic data structure, which enhances feasibility in the presence of small samples. Performance of the proposed method is evaluated in simulation studies, and its application is demonstrated through a study of lung squamous cell carcinoma. Our Bayesian predictive failure time approach is shown to effectively leverage genomic signatures to match patients to the therapies that are most beneficial for prolonging their survival.
Collapse
Affiliation(s)
- Junsheng Ma
- 1 Department of Biostatistics, The University of Texas MD Anderson Cancer Center, USA
| | - Brian P Hobbs
- 1 Department of Biostatistics, The University of Texas MD Anderson Cancer Center, USA
| | - Francesco C Stingo
- 2 Dipartimento Di Statistica, informatica applicazionio, University of Florence, Italy
| |
Collapse
|
37
|
Affiliation(s)
- Qian Guan
- Department of Statistics, North Carolina State University, Raleigh, NC, USA
| | - Eric B Laber
- Department of Statistics, North Carolina State University, Raleigh, NC, USA
| | - Brian J Reich
- Department of Statistics, North Carolina State University, Raleigh, NC, USA
| |
Collapse
|
38
|
Zhu R, Zhao YQ, Chen G, Ma S, Zhao H. Greedy outcome weighted tree learning of optimal personalized treatment rules. Biometrics 2016; 73:391-400. [PMID: 27704531 DOI: 10.1111/biom.12593] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2015] [Revised: 07/01/2016] [Accepted: 08/01/2016] [Indexed: 11/27/2022]
Abstract
We propose a subgroup identification approach for inferring optimal and interpretable personalized treatment rules with high-dimensional covariates. Our approach is based on a two-step greedy tree algorithm to pursue signals in a high-dimensional space. In the first step, we transform the treatment selection problem into a weighted classification problem that can utilize tree-based methods. In the second step, we adopt a newly proposed tree-based method, known as reinforcement learning trees, to detect features involved in the optimal treatment rules and to construct binary splitting rules. The method is further extended to right censored survival data by using the accelerated failure time model and introducing double weighting to the classification trees. The performance of the proposed method is demonstrated via simulation studies, as well as analyses of the Cancer Cell Line Encyclopedia (CCLE) data and the Tamoxifen breast cancer data.
Collapse
Affiliation(s)
- Ruoqing Zhu
- University of Illinois at Urbana-Champaign, Champaign, Illinois, 61820, U.S.A
| | - Ying-Qi Zhao
- Fred Hutchinson Cancer Research Center, Seattle, Washington, 98109, U.S.A
| | - Guanhua Chen
- Vanderbilt University, Nashville, Tennessee, 37240, U.S.A
| | - Shuangge Ma
- Yale University, New Haven, Connecticut, 06520, U.S.A
| | - Hongyu Zhao
- Yale University, New Haven, Connecticut, 06520, U.S.A
| |
Collapse
|
39
|
Lipkovich I, Dmitrienko A, B R. Tutorial in biostatistics: data-driven subgroup identification and analysis in clinical trials. Stat Med 2016; 36:136-196. [PMID: 27488683 DOI: 10.1002/sim.7064] [Citation(s) in RCA: 159] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2015] [Revised: 06/23/2016] [Accepted: 07/05/2016] [Indexed: 02/05/2023]
Abstract
It is well known that both the direction and magnitude of the treatment effect in clinical trials are often affected by baseline patient characteristics (generally referred to as biomarkers). Characterization of treatment effect heterogeneity plays a central role in the field of personalized medicine and facilitates the development of tailored therapies. This tutorial focuses on a general class of problems arising in data-driven subgroup analysis, namely, identification of biomarkers with strong predictive properties and patient subgroups with desirable characteristics such as improved benefit and/or safety. Limitations of ad-hoc approaches to biomarker exploration and subgroup identification in clinical trials are discussed, and the ad-hoc approaches are contrasted with principled approaches to exploratory subgroup analysis based on recent advances in machine learning and data mining. A general framework for evaluating predictive biomarkers and identification of associated subgroups is introduced. The tutorial provides a review of a broad class of statistical methods used in subgroup discovery, including global outcome modeling methods, global treatment effect modeling methods, optimal treatment regimes, and local modeling methods. Commonly used subgroup identification methods are illustrated using two case studies based on clinical trials with binary and survival endpoints. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
| | | | - Ralph B
- Boston University, Boston, MA, U.S.A
| |
Collapse
|
40
|
Petkova E, Tarpey T, Su Z, Ogden RT. Generated effect modifiers (GEM's) in randomized clinical trials. Biostatistics 2016; 18:105-118. [PMID: 27465235 DOI: 10.1093/biostatistics/kxw035] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Revised: 06/09/2016] [Accepted: 06/12/2016] [Indexed: 01/07/2023] Open
Abstract
In a randomized clinical trial (RCT), it is often of interest not only to estimate the effect of various treatments on the outcome, but also to determine whether any patient characteristic has a different relationship with the outcome, depending on treatment. In regression models for the outcome, if there is a non-zero interaction between treatment and a predictor, that predictor is called an "effect modifier". Identification of such effect modifiers is crucial as we move towards precision medicine, that is, optimizing individual treatment assignment based on patient measurements assessed when presenting for treatment. In most settings, there will be several baseline predictor variables that could potentially modify the treatment effects. This article proposes optimal methods of constructing a composite variable (defined as a linear combination of pre-treatment patient characteristics) in order to generate an effect modifier in an RCT setting. Several criteria are considered for generating effect modifiers and their performance is studied via simulations. An example from a RCT is provided for illustration.
Collapse
Affiliation(s)
- Eva Petkova
- Department of Child and Adolescent Psychiatry, New York University, 1 Park Ave., New York, NY 10016, USA and Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Road, Orangeburg, NY 10962, USA
| | - Thaddeus Tarpey
- Department of Mathematics and Statistics, Wright State University, 3640 Colonel Glenn Hwy, Dayton, OH 45435, USA and Department of Child and Adolescent Psychiatry, New York University, 1 Park Ave., New York, NY 10016, USA
| | - Zhe Su
- Department of Child and Adolescent Psychiatry, New York University, 1 Park Ave., New York, NY 10016, USA
| | - R Todd Ogden
- Department of Biostatistics, Mailman School of Public Health, Columbia University, 722 West 168th St., New York, NY 10032, USA
| |
Collapse
|
41
|
Tajik P, Monfrance M, van 't Hooft J, Liem SMS, Schuit E, Bloemenkamp KWM, Duvekot JJ, Nij Bijvank B, Franssen MTM, Oudijk MA, Scheepers HCJ, Sikkema JM, Woiski M, Mol BWJ, Bekedam DJ, Bossuyt PM, Zafarmand MH. A multivariable model to guide the decision for pessary placement to prevent preterm birth in women with a multiple pregnancy: a secondary analysis of the ProTWIN trial. ULTRASOUND IN OBSTETRICS & GYNECOLOGY : THE OFFICIAL JOURNAL OF THE INTERNATIONAL SOCIETY OF ULTRASOUND IN OBSTETRICS AND GYNECOLOGY 2016; 48:48-55. [PMID: 26748537 DOI: 10.1002/uog.15855] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2015] [Revised: 12/16/2015] [Accepted: 12/23/2015] [Indexed: 06/05/2023]
Abstract
OBJECTIVE The ProTWIN Trial (NTR1858) showed that, in women with a multiple pregnancy and a cervical length < 25(th) percentile (38 mm), prophylactic use of a cervical pessary reduced the risk of adverse perinatal outcome. We investigated whether other maternal or pregnancy characteristics collected at baseline can improve identification of women most likely to benefit from pessary placement. METHODS ProTWIN is a multicenter randomized trial in which 808 women with a multiple pregnancy were assigned to pessary or control. Using these data we developed a multivariable logistic model comprising treatment, cervical length, chorionicity, pregnancy history and number of fetuses, and the interaction of these variables with treatment as predictors of adverse perinatal outcome. RESULTS Short cervix, monochorionicity and nulliparity were predictive factors for a benefit from pessary insertion. History of previous preterm birth and triplet pregnancy were predictive factors of possible harm from pessary. The model identified 35% of women as benefiting (95% CI, 32-39%), which is 10% more than using cervical length only (25%) for pessary decisions. The model had acceptable calibration. We estimated that using the model to guide the choice of pessary placement would reduce the risk of adverse perinatal outcome significantly from 13.5% when no pessary is inserted to 8.1% (absolute risk reduction, 5.4% (95% CI, 2.1-8.6%)). CONCLUSIONS We developed and internally validated a multivariable treatment selection model, with cervical length, chorionicity, pregnancy history and number of fetuses. If externally validated, it could be used to identify women with a twin pregnancy who would benefit from a pessary, and lead to a reduction in adverse perinatal outcomes in these women. Copyright © 2016 ISUOG. Published by John Wiley & Sons Ltd.
Collapse
Affiliation(s)
- P Tajik
- Department of Obstetrics and Gynaecology, Academic Medical Centre, Amsterdam, The Netherlands
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Centre, Amsterdam, The Netherlands
| | - M Monfrance
- Department of Obstetrics and Gynaecology, Atrium Medical Centre, Heerlen, The Netherlands
| | - J van 't Hooft
- Department of Obstetrics and Gynaecology, Academic Medical Centre, Amsterdam, The Netherlands
| | - S M S Liem
- Department of Obstetrics and Gynaecology, Academic Medical Centre, Amsterdam, The Netherlands
| | - E Schuit
- Department of Obstetrics and Gynaecology, Academic Medical Centre, Amsterdam, The Netherlands
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht, The Netherlands
| | - K W M Bloemenkamp
- Department of Obstetrics and Gynaecology, Leiden University Medical Centre, Leiden, The Netherlands
| | - J J Duvekot
- Department of Obstetrics and Gynaecology, Erasmus Medical Centre, Rotterdam, The Netherlands
| | - B Nij Bijvank
- Department of Obstetrics and Gynaecology, Isala Clinics, Zwolle, The Netherlands
| | - M T M Franssen
- Department of Obstetrics and Gynaecology, University Medical Centre Groningen, Groningen, The Netherlands
| | - M A Oudijk
- Department of Obstetrics and Gynaecology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - H C J Scheepers
- Department of Obstetrics and Gynaecology, Maastricht University Medical Center, Maastricht, The Netherlands
| | - J M Sikkema
- Department of Obstetrics and Gynaecology, ZGT, Almelo, The Netherlands
| | - M Woiski
- Department of Obstetrics and Gynaecology, Radboud University Nijmegen, Nijmegen, The Netherlands
| | - B W J Mol
- The Robinson Institute, School of Paediatrics and Reproductive Health, University of Adelaide, Adelaide, Australia
| | - D J Bekedam
- Department of Obstetrics and Gynaecology, Onze Lieve Vrouwe Gasthuis, Amsterdam, The Netherlands
| | - P M Bossuyt
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Centre, Amsterdam, The Netherlands
| | - M H Zafarmand
- Department of Obstetrics and Gynaecology, Academic Medical Centre, Amsterdam, The Netherlands
- Department of Public Health, Academic Medical Centre, Amsterdam, The Netherlands
| |
Collapse
|
42
|
Shen C, Hu Y, Li X, Wang Y, Chen PS, Buxton AE. Identification of subpopulations with distinct treatment benefit rate using the Bayesian tree. Biom J 2016; 58:1357-1375. [DOI: 10.1002/bimj.201500180] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Revised: 03/06/2016] [Accepted: 04/04/2016] [Indexed: 11/10/2022]
Affiliation(s)
- Changyu Shen
- Department of Biostatistics; School of Medicine; Richard M. Fairbanks School of Public Health; Indiana University; Indianapolis IN 46202 USA
| | - Yang Hu
- School of Life Science and Technology; Harbin Institute of Technology; Harbin HeiLongJiang 150001 China
| | - Xiaochun Li
- Department of Biostatistics; School of Medicine; Richard M. Fairbanks School of Public Health; Indiana University; Indianapolis IN 46202 USA
| | - Yadong Wang
- School of Life Science and Technology; Harbin Institute of Technology; Harbin HeiLongJiang 150001 China
| | - Peng-Sheng Chen
- Division of Cardiology; Krannert Institute of Cardiology, Department of Medicine; School of Medicine; Indiana University; Indianapolis IN 46202 USA
| | - Alfred E. Buxton
- Clinical Electrophysiology Laboratory; Beth Israel Deaconess Medical Center; Harvard Medical School; Boston MA 02215 USA
| |
Collapse
|
43
|
Tsai WM, Zhang H, Buta E, O'Malley S, Gueorguieva R. A modified classification tree method for personalized medicine decisions. STATISTICS AND ITS INTERFACE 2016; 9:239-253. [PMID: 26770292 PMCID: PMC4707681 DOI: 10.4310/sii.2016.v9.n2.a11] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The tree-based methodology has been widely applied to identify predictors of health outcomes in medical studies. However, the classical tree-based approaches do not pay particular attention to treatment assignment and thus do not consider prediction in the context of treatment received. In recent years, attention has been shifting from average treatment effects to identifying moderators of treatment response, and tree-based approaches to identify subgroups of subjects with enhanced treatment responses are emerging. In this study, we extend and present modifications to one of these approaches (Zhang et al., 2010 [29]) to efficiently identify subgroups of subjects who respond more favorably to one treatment than another based on their baseline characteristics. We extend the algorithm by incorporating an automatic pruning step and propose a measure for assessment of the predictive performance of the constructed tree. We evaluate the proposed method through a simulation study and illustrate the approach using a data set from a clinical trial of treatments for alcohol dependence. This simple and efficient statistical tool can be used for developing algorithms for clinical decision making and personalized treatment for patients based on their characteristics.
Collapse
Affiliation(s)
- Wan-Min Tsai
- Department of Biostatistics, Yale University School of Public Health, New Haven, CT 06520, USA,
| | - Heping Zhang
- Department of Biostatistics, Yale University School of Public Health, New Haven, CT 06520, USA,
| | - Eugenia Buta
- Department of Biostatistics, Yale University School of Public Health, New Haven, CT 06520, USA,
| | - Stephanie O'Malley
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT 06511, USA,
| | - Ralitza Gueorguieva
- Department of Biostatistics, Yale University School of Public Health, 60 College Street, New Haven, CT 06520, USA, Department of Psychiatry, Yale University School of Medicine, New Haven, CT 06511, USA,
| |
Collapse
|
44
|
Ma J, Stingo FC, Hobbs BP. Bayesian predictive modeling for genomic based personalized treatment selection. Biometrics 2015; 72:575-83. [PMID: 26575856 DOI: 10.1111/biom.12448] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2015] [Revised: 08/01/2015] [Accepted: 10/01/2015] [Indexed: 01/15/2023]
Abstract
Efforts to personalize medicine in oncology have been limited by reductive characterizations of the intrinsically complex underlying biological phenomena. Future advances in personalized medicine will rely on molecular signatures that derive from synthesis of multifarious interdependent molecular quantities requiring robust quantitative methods. However, highly parameterized statistical models when applied in these settings often require a prohibitively large database and are sensitive to proper characterizations of the treatment-by-covariate interactions, which in practice are difficult to specify and may be limited by generalized linear models. In this article, we present a Bayesian predictive framework that enables the integration of a high-dimensional set of genomic features with clinical responses and treatment histories of historical patients, providing a probabilistic basis for using the clinical and molecular information to personalize therapy for future patients. Our work represents one of the first attempts to define personalized treatment assignment rules based on large-scale genomic data. We use actual gene expression data acquired from The Cancer Genome Atlas in the settings of leukemia and glioma to explore the statistical properties of our proposed Bayesian approach for personalizing treatment selection. The method is shown to yield considerable improvements in predictive accuracy when compared to penalized regression approaches.
Collapse
Affiliation(s)
- Junsheng Ma
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, U.S.A
| | - Francesco C Stingo
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, U.S.A
| | - Brian P Hobbs
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, U.S.A
| |
Collapse
|
45
|
Statistical Methods for Establishing Personalized Treatment Rules in Oncology. BIOMED RESEARCH INTERNATIONAL 2015; 2015:670691. [PMID: 26446492 PMCID: PMC4584067 DOI: 10.1155/2015/670691] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2014] [Accepted: 02/09/2015] [Indexed: 12/23/2022]
Abstract
The process for using statistical inference to establish personalized treatment strategies requires
specific techniques for data-analysis that optimize the combination of competing therapies
with candidate genetic features and characteristics of the patient and disease. A wide variety
of methods have been developed. However, heretofore the usefulness of these recent advances
has not been fully recognized by the oncology community, and the scope of their applications
has not been summarized. In this paper, we provide an overview of statistical methods for
establishing optimal treatment rules for personalized medicine and discuss specific examples in
various medical contexts with oncology as an emphasis. We also point the reader to statistical
software for implementation of the methods when available.
Collapse
|
46
|
Huang Y. Identifying optimal biomarker combinations for treatment selection through randomized controlled trials. Clin Trials 2015; 12:348-56. [PMID: 25948620 PMCID: PMC4506270 DOI: 10.1177/1740774515580126] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
BACKGROUND/AIMS Biomarkers associated with treatment-effect heterogeneity can be used to make treatment recommendations that optimize individual clinical outcomes. To accomplish this, statistical methods are needed to generate marker-based treatment-selection rules that can most effectively reduce the population burden due to disease and treatment. Compared to the standard approach of risk modeling to derive treatment-selection rules, a more robust approach is to directly minimize an unbiased estimate of total disease and treatment burden among a pre-specified class of rules. This problem is one of minimizing a weighted sum of 0-1 loss function, which is computationally challenging to solve due to the nonsmoothness of 0-1 loss. Huang and Fong, among others, proposed a method that uses the Ramp loss to approximate the 0-1 loss and solves the minimization problem through repetitive constrained optimizations. The algorithm was shown to have comparable or better performance than other comparative estimators in various settings. Our aim in this article is to further extend the algorithm to allow for variable selection in the presence of a large number of candidate markers. METHODS We develop an alternative method to derive marker combinations to minimize the weighted sum of Ramp loss in Huang and Fong, based on data from randomized trials. The new algorithm estimates treatment-selection rules by repetitively minimizing a smooth and differentiable objective function. Through the use of an L1 penalty, we expand the method to allow for feature selection and develop an algorithm based on the coordinate descent method to build the treatment-selection rule. RESULTS Through extensive simulation studies, we compared performance of the proposed estimator to four existing approaches: (1) a logistic regression risk modeling approach, and three other "direct optimizing" approaches including (2) the estimator in Huang and Fong, (3) the weighted support vector machine, and (4) the weighted logistic regression. The proposed estimator performs comparably to that of Huang and Fong, and comparably or better than other estimators. Allowing for variable selection using the proposed estimator in the presence of a large number of markers further improves treatment-selection performance. The proposed estimator is also advantageous for selecting variables relevant to treatment selection compared to L1 penalized logistic regression and weighted logistic regression. We illustrate the application of the proposed methods in host-genetics data from an HIV vaccine trial. CONCLUSION The proposed estimator is appealing considering its effectiveness and conceptual simplicity. It has significant potential to contribute to the selection and combination of biomarkers for treatment selection in clinical practice.
Collapse
Affiliation(s)
- Ying Huang
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA Department of biostatistics, University of Washington, Seattle, WA, USA
| |
Collapse
|
47
|
Baker SG, Kramer BS. Evaluating surrogate endpoints, prognostic markers, and predictive markers: Some simple themes. Clin Trials 2015; 12:299-308. [PMID: 25385934 PMCID: PMC4451440 DOI: 10.1177/1740774514557725] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
BACKGROUND A surrogate endpoint is an endpoint observed earlier than the true endpoint (a health outcome) that is used to draw conclusions about the effect of treatment on the unobserved true endpoint. A prognostic marker is a marker for predicting the risk of an event given a control treatment; it informs treatment decisions when there is information on anticipated benefits and harms of a new treatment applied to persons at high risk. A predictive marker is a marker for predicting the effect of treatment on outcome in a subgroup of patients or study participants; it provides more rigorous information for treatment selection than a prognostic marker when it is based on estimated treatment effects in a randomized trial. METHODS We organized our discussion around a different theme for each topic. RESULTS "Fundamentally an extrapolation" refers to the non-statistical considerations and assumptions needed when using surrogate endpoints to evaluate a new treatment. "Decision analysis to the rescue" refers to use the use of decision analysis to evaluate an additional prognostic marker because it is not possible to choose between purely statistical measures of marker performance. "The appeal of simplicity" refers to a straightforward and efficient use of a single randomized trial to evaluate overall treatment effect and treatment effect within subgroups using predictive markers. CONCLUSION The simple themes provide a general guideline for evaluation of surrogate endpoints, prognostic markers, and predictive markers.
Collapse
Affiliation(s)
- Stuart G Baker
- Division of Cancer Prevention, National Cancer Institute, Bethesda MD, USA
| | - Barnett S Kramer
- Division of Cancer Prevention, National Cancer Institute, Bethesda MD, USA
| |
Collapse
|
48
|
Zhao YQ, Zeng D, Laber EB, Song R, Yuan M, Kosorok MR. Doubly Robust Learning for Estimating Individualized Treatment with Censored Data. Biometrika 2015; 102:151-168. [PMID: 25937641 PMCID: PMC4414056 DOI: 10.1093/biomet/asu050] [Citation(s) in RCA: 78] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Individualized treatment rules recommend treatments based on individual patient characteristics in order to maximize clinical benefit. When the clinical outcome of interest is survival time, estimation is often complicated by censoring. We develop nonparametric methods for estimating an optimal individualized treatment rule in the presence of censored data. To adjust for censoring, we propose a doubly robust estimator which requires correct specification of either the censoring model or survival model, but not both; the method is shown to be Fisher consistent when either model is correct. Furthermore, we establish the convergence rate of the expected survival under the estimated optimal individualized treatment rule to the expected survival under the optimal individualized treatment rule. We illustrate the proposed methods using simulation study and data from a Phase III clinical trial on non-small cell lung cancer.
Collapse
Affiliation(s)
- Y. Q. Zhao
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, 53792, U.S.A
| | - D. Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, U.S.A
| | - E. B. Laber
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, 27695, U.S.A
| | - R. Song
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, 27695, U.S.A
| | - M. Yuan
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, 53792, U.S.A
| | - M. R. Kosorok
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, U.S.A
| |
Collapse
|
49
|
Zhao YQ, Kosorok MR. Discussion of combining biomarkers to optimize patient treatment recommendations. Biometrics 2014; 70:713-6. [PMID: 24889265 DOI: 10.1111/biom.12189] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2014] [Revised: 02/01/2014] [Accepted: 02/01/2014] [Indexed: 12/01/2022]
Abstract
Kang, Janes and Huang propose an interesting boosting method to combine biomarkers for treatment selection. The method requires modeling the treatment effects using markers. We discuss an alternative method, outcome weighted learning. This method sidesteps the need for modeling the outcomes, and thus can be more robust to model misspecification.
Collapse
Affiliation(s)
- Ying-Qi Zhao
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin 53792, U.S.A
| | - Michael R Kosorok
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A.,Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| |
Collapse
|