1
|
Zhang Z, Yi D, Fan Y. Doubly robust estimation of optimal dynamic treatment regimes with multicategory treatments and survival outcomes. Stat Med 2022; 41:4903-4923. [PMID: 35948279 DOI: 10.1002/sim.9543] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 05/31/2022] [Accepted: 07/21/2022] [Indexed: 11/06/2022]
Abstract
Patients with chronic diseases, such as cancer or epilepsy, are often followed through multiple stages of clinical interventions. Dynamic treatment regimes (DTRs) are sequences of decision rules that assign treatments at each stage based on measured covariates for each patient. A DTR is said to be optimal if the expectation of the desirable clinical benefit reaches a maximum when applied to a population. When there are three or more options for treatments at each decision point and the clinical outcome of interest is a time-to-event variable, estimating an optimal DTR can be complicated. We propose a doubly robust method to estimate optimal DTRs with multicategory treatments and survival outcomes. A novel blip function is defined to measure the difference in expected outcomes among treatments, and a doubly robust weighted least squares algorithm is designed for parameter estimation. Simulations using various weight functions and scenarios support the advantages of the proposed method in estimating optimal DTRs over existing approaches. We further illustrate the practical value of our method by applying it to data from the Standard and New Antiepileptic Drugs study. In this analysis, the proposed method supports the use of the new drug lamotrigine over the standard option carbamazepine. When the actual treatments match the estimated optimal treatments, survival outcomes tend to be better. The newly developed method provides a practical approach for clinicians that is not limited to cases of binary treatment options.
Collapse
Affiliation(s)
- Zhang Zhang
- Center for Applied Statistics, Renmin University of China, Beijing, China.,School of Statistics, Renmin University of China, Beijing, China
| | - Danhui Yi
- Center for Applied Statistics, Renmin University of China, Beijing, China.,School of Statistics, Renmin University of China, Beijing, China
| | - Yiwei Fan
- School of Mathematics and Statistics, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
2
|
Survival Augmented Patient Preference Incorporated Reinforcement Learning to Evaluate Tailoring Variables for Personalized Healthcare. STATS 2021. [DOI: 10.3390/stats4040046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
In this paper, we consider personalized treatment decision strategies in the management of chronic diseases, such as chronic kidney disease, which typically consists of sequential and adaptive treatment decision making. We investigate a two-stage treatment setting with a survival outcome that could be right censored. This can be formulated through a dynamic treatment regime (DTR) framework, where the goal is to tailor treatment to each individual based on their own medical history in order to maximize a desirable health outcome. We develop a new method, Survival Augmented Patient Preference incorporated reinforcement Q-Learning (SAPP-Q-Learning) to decide between quality of life and survival restricted at maximal follow-up. Our method incorporates the latent patient preference into a weighted utility function that balances between quality of life and survival time, in a Q-learning model framework. We further propose a corresponding m-out-of-n Bootstrap procedure to accurately make statistical inferences and construct confidence intervals on the effects of tailoring variables, whose values can guide personalized treatment strategies.
Collapse
|
3
|
Mahar RK, McGuinness MB, Chakraborty B, Carlin JB, IJzerman MJ, Simpson JA. A scoping review of studies using observational data to optimise dynamic treatment regimens. BMC Med Res Methodol 2021; 21:39. [PMID: 33618655 PMCID: PMC7898728 DOI: 10.1186/s12874-021-01211-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 01/19/2021] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Dynamic treatment regimens (DTRs) formalise the multi-stage and dynamic decision problems that clinicians often face when treating chronic or progressive medical conditions. Compared to randomised controlled trials, using observational data to optimise DTRs may allow a wider range of treatments to be evaluated at a lower cost. This review aimed to provide an overview of how DTRs are optimised with observational data in practice. METHODS Using the PubMed database, a scoping review of studies in which DTRs were optimised using observational data was performed in October 2020. Data extracted from eligible articles included target medical condition, source and type of data, statistical methods, and translational relevance of the included studies. RESULTS From 209 PubMed abstracts, 37 full-text articles were identified, and a further 26 were screened from the reference lists, totalling 63 articles for inclusion in a narrative data synthesis. Observational DTR models are a recent development and their application has been concentrated in a few medical areas, primarily HIV/AIDS (27, 43%), followed by cancer (8, 13%), and diabetes (6, 10%). There was substantial variation in the scope, intent, complexity, and quality between the included studies. Statistical methods that were used included inverse-probability weighting (26, 41%), the parametric G-formula (16, 25%), Q-learning (10, 16%), G-estimation (4, 6%), targeted maximum likelihood/minimum loss-based estimation (4, 6%), regret regression (3, 5%), and other less common approaches (10, 16%). Notably, studies that were primarily intended to address real-world clinical questions (18, 29%) tended to use inverse-probability weighting and the parametric G-formula, relatively well-established methods, along with a large amount of data. Studies focused on methodological developments (45, 71%) tended to be more complicated and included a demonstrative real-world application only. CONCLUSIONS As chronic and progressive conditions become more common, the need will grow for personalised treatments and methods to estimate the effects of DTRs. Observational DTR studies will be necessary, but so far their use to inform clinical practice has been limited. Focusing on simple DTRs, collecting large and rich clinical datasets, and fostering tight partnerships between content experts and data analysts may result in more clinically relevant observational DTR studies.
Collapse
Affiliation(s)
- Robert K Mahar
- Biostatistics Unit, Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Parkville, Victoria, Australia.
- Cancer Health Services Research Unit, University of Melbourne Centre for Cancer Research and Centre for Health Policy, Melbourne School of Population and Global Health, University of Melbourne, Parkville, Victoria, Australia.
- Victorian Comprehensive Cancer Centre, Parkville, Victoria, Australia.
| | - Myra B McGuinness
- Biostatistics Unit, Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Parkville, Victoria, Australia
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, Melbourne, Victoria, Australia
| | - Bibhas Chakraborty
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
- Department of Statistics and Applied Probability, Faculty of Science, National University of Singapore, Singapore, Singapore
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA
| | - John B Carlin
- Biostatistics Unit, Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Parkville, Victoria, Australia
- Clinical Epidemiology and Biostatistics Unit, Murdoch Children's Research Institute, Parkville, Victoria, Australia
| | - Maarten J IJzerman
- Cancer Health Services Research Unit, University of Melbourne Centre for Cancer Research and Centre for Health Policy, Melbourne School of Population and Global Health, University of Melbourne, Parkville, Victoria, Australia
- Victorian Comprehensive Cancer Centre, Parkville, Victoria, Australia
- Peter MacCallum Cancer Centre, Parkville, Victoria, Australia
| | - Julie A Simpson
- Biostatistics Unit, Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Parkville, Victoria, Australia
| |
Collapse
|
4
|
Moodie EEM, Stephens DA, Alam S, Zhang MJ, Logan B, Arora M, Spellman S, Krakow EF. A cure-rate model for Q-learning: Estimating an adaptive immunosuppressant treatment strategy for allogeneic hematopoietic cell transplant patients. Biom J 2018; 61:442-453. [PMID: 29766558 DOI: 10.1002/bimj.201700181] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 02/26/2018] [Accepted: 03/23/2018] [Indexed: 11/11/2022]
Abstract
Cancers treated by transplantation are often curative, but immunosuppressive drugs are required to prevent and (if needed) to treat graft-versus-host disease. Estimation of an optimal adaptive treatment strategy when treatment at either one of two stages of treatment may lead to a cure has not yet been considered. Using a sample of 9563 patients treated for blood and bone cancers by allogeneic hematopoietic cell transplantation drawn from the Center for Blood and Marrow Transplant Research database, we provide a case study of a novel approach to Q-learning for survival data in the presence of a potentially curative treatment, and demonstrate the results differ substantially from an implementation of Q-learning that fails to account for the cure-rate.
Collapse
Affiliation(s)
- Erica E M Moodie
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, H3A 1A2, Canada
| | - David A Stephens
- Department of Mathematics and Statistics, McGill University, Montreal, QC, H3A 1A2, Canada
| | - Shomoita Alam
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, H3A 1A2, Canada
| | - Mei-Jie Zhang
- Medical College of Wisconsin, Milwaukee, WI, 53226, USA
| | - Brent Logan
- Medical College of Wisconsin, Milwaukee, WI, 53226, USA
| | - Mukta Arora
- Department of Medicine, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Stephen Spellman
- Center for International Blood and Marrow Transplant Research, Minneapolis, MN, 55401, USA
| | | |
Collapse
|
5
|
Krakow EF, Hemmer M, Wang T, Logan B, Arora M, Spellman S, Couriel D, Alousi A, Pidala J, Last M, Lachance S, Moodie EEM. Tools for the Precision Medicine Era: How to Develop Highly Personalized Treatment Recommendations From Cohort and Registry Data Using Q-Learning. Am J Epidemiol 2017; 186:160-172. [PMID: 28472335 DOI: 10.1093/aje/kwx027] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2015] [Accepted: 08/02/2017] [Indexed: 01/01/2023] Open
Abstract
Q-learning is a method of reinforcement learning that employs backwards stagewise estimation to identify sequences of actions that maximize some long-term reward. The method can be applied to sequential multiple-assignment randomized trials to develop personalized adaptive treatment strategies (ATSs)-longitudinal practice guidelines highly tailored to time-varying attributes of individual patients. Sometimes, the basis for choosing which ATSs to include in a sequential multiple-assignment randomized trial (or randomized controlled trial) may be inadequate. Nonrandomized data sources may inform the initial design of ATSs, which could later be prospectively validated. In this paper, we illustrate challenges involved in using nonrandomized data for this purpose with a case study from the Center for International Blood and Marrow Transplant Research registry (1995-2007) aimed at 1) determining whether the sequence of therapeutic classes used in graft-versus-host disease prophylaxis and in refractory graft-versus-host disease is associated with improved survival and 2) identifying donor and patient factors with which to guide individualized immunosuppressant selections over time. We discuss how to communicate the potential benefit derived from following an ATS at the population and subgroup levels and how to evaluate its robustness to modeling assumptions. This worked example may serve as a model for developing ATSs from registries and cohorts in oncology and other fields requiring sequential treatment decisions.
Collapse
|
6
|
Linn KA, Laber EB, Stefanski LA. Interactive Q-learning for Quantiles. J Am Stat Assoc 2017; 112:638-649. [PMID: 28890584 PMCID: PMC5586239 DOI: 10.1080/01621459.2016.1155993] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Revised: 01/01/2016] [Indexed: 12/18/2022]
Abstract
A dynamic treatment regime is a sequence of decision rules, each of which recommends treatment based on features of patient medical history such as past treatments and outcomes. Existing methods for estimating optimal dynamic treatment regimes from data optimize the mean of a response variable. However, the mean may not always be the most appropriate summary of performance. We derive estimators of decision rules for optimizing probabilities and quantiles computed with respect to the response distribution for two-stage, binary treatment settings. This enables estimation of dynamic treatment regimes that optimize the cumulative distribution function of the response at a prespecified point or a prespecified quantile of the response distribution such as the median. The proposed methods perform favorably in simulation experiments. We illustrate our approach with data from a sequentially randomized trial where the primary outcome is remission of depression symptoms.
Collapse
Affiliation(s)
- Kristin A Linn
- Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA 19104
| | - Eric B Laber
- Department of Statistics, North Carolina State University, Raleigh, NC 27695
| | - Leonard A Stefanski
- Department of Statistics, North Carolina State University, Raleigh, NC 27695
| |
Collapse
|
7
|
Wallace MP, Moodie EEM. Doubly-robust dynamic treatment regimen estimation via weighted least squares. Biometrics 2015; 71:636-44. [PMID: 25854539 DOI: 10.1111/biom.12306] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Revised: 02/01/2015] [Accepted: 02/01/2015] [Indexed: 11/28/2022]
Abstract
Personalized medicine is a rapidly expanding area of health research wherein patient level information is used to inform their treatment. Dynamic treatment regimens (DTRs) are a means of formalizing the sequence of treatment decisions that characterize personalized management plans. Identifying the DTR which optimizes expected patient outcome is of obvious interest and numerous methods have been proposed for this purpose. We present a new approach which builds on two established methods: Q-learning and G-estimation, offering the doubly robust property of the latter but with ease of implementation much more akin to the former. We outline the underlying theory, provide simulation studies that demonstrate the double-robustness and efficiency properties of our approach, and illustrate its use on data from the Promotion of Breastfeeding Intervention Trial.
Collapse
Affiliation(s)
- Michael P Wallace
- Department of Epidemiology, Biostatistics and Occupational Health McGill University, Montreal, Canada
| | - Erica E M Moodie
- Department of Epidemiology, Biostatistics and Occupational Health McGill University, Montreal, Canada
| |
Collapse
|
8
|
Huang X, Ning J, Wahed AS. Optimization of individualized dynamic treatment regimes for recurrent diseases. Stat Med 2014; 33:2363-78. [PMID: 24510534 PMCID: PMC4043865 DOI: 10.1002/sim.6104] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2013] [Revised: 01/14/2014] [Accepted: 01/15/2014] [Indexed: 11/10/2022]
Abstract
Patients with cancer or other recurrent diseases may undergo a long process of initial treatment, disease recurrences, and salvage treatments. It is important to optimize the multi-stage treatment sequence in this process to maximally prolong patients' survival. Comparing disease-free survival for each treatment stage over penalizes disease recurrences but under penalizes treatment-related mortalities. Moreover, treatment regimes used in practice are dynamic; that is, the choice of next treatment depends on a patient's responses to previous therapies. In this article, using accelerated failure time models, we develop a method to optimize such dynamic treatment regimes. This method utilizes all the longitudinal data collected during the multi-stage process of disease recurrences and treatments, and identifies the optimal dynamic treatment regime for each individual patient by maximizing his or her expected overall survival. We illustrate the application of this method using data from a study of acute myeloid leukemia, for which the optimal treatment strategies for different patient subgroups are identified.
Collapse
Affiliation(s)
- Xuelin Huang
- Department of Biostatistics, The University of Texas MD
Anderson Cancer Center, Houston, TX 77230
| | - Jing Ning
- Department of Biostatistics, The University of Texas MD
Anderson Cancer Center, Houston, TX 77230
| | - Abdus S. Wahed
- Department of Biostatistics, The University of Pittsburgh,
Pittsburgh, PA 15260
| |
Collapse
|