1
|
Lipkovich I, Svensson D, Ratitch B, Dmitrienko A. Modern approaches for evaluating treatment effect heterogeneity from clinical trials and observational data. Stat Med 2024. [PMID: 39054669 DOI: 10.1002/sim.10167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 05/28/2024] [Accepted: 06/21/2024] [Indexed: 07/27/2024]
Abstract
In this paper, we review recent advances in statistical methods for the evaluation of the heterogeneity of treatment effects (HTE), including subgroup identification and estimation of individualized treatment regimens, from randomized clinical trials and observational studies. We identify several types of approaches using the features introduced in Lipkovich et al (Stat Med 2017;36: 136-196) that distinguish the recommended principled methods from basic methods for HTE evaluation that typically rely on rules of thumb and general guidelines (the methods are often referred to as common practices). We discuss the advantages and disadvantages of various principled methods as well as common measures for evaluating their performance. We use simulated data and a case study based on a historical clinical trial to illustrate several new approaches to HTE evaluation.
Collapse
Affiliation(s)
- Ilya Lipkovich
- Advanced Analytics and Access Capabilities, Eli Lilly and Company, Indianapolis, Indiana, USA
| | - David Svensson
- Statistical Innovation, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Bohdana Ratitch
- Clinical Statistics and Analytics, Research & Development, Pharmaceuticals, Bayer Inc., Mississauga, Ontario, Canada
| | - Alex Dmitrienko
- Department of Biostatistics, Mediana, San Juan, Puerto Rico, USA
| |
Collapse
|
2
|
Guo W, Zhou XH, Ma S. Estimation of Optimal Individualized Treatment Rules Using a Covariate-Specific Treatment Effect Curve With High-Dimensional Covariates. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2020.1865167] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Wenchuan Guo
- Department of Statistics, University of California Riverside, Riverside, CA
- Global Biometric Sciences, Bristol-Myers Squibb, Pennington, NJ
| | - Xiao-Hua Zhou
- Beijing International Center for Mathematical Research, and Department of Biostatistics, Peking University, Beijing, China
| | - Shujie Ma
- Department of Statistics, University of California Riverside, Riverside, CA
| |
Collapse
|
3
|
Huang Y, Cho J, Fong Y. Threshold-based subgroup testing in logistic regression models in two-phase sampling designs. J R Stat Soc Ser C Appl Stat 2021; 70:291-311. [PMID: 33840863 PMCID: PMC8032557 DOI: 10.1111/rssc.12459] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The effect of treatment on binary disease outcome can differ across subgroups characterized by other covariates. Testing for the existence of subgroups that are associated with heterogeneous treatment effects can provide valuable insight regarding the optimal treatment recommendation in practice. Our research in this paper is motivated by the question of whether host genetics could modify a vaccine's effect on HIV acquisition risk. To answer this question, we used data from an HIV vaccine trial with a two-phase sampling design and developed a general threshold-based model framework to test for the existence of subgroups associated with the heterogeneity in disease risks, allowing for subgroups based on multivariate covariates. We developed a testing procedure based on maximum of likelihood-ratio statistics over change planes and demonstrated its advantage over alternative methods. We further developed the testing procedure to account for bias sampling of expensive (i.e. resource-intensive to measure) covariates through the incorporation of inverse probability weighting techniques. We used the proposed method to analyze the motivating HIV vaccine trial data. Our proposed testing procedure also has broad applications in epidemiological studies for assessing heterogeneity in disease risk with respect to univariate or multivariate predictors.
Collapse
Affiliation(s)
- Ying Huang
- Biostatistics, Bioinformatics, & Epidemiology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109
| | - Juhee Cho
- Biostatistics, Bioinformatics, & Epidemiology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109
| | - Youyi Fong
- Biostatistics, Bioinformatics, & Epidemiology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109
| |
Collapse
|
4
|
Huang Y, Zhou XH. Identification of the optimal treatment regimen in the presence of missing covariates. Stat Med 2020; 39:353-368. [PMID: 31774192 PMCID: PMC6954309 DOI: 10.1002/sim.8407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Revised: 09/25/2019] [Accepted: 09/27/2019] [Indexed: 12/25/2022]
Abstract
Covariates associated with treatment-effect heterogeneity can potentially be used to make personalized treatment recommendations towards best clinical outcomes. Methods for treatment-selection rule development that directly maximize treatment-selection benefits have attracted much interest in recent years, due to the robustness of these methods to outcome modeling. In practice, the task of treatment-selection rule development can be further complicated by missingness in data. Here, we consider the identification of optimal treatment-selection rules for a binary disease outcome when measurements of an important covariate from study participants are partly missing. Under the missing at random assumption, we develop a robust estimator of treatment-selection rules under the direct-optimization paradigm. This estimator targets the maximum selection benefits to the population under correct specification of at least one mechanism from each of the two sets-missing data or conditional covariate distribution, and treatment assignment or disease outcome model. We evaluate and compare performance of the proposed estimator with alternative direct-optimization estimators through extensive simulation studies. We demonstrate the application of the proposed method through a real data example from an Alzheimer's disease study for developing covariate combinations to guide the treatment of Alzheimer's disease.
Collapse
Affiliation(s)
- Ying Huang
- Vaccine & Infectious Diseases Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA,Correspondence;
| | - Xiao-Hua Zhou
- Department of Biostatistics, Peking University, Beijing, China,Correspondence;
| |
Collapse
|
5
|
Dasgupta S, Huang Y. Selecting Biomarkers for building optimal treatment selection rules using Kernel Machines. J R Stat Soc Ser C Appl Stat 2020; 69:69-88. [PMID: 32921837 PMCID: PMC7485396 DOI: 10.1111/rssc.12379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Optimal biomarker combinations for treatment-selection can be derived by minimizing total burden to the population caused by the targeted disease and its treatment. However, when multiple biomarkers are present, including all in the model can be expensive and hurt model performance. To remedy this, we consider feature selection in optimization by minimizing an extended total burden that additionally incorporates biomarker costs. Formulating it as a 0-norm penalized weighted-classification, we develop various procedures for estimating linear and nonlinear combinations. Through simulations and a real data example, we demonstrate the importance of incorporating feature-selection and marker cost when deriving treatment-selection rules.
Collapse
Affiliation(s)
- Sayan Dasgupta
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, USA
| | - Ying Huang
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, USA
| |
Collapse
|
6
|
Huang X, Goldberg Y, Xu J. Multicategory individualized treatment regime using outcome weighted learning. Biometrics 2019; 75:1216-1227. [PMID: 31095722 DOI: 10.1111/biom.13084] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Accepted: 05/09/2019] [Indexed: 12/01/2022]
Abstract
Individualized treatment regimes (ITRs) aim to recommend treatments based on patient-specific characteristics in order to maximize the expected clinical outcome. Outcome weighted learning approaches have been proposed for this optimization problem with primary focus on the binary treatment case. Many require assumptions of the outcome value or the randomization mechanism. In this paper, we propose a general framework for multicategory ITRs using generic surrogate risk. The proposed method accommodates the situations when the outcome takes negative value and/or when the propensity score is unknown. Theoretical results about Fisher consistency, excess risk, and risk consistency are established. In practice, we recommend using differentiable convex loss for computational optimization. We demonstrate the superiority of the proposed method under multinomial deviance risk to some existing methods by simulation and application on data from a clinical trial.
Collapse
Affiliation(s)
- Xinyang Huang
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, and School of Statistics, East China Normal University, Shanghai, China
| | - Yair Goldberg
- The Faculty of Industrial Engineering and Management, Technion-Israel Institute of Technology, Haifa, Israel
| | - Jin Xu
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, and School of Statistics, East China Normal University, Shanghai, China
| |
Collapse
|
7
|
Litman T. Personalized medicine-concepts, technologies, and applications in inflammatory skin diseases. APMIS 2019; 127:386-424. [PMID: 31124204 PMCID: PMC6851586 DOI: 10.1111/apm.12934] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Accepted: 01/31/2019] [Indexed: 12/19/2022]
Abstract
The current state, tools, and applications of personalized medicine with special emphasis on inflammatory skin diseases like psoriasis and atopic dermatitis are discussed. Inflammatory pathways are outlined as well as potential targets for monoclonal antibodies and small-molecule inhibitors.
Collapse
Affiliation(s)
- Thomas Litman
- Department of Immunology and MicrobiologyUniversity of CopenhagenCopenhagenDenmark
- Explorative Biology, Skin ResearchLEO Pharma A/SBallerupDenmark
| |
Collapse
|
8
|
Sies A, Demyttenaere K, Van Mechelen I. Studying treatment-effect heterogeneity in precision medicine through induced subgroups. J Biopharm Stat 2019; 29:491-507. [PMID: 30794033 DOI: 10.1080/10543406.2019.1579220] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Precision medicine, in the sense of tailoring the choice of medical treatment to patients' pretreatment characteristics, is nowadays gaining a lot of attention. Preferably, this tailoring should be realized in an evidence-based way, with key evidence in this regard pertaining to subgroups of patients that respond differentially to treatment (i.e., to subgroups involved in treatment-subgroup interactions). Often a-priori hypotheses on subgroups involved in treatment-subgroup interactions are lacking or are incomplete at best. Therefore, methods are needed that can induce such subgroups from empirical data on treatment effectiveness in a post hoc manner. Recently, quite a few such methods have been developed. So far, however, there is little empirical experience in their usage. This may be problematic for medical statisticians and statistically minded medical researchers, as many (nontrivial) choices have to be made during the data-analytic process. The main purpose of this paper is to discuss the major concepts and considerations when using these methods. This discussion will be based on a systematic, conceptual, and technical analysis of the type of research questions at play, and of the type of data that the methods can handle along with the available software, and a review of available empirical evidence. We will illustrate all this with the analysis of a dataset comparing several anti-depressant treatments.
Collapse
Affiliation(s)
- Aniek Sies
- a Faculty of Psychology and Educational Sciences , KU Leuven , Leuven , Belgium
| | | | - Iven Van Mechelen
- a Faculty of Psychology and Educational Sciences , KU Leuven , Leuven , Belgium
| |
Collapse
|
9
|
Abstract
There is a growing interest in development of statistical methods for personalized medicine or precision medicine, especially for deriving optimal individualized treatment rules (ITRs). An ITR recommends a patient to a treatment based on the patient's characteristics. The common parametric methods for deriving an optimal ITR, which model the clinical endpoint as a function of the patient's characteristics, can have suboptimal performance when the conditional mean model is misspecified. Recent methodology development has cast the problem of deriving optimal ITR under a weighted classification framework. Under this weighted classification framework, we develop a weighted random forests (W-RF) algorithm that derives an optimal ITR nonparametrically. In addition, with the W-RF algorithm, we propose the variable importance measures for quantifying relative relevance of the patient's characteristics to treatment selection, and the out-of-bag estimator for the population average outcome under the estimated optimal ITR. Our proposed methods are evaluated through intensive simulation studies. We illustrate the application of our methods using data from Clinical Antipsychotic Trials of Intervention Effectiveness Alzheimers Disease Study (CATIE-AD).
Collapse
Affiliation(s)
- Kehao Zhu
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Ying Huang
- Department of Biostatistics, University of Washington, Seattle, WA, USA.,Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Xiao-Hua Zhou
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| |
Collapse
|
10
|
Lipkovich I, Dmitrienko A, Muysers C, Ratitch B. Multiplicity issues in exploratory subgroup analysis. J Biopharm Stat 2017; 28:63-81. [DOI: 10.1080/10543406.2017.1397009] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
11
|
Qiu X, Zeng D, Wang Y. Estimation and evaluation of linear individualized treatment rules to guarantee performance. Biometrics 2017; 74:517-528. [PMID: 28960239 DOI: 10.1111/biom.12773] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Revised: 08/01/2017] [Accepted: 08/01/2017] [Indexed: 11/30/2022]
Abstract
In clinical practice, an informative and practically useful treatment rule should be simple and transparent. However, because simple rules are likely to be far from optimal, effective methods to construct such rules must guarantee performance, in terms of yielding the best clinical outcome (highest reward) among the class of simple rules under consideration. Furthermore, it is important to evaluate the benefit of the derived rules on the whole sample and in pre-specified subgroups (e.g., vulnerable patients). To achieve both goals, we propose a robust machine learning method to estimate a linear treatment rule that is guaranteed to achieve optimal reward among the class of all linear rules. We then develop a diagnostic measure and inference procedure to evaluate the benefit of the obtained rule and compare it with the rules estimated by other methods. We provide theoretical justification for the proposed method and its inference procedure, and we demonstrate via simulations its superior performance when compared to existing methods. Lastly, we apply the method to the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) trial on major depressive disorder and show that the estimated optimal linear rule provides a large benefit for mildly depressed and severely depressed patients but manifests a lack-of-fit for moderately depressed patients.
Collapse
Affiliation(s)
- Xin Qiu
- Department of Biostatistics, Columbia University, New York, NY, U.S.A
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Yuanjia Wang
- Department of Biostatistics, Columbia University, New York, NY, U.S.A
| |
Collapse
|
12
|
Abstract
In this paper, we consider the generalized linear models (GLMs) [Formula: see text] where [Formula: see text] is a continuous differentiable function, [Formula: see text] are dependent errors. We obtain the M-estimator [Formula: see text] of [Formula: see text] from the following equation: [Formula: see text] where [Formula: see text] is assumed to be a convex function. We also show the linear representation and asymptotic normality of the estimator, which extend the correspondingly results of Wu et al. (M-estimation of linear models with dependent errors, Ann. Statist. 2007) to GLMs.
Collapse
Affiliation(s)
- Zhen Zeng
- College of Mathematics and Statistics, Hubei Normal University, Hubei Huangshi, Hubei, 0086/435000, P. R. China
| | - Hongchang Hu
- College of Mathematics and Statistics, Hubei Normal University, Hubei Huangshi, Hubei, 0086/435000, P. R. China
| |
Collapse
|
13
|
Lipkovich I, Dmitrienko A, B R. Tutorial in biostatistics: data-driven subgroup identification and analysis in clinical trials. Stat Med 2016; 36:136-196. [PMID: 27488683 DOI: 10.1002/sim.7064] [Citation(s) in RCA: 165] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2015] [Revised: 06/23/2016] [Accepted: 07/05/2016] [Indexed: 02/05/2023]
Abstract
It is well known that both the direction and magnitude of the treatment effect in clinical trials are often affected by baseline patient characteristics (generally referred to as biomarkers). Characterization of treatment effect heterogeneity plays a central role in the field of personalized medicine and facilitates the development of tailored therapies. This tutorial focuses on a general class of problems arising in data-driven subgroup analysis, namely, identification of biomarkers with strong predictive properties and patient subgroups with desirable characteristics such as improved benefit and/or safety. Limitations of ad-hoc approaches to biomarker exploration and subgroup identification in clinical trials are discussed, and the ad-hoc approaches are contrasted with principled approaches to exploratory subgroup analysis based on recent advances in machine learning and data mining. A general framework for evaluating predictive biomarkers and identification of associated subgroups is introduced. The tutorial provides a review of a broad class of statistical methods used in subgroup discovery, including global outcome modeling methods, global treatment effect modeling methods, optimal treatment regimes, and local modeling methods. Commonly used subgroup identification methods are illustrated using two case studies based on clinical trials with binary and survival endpoints. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
| | | | - Ralph B
- Boston University, Boston, MA, U.S.A
| |
Collapse
|
14
|
Xu T, Fang Y, Rong A, Wang J. Flexible combination of multiple diagnostic biomarkers to improve diagnostic accuracy. BMC Med Res Methodol 2015; 15:94. [PMID: 26521228 PMCID: PMC4628350 DOI: 10.1186/s12874-015-0085-z] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2015] [Accepted: 10/17/2015] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND In medical research, it is common to collect information of multiple continuous biomarkers to improve the accuracy of diagnostic tests. Combining the measurements of these biomarkers into one single score is a popular practice to integrate the collected information, where the accuracy of the resultant diagnostic test is usually improved. To measure the accuracy of a diagnostic test, the Youden index has been widely used in literature. Various parametric and nonparametric methods have been proposed to linearly combine biomarkers so that the corresponding Youden index can be optimized. Yet there seems to be little justification of enforcing such a linear combination. METHODS This paper proposes a flexible approach that allows both linear and nonlinear combinations of biomarkers. The proposed approach formulates the problem in a large margin classification framework, where the combination function is embedded in a flexible reproducing kernel Hilbert space. RESULTS Advantages of the proposed approach are demonstrated in a variety of simulated experiments as well as a real application to a liver disorder study. CONCLUSION Linear combination of multiple diagnostic biomarkers are widely used without proper justification. Additional research on flexible framework allowing both linear and nonlinear combinations is in need.
Collapse
Affiliation(s)
- Tu Xu
- Gilead Sciences Inc., 333 Lakeside Dr, Foster City, 94404, CA, USA.
| | - Yixin Fang
- Division of Biostatistics, Department of Population Health, New York University, New York, USA.
| | - Alan Rong
- Astellas Pharma Inc., Northbrook, USA.
| | - Junhui Wang
- Department of Mathematics, City University of Hong Kong, Kowloon Tong, Hong Kong.
| |
Collapse
|
15
|
Huang Y. Identifying optimal biomarker combinations for treatment selection through randomized controlled trials. Clin Trials 2015; 12:348-56. [PMID: 25948620 PMCID: PMC4506270 DOI: 10.1177/1740774515580126] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
BACKGROUND/AIMS Biomarkers associated with treatment-effect heterogeneity can be used to make treatment recommendations that optimize individual clinical outcomes. To accomplish this, statistical methods are needed to generate marker-based treatment-selection rules that can most effectively reduce the population burden due to disease and treatment. Compared to the standard approach of risk modeling to derive treatment-selection rules, a more robust approach is to directly minimize an unbiased estimate of total disease and treatment burden among a pre-specified class of rules. This problem is one of minimizing a weighted sum of 0-1 loss function, which is computationally challenging to solve due to the nonsmoothness of 0-1 loss. Huang and Fong, among others, proposed a method that uses the Ramp loss to approximate the 0-1 loss and solves the minimization problem through repetitive constrained optimizations. The algorithm was shown to have comparable or better performance than other comparative estimators in various settings. Our aim in this article is to further extend the algorithm to allow for variable selection in the presence of a large number of candidate markers. METHODS We develop an alternative method to derive marker combinations to minimize the weighted sum of Ramp loss in Huang and Fong, based on data from randomized trials. The new algorithm estimates treatment-selection rules by repetitively minimizing a smooth and differentiable objective function. Through the use of an L1 penalty, we expand the method to allow for feature selection and develop an algorithm based on the coordinate descent method to build the treatment-selection rule. RESULTS Through extensive simulation studies, we compared performance of the proposed estimator to four existing approaches: (1) a logistic regression risk modeling approach, and three other "direct optimizing" approaches including (2) the estimator in Huang and Fong, (3) the weighted support vector machine, and (4) the weighted logistic regression. The proposed estimator performs comparably to that of Huang and Fong, and comparably or better than other estimators. Allowing for variable selection using the proposed estimator in the presence of a large number of markers further improves treatment-selection performance. The proposed estimator is also advantageous for selecting variables relevant to treatment selection compared to L1 penalized logistic regression and weighted logistic regression. We illustrate the application of the proposed methods in host-genetics data from an HIV vaccine trial. CONCLUSION The proposed estimator is appealing considering its effectiveness and conceptual simplicity. It has significant potential to contribute to the selection and combination of biomarkers for treatment selection in clinical practice.
Collapse
Affiliation(s)
- Ying Huang
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA Department of biostatistics, University of Washington, Seattle, WA, USA
| |
Collapse
|