1
|
Lipkovich I, Svensson D, Ratitch B, Dmitrienko A. Modern approaches for evaluating treatment effect heterogeneity from clinical trials and observational data. Stat Med 2024. [PMID: 39054669 DOI: 10.1002/sim.10167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 05/28/2024] [Accepted: 06/21/2024] [Indexed: 07/27/2024]
Abstract
In this paper, we review recent advances in statistical methods for the evaluation of the heterogeneity of treatment effects (HTE), including subgroup identification and estimation of individualized treatment regimens, from randomized clinical trials and observational studies. We identify several types of approaches using the features introduced in Lipkovich et al (Stat Med 2017;36: 136-196) that distinguish the recommended principled methods from basic methods for HTE evaluation that typically rely on rules of thumb and general guidelines (the methods are often referred to as common practices). We discuss the advantages and disadvantages of various principled methods as well as common measures for evaluating their performance. We use simulated data and a case study based on a historical clinical trial to illustrate several new approaches to HTE evaluation.
Collapse
Affiliation(s)
- Ilya Lipkovich
- Advanced Analytics and Access Capabilities, Eli Lilly and Company, Indianapolis, Indiana, USA
| | - David Svensson
- Statistical Innovation, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Bohdana Ratitch
- Clinical Statistics and Analytics, Research & Development, Pharmaceuticals, Bayer Inc., Mississauga, Ontario, Canada
| | - Alex Dmitrienko
- Department of Biostatistics, Mediana, San Juan, Puerto Rico, USA
| |
Collapse
|
2
|
MacNell N, Feinstein L, Wilkerson J, Salo PM, Molsberry SA, Fessler MB, Thorne PS, Motsinger-Reif AA, Zeldin DC. Implementing machine learning methods with complex survey data: Lessons learned on the impacts of accounting sampling weights in gradient boosting. PLoS One 2023; 18:e0280387. [PMID: 36638125 PMCID: PMC9838837 DOI: 10.1371/journal.pone.0280387] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 12/28/2022] [Indexed: 01/14/2023] Open
Abstract
Despite the prominent use of complex survey data and the growing popularity of machine learning methods in epidemiologic research, few machine learning software implementations offer options for handling complex samples. A major challenge impeding the broader incorporation of machine learning into epidemiologic research is incomplete guidance for analyzing complex survey data, including the importance of sampling weights for valid prediction in target populations. Using data from 15, 820 participants in the 1988-1994 National Health and Nutrition Examination Survey cohort, we determined whether ignoring weights in gradient boosting models of all-cause mortality affected prediction, as measured by the F1 score and corresponding 95% confidence intervals. In simulations, we additionally assessed the impact of sample size, weight variability, predictor strength, and model dimensionality. In the National Health and Nutrition Examination Survey data, unweighted model performance was inflated compared to the weighted model (F1 score 81.9% [95% confidence interval: 81.2%, 82.7%] vs 77.4% [95% confidence interval: 76.1%, 78.6%]). However, the error was mitigated if the F1 score was subsequently recalculated with observed outcomes from the weighted dataset (F1: 77.0%; 95% confidence interval: 75.7%, 78.4%). In simulations, this finding held in the largest sample size (N = 10,000) under all analytic conditions assessed. For sample sizes <5,000, sampling weights had little impact in simulations that more closely resembled a simple random sample (low weight variability) or in models with strong predictors, but findings were inconsistent under other analytic scenarios. Failing to account for sampling weights in gradient boosting models may limit generalizability for data from complex surveys, dependent on sample size and other analytic properties. In the absence of software for configuring weighted algorithms, post-hoc re-calculations of unweighted model performance using weighted observed outcomes may more accurately reflect model prediction in target populations than ignoring weights entirely.
Collapse
Affiliation(s)
- Nathaniel MacNell
- Social & Scientific Systems, a DLH Holdings Company, Durham, North Carolina, United States of America
| | - Lydia Feinstein
- Social & Scientific Systems, a DLH Holdings Company, Durham, North Carolina, United States of America
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Jesse Wilkerson
- Social & Scientific Systems, a DLH Holdings Company, Durham, North Carolina, United States of America
| | - Pӓivi M. Salo
- Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina, United States of America
| | - Samantha A. Molsberry
- Social & Scientific Systems, a DLH Holdings Company, Durham, North Carolina, United States of America
| | - Michael B. Fessler
- Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina, United States of America
| | - Peter S. Thorne
- Department of Occupational and Environmental Health, University of Iowa, College of Public Health, Iowa City, Iowa, United States of America
| | - Alison A. Motsinger-Reif
- Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina, United States of America
| | - Darryl C. Zeldin
- Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina, United States of America
| |
Collapse
|
3
|
Hu B, Wang C, Jiang K, Shen Z, Yang X, Yin M, Liang B, Xie Q, Ye Y, Gao Z. Development and validation of a novel diagnostic model for initially clinical diagnosed gastrointestinal stromal tumors using an extreme gradient-boosting machine. BMC Gastroenterol 2021; 21:481. [PMID: 34922474 PMCID: PMC8684147 DOI: 10.1186/s12876-021-02048-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 12/01/2021] [Indexed: 12/17/2022] Open
Abstract
Introduction Gastrointestinal stromal tumor (GIST) is the most common gastrointestinal soft tissue tumor. Clinical diagnosis mainly relies on enhanced CT, endoscopy and endoscopic ultrasound (EUS), but the misdiagnosis rate is still high without fine needle aspiration biopsy. We aim to develop a novel diagnostic model by analyzing the preoperative data of the patients. Methods We used the data of patients who were initially diagnosed as gastric GIST and underwent partial gastrectomy. The patients were randomly divided into training dataset and test dataset at a ratio of 3 to 1. After pre-experimental screening, max depth = 2, eta = 0.1, gamma = 0.5, and nrounds = 200 were defined as the best parameters, and in this way we developed the initial extreme gradient-boosting (XGBoost) model. Based on the importance of the features in the initial model, we improved the model by excluding the hematological features. In this way we obtained the final XGBoost model and underwent validation using the test dataset. Results In the initial XGBoost model, we found that the hematological indicators (including inflammation and nutritional indicators) examined before the surgery had little effect on the outcome, so we subsequently excluded the hematological indicators. Similarly, we also screened the features from enhanced CT and ultrasound gastroscopy, and finally determined the 6 most important predictors for GIST diagnosis, including the ratio of long and short diameter under CT, the CT value of the tumor, the enhancement of the tumor in arterial period and venous period, existence of liquid area and calcific area inside the tumor under EUS. Round or round-like tumors with a CT value of around 30 (25–37) and delayed enhancement, as well as liquid but not calcific area inside the tumor best indicate the diagnosis of GIST. Conclusions We developed a model to further differential diagnose GIST from other tumors in initially clinical diagnosed gastric GIST patients by analyzing the results of clinical examinations that most patients should have completed before surgical resection. Supplementary Information The online version contains supplementary material available at 10.1186/s12876-021-02048-1.
Collapse
Affiliation(s)
- Bozhi Hu
- Department of Gastrointestinal Surgery, Peking University People's Hospital, No.11 Xizhimen South Street, Xicheng District, Beijing, 100044, China
| | - Chao Wang
- Department of Gastrointestinal Surgery, Peking University People's Hospital, No.11 Xizhimen South Street, Xicheng District, Beijing, 100044, China
| | - Kewei Jiang
- Department of Gastrointestinal Surgery, Peking University People's Hospital, No.11 Xizhimen South Street, Xicheng District, Beijing, 100044, China
| | - Zhanlong Shen
- Department of Gastrointestinal Surgery, Peking University People's Hospital, No.11 Xizhimen South Street, Xicheng District, Beijing, 100044, China
| | - Xiaodong Yang
- Department of Gastrointestinal Surgery, Peking University People's Hospital, No.11 Xizhimen South Street, Xicheng District, Beijing, 100044, China
| | - Mujun Yin
- Department of Gastrointestinal Surgery, Peking University People's Hospital, No.11 Xizhimen South Street, Xicheng District, Beijing, 100044, China
| | - Bin Liang
- Department of Gastrointestinal Surgery, Peking University People's Hospital, No.11 Xizhimen South Street, Xicheng District, Beijing, 100044, China
| | - Qiwei Xie
- Department of Gastrointestinal Surgery, Peking University People's Hospital, No.11 Xizhimen South Street, Xicheng District, Beijing, 100044, China
| | - Yingjiang Ye
- Department of Gastrointestinal Surgery, Peking University People's Hospital, No.11 Xizhimen South Street, Xicheng District, Beijing, 100044, China.
| | - Zhidong Gao
- Department of Gastrointestinal Surgery, Peking University People's Hospital, No.11 Xizhimen South Street, Xicheng District, Beijing, 100044, China.
| |
Collapse
|
4
|
Cunningham L, Ganier C, Ferguson F, White IR, Watt FM, McFadden J, Lynch MD. Gradient boosting approaches can outperform logistic regression for risk prediction in cutaneous allergy. Contact Dermatitis 2021; 86:165-174. [PMID: 34812539 DOI: 10.1111/cod.14011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 09/27/2021] [Accepted: 10/13/2021] [Indexed: 11/28/2022]
Abstract
BACKGROUND Contact allergy is a major clinical and public health challenge. It is important to identify individuals who are at risk and perform patch testing to identify relevant allergens. Predicting clinical risk on the basis of input parameters is common in clinical medicine and traditionally has been achieved with linear models. OBJECTIVES We hypothesized that the risk of a clinically relevant positive patch test could be predicted according to clinical and demographic parameters. METHODS We compared the predictive accuracy of logistic regression with more sophisticated machine learning approaches such as gradient boosting, in the prediction of patch testing results. RESULTS We found that both logistic regression and more sophisticated machine learning approaches were able to predict the risk of positive patch tests. For certain predictions, including the overall risk of a clinically relevant positive patch test, gradient boosting approaches can outperform logistic regression. CONCLUSIONS These findings suggest that complex nonlinear interactions between input variables are relevant in risk prediction. While a risk prediction model cannot replace the judgment of an experienced clinician, quantifying the risk of a clinically relevant positive patch test result has the potential to assist in decision making and to inform discussions with patients.
Collapse
Affiliation(s)
- Louise Cunningham
- Department of Cutaneous Allergy, St John's Institute of Dermatology, Guy's Hospital, London, UK
| | - Clarisse Ganier
- Centre for Stem Cells and Regenerative Medicine, King's College London, Guy's Hospital, London, UK
| | - Felicity Ferguson
- Department of Cutaneous Allergy, St John's Institute of Dermatology, Guy's Hospital, London, UK
| | - Ian R White
- Department of Cutaneous Allergy, St John's Institute of Dermatology, Guy's Hospital, London, UK
| | - Fiona M Watt
- Centre for Stem Cells and Regenerative Medicine, King's College London, Guy's Hospital, London, UK
| | - John McFadden
- Department of Cutaneous Allergy, St John's Institute of Dermatology, Guy's Hospital, London, UK
| | - Magnus D Lynch
- Department of Cutaneous Allergy, St John's Institute of Dermatology, Guy's Hospital, London, UK.,Centre for Stem Cells and Regenerative Medicine, King's College London, Guy's Hospital, London, UK
| |
Collapse
|
5
|
Okagbue HI, Adamu PI, Oguntunde PE, Obasi ECM, Odetunmibi OA. Machine learning prediction of breast cancer survival using age, sex, length of stay, mode of diagnosis and location of cancer. HEALTH AND TECHNOLOGY 2021. [DOI: 10.1007/s12553-021-00572-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
6
|
Zhang P, Ma J, Chen X, Shentu Y. A nonparametric method for value function guided subgroup identification via gradient tree boosting for censored survival data. Stat Med 2020; 39:4133-4146. [PMID: 32786155 DOI: 10.1002/sim.8714] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2020] [Revised: 06/08/2020] [Accepted: 07/09/2020] [Indexed: 11/07/2022]
Abstract
In randomized clinical trials with survival outcome, there has been an increasing interest in subgroup identification based on baseline genomic, proteomic markers, or clinical characteristics. Some of the existing methods identify subgroups that benefit substantially from the experimental treatment by directly modeling outcomes or treatment effect. When the goal is to find an optimal treatment for a given patient rather than finding the right patient for a given treatment, methods under the individualized treatment regime framework estimate an individualized treatment rule that would lead to the best expected clinical outcome as measured by a value function. Connecting the concept of value function to subgroup identification, we propose a nonparametric method that searches for subgroup membership scores by maximizing a value function that directly reflects the subgroup-treatment interaction effect based on restricted mean survival time. A gradient tree boosting algorithm is proposed to search for the individual subgroup membership scores. We conduct simulation studies to evaluate the performance of the proposed method and an application to an AIDS clinical trial is performed for illustration.
Collapse
Affiliation(s)
- Pingye Zhang
- Biostatistics and Research Decision Sciences, MRL, Merck & Co., Inc., Rahway, New Jersey, USA
| | - Junshui Ma
- Biostatistics and Research Decision Sciences, MRL, Merck & Co., Inc., Rahway, New Jersey, USA
| | - Xinqun Chen
- Biostatistics and Research Decision Sciences, MRL, Merck & Co., Inc., Rahway, New Jersey, USA
| | - Yue Shentu
- Biostatistics and Research Decision Sciences, MRL, Merck & Co., Inc., Rahway, New Jersey, USA
| |
Collapse
|
7
|
Sugasawa S, Noma H. Efficient screening of predictive biomarkers for individual treatment selection. Biometrics 2020; 77:249-257. [PMID: 32294246 DOI: 10.1111/biom.13279] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 03/27/2020] [Accepted: 03/30/2020] [Indexed: 01/18/2023]
Abstract
The development of molecular diagnostic tools to achieve individualized medicine requires identifying predictive biomarkers associated with subgroups of individuals who might receive beneficial or harmful effects from different available treatments. However, due to the large number of candidate biomarkers in the large-scale genetic and molecular studies, and complex relationships among clinical outcome, biomarkers, and treatments, the ordinary statistical tests for the interactions between treatments and covariates have difficulties from their limited statistical powers. In this paper, we propose an efficient method for detecting predictive biomarkers. We employ weighted loss functions of Chen et al. to directly estimate individual treatment scores and propose synthetic posterior inference for effect sizes of biomarkers. We develop an empirical Bayes approach, namely, we estimate unknown hyperparameters in the prior distribution based on data. We then provide efficient screening methods for the candidate biomarkers via optimal discovery procedure with adequate control of false discovery rate. The proposed method is demonstrated in simulation studies and an application to a breast cancer clinical study in which the proposed method was shown to detect the much larger numbers of significant biomarkers than existing standard methods.
Collapse
Affiliation(s)
- Shonosuke Sugasawa
- Center for Spatial Information Science, The University of Tokyo, Kashiwa, Chiba, Japan
| | - Hisashi Noma
- Department of Data Science, The Institute of Statistical Mathematics, Tachikawa, Tokyo, Japan
| |
Collapse
|