1
|
A predictive model for depression in Chinese middle-aged and elderly people with physical disabilities. BMC Psychiatry 2024; 24:305. [PMID: 38654170 DOI: 10.1186/s12888-024-05766-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 04/15/2024] [Indexed: 04/25/2024] Open
Abstract
BACKGROUND Middle-aged and older adults with physical disabilities exhibit more common and severe depressive symptoms than those without physical disabilities. Such symptoms can greatly affect the physical and mental health and life expectancy of middle-aged and older persons with disabilities. METHOD This study selected 2015 and 2018 data from the China Longitudinal Study of Health and Retirement. After analyzing the effect of age on depression, we used whether middle-aged and older adults with physical disabilities were depressed as the dependent variable and included a total of 24 predictor variables, including demographic factors, health behaviors, physical functioning and socialization, as independent variables. The data were randomly divided into training and validation sets on a 7:3 basis. LASSO regression analysis combined with binary logistic regression analysis was performed in the training set to screen the predictor variables of the model. Construct models in the training set and perform model evaluation, model visualization and internal validation. Perform external validation of the model in the validation set. RESULT A total of 1052 middle-aged and elderly persons with physical disabilities were included in this study, and the prevalence of depression in the elderly group > middle-aged group. Restricted triple spline indicated that age had different effects on depression in the middle-aged and elderly groups. LASSO regression analysis combined with binary logistic regression screened out Gender, Location of Residential Address, Shortsightedness, Hearing, Any possible helper in the future, Alcoholic in the Past Year, Difficulty with Using the Toilet, Difficulty with Preparing Hot Meals, and Unable to work due to disability constructed the Chinese Depression Prediction Model for Middle-aged and Older People with Physical Disabilities. The nomogram shows that living in a rural area, lack of assistance, difficulties with activities of daily living, alcohol abuse, visual and hearing impairments, unemployment and being female are risk factors for depression in middle-aged and older persons with physical disabilities. The area under the ROC curve for the model, internal validation and external validation were all greater than 0.70, the mean absolute error was less than 0.02, and the recall and precision were both greater than 0.65, indicating that the model performs well in terms of discriminability, accuracy and generalisation. The DCA curve and net gain curve of the model indicate that the model has high gain in predicting depression. CONCLUSION In this study, we showed that being female, living in rural areas, having poor vision and/or hearing, lack of assistance from others, drinking alcohol, having difficulty using the restroom and preparing food, and being unable to work due to a disability were risk factors for depression among middle-aged and older adults with physical disabilities. We developed a depression prediction model to assess the likelihood of depression in Chinese middle-aged and older adults with physical disabilities based on the above risk factors, so that early identification, intervention, and treatment can be provided to middle-aged and older adults with physical disabilities who are at high risk of developing depression.
Collapse
|
2
|
Machine Learning-Based Clinical Prediction Models for Acute Ischemic Stroke Based on Serum Xanthine Oxidase Levels. World Neurosurg 2024; 184:e695-e707. [PMID: 38340801 DOI: 10.1016/j.wneu.2024.02.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 02/04/2024] [Indexed: 02/12/2024]
Abstract
OBJECTIVE Early prediction of the onset, progression and prognosis of acute ischemic stroke (AIS) is helpful for treatment decision-making and proactive management. Although several biomarkers have been found to predict the progression and prognosis of AIS, these biomarkers have not been widely used in routine clinical practice. Xanthine oxidase (XO) is a form of xanthine oxidoreductase (XOR), which is widespread in various organs of the human body and plays an important role in redox reactions and ischemia‒reperfusion injury. Our previous studies have shown that serum XO levels on admission have certain clinical predictive value for AIS. The purpose of this study was to utilize serum XO levels and clinical data to establish machine learning models for predicting the onset, progression, and prognosis of AIS. METHODS We enrolled 328 consecutive patients with AIS and 107 healthy controls from October 2020 to September 2021. Serum XO levels and stroke-related clinical data were collected. We established 5 machine learning models-the logistic regression (LR), support vector machine (SVM), decision tree, random forest, and K-nearest neighbor (KNN) models-to predict the onset, progression, and prognosis of AIS. The area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, specificity, negative predictive value, and positive predictive value were used to evaluate the predictive performance of each model. RESULTS Among the 5 machine learning models predicting AIS onset, the AUROC values of 4 prediction models were over 0.7, while that of the KNN model was lower (AUROC = 0.6708, 95% CI 0.576-0.765). The LR model showed the best AUROC value (AUROC = 0.9586, 95% CI 0.927-0.991). Although the 5 machine learning models showed relatively poor predictive value for the progression of AIS (all AUROCs <0.7), the LR model still showed the highest AUROC value (AUROC = 0.6543, 95% CI 0.453-0.856). We compared the value of 5 machine learning models in predicting the prognosis of AIS, and the LR model showed the best predictive value (AUROC = 0.8124, 95% CI 0.715-0.910). CONCLUSIONS The tested machine learning models based on serum levels of XO could predict the onset and prognosis of AIS. Among the 5 machine learning models, we found that the LR model showed the best predictive performance. Machine learning algorithms improve accuracy in the early diagnosis of AIS and can be used to make treatment decisions.
Collapse
|
3
|
Comparing Predictive Performance of Time Invariant and Time Variant Clinical Prediction Models in Cardiac Surgery. Stud Health Technol Inform 2024; 310:1026-1030. [PMID: 38269970 DOI: 10.3233/shti231120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
Clinical prediction models are increasingly used across healthcare to support clinical decision making. Existing methods and models are time-invariant and thus ignore the changes in populations and healthcare practice that occur over time. We aimed to compare the performance of time-invariant with time-variant models in UK National Adult Cardiac Surgery Audit data from Manchester University NHS Foundation Trust between 2009 and 2019. Data from 2009-2011 were used for initial model fitting, and data from 2012-2019 for validation and updating. We fitted four models to the data: a time-invariant logistic regression model (not updated), a logistic model which was updated every year and validated it in each subsequent year, a logistic regression model where the intercept is a function of calendar time (not updated), and a continually updating Bayesian logistic model which was updated with each new observation and continuously validated. We report predictive performance over the complete validation cohort and for each year in the validation data. Over the complete validation data, the Bayesian model had the best predictive performance.
Collapse
|
4
|
Dynamic updating of clinical survival prediction models in a changing environment. Diagn Progn Res 2023; 7:24. [PMID: 38082429 PMCID: PMC10714456 DOI: 10.1186/s41512-023-00163-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 10/17/2023] [Indexed: 01/31/2024] Open
Abstract
BACKGROUND Over time, the performance of clinical prediction models may deteriorate due to changes in clinical management, data quality, disease risk and/or patient mix. Such prediction models must be updated in order to remain useful. In this study, we investigate dynamic model updating of clinical survival prediction models. In contrast to discrete or one-time updating, dynamic updating refers to a repeated process for updating a prediction model with new data. We aim to extend previous research which focused largely on binary outcome prediction models by concentrating on time-to-event outcomes. We were motivated by the rapidly changing environment seen during the COVID-19 pandemic where mortality rates changed over time and new treatments and vaccines were introduced. METHODS We illustrate three methods for dynamic model updating: Bayesian dynamic updating, recalibration, and full refitting. We use a simulation study to compare performance in a range of scenarios including changing mortality rates, predictors with low prevalence and the introduction of a new treatment. Next, the updating strategies were applied to a model for predicting 70-day COVID-19-related mortality using patient data from QResearch, an electronic health records database from general practices in the UK. RESULTS In simulated scenarios with mortality rates changing over time, all updating methods resulted in better calibration than not updating. Moreover, dynamic updating outperformed ad hoc updating. In the simulation scenario with a new predictor and a small updating dataset, Bayesian updating improved the C-index over not updating and refitting. In the motivating example with a rare outcome, no single updating method offered the best performance. CONCLUSIONS We found that a dynamic updating process outperformed one-time discrete updating in the simulations. Bayesian updating offered good performance overall, even in scenarios with new predictors and few events. Intercept recalibration was effective in scenarios with smaller sample size and changing baseline hazard. Refitting performance depended on sample size and produced abrupt changes in hazard ratio estimates between periods.
Collapse
|
5
|
Attention-based neural networks for clinical prediction modelling on electronic health records. BMC Med Res Methodol 2023; 23:285. [PMID: 38062352 PMCID: PMC10701944 DOI: 10.1186/s12874-023-02112-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 11/27/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Deep learning models have had a lot of success in various fields. However, on structured data they have struggled. Here we apply four state-of-the-art supervised deep learning models using the attention mechanism and compare against logistic regression and XGBoost using discrimination, calibration and clinical utility. METHODS We develop the models using a general practitioners database. We implement a recurrent neural network, a transformer with and without reverse distillation and a graph neural network. We measure discrimination using the area under the receiver operating characteristic curve (AUC) and the area under the precision recall curve (AUPRC). We assess smooth calibration using restricted cubic splines and clinical utility with decision curve analysis. RESULTS Our results show that deep learning approaches can improve discrimination up to 2.5% points AUC and 7.4% points AUPRC. However, on average the baselines are competitive. Most models are similarly calibrated as the baselines except for the graph neural network. The transformer using reverse distillation shows the best performance in clinical utility on two out of three prediction problems over most of the prediction thresholds. CONCLUSION In this study, we evaluated various approaches in supervised learning using neural networks and attention. Here we do a rigorous comparison, not only looking at discrimination but also calibration and clinical utility. There is value in using deep learning models on electronic health record data since it can improve discrimination and clinical utility while providing good calibration. However, good baseline methods are still competitive.
Collapse
|
6
|
A clinical model to predict the progression of knee osteoarthritis: data from Dryad. J Orthop Surg Res 2023; 18:628. [PMID: 37635226 PMCID: PMC10464113 DOI: 10.1186/s13018-023-04118-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Accepted: 08/21/2023] [Indexed: 08/29/2023] Open
Abstract
BACKGROUND Knee osteoarthritis (KOA) is a multifactorial, slow-progressing, non-inflammatory degenerative disease primarily affecting synovial joints. It is usually induced by advanced age and/or trauma and eventually leads to irreversible destruction of articular cartilage and other tissues of the joint. Current research on KOA progression has limited clinical application significance. In this study, we constructed a prediction model for KOA progression based on multiple clinically relevant factors to provide clinicians with an effective tool to intervene in KOA progression. METHOD This study utilized the data set from the Dryad database which included patients with Kellgren-Lawrence (KL) grades 2 and 3. The KL grades was determined as the dependent variable, while 15 potential predictors were identified as independent variables. Patients were randomized into training set and validation set. The training set underwent LASSO analysis, model creation, visualization, decision curve analysis and internal validation using R language. The validation set is externally validated and F1-score, precision, and recall are computed. RESULT A total of 101 patients with KL2 and 94 patients with KL3 were selected. We randomly split the data set into a training set and a validation set by 8:2. We filtered "BMI", "TC", "Hypertension treatment", and "JBS3 (%)" to build the prediction model for progression of KOA. Nomogram used to visualize the model in R language. Area under ROC curve was 0.896 (95% CI 0.847-0.945), indicating high discrimination. Mean absolute error (MAE) of calibration curve = 0.041, showing high calibration. MAE of internal validation error was 0.043, indicating high model calibration. Decision curve analysis showed high net benefit. External validation of the metabolic syndrome column-line graph prediction model was performed by the validation set. The area under the ROC curve was 0.876 (95% CI 0.767-0.984), indicating that the model had a high degree of discrimination. Meanwhile, the calibration curve Mean absolute error was 0.113, indicating that the model had a high degree of calibration. The F1 score is 0.690, the precision is 0.667, and the recall is 0.714. The above metrics represent a good performance of the model. CONCLUSION We found that KOA progression was associated with four variable predictors and constructed a predictive model for KOA progression based on the predictors. The clinician can intervene based on the nomogram of our prediction model. KEY INFORMATION This study is a clinical predictive model of KOA progression. KOA progression prediction model has good credibility and clinical value in the prevention of KOA progression.
Collapse
|
7
|
Clinical prediction models for serious infections in children: external validation in ambulatory care. BMC Med 2023; 21:151. [PMID: 37072778 PMCID: PMC10114467 DOI: 10.1186/s12916-023-02860-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 04/03/2023] [Indexed: 04/20/2023] Open
Abstract
BACKGROUND Early distinction between mild and serious infections (SI) is challenging in children in ambulatory care. Clinical prediction models (CPMs), developed to aid physicians in clinical decision-making, require broad external validation before clinical use. We aimed to externally validate four CPMs, developed in emergency departments, in ambulatory care. METHODS We applied the CPMs in a prospective cohort of acutely ill children presenting to general practices, outpatient paediatric practices or emergency departments in Flanders, Belgium. For two multinomial regression models, Feverkidstool and Craig model, discriminative ability and calibration were assessed, and a model update was performed by re-estimation of coefficients with correction for overfitting. For two risk scores, the SBI score and PAWS, the diagnostic test accuracy was assessed. RESULTS A total of 8211 children were included, comprising 498 SI and 276 serious bacterial infections (SBI). Feverkidstool had a C-statistic of 0.80 (95% confidence interval 0.77-0.84) with good calibration for pneumonia and 0.74 (0.70-0.79) with poor calibration for other SBI. The Craig model had a C-statistic of 0.80 (0.77-0.83) for pneumonia, 0.75 (0.70-0.80) for complicated urinary tract infections and 0.63 (0.39-0.88) for bacteraemia, with poor calibration. The model update resulted in improved C-statistics for all outcomes and good overall calibration for Feverkidstool and the Craig model. SBI score and PAWS performed extremely weak with sensitivities of 0.12 (0.09-0.15) and 0.32 (0.28-0.37). CONCLUSIONS Feverkidstool and the Craig model show good discriminative ability for predicting SBI and a potential for early recognition of SBI, confirming good external validity in a low prevalence setting of SBI. The SBI score and PAWS showed poor diagnostic performance. TRIAL REGISTRATION ClinicalTrials.gov, NCT02024282. Registered on 31 December 2013.
Collapse
|
8
|
Obstacles to effective model deployment in healthcare. J Bioinform Comput Biol 2023; 21:2371001. [PMID: 36938598 DOI: 10.1142/s0219720023710014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2023]
Abstract
Despite an exponential increase in publications on clinical prediction models over recent years, the number of models deployed in clinical practice remains fairly limited. In this paper, we identify common obstacles that impede effective deployment of prediction models in healthcare, and investigate their underlying causes. We observe a key underlying cause behind most obstacles - the improper development and evaluation of prediction models. Inherent heterogeneities in clinical data complicate the development and evaluation of clinical prediction models. Many of these heterogeneities in clinical data are unreported because they are deemed to be irrelevant, or due to privacy concerns. We provide real-life examples where failure to handle heterogeneities in clinical data, or sources of biases, led to the development of erroneous models. The purpose of this paper is to familiarize modeling practitioners with common sources of biases and heterogeneities in clinical data, both of which have to be dealt with to ensure proper development and evaluation of clinical prediction models. Proper model development and evaluation, together with complete and thorough reporting, are important prerequisites for a prediction model to be effectively deployed in healthcare.
Collapse
|
9
|
Minimum sample size for developing a multivariable prediction model using multinomial logistic regression. Stat Methods Med Res 2023; 32:555-571. [PMID: 36660777 PMCID: PMC10012398 DOI: 10.1177/09622802231151220] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
AIMS Multinomial logistic regression models allow one to predict the risk of a categorical outcome with > 2 categories. When developing such a model, researchers should ensure the number of participants (n ) is appropriate relative to the number of events (E k ) and the number of predictor parameters (p k ) for each category k. We propose three criteria to determine the minimum n required in light of existing criteria developed for binary outcomes. PROPOSED CRITERIA The first criterion aims to minimise the model overfitting. The second aims to minimise the difference between the observed and adjusted R 2 Nagelkerke. The third criterion aims to ensure the overall risk is estimated precisely. For criterion (i), we show the sample size must be based on the anticipated Cox-snell R 2 of distinct 'one-to-one' logistic regression models corresponding to the sub-models of the multinomial logistic regression, rather than on the overall Cox-snell R 2 of the multinomial logistic regression. EVALUATION OF CRITERIA We tested the performance of the proposed criteria (i) through a simulation study and found that it resulted in the desired level of overfitting. Criterion (ii) and (iii) were natural extensions from previously proposed criteria for binary outcomes and did not require evaluation through simulation. SUMMARY We illustrated how to implement the sample size criteria through a worked example considering the development of a multinomial risk prediction model for tumour type when presented with an ovarian mass. Code is provided for the simulation and worked example. We will embed our proposed criteria within the pmsampsize R library and Stata modules.
Collapse
|
10
|
IMplementing Predictive Analytics towards efficient COPD Treatments (IMPACT): protocol for a stepped-wedge cluster randomized impact study. Diagn Progn Res 2023; 7:3. [PMID: 36782301 PMCID: PMC9926816 DOI: 10.1186/s41512-023-00140-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 01/09/2023] [Indexed: 02/15/2023] Open
Abstract
INTRODUCTION Personalized disease management informed by quantitative risk prediction has the potential to improve patient care and outcomes. The integration of risk prediction into clinical workflow should be informed by the experiences and preferences of stakeholders, and the impact of such integration should be evaluated in prospective comparative studies. The objectives of the IMplementing Predictive Analytics towards efficient chronic obstructive pulmonary disease (COPD) treatments (IMPACT) study are to integrate an exacerbation risk prediction tool into routine care and to determine its impact on prescription appropriateness (primary outcome), medication adherence, quality of life, exacerbation rates, and sex and gender disparities in COPD care (secondary outcomes). METHODS IMPACT will be conducted in two phases. Phase 1 will include the systematic and user-centered development of two decision support tools: (1) a decision tool for pulmonologists called the ACCEPT decision intervention (ADI), which combines risk prediction from the previously developed Acute COPD Exacerbation Prediction Tool with treatment algorithms recommended by the Canadian Thoracic Society's COPD pharmacotherapy guidelines, and (2) an information pamphlet for COPD patients (patient tool), tailored to their prescribed medication, clinical needs, and lung function. In phase 2, we will conduct a stepped-wedge cluster randomized controlled trial in two outpatient respiratory clinics to evaluate the impact of the decision support tools on quality of care and patient outcomes. Clusters will be practicing pulmonologists (n ≥ 24), who will progressively switch to the intervention over 18 months. At the end of the study, a qualitative process evaluation will be carried out to determine the barriers and enablers of uptake of the tools. DISCUSSION The IMPACT study coincides with a planned harmonization of electronic health record systems across tertiary care centers in British Columbia, Canada. The harmonization of these systems combined with IMPACT's implementation-oriented design and partnership with stakeholders will facilitate integration of the tools into routine care, if the results of the proposed study reveal positive association with improvement in the process and outcomes of clinical care. The process evaluation at the end of the trial will inform subsequent design iterations before largescale implementation. TRIAL REGISTRATION NCT05309356.
Collapse
|
11
|
Quality of clinical prediction models in in vitro fertilisation: Which covariates are really important to predict cumulative live birth and which models are best? Best Pract Res Clin Obstet Gynaecol 2023; 86:102309. [PMID: 36641248 DOI: 10.1016/j.bpobgyn.2022.102309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 11/29/2022] [Accepted: 12/19/2022] [Indexed: 12/29/2022]
Abstract
The improvement in IVF cryopreservation techniques over the last 20 years has led to an increase in elective single embryo transfer, thus reducing multiple pregnancy rates. This strategy of successive transfers of fresh followed by frozen embryos has resulted in the acceptance of using cumulative live birth over complete cycles of IVF as a critical measure of success. Clinical prediction models are a useful way of estimating the cumulative chances of success for couples tailored to their individual clinical factors, which help them prepare for and plan future treatment. In this review, we describe several models that predict cumulative live birth and recommend which should be used by couples and/or their clinicians and when they should be used. We also discuss the most relevant predictors to consider when either developing new IVF prediction models or updating existing models.
Collapse
|
12
|
Personalized medicine in rheumatoid arthritis: Combining biomarkers and patient preferences to guide therapeutic decisions. Best Pract Res Clin Rheumatol 2023; 36:101812. [PMID: 36653230 DOI: 10.1016/j.berh.2022.101812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
The last few decades have seen major therapeutic advancements in rheumatoid arthritis (RA) therapeutics. New disease-modifying antirheumatic drugs (DMARDs) have continued to emerge, creating more choices for people. However, no therapeutic works for all patients. Each has its own inherent benefits, risks, costs, dosing, and monitoring considerations. In parallel, there has been a focus on personalized medicine initiatives that tailor therapeutic decisions to patients based on their unique characteristics or biomarkers. Personalized effect estimates require an understanding of a patient's baseline probability of response to treatment and data on the comparative effectiveness of the available treatments. However, even if accurate risk prediction models are available, trade-offs often still need to be made between treatments. In this paper, we review the history of RA therapeutics and progress that has been made toward personalized risk predictive models for DMARDs, outlining where knowledge gaps still exist. We further review why patient preferences play a key role in a holistic view of personalized medicine and how this links with shared decision-making. We argue that a "preference misdiagnosis" may be equally important as a medical misdiagnosis but is often overlooked.
Collapse
|
13
|
Inpatient Fall Prediction Models: A Scoping Review. Gerontology 2023; 69:14-29. [PMID: 35977533 DOI: 10.1159/000525727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Accepted: 05/07/2022] [Indexed: 01/06/2023] Open
Abstract
INTRODUCTION The digitization of hospital systems, including integrated electronic medical records, has provided opportunities to improve the prediction performance of inpatient fall risk models and their application to computerized clinical decision support systems. This review describes the data sources and scope of methods reported in studies that developed inpatient fall prediction models, including machine learning and more traditional approaches to inpatient fall risk prediction. METHODS This scoping review used methods recommended by the Arksey and O'Malley framework and its recent advances. PubMed, CINAHL, IEEE Xplore, and EMBASE databases were systematically searched. Studies reporting the development of inpatient fall risk prediction approaches were included. There was no restriction on language or recency. Reference lists and manual searches were also completed. Reporting quality was assessed using adherence to Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis statement (TRIPOD), where appropriate. RESULTS Database searches identified 1,396 studies, 63 were included for scoping assessment and 45 for reporting quality assessment. There was considerable overlap in data sources and methods used for model development. Fall prediction models typically relied on features from patient assessments, including indicators of physical function or impairment, or cognitive function or impairment. All but two studies used patient information at or soon after admission and predicted fall risk over the entire admission, without consideration of post-admission interventions, acuity changes or length of stay. Overall, reporting quality was poor, but improved in the past decade. CONCLUSION There was substantial homogeneity in data sources and prediction model development methods. Use of artificial intelligence, including machine learning with high-dimensional data, remains underexplored in the context of hospital falls. Future research should consider approaches with the potential to utilize high-dimensional data from digital hospital systems, which may contribute to greater performance and clinical usefulness.
Collapse
|
14
|
The majority of 922 prediction models supporting breast cancer decision-making are at high risk of bias. J Clin Epidemiol 2022; 152:238-247. [PMID: 36633901 DOI: 10.1016/j.jclinepi.2022.10.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 09/25/2022] [Accepted: 10/20/2022] [Indexed: 11/23/2022]
Abstract
OBJECTIVES To systematically review the currently available prediction models that may support treatment decision-making in breast cancer. STUDY DESIGN AND SETTING Literature was systematically searched to identify studies reporting on development of prediction models aiming to support breast cancer treatment decision-making, published between January 2010 and December 2020. Quality and risk of bias were assessed using the Prediction model Risk Of Bias (ROB) Assessment Tool (PROBAST). RESULTS After screening 20,460 studies, 534 studies were included, reporting on 922 models. The 922 models predicted: mortality (n = 417 45%), recurrence (n = 217, 24%), lymph node involvement (n = 141, 15%), adverse events (n = 58, 6%), treatment response (n = 56, 6%), or other outcomes (n = 33, 4%). In total, 285 models (31%) lacked a complete description of the final model and could not be applied to new patients. Most models (n = 878, 95%) were considered to contain high ROB. CONCLUSION A substantial overlap in predictor variables and outcomes between the models was observed. Most models were not reported according to established reporting guidelines or showed methodological flaws during the development and/or validation of the model. Further development of prediction models with thorough quality and validity assessment is an essential first step for future clinical application.
Collapse
|
15
|
Prognostic models for COVID-19 needed updating to warrant transportability over time and space. BMC Med 2022; 20:456. [PMID: 36424619 PMCID: PMC9686462 DOI: 10.1186/s12916-022-02651-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 11/04/2022] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Supporting decisions for patients who present to the emergency department (ED) with COVID-19 requires accurate prognostication. We aimed to evaluate prognostic models for predicting outcomes in hospitalized patients with COVID-19, in different locations and across time. METHODS We included patients who presented to the ED with suspected COVID-19 and were admitted to 12 hospitals in the New York City (NYC) area and 4 large Dutch hospitals. We used second-wave patients who presented between September and December 2020 (2137 and 3252 in NYC and the Netherlands, respectively) to evaluate models that were developed on first-wave patients who presented between March and August 2020 (12,163 and 5831). We evaluated two prognostic models for in-hospital death: The Northwell COVID-19 Survival (NOCOS) model was developed on NYC data and the COVID Outcome Prediction in the Emergency Department (COPE) model was developed on Dutch data. These models were validated on subsequent second-wave data at the same site (temporal validation) and at the other site (geographic validation). We assessed model performance by the Area Under the receiver operating characteristic Curve (AUC), by the E-statistic, and by net benefit. RESULTS Twenty-eight-day mortality was considerably higher in the NYC first-wave data (21.0%), compared to the second-wave (10.1%) and the Dutch data (first wave 10.8%; second wave 10.0%). COPE discriminated well at temporal validation (AUC 0.82), with excellent calibration (E-statistic 0.8%). At geographic validation, discrimination was satisfactory (AUC 0.78), but with moderate over-prediction of mortality risk, particularly in higher-risk patients (E-statistic 2.9%). While discrimination was adequate when NOCOS was tested on second-wave NYC data (AUC 0.77), NOCOS systematically overestimated the mortality risk (E-statistic 5.1%). Discrimination in the Dutch data was good (AUC 0.81), but with over-prediction of risk, particularly in lower-risk patients (E-statistic 4.0%). Recalibration of COPE and NOCOS led to limited net benefit improvement in Dutch data, but to substantial net benefit improvement in NYC data. CONCLUSIONS NOCOS performed moderately worse than COPE, probably reflecting unique aspects of the early pandemic in NYC. Frequent updating of prognostic models is likely to be required for transportability over time and space during a dynamic pandemic.
Collapse
|
16
|
ACCEPT 2·0: Recalibrating and externally validating the Acute COPD exacerbation prediction tool (ACCEPT). EClinicalMedicine 2022; 51:101574. [PMID: 35898315 PMCID: PMC9309408 DOI: 10.1016/j.eclinm.2022.101574] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 06/30/2022] [Accepted: 07/01/2022] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND The Acute Chronic Obstructive Pulmonary Disease (COPD) Exacerbation Prediction Tool (ACCEPT) was developed for individualised prediction of COPD exacerbations. ACCEPT was well calibrated overall and had a high discriminatory power, but overestimated risk among individuals without recent exacerbations. The objectives of this study were to 1) fine-tune ACCEPT to make better predictions for individuals with a negative exacerbation history, 2) develop more parsimonious models, and 3) externally validate the models in a new dataset. METHODS We recalibrated ACCEPT using data from the Evaluation of COPD Longitudinally to Identify Predictive Surrogate End-points (ECLIPSE, a three-year observational study, 1,803 patients, 2,117 exacerbations) study by applying non-parametric regression splines to the predicted rates. We developed three reduced versions of ACCEPT by removing symptom score and/or baseline medications as predictors. We examined the discrimination, calibration, and net benefit of ACCEPT 2·0 in the placebo arm of the Towards a Revolution in COPD Health (TORCH, a three-year randomised clinical trial of inhaled therapies in COPD, 1,091 patients, 1,064 exacerbations) study. The primary outcome for prediction was the occurrence of ≥2 moderate or ≥1 severe exacerbation in the next 12 months; the secondary outcomes were prediction of the occurrence of any moderate/severe exacerbation or any severe exacerbation. FINDINGS ACCEPT 2·0 had an area-under-the-curve (AUC) of 0·76 for predicting the primary outcome. Exacerbation history alone (current standard of care) had an AUC of 0·68. The model was well calibrated in patients with positive or negative exacerbation histories. Changes in AUC in reduced versions were minimal for the primary outcome as well as for predicting the occurrence of any moderate/severe exacerbations (ΔAUC<0·011), but more substantial for predicting the occurrence of any severe exacerbations (ΔAUC<0·020). All versions of ACCEPT 2·0 provided positive net benefit over the use of exacerbation history alone for some range of thresholds. INTERPRETATION ACCEPT 2·0 showed good calibration regardless of exacerbation history, and predicts exacerbation risk better than current standard of care for a range of thresholds. Future studies need to investigate the utility of exacerbation prediction in various subgroups of patients. FUNDING This study was funded by a team grant from the Canadian Institutes of Health Research (PHT 178432).
Collapse
|
17
|
Clinical Prediction Models for Hepatitis B Virus-related Acute-on-chronic Liver Failure: A Technical Report. J Clin Transl Hepatol 2021; 9:838-849. [PMID: 34966647 PMCID: PMC8666376 DOI: 10.14218/jcth.2021.00005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/01/2021] [Revised: 04/09/2021] [Accepted: 04/11/2021] [Indexed: 12/04/2022] Open
Abstract
BACKGROUND AND AIMS It is critical but challenging to predict the prognosis of hepatitis B virus-related acute-on-chronic liver failure (HBV-ACLF). This study systematically summarized and evaluated the quality and performance of available clinical prediction models (CPMs). METHODS A keyword search of articles on HBV-ACLF CPMs published in PubMed from January 1995 to April 2020 was performed. Both the quality and performance of the CPMs were assessed. RESULTS Fifty-two CPMs were identified, of which 31 were HBV-ACLF specific. The modeling data were mostly derived from retrospective (83.87%) and single-center (96.77%) cohorts, with sample sizes ranging from 46 to 1,202. Three-month mortality was the most common endpoint. The Asian Pacific Association for the Study of the Liver consensus (51.92%) and Chinese Medical Association liver failure guidelines (40.38%) were commonly used for HBV-ACLF diagnosis. Serum bilirubin (67.74%), the international normalized ratio (54.84%), and hepatic encephalopathy (51.61%) were the most frequent variables used in models. Model discrimination was commonly evaluated (88.46%), but model calibration was seldom performed. The model for end-stage liver disease score was the most widely used (84.62%); however, varying performance was reported among the studies. CONCLUSIONS Substantial limitations lie in the quality of HBV-ACLF-specific CPMs. Disease severity of study populations may impact model performance. The clinical utility of CPMs in predicting short-term prognosis of HBV-ACLF remains to be undefined.
Collapse
|
18
|
Multicenter Validation of Individual Preoperative Motor Outcome Prediction for Deep Brain Stimulation in Parkinson's Disease. Stereotact Funct Neurosurg 2021; 100:121-129. [PMID: 34823246 DOI: 10.1159/000519960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 09/20/2021] [Indexed: 11/19/2022]
Abstract
BACKGROUND Subthalamic nucleus deep brain stimulation (STN DBS) is an established therapy for Parkinson's disease (PD) patients suffering from motor response fluctuations despite optimal medical treatment, or severe dopaminergic side effects. Despite careful clinical selection and surgical procedures, some patients do not benefit from STN DBS. Preoperative prediction models are suggested to better predict individual motor response after STN DBS. We validate a preregistered model, DBS-PREDICT, in an external multicenter validation cohort. METHODS DBS-PREDICT considered eleven, solely preoperative, clinical characteristics and applied a logistic regression to differentiate between weak and strong motor responders. Weak motor response was defined as no clinically relevant improvement on the Unified Parkinson's Disease Rating Scale (UPDRS) II, III, or IV, 1 year after surgery, defined as, respectively, 3, 5, and 3 points or more. Lower UPDRS III and IV scores and higher age at disease onset contributed most to weak response predictions. Individual predictions were compared with actual clinical outcomes. RESULTS 322 PD patients treated with STN DBS from 6 different centers were included. DBS-PREDICT differentiated between weak and strong motor responders with an area under the receiver operator curve of 0.76 and an accuracy up to 77%. CONCLUSION Proving generalizability and feasibility of preoperative STN DBS outcome prediction in an external multicenter cohort is an important step in creating clinical impact in DBS with data-driven tools. Future prospective studies are required to overcome several inherent practical and statistical limitations of including clinical decision support systems in DBS care.
Collapse
|
19
|
Adaptive sample size determination for the development of clinical prediction models. Diagn Progn Res 2021; 5:6. [PMID: 33745449 PMCID: PMC7983402 DOI: 10.1186/s41512-021-00096-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 02/15/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND We suggest an adaptive sample size calculation method for developing clinical prediction models, in which model performance is monitored sequentially as new data comes in. METHODS We illustrate the approach using data for the diagnosis of ovarian cancer (n = 5914, 33% event fraction) and obstructive coronary artery disease (CAD; n = 4888, 44% event fraction). We used logistic regression to develop a prediction model consisting only of a priori selected predictors and assumed linear relations for continuous predictors. We mimicked prospective patient recruitment by developing the model on 100 randomly selected patients, and we used bootstrapping to internally validate the model. We sequentially added 50 random new patients until we reached a sample size of 3000 and re-estimated model performance at each step. We examined the required sample size for satisfying the following stopping rule: obtaining a calibration slope ≥ 0.9 and optimism in the c-statistic (or AUC) < = 0.02 at two consecutive sample sizes. This procedure was repeated 500 times. We also investigated the impact of alternative modeling strategies: modeling nonlinear relations for continuous predictors and correcting for bias on the model estimates (Firth's correction). RESULTS Better discrimination was achieved in the ovarian cancer data (c-statistic 0.9 with 7 predictors) than in the CAD data (c-statistic 0.7 with 11 predictors). Adequate calibration and limited optimism in discrimination was achieved after a median of 450 patients (interquartile range 450-500) for the ovarian cancer data (22 events per parameter (EPP), 20-24) and 850 patients (750-900) for the CAD data (33 EPP, 30-35). A stricter criterion, requiring AUC optimism < = 0.01, was met with a median of 500 (23 EPP) and 1500 (59 EPP) patients, respectively. These sample sizes were much higher than the well-known 10 EPP rule of thumb and slightly higher than a recently published fixed sample size calculation method by Riley et al. Higher sample sizes were required when nonlinear relationships were modeled, and lower sample sizes when Firth's correction was used. CONCLUSIONS Adaptive sample size determination can be a useful supplement to fixed a priori sample size calculations, because it allows to tailor the sample size to the specific prediction modeling context in a dynamic fashion.
Collapse
|
20
|
A scoping review of causal methods enabling predictions under hypothetical interventions. Diagn Progn Res 2021; 5:3. [PMID: 33536082 PMCID: PMC7860039 DOI: 10.1186/s41512-021-00092-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 01/02/2021] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The methods with which prediction models are usually developed mean that neither the parameters nor the predictions should be interpreted causally. For many applications, this is perfectly acceptable. However, when prediction models are used to support decision making, there is often a need for predicting outcomes under hypothetical interventions. AIMS We aimed to identify published methods for developing and validating prediction models that enable risk estimation of outcomes under hypothetical interventions, utilizing causal inference. We aimed to identify the main methodological approaches, their underlying assumptions, targeted estimands, and potential pitfalls and challenges with using the method. Finally, we aimed to highlight unresolved methodological challenges. METHODS We systematically reviewed literature published by December 2019, considering papers in the health domain that used causal considerations to enable prediction models to be used for predictions under hypothetical interventions. We included both methodologies proposed in statistical/machine learning literature and methodologies used in applied studies. RESULTS We identified 4919 papers through database searches and a further 115 papers through manual searches. Of these, 87 papers were retained for full-text screening, of which 13 were selected for inclusion. We found papers from both the statistical and the machine learning literature. Most of the identified methods for causal inference from observational data were based on marginal structural models and g-estimation. CONCLUSIONS There exist two broad methodological approaches for allowing prediction under hypothetical intervention into clinical prediction models: (1) enriching prediction models derived from observational studies with estimated causal effects from clinical trials and meta-analyses and (2) estimating prediction models and causal effects directly from observational data. These methods require extending to dynamic treatment regimes, and consideration of multiple interventions to operationalise a clinical decision support system. Techniques for validating 'causal prediction models' are still in their infancy.
Collapse
|
21
|
Challenges and solutions in prognostic prediction models in spinal disorders. J Clin Epidemiol 2021; 132:125-130. [PMID: 33359321 DOI: 10.1016/j.jclinepi.2020.12.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 12/01/2020] [Accepted: 12/14/2020] [Indexed: 12/18/2022]
Abstract
Methodological shortcomings in prognostic modeling for patients with spinal disorders are highly common. This general commentary discusses methodological challenges related to the specific nature of this field. Five specific methodological challenges in prognostic modeling for patients with spinal disorders are presented with their potential solutions, as related to the choice of study participants, purpose of studies, limitations in measurements of outcomes and predictors, complexity of recovery predictions, and confusion of prognosis and treatment response. Large studies specifically designed for prognostic model research are needed, using standard baseline measurement sets, clearly describing participants' recruitment and accounting and correcting for measurement limitations.
Collapse
|
22
|
Continual updating and monitoring of clinical prediction models: time for dynamic prediction systems? Diagn Progn Res 2021; 5:1. [PMID: 33431065 PMCID: PMC7797885 DOI: 10.1186/s41512-020-00090-3] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 12/08/2020] [Indexed: 01/01/2023] Open
Abstract
Clinical prediction models (CPMs) have become fundamental for risk stratification across healthcare. The CPM pipeline (development, validation, deployment, and impact assessment) is commonly viewed as a one-time activity, with model updating rarely considered and done in a somewhat ad hoc manner. This fails to address the fact that the performance of a CPM worsens over time as natural changes in populations and care pathways occur. CPMs need constant surveillance to maintain adequate predictive performance. Rather than reactively updating a developed CPM once evidence of deteriorated performance accumulates, it is possible to proactively adapt CPMs whenever new data becomes available. Approaches for validation then need to be changed accordingly, making validation a continuous rather than a discrete effort. As such, "living" (dynamic) CPMs represent a paradigm shift, where the analytical methods dynamically generate updated versions of a model through time; one then needs to validate the system rather than each subsequent model revision.
Collapse
|
23
|
A Note on Calibration of Clinical Prediction Models with Copas Statistics. JOURNAL OF BIOSTATISTICS AND EPIDEMIOLOGY 2020; 6:305-319. [PMID: 37664642 PMCID: PMC10474819 DOI: 10.18502/jbe.v6i4.5687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Background Calibration of clinical prediction models often entails assessing goodness of fit with independent, non-identically distributed Bernoulli random variables. We here investigate two statistics studied by Copas in this setting. Materials and Methods We present distribution theory and a simulation study to compare the operating characteristics of the Copas statistics. Results In our simulation study with relatively small sample sizes, we found a simple Cornish-Fisher approximation tail quantiles of the distributions of the Copas statistics to perform adequately. Upon illustrating their use in a calibration study relating to prediction of atherosclerotic cardiovascular disease risk, power properties appear to reflect differential weighting accorded to observations, as evinced with other goodness-of-fit statistics. Conclusion The Copas statistics are easily implemented, have proven value in other contexts, and appear to be underutilized in calibration studies. They ought to be part of the armamentarium of calibration tools for all researchers.
Collapse
|
24
|
Prognostic Model and Nomogram for Estimating Survival of Small Breast Cancer: A SEER-based Analysis. Clin Breast Cancer 2020; 21:e497-e505. [PMID: 33277191 DOI: 10.1016/j.clbc.2020.11.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Revised: 10/29/2020] [Accepted: 11/09/2020] [Indexed: 12/24/2022]
Abstract
BACKGROUND Different clinicopathologic characteristics could contribute to inconsistent prognoses of small breast neoplasms (T1a/T1b). This study was done to conduct a retrospective analysis and establish a clinical prediction model to predict individual survival outcomes of patients with small carcinomas of the breast. MATERIALS AND METHODS Based on the Surveillance, Epidemiology, and End Results (SEER) database, eligible patients with small breast carcinomas were analyzed. Univariate analysis and multivariate analysis were performed to clarify the indicators of overall survival. Pooling risk factors enabled nomograms to be constructed and further predicted 3-year, 5-year, and 10-year survival of patients with small breast cancer. The model was internally validated for discrimination and calibration. RESULTS A total of 17,543 patients with small breast neoplasms diagnosed between 2013 and 2016 were enrolled. Histologic grade, lymph node stage, estrogen receptor or progesterone receptor status, and molecular subtypes of breast cancer were regarded as the risk factors of prognosis in a Cox proportional hazards model (P < .05). A nomogram was constructed to give predictive accuracy toward individual survival rate of patients with small breast neoplasms. CONCLUSIONS This prognostic model provided a robust and effective method to predict the prognosis of patients with small breast cancer.
Collapse
|
25
|
Predicting Outcomes After Surgical Decompression for Mild Degenerative Cervical Myelopathy: Moving Beyond the mJOA to Identify Surgical Candidates. Neurosurgery 2020; 86:565-573. [PMID: 31225604 DOI: 10.1093/neuros/nyz160] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Accepted: 01/21/2019] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND Patients with mild degenerative cervical myelopathy (DCM) represent a heterogeneous population, and indications for surgical decompression remain controversial. OBJECTIVE To dissociate patient phenotypes within the broader population of mild DCM associated with degree of impairment in baseline quality of life (QOL) and surgical outcomes. METHODS This was a post hoc analysis of patients with mild DCM (modified Japanese Orthopedic Association [mJOA] 15-17) enrolled in the AOSpine CSM-NA/CSM-I studies. A k-means clustering algorithm was applied to baseline QOL (Short Form-36 [SF-36]) scores to separate patients into 2 clusters. Baseline variables and surgical outcomes (change in SF-36 scores at 1 yr) were compared between clusters. A k-nearest neighbors (kNN) algorithm was used to evaluate the ability to classify patients into the 2 clusters by significant baseline clinical variables. RESULTS One hundred eighty-five patients were eligible. Two groups were generated by k-means clustering. Cluster 1 had a greater proportion of females (44% vs 28%, P = .029) and symptoms of neck pain (32% vs 11%, P = .001), gait difficulty (57% vs 40%, P = .025), or weakness (75% vs 59%, P = .041). Although baseline mJOA correlated with neither baseline QOL nor outcomes, cluster 1 was associated with significantly greater improvement in disability (P = .003) and QOL (P < .001) scores following surgery. A kNN algorithm could predict cluster classification with 71% accuracy by neck pain, motor symptoms, and gender alone. CONCLUSION We have dissociated a distinct patient phenotype of mild DCM, characterized by neck pain, motor symptoms, and female gender associated with greater impairment in QOL and greater response to surgery.
Collapse
|
26
|
Counterfactual clinical prediction models could help to infer individualized treatment effects in randomized controlled trials-An illustration with the International Stroke Trial. J Clin Epidemiol 2020; 125:47-56. [PMID: 32464321 DOI: 10.1016/j.jclinepi.2020.05.022] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 04/17/2020] [Accepted: 05/20/2020] [Indexed: 12/22/2022]
Abstract
OBJECTIVE Causal treatment effects are estimated at the population level in randomized controlled trials, while clinical decision is often to be made at the individual level in practice. We aim to show how clinical prediction models used under a counterfactual framework may help to infer individualized treatment effects. STUDY DESIGN AND SETTING As an illustrative example, we reanalyze the International Stroke Trial. This large, multicenter trial enrolled 19,435 adult patients with suspected acute ischemic stroke from 36 countries, and reported a modest average benefit of aspirin (vs. no aspirin) on a composite outcome of death or dependency at 6 months. We derive and validate multivariable logistic regression models that predict the patient counterfactual risks of outcome with and without aspirin, conditionally on 23 predictors. RESULTS The counterfactual prediction models display good performance in terms of calibration and discrimination (validation c-statistics: 0.798 and 0.794). Comparing the counterfactual predicted risks on an absolute difference scale, we show that aspirin-despite an average benefit-may increase the risk of death or dependency at 6 months (compared with the control) in a quarter of stroke patients. CONCLUSIONS Counterfactual prediction models could help researchers and clinicians (i) infer individualized treatment effects and (ii) better target patients who may benefit from treatments.
Collapse
|
27
|
Missing data should be handled differently for prediction than for description or causal explanation. J Clin Epidemiol 2020; 125:183-187. [PMID: 32540389 DOI: 10.1016/j.jclinepi.2020.03.028] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Revised: 03/10/2020] [Accepted: 03/18/2020] [Indexed: 12/26/2022]
Abstract
Missing data are much studied in epidemiology and statistics. Theoretical development and application of methods for handling missing data have mostly been conducted in the context of prospective research data and with a goal of description or causal explanation. However, it is now common to build predictive models using routinely collected data, where missing patterns may convey important information, and one might take a pragmatic approach to optimizing prediction. Therefore, different methods to handle missing data may be preferred. Furthermore, an underappreciated issue in prediction modeling is that the missing data method used in model development may not match the method used when a model is deployed. This may lead to overoptimistic assessments of model performance. For prediction, particularly with routinely collected data, methods for handling missing data that incorporate information within the missingness pattern should be explored and further developed. Where missing data methods differ between model development and model deployment, the implications of this must be explicitly evaluated. The trade-off between building a prediction model that is causally principled, and building a prediction model that maximizes the use of all available information, should be carefully considered and will depend on the intended use of the model.
Collapse
|
28
|
When predictions are used to allocate scarce health care resources: three considerations for models in the era of Covid-19. Diagn Progn Res 2020; 4:11. [PMID: 32455168 PMCID: PMC7238723 DOI: 10.1186/s41512-020-00079-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/13/2020] [Accepted: 04/29/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND The need for life-saving interventions such as mechanical ventilation may threaten to outstrip resources during the Covid-19 pandemic. Allocation of these resources to those most likely to benefit can be supported by clinical prediction models. The ethical and practical considerations relevant to predictions supporting decisions about microallocation are distinct from those that inform shared decision-making in ways important for model design. MAIN BODY We review three issues of importance for microallocation: (1) Prediction of benefit (or of medical futility) may be technically very challenging; (2) When resources are scarce, calibration is less important for microallocation than is ranking to prioritize patients, since capacity determines thresholds for resource utilization; (3) The concept of group fairness, which is not germane in shared decision-making, is of central importance in microallocation. Therefore, model transparency is important. CONCLUSION Prediction supporting allocation of life-saving interventions should be explicit, data-driven, frequently updated and open to public scrutiny. This implies a preference for simple, easily understood and easily applied prognostic models.
Collapse
|
29
|
In-depth mining of clinical data: the construction of clinical prediction model with R. ANNALS OF TRANSLATIONAL MEDICINE 2019; 7:796. [PMID: 32042812 DOI: 10.21037/atm.2019.08.63] [Citation(s) in RCA: 156] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
This article is the series of methodology of clinical prediction model construction (total 16 sections of this methodology series). The first section mainly introduces the concept, current application status, construction methods and processes, classification of clinical prediction models, and the necessary conditions for conducting such researches and the problems currently faced. The second episode of these series mainly concentrates on the screening method in multivariate regression analysis. The third section mainly introduces the construction method of prediction models based on Logistic regression and Nomogram drawing. The fourth episode mainly concentrates on Cox proportional hazards regression model and Nomogram drawing. The fifth Section of the series mainly introduces the calculation method of C-Statistics in the logistic regression model. The sixth section mainly introduces two common calculation methods for C-Index in Cox regression based on R. The seventh section focuses on the principle and calculation methods of Net Reclassification Index (NRI) using R. The eighth section focuses on the principle and calculation methods of IDI (Integrated Discrimination Index) using R. The ninth section continues to explore the evaluation method of clinical utility after predictive model construction: Decision Curve Analysis. The tenth section is a supplement to the previous section and mainly introduces the Decision Curve Analysis of survival outcome data. The eleventh section mainly discusses the external validation method of Logistic regression model. The twelfth mainly discusses the in-depth evaluation of Cox regression model based on R, including calculating the concordance index of discrimination (C-index) in the validation data set and drawing the calibration curve. The thirteenth section mainly introduces how to deal with the survival data outcome using competitive risk model with R. The fourteenth section mainly introduces how to draw the nomogram of the competitive risk model with R. The fifteenth section of the series mainly discusses the identification of outliers and the interpolation of missing values. The sixteenth section of the series mainly introduced the advanced variable selection methods in linear model, such as Ridge regression and LASSO regression.
Collapse
|
30
|
Reply: Prediction of the Left Ventricular Functional Outcome by Myocardial Extracellular Volume Fraction Measured Using Magnetic Resonance Imaging; Methodological Issue. Korean J Radiol 2019; 20:1311-1312. [PMID: 31339019 PMCID: PMC6658882 DOI: 10.3348/kjr.2019.0115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2019] [Accepted: 03/05/2019] [Indexed: 11/15/2022] Open
|
31
|
A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol 2019; 110:12-22. [PMID: 30763612 DOI: 10.1016/j.jclinepi.2019.02.004] [Citation(s) in RCA: 750] [Impact Index Per Article: 150.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Revised: 01/18/2019] [Accepted: 02/05/2019] [Indexed: 02/06/2023]
Abstract
OBJECTIVES The objective of this study was to compare performance of logistic regression (LR) with machine learning (ML) for clinical prediction modeling in the literature. STUDY DESIGN AND SETTING We conducted a Medline literature search (1/2016 to 8/2017) and extracted comparisons between LR and ML models for binary outcomes. RESULTS We included 71 of 927 studies. The median sample size was 1,250 (range 72-3,994,872), with 19 predictors considered (range 5-563) and eight events per predictor (range 0.3-6,697). The most common ML methods were classification trees, random forests, artificial neural networks, and support vector machines. In 48 (68%) studies, we observed potential bias in the validation procedures. Sixty-four (90%) studies used the area under the receiver operating characteristic curve (AUC) to assess discrimination. Calibration was not addressed in 56 (79%) studies. We identified 282 comparisons between an LR and ML model (AUC range, 0.52-0.99). For 145 comparisons at low risk of bias, the difference in logit(AUC) between LR and ML was 0.00 (95% confidence interval, -0.18 to 0.18). For 137 comparisons at high risk of bias, logit(AUC) was 0.34 (0.20-0.47) higher for ML. CONCLUSION We found no evidence of superior performance of ML over LR. Improvements in methodology and reporting are needed for studies that compare modeling algorithms.
Collapse
|
32
|
Development of diagnostic prediction tools for bacteraemia caused by third-generation cephalosporin-resistant enterobacteria in suspected bacterial infections: a nested case-control study. Clin Microbiol Infect 2018; 24:1315-1321. [PMID: 29581056 DOI: 10.1016/j.cmi.2018.03.023] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Revised: 02/24/2018] [Accepted: 03/13/2018] [Indexed: 01/23/2023]
Abstract
OBJECTIVES Current guidelines for the empirical antibiotic treatment predict the presence of third-generation cephalosporin-resistant enterobacterial bacteraemia (3GCR-E-Bac) in case of infection only poorly, thereby increasing unnecessary carbapenem use. We aimed to develop diagnostic scoring systems which can better predict the presence of 3GCR-E-Bac. METHODS A retrospective nested case-control study was performed that included patients ≥18 years of age from eight Dutch hospitals in whom blood cultures were obtained and intravenous antibiotics were initiated. Each patient with 3GCR-E-Bac was matched to four control infection episodes within the same hospital, based on blood-culture date and onset location (community or hospital). Starting from 32 commonly described clinical risk factors at infection onset, selection strategies were used to derive scoring systems for the probability of community- and hospital-onset 3GCR-E-Bac. RESULTS 3GCR-E-Bac occurred in 90 of 22 506 (0.4%) community-onset infections and in 82 of 8110 (1.0%) hospital-onset infections, and these cases were matched to 360 community-onset and 328 hospital-onset control episodes. The derived community-onset and hospital-onset scoring systems consisted of six and nine predictors, respectively. With selected score cut-offs, the models identified 3GCR-E-Bac with sensitivity equal to existing guidelines (community-onset: 54.3%; hospital-onset: 81.5%). However, they reduced the proportion of patients classified as at risk for 3GCR-E-Bac (i.e. eligible for empirical carbapenem therapy) with 40% (95%CI 21-56%) and 49% (95%CI 39-58%) in, respectively, community-onset and hospital-onset infections. CONCLUSIONS These prediction scores for 3GCR-E-Bac, specifically geared towards the initiation of empirical antibiotic treatment, may improve the balance between inappropriate antibiotics and carbapenem overuse.
Collapse
|
33
|
Clinical prediction in defined populations: a simulation study investigating when and how to aggregate existing models. BMC Med Res Methodol 2017; 17:1. [PMID: 28056835 PMCID: PMC5217317 DOI: 10.1186/s12874-016-0277-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Accepted: 12/15/2016] [Indexed: 12/23/2022] Open
Abstract
Background Clinical prediction models (CPMs) are increasingly deployed to support healthcare decisions but they are derived inconsistently, in part due to limited data. An emerging alternative is to aggregate existing CPMs developed for similar settings and outcomes. This simulation study aimed to investigate the impact of between-population-heterogeneity and sample size on aggregating existing CPMs in a defined population, compared with developing a model de novo. Methods Simulations were designed to mimic a scenario in which multiple CPMs for a binary outcome had been derived in distinct, heterogeneous populations, with potentially different predictors available in each. We then generated a new ‘local’ population and compared the performance of CPMs developed for this population by aggregation, using stacked regression, principal component analysis or partial least squares, with redevelopment from scratch using backwards selection and penalised regression. Results While redevelopment approaches resulted in models that were miscalibrated for local datasets of less than 500 observations, model aggregation methods were well calibrated across all simulation scenarios. When the size of local data was less than 1000 observations and between-population-heterogeneity was small, aggregating existing CPMs gave better discrimination and had the lowest mean square error in the predicted risks compared with deriving a new model. Conversely, given greater than 1000 observations and significant between-population-heterogeneity, then redevelopment outperformed the aggregation approaches. In all other scenarios, both aggregation and de novo derivation resulted in similar predictive performance. Conclusion This study demonstrates a pragmatic approach to contextualising CPMs to defined populations. When aiming to develop models in defined populations, modellers should consider existing CPMs, with aggregation approaches being a suitable modelling strategy particularly with sparse data on the local population. Electronic supplementary material The online version of this article (doi:10.1186/s12874-016-0277-1) contains supplementary material, which is available to authorized users.
Collapse
|
34
|
Development and validation of a prediction model for long-term sickness absence based on occupational health survey variables. Disabil Rehabil 2016; 40:168-175. [PMID: 27830962 DOI: 10.1080/09638288.2016.1247471] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
PURPOSE The purpose of this study is to develop and validate a prediction model for identifying employees at increased risk of long-term sickness absence (LTSA), by using variables commonly measured in occupational health surveys. MATERIALS AND METHODS Based on the literature, 15 predictor variables were retrieved from the DAnish National working Environment Survey (DANES) and included in a model predicting incident LTSA (≥4 consecutive weeks) during 1-year follow-up in a sample of 4000 DANES participants. The 15-predictor model was reduced by backward stepwise statistical techniques and then validated in a sample of 2524 DANES participants, not included in the development sample. Identification of employees at increased LTSA risk was investigated by receiver operating characteristic (ROC) analysis; the area-under-the-ROC-curve (AUC) reflected discrimination between employees with and without LTSA during follow-up. RESULTS The 15-predictor model was reduced to a 9-predictor model including age, gender, education, self-rated health, mental health, prior LTSA, work ability, emotional job demands, and recognition by the management. Discrimination by the 9-predictor model was significant (AUC = 0.68; 95% CI 0.61-0.76), but not practically useful. CONCLUSIONS A prediction model based on occupational health survey variables identified employees with an increased LTSA risk, but should be further developed into a practically useful tool to predict the risk of LTSA in the general working population. Implications for rehabilitation Long-term sickness absence risk predictions would enable healthcare providers to refer high-risk employees to rehabilitation programs aimed at preventing or reducing work disability. A prediction model based on health survey variables discriminates between employees at high and low risk of long-term sickness absence, but discrimination was not practically useful. Health survey variables provide insufficient information to determine long-term sickness absence risk profiles. There is a need for new variables, based on the knowledge and experience of rehabilitation professionals, to improve long-term sickness absence risk profiles.
Collapse
|
35
|
Understanding clinical prediction models as 'innovations': a mixed methods study in UK family practice. BMC Med Inform Decis Mak 2016; 16:106. [PMID: 27506547 PMCID: PMC4977891 DOI: 10.1186/s12911-016-0343-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Accepted: 07/30/2016] [Indexed: 11/12/2022] Open
Abstract
Background Well-designed clinical prediction models (CPMs) often out-perform clinicians at estimating probabilities of clinical outcomes, though their adoption by family physicians is variable. How family physicians interact with CPMs is poorly understood, therefore a better understanding and framing within a context-sensitive theoretical framework may improve CPM development and implementation. The aim of this study was to investigate why family physicians do or do not use CPMs, interpreting these findings within a theoretical framework to provide recommendations for the development and implementation of future CPMs. Methods Mixed methods study in North West England that comprised an online survey and focus groups. Results One hundred thirty eight respondents completed the survey, which found the main perceived advantages to using CPMs were that they guided appropriate treatment (weighted rank [r] = 299; maximum r = 414 throughout), justified treatment decisions (r = 217), and incorporated a large body of evidence (r = 156). The most commonly reported barriers to using CPMs were lack of time (r = 163), irrelevance to some patients (r = 161), and poor integration with electronic health records (r = 147). Eighteen clinicians participated in two focus groups (i.e. nine in each), which revealed 13 interdependent themes affecting CPM use under three overarching domains: clinician factors, CPM factors and contextual factors. Themes were interdependent, indicating the tensions family physicians experience in providing evidence-based care for individual patients. Conclusions The survey and focus groups showed that CPMs were valued when they supported clinical decision making and were robust. Barriers to their use related to their being time-consuming, difficult to use and not always adding value. Therefore, to be successful, CPMs should offer a relative advantage to current working, be easy to implement, be supported by training, policy and guidelines, and fit within the organisational culture.
Collapse
|
36
|
An external validation of models to predict the onset of chronic kidney disease using population-based electronic health records from Salford, UK. BMC Med 2016; 14:104. [PMID: 27401013 PMCID: PMC4940699 DOI: 10.1186/s12916-016-0650-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Accepted: 06/27/2016] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Chronic kidney disease (CKD) is a major and increasing constituent of disease burdens worldwide. Early identification of patients at increased risk of developing CKD can guide interventions to slow disease progression, initiate timely referral to appropriate kidney care services, and support targeting of care resources. Risk prediction models can extend laboratory-based CKD screening to earlier stages of disease; however, to date, only a few of them have been externally validated or directly compared outside development populations. Our objective was to validate published CKD prediction models applicable in primary care. METHODS We synthesised two recent systematic reviews of CKD risk prediction models and externally validated selected models for a 5-year horizon of disease onset. We used linked, anonymised, structured (coded) primary and secondary care data from patients resident in Salford (population ~234 k), UK. All adult patients with at least one record in 2009 were followed-up until the end of 2014, death, or CKD onset (n = 178,399). CKD onset was defined as repeated impaired eGFR measures over a period of at least 3 months, or physician diagnosis of CKD Stage 3-5. For each model, we assessed discrimination, calibration, and decision curve analysis. RESULTS Seven relevant CKD risk prediction models were identified. Five models also had an associated simplified scoring system. All models discriminated well between patients developing CKD or not, with c-statistics around 0.90. Most of the models were poorly calibrated to our population, substantially over-predicting risk. The two models that did not require recalibration were also the ones that had the best performance in the decision curve analysis. CONCLUSIONS Included CKD prediction models showed good discriminative ability but over-predicted the actual 5-year CKD risk in English primary care patients. QKidney, the only UK-developed model, outperformed the others. Clinical prediction models should be (re)calibrated for their intended uses.
Collapse
|