1
|
Yang X, Luo M, Jiang Y. The regulatory effect of zinc on the association between periodontitis and atherosclerotic cardiovascular disease: a cross-sectional study based on the National Health and Nutrition Examination Survey. BMC Oral Health 2024; 24:703. [PMID: 38890599 PMCID: PMC11184828 DOI: 10.1186/s12903-024-04473-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 06/11/2024] [Indexed: 06/20/2024] Open
Abstract
BACKGROUND Zinc has been proven to be effective against periodontitis, and also reported to reduce the risk of cardiovascular diseases (CVD). This study aims to explore the regulatory effect of zinc intake on the association between periodontitis and atherosclerotic cardiovascular disease (ASCVD). METHODS This was a cross-sectional study based on the National Health and Nutrition Examination Survey (NHANES). Logistic regression model was used to explore the association between zinc-RDA or periodontitis and 10-year ASCVD risk ≥ 20%, and results were shown as odds ratio (OR) and 95% confidence interval (95% CI). The regulatory effect of zinc intake on the association between periodontitis and 10-year ASCVD risk ≥ 20% was also assessed using logistic regression model. Subgroup analysis was performed based on age, gender, obesity, education level, lipid-lowering therapy, and dental floss. RESULTS 6,075 patients were finally included for analysis. Zinc intake reaching the recommended level (OR = 0.82, 95%CI: 0.69-0.98) and periodontitis (OR = 2.47, 95%CI: 2.04-3.00) were found to be associated with 0.82-fold and 2.47-fold odds of 10-year ASCVD risk ≥ 20%, respectively. In addition, we found that the odds of 10-year ASCVD risk ≥ 20% was lower in patients with zinc intake reaching the recommended level than those without [OR (95%CI): 2.25 (1.81-2.80) vs. 2.72 (2.05-3.62)]. The similar regulatory effect was found in patients with age ≥ 60 years and < 60 years, in male and female, with or without obesity, in different education levels, with or without lipid lowering therapy, and with or without use of dental floss (all P < 0.05). CONCLUSIONS This study found the regulatory effect of adequate zinc intake on the association between periodontitis and ASCVD, providing guidance for periodontitis patients to decrease the risk of ASCVD.
Collapse
Affiliation(s)
- Xiuxiu Yang
- Department of Stomatology, Ziyang Central Hospital, No.66 Rende Western Road, Yanjiang District, Ziyang, 641300, P.R. China.
| | - Maoyu Luo
- Department of Stomatology, Ziyang Central Hospital, No.66 Rende Western Road, Yanjiang District, Ziyang, 641300, P.R. China
| | - Yao Jiang
- Department of Stomatology, Ziyang Central Hospital, No.66 Rende Western Road, Yanjiang District, Ziyang, 641300, P.R. China
| |
Collapse
|
2
|
Planterose Jiménez B, Kayser M, Vidaki A, Caliebe A. Adaptive predictor-set linear model: An imputation-free method for linear regression prediction on data sets with missing values. Biom J 2024; 66:e2300090. [PMID: 38813859 DOI: 10.1002/bimj.202300090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 03/25/2024] [Accepted: 04/01/2024] [Indexed: 05/31/2024]
Abstract
Linear regression (LR) is vastly used in data analysis for continuous outcomes in biomedicine and epidemiology. Despite its popularity, LR is incompatible with missing data, which frequently occur in health sciences. For parameter estimation, this shortcoming is usually resolved by complete-case analysis or imputation. Both work-arounds, however, are inadequate for prediction, since they either fail to predict on incomplete records or ignore missingness-induced reduction in prediction accuracy and rely on (unrealistic) assumptions about the missing mechanism. Here, we derive adaptive predictor-set linear model (aps-lm), capable of making predictions for incomplete data without the need for imputation. It is derived by using a predictor-selection operation, the Moore-Penrose pseudoinverse, and the reduced QR decomposition. aps-lm is an LR generalization that inherently handles missing values. It is applied on a reference data set, where complete predictors and outcome are available, and yields a set of privacy-preserving parameters. In a second stage, these are shared for making predictions of the outcome on external data sets with missing entries for predictors without imputation. Moreover, aps-lm computes prediction errors that account for the pattern of missing values even under extreme missingness. We benchmark aps-lm in a simulation study. aps-lm showed greater prediction accuracy and reduced bias compared to popular imputation strategies under a wide range of scenarios including variation of sample size, goodness of fit, missing value type, and covariance structure. Finally, as a proof-of-principle, we apply aps-lm in the context of epigenetic aging clocks, linear models that predict a person's biological age from epigenetic data with promising clinical applications.
Collapse
Affiliation(s)
- Benjamin Planterose Jiménez
- Department of Genetic Identification, Erasmus MC, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Manfred Kayser
- Department of Genetic Identification, Erasmus MC, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Athina Vidaki
- Department of Genetic Identification, Erasmus MC, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Amke Caliebe
- Institute of Medical Informatics and Statistics, Kiel University, Kiel, Germany
- University Medical Centre Schleswig-Holstein, Kiel, Germany
| |
Collapse
|
3
|
Wiens MO, Nguyen V, Bone JN, Kumbakumba E, Businge S, Tagoola A, Sherine SO, Byaruhanga E, Ssemwanga E, Barigye C, Nsungwa J, Olaro C, Ansermino JM, Kissoon N, Singer J, Larson CP, Lavoie PM, Dunsmuir D, Moschovis PP, Novakowski S, Komugisha C, Tayebwa M, Mwesigwa D, Knappett M, West N, Mugisha NK, Kabakyenga J. Prediction models for post-discharge mortality among under-five children with suspected sepsis in Uganda: A multicohort analysis. PLOS GLOBAL PUBLIC HEALTH 2024; 4:e0003050. [PMID: 38683787 PMCID: PMC11057737 DOI: 10.1371/journal.pgph.0003050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 04/04/2024] [Indexed: 05/02/2024]
Abstract
In many low-income countries, over five percent of hospitalized children die following hospital discharge. The lack of available tools to identify those at risk of post-discharge mortality has limited the ability to make progress towards improving outcomes. We aimed to develop algorithms designed to predict post-discharge mortality among children admitted with suspected sepsis. Four prospective cohort studies of children in two age groups (0-6 and 6-60 months) were conducted between 2012-2021 in six Ugandan hospitals. Prediction models were derived for six-months post-discharge mortality, based on candidate predictors collected at admission, each with a maximum of eight variables, and internally validated using 10-fold cross-validation. 8,810 children were enrolled: 470 (5.3%) died in hospital; 257 (7.7%) and 233 (4.8%) post-discharge deaths occurred in the 0-6-month and 6-60-month age groups, respectively. The primary models had an area under the receiver operating characteristic curve (AUROC) of 0.77 (95%CI 0.74-0.80) for 0-6-month-olds and 0.75 (95%CI 0.72-0.79) for 6-60-month-olds; mean AUROCs among the 10 cross-validation folds were 0.75 and 0.73, respectively. Calibration across risk strata was good: Brier scores were 0.07 and 0.04, respectively. The most important variables included anthropometry and oxygen saturation. Additional variables included: illness duration, jaundice-age interaction, and a bulging fontanelle among 0-6-month-olds; and prior admissions, coma score, temperature, age-respiratory rate interaction, and HIV status among 6-60-month-olds. Simple prediction models at admission with suspected sepsis can identify children at risk of post-discharge mortality. Further external validation is recommended for different contexts. Models can be digitally integrated into existing processes to improve peri-discharge care as children transition from the hospital to the community.
Collapse
Affiliation(s)
- Matthew O. Wiens
- Institute for Global Health at BC Children’s and Women’s Hospital, Vancouver, Canada
- Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, Canada
- BC Children’s Hospital Research Institute, Vancouver, Canada
- Walimu, Kampala, Uganda
| | - Vuong Nguyen
- Institute for Global Health at BC Children’s and Women’s Hospital, Vancouver, Canada
| | - Jeffrey N. Bone
- BC Children’s Hospital Research Institute, Vancouver, Canada
| | - Elias Kumbakumba
- Department of Paediatrics and Child Health, Mbarara University of Science and Technology, Mbarara, Uganda
| | | | - Abner Tagoola
- Jinja Regional Referral Hospital, Jinja City, Uganda
| | | | | | | | | | - Jesca Nsungwa
- Ministry of Health for the Republic of Uganda, Kampala, Uganda
| | - Charles Olaro
- Ministry of Health for the Republic of Uganda, Kampala, Uganda
| | - J. Mark Ansermino
- Institute for Global Health at BC Children’s and Women’s Hospital, Vancouver, Canada
- Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, Canada
- BC Children’s Hospital Research Institute, Vancouver, Canada
| | - Niranjan Kissoon
- BC Children’s Hospital Research Institute, Vancouver, Canada
- Department of Pediatrics, University of British Columbia, Vancouver, Canada
| | - Joel Singer
- School of Population and Public Health, University of British Columbia, Vancouver, Canada
| | - Charles P. Larson
- School of Population and Global Health, McGill University, Montréal, Canada
| | - Pascal M. Lavoie
- BC Children’s Hospital Research Institute, Vancouver, Canada
- Department of Pediatrics, University of British Columbia, Vancouver, Canada
| | - Dustin Dunsmuir
- Institute for Global Health at BC Children’s and Women’s Hospital, Vancouver, Canada
- BC Children’s Hospital Research Institute, Vancouver, Canada
| | - Peter P. Moschovis
- Division of Global Health, Massachusetts General Hospital, Boston, MA, United States of America
| | - Stefanie Novakowski
- Institute for Global Health at BC Children’s and Women’s Hospital, Vancouver, Canada
- Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, Canada
| | | | | | | | - Martina Knappett
- Institute for Global Health at BC Children’s and Women’s Hospital, Vancouver, Canada
| | - Nicholas West
- BC Children’s Hospital Research Institute, Vancouver, Canada
| | | | - Jerome Kabakyenga
- Maternal Newborn & Child Health Institute, Mbarara University of Science and Technology, Mbarara, Uganda
- Faculty of Medicine, Department of Community Health, Mbarara University of Science and Technology, Mbarara, Uganda
| |
Collapse
|
4
|
Wang B, Cheng Y, Gail MH, Fine J, Pfeiffer RM. Predicting absolute risk for a person with missing risk factors. Stat Methods Med Res 2024; 33:557-573. [PMID: 38426821 DOI: 10.1177/09622802241227945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Abstract
We compared methods to project absolute risk, the probability of experiencing the outcome of interest in a given projection interval accommodating competing risks, for a person from the target population with missing predictors. Without missing data, a perfectly calibrated model gives unbiased absolute risk estimates in a new target population, even if the predictor distribution differs from the training data. However, if predictors are missing in target population members, a reference dataset with complete data is needed to impute them and to estimate absolute risk, conditional only on the observed predictors. If the predictor distributions of the reference data and the target population differ, this approach yields biased estimates. We compared the bias and mean squared error of absolute risk predictions for seven methods that assume predictors are missing at random (MAR). Some methods imputed individual missing predictors, others imputed linear predictor combinations (risk scores). Simulations were based on real breast cancer predictor distributions and outcome data. We also analyzed a real breast cancer dataset. The largest bias for all methods resulted from different predictor distributions of the reference and target populations. No method was unbiased in this situation. Surprisingly, violating the MAR assumption did not induce severe biases. Most multiple imputation methods performed similarly and were less biased (but more variable) than a method that used a single expected risk score. Our work shows the importance of selecting predictor reference datasets similar to the target population to reduce bias of absolute risk predictions with missing risk factors.
Collapse
Affiliation(s)
- Bang Wang
- Department of Statistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Yu Cheng
- Department of Statistics, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Mitchell H Gail
- Biostatistics Branch, National Cancer Institute, Rockville, MD, USA
| | - Jason Fine
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Ruth M Pfeiffer
- Biostatistics Branch, National Cancer Institute, Rockville, MD, USA
| |
Collapse
|
5
|
Hayakawa K, Uchino S, Endo H, Hasegawa K, Kiyota K. Impact of missing values on the ability of the acute physiology and chronic health evaluation III and Japan risk of death models to predict mortality. J Crit Care 2024; 79:154432. [PMID: 37742518 DOI: 10.1016/j.jcrc.2023.154432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Revised: 09/10/2023] [Accepted: 09/11/2023] [Indexed: 09/26/2023]
Abstract
PURPOSE This study assessed model performance of the Acute Physiology and Chronic Health Evaluation (APACHE) III and Japan Risk of Death (JROD) when degraded by the number and category of missing variables. We also examined the impact of missing data on predicted mortality for facilities with missing physiological variables. METHODS We obtained data from the Japanese Intensive care PAtient Database (JIPAD). We calculated observed and predicted mortality rates using the APACHE III and JROD and the standardized mortality ratio (SMR) by the number and category of missing variables. Smoothed spline curves were calculated for the SMR to the missing proportion of the facility. RESULTS A total of 61,357 patients from 57 ICUs were included between April 2015 and March 2019. The APACHE III and JROD SMRs increased as the number of missing values increased. The SMR in the APACHE III model was elevated in facilities with a larger proportion of missing in each of the APS categories, arterial blood gas, albumin, glucose, and bilirubin. Facilities with a high proportion of missing albumin data preserved their SMRs in only the JROD model. CONCLUSION An increased number of missing physiological variables resulted in falsely low predicted mortality rates and high SMRs.
Collapse
Affiliation(s)
- Katsura Hayakawa
- Department of Intensive Care Medicine, Toranomon Hospital, 2-2-2 Toranomon, Minato-ku, Tokyo 105-8470, Japan; Department of Emergency and Critical Care Medicine, Saitama Red Cross Hospital, 1-5 Shintoshin, Chu-o-ku, Saitama 330-8553, Japan.
| | - Shigehiko Uchino
- Department of Anesthesiology and Intensive Care, Saitama Medical Center, Jichi Medical University, 1-847 Amanuma-cho, Omiya-ku, Saitama 330-0834, Japan
| | - Hideki Endo
- Department of Healthcare Quality Assessment, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | - Kazuki Hasegawa
- Department of Emergency and Critical Care Medicine, Saitama Red Cross Hospital, 1-5 Shintoshin, Chu-o-ku, Saitama 330-8553, Japan
| | - Kazuya Kiyota
- Department of Emergency and Critical Care Medicine, Saitama Red Cross Hospital, 1-5 Shintoshin, Chu-o-ku, Saitama 330-8553, Japan
| |
Collapse
|
6
|
Mertens E, Keuchkarian M, Vasquez MS, Vandevijvere S, Peñalvo JL. Lifestyle predictors of colorectal cancer in European populations: a systematic review. BMJ Nutr Prev Health 2024; 7:183-190. [PMID: 38966096 PMCID: PMC11221299 DOI: 10.1136/bmjnph-2022-000554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 10/10/2023] [Indexed: 07/06/2024] Open
Abstract
Background Colorectal cancer (CRC) is the second most prevalent cancer in Europe, with one-fifth of cases attributable to unhealthy lifestyles. Risk prediction models for quantifying CRC risk and identifying high-risk groups have been developed or validated across European populations, some considering lifestyle as a predictor. Purpose To identify lifestyle predictors considered in existing risk prediction models applicable for European populations and characterise their corresponding parameter values for an improved understanding of their relative contribution to prediction across different models. Methods A systematic review was conducted in PubMed and Web of Science from January 2000 to August 2021. Risk prediction models were included if (1) developed and/or validated in an adult asymptomatic European population, (2) based on non-invasively measured predictors and (3) reported mean estimates and uncertainty for predictors included. To facilitate comparison, model-specific lifestyle predictors were visualised using forest plots. Results A total of 21 risk prediction models for CRC (reported in 16 studies) were eligible, of which 11 were validated in a European adult population but developed elsewhere, mostly USA. All models but two reported at least one lifestyle factor as predictor. Of the lifestyle factors, the most common predictors were body mass index (BMI) and smoking (each present in 13 models), followed by alcohol (11), and physical activity (7), while diet-related factors were less considered with the most commonly present meat (9), vegetables (5) or dairy (2). The independent predictive contribution was generally greater when they were collected with greater detail, although a noticeable variation in effect size estimates for BMI, smoking and alcohol. Conclusions Early identification of high-risk groups based on lifestyle data offers the potential to encourage participation in lifestyle change and screening programmes, hence reduce CRC burden. We propose the commonly shared lifestyle predictors to be further used in public health prediction modelling for improved uptake of the model.
Collapse
Affiliation(s)
- Elly Mertens
- Unit of Non-Comunicable Diseases, Department of Public Health, Institute of Tropical Medicine, Antwerp, Belgium
| | - Maria Keuchkarian
- Unit of Non-Comunicable Diseases, Department of Public Health, Institute of Tropical Medicine, Antwerp, Belgium
- Faculty of Bioscience Engineering, Ghent University, Gent, Belgium
| | | | | | - José L Peñalvo
- Unit of Non-Comunicable Diseases, Department of Public Health, Institute of Tropical Medicine, Antwerp, Belgium
- Global Health Institute, University of Antwerp, Wilrijk, Belgium
| |
Collapse
|
7
|
Kang Y, Yan J. Exploring the connection between caffeine intake and constipation: a cross-sectional study using national health and nutrition examination survey data. BMC Public Health 2024; 24:3. [PMID: 38167025 PMCID: PMC10759350 DOI: 10.1186/s12889-023-17502-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 12/15/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND Caffeine has been reported to increase gastrointestinal motility and change intestinal microbiota. Constipation may be caused by colonic motor dysfunction and colonic microbiomeis disturbances. In this study, we aimed to explore the association between caffeine intake and constipation. METHODS This was a cross-sectional study based on the National Health and Nutrition Examination Survey (NHANES). Caffeine intake was assessed using 24-h dietary recall method, and constipation was defined based on stool consistency or stool frequency. Logistic regression analysis was used to assess the association between caffeine intake and constipation, and results were expressed as odds ratio (OR) with 95% confidence intervals (95%CI). Subgroup analysis was performed based on age. RESULTS A total of 13,816 participants were finally included for analysis. After adjusting potential confounders, high intake of caffeine was found to be associated with the low odds of constipation (Q3: OR = 0.60, 95%CI: 0.49-0.74; Q4: OR = 0.77, 95%CI: 0.59-0.99; Q5: OR = 0.72, 95%CI: 0.56-0.92). The similar association was found in young people and middle-age people (P < 0.05). CONCLUSION High caffeine intake was associated with the low odds of constipation. Our finding indicated that individuals should develop consciousness and habit of consuming caffeinated foods and drinks to prevent and relief the constipation.
Collapse
Affiliation(s)
- Yulong Kang
- Department of Proctology, Northern Jiangsu People's Hospital, No.98 Nantong Western Road, Guangling District, Yangzhou, 225001, P.R. China
| | - Jin Yan
- Department of Proctology, Northern Jiangsu People's Hospital, No.98 Nantong Western Road, Guangling District, Yangzhou, 225001, P.R. China.
| |
Collapse
|
8
|
Tsiampalis T, Panagiotakos D. Methodological issues of the electronic health records' use in the context of epidemiological investigations, in light of missing data: a review of the recent literature. BMC Med Res Methodol 2023; 23:180. [PMID: 37559072 PMCID: PMC10410989 DOI: 10.1186/s12874-023-02004-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Accepted: 07/27/2023] [Indexed: 08/11/2023] Open
Abstract
BACKGROUND Electronic health records (EHRs) are widely accepted to enhance the health care quality, patient monitoring, and early prevention of various diseases, even when there is incomplete or missing information in them. AIM The present review sought to investigate the impact of EHR implementation on healthcare quality and medical decision in the context of epidemiological investigations, considering missing or incomplete data. METHODS Google scholar, Medline (via PubMed) and Scopus databases were searched for studies investigating the impact of EHR implementation on healthcare quality and medical decision, as well as for studies investigating the way of dealing with missing data, and their impact on medical decision and the development process of prediction models. Electronic searches were carried out up to 2022. RESULTS EHRs were shown that they constitute an increasingly important tool for both physicians, decision makers and patients, which can improve national healthcare systems both for the convenience of patients and doctors, while they improve the quality of health care as well as they can also be used in order to save money. As far as the missing data handling techniques is concerned, several investigators have already tried to propose the best possible methodology, yet there is no wide consensus and acceptance in the scientific community, while there are also crucial gaps which should be addressed. CONCLUSIONS Through the present thorough investigation, the importance of the EHRs' implementation in clinical practice was established, while at the same time the gap of knowledge regarding the missing data handling techniques was also pointed out.
Collapse
Affiliation(s)
- Thomas Tsiampalis
- Department of Nutrition and Dietetics, School of Health Sciences and Education, Harokopio University, Athens, Greece
| | - Demosthenes Panagiotakos
- Department of Nutrition and Dietetics, School of Health Sciences and Education, Harokopio University, Athens, Greece.
- Faculty of Health, University of Canberra, Canberra, Australia.
| |
Collapse
|
9
|
Lee J, Westphal M, Vali Y, Boursier J, Petta S, Ostroff R, Alexander L, Chen Y, Fournier C, Geier A, Francque S, Wonders K, Tiniakos D, Bedossa P, Allison M, Papatheodoridis G, Cortez-Pinto H, Pais R, Dufour JF, Leeming DJ, Harrison S, Cobbold J, Holleboom AG, Yki-Järvinen H, Crespo J, Ekstedt M, Aithal GP, Bugianesi E, Romero-Gomez M, Torstenson R, Karsdal M, Yunis C, Schattenberg JM, Schuppan D, Ratziu V, Brass C, Duffin K, Zwinderman K, Pavlides M, Anstee QM, Bossuyt PM. Machine learning algorithm improves the detection of NASH (NAS-based) and at-risk NASH: A development and validation study. Hepatology 2023; 78:258-271. [PMID: 36994719 DOI: 10.1097/hep.0000000000000364] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 12/22/2022] [Indexed: 03/31/2023]
Abstract
BACKGROUND AND AIMS Detecting NASH remains challenging, while at-risk NASH (steatohepatitis and F≥ 2) tends to progress and is of interest for drug development and clinical application. We developed prediction models by supervised machine learning techniques, with clinical data and biomarkers to stage and grade patients with NAFLD. APPROACH AND RESULTS Learning data were collected in the Liver Investigation: Testing Marker Utility in Steatohepatitis metacohort (966 biopsy-proven NAFLD adults), staged and graded according to NASH CRN. Conditions of interest were the clinical trial definition of NASH (NAS ≥ 4;53%), at-risk NASH (NASH with F ≥ 2;35%), significant (F ≥ 2;47%), and advanced fibrosis (F ≥ 3;28%). Thirty-five predictors were included. Missing data were handled by multiple imputations. Data were randomly split into training/validation (75/25) sets. A gradient boosting machine was applied to develop 2 models for each condition: clinical versus extended (clinical and biomarkers). Two variants of the NASH and at-risk NASH models were constructed: direct and composite models.Clinical gradient boosting machine models for steatosis/inflammation/ballooning had AUCs of 0.94/0.79/0.72. There were no improvements when biomarkers were included. The direct NASH model produced AUCs (clinical/extended) of 0.61/0.65. The composite NASH model performed significantly better (0.71) for both variants. The composite at-risk NASH model had an AUC of 0.83 (clinical and extended), an improvement over the direct model. Significant fibrosis models had AUCs (clinical/extended) of 0.76/0.78. The extended advanced fibrosis model (0.86) performed significantly better than the clinical version (0.82). CONCLUSIONS Detection of NASH and at-risk NASH can be improved by constructing independent machine learning models for each component, using only clinical predictors. Adding biomarkers only improved the accuracy of fibrosis.
Collapse
Affiliation(s)
- Jenny Lee
- Department of Epidemiology and Data Science, Amsterdam UMC, Amsterdam, the Netherlands
| | - Max Westphal
- Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany
| | - Yasaman Vali
- Department of Epidemiology and Data Science, Amsterdam UMC, Amsterdam, the Netherlands
| | - Jerome Boursier
- Department of Hepatology, Angers University Hospital, Angers, France
| | - Salvatorre Petta
- Section of Gastroenterology and Hepatology, Promozione della Salute, Materno-Infantile, di Medicina Interna e Specialistica di Eccellenza, Department, University of Palermo, Palermo, Italy
| | | | | | - Yu Chen
- Lilly Research Laboratories, Eli Lilly and Company Ltd (LLY), Indianapolis, Indiana, USA
| | | | - Andreas Geier
- Division of Hepatology, Department of Medicine II, Wurzburg University Hospital, Wurzburg, Germany
| | - Sven Francque
- Department of Gastroenterology Hepatology, Antwerp University Hospital, and Laboratory of Experimental Medicine and Paediatrics, University of Antwerp, Belgium
| | - Kristy Wonders
- Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| | - Dina Tiniakos
- Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
- Department of Pathology, Aretaieion Hospital, national and Kapodistrian University of Athens, Athens, Greece
| | - Pierre Bedossa
- Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| | - Mike Allison
- Liver Unit, Department of Medicine, Cambridge NIHR Biomedical Research Centre, Cambridge University NHS Foundation Trust, CB2 0QQ, Cambridge, UK
| | - Georgios Papatheodoridis
- Gastroenterology Department, National and Kapodistrian University of Athens, General Hospital of Athens "Laiko", Athens, Greece
| | - Helena Cortez-Pinto
- Clínica Universitária de Gastrenterologia, Faculdade de Medicina, Universidade de Lisboa, Portugal
| | - Raluca Pais
- Assistance Publique-Hôpitaux de Paris, hôpital Pitié Salpêtrière, Sorbonne University, ICAN (Institute of Cardiometabolism and Nutrition), Paris, France
| | - Jean-Francois Dufour
- Hepatology, Department of Biomedical Research, University of Bern, Bern, Switzerland
| | | | - Stephen Harrison
- Department of Gastroenterology and Hepatology, Oxford NIHR Biomedical Research Centre, John Radcliffe Hospital, Oxford, UK
| | - Jeremy Cobbold
- Department of Gastroenterology and Hepatology, Oxford NIHR Biomedical Research Centre, John Radcliffe Hospital, Oxford, UK
| | - Adriaan G Holleboom
- Department of Internal and Vascular Medicine, Amsterdam University Medical Centres, location AMC, Amsterdam, the Netherlands
| | - Hannele Yki-Järvinen
- Department of Medicine, University of Helsinki and Helsinki University Hospital, Finland; Minerva Foundation Institute for Medical Research, Helsinki, Finland
| | - Javier Crespo
- Department of Gastroenterology and Hepatology, University Hospital Marques de Valdecilla. Research Institute Valdecilla-IDIVAL, Santander, Spain
| | - Mattias Ekstedt
- Department of Health, Medicine and Caring Sciences, Linköping University, Linköping, Sweden
| | - Guruprasad P Aithal
- Nottingham Digestive Diseases Centre, School of Medicine, NIHR Nottingham Biomedical Research Centre, Nottingham University Hospitals NHS Trust and The University of Nottingham, Nottingham, UK
| | - Elisabetta Bugianesi
- Department of Medical Sciences, Division of Gastro-Hepatology, A.O. Città della Salute e della Scienza di Torino, University of Turin, Turin, Italy
| | - Manuel Romero-Gomez
- UCM Digestive Diseases, ciberehd, Virgen del Rocio University Hospital. Institute of Biomedicine of Seville (CSIC/HUVR/US), Department of Medicine, University of Seville, Seville, Spain
| | - Richard Torstenson
- Cardiovascular, Renal and Metabolism Regulatory Affairs, AstraZeneca, Mölndal, Sweden
| | | | - Carla Yunis
- Internal Medicine and Hospital, Global Product Development, Pfizer, Inc, New York, New York, USA
| | - Jörn M Schattenberg
- Metabolic Liver Research Program, I. Department of Medicine, University Medical Center Mainz, Mainz, Germany
| | - Detlef Schuppan
- Institute of Translational Immunology and Research Center for Immune Therapy, University Medical Center Mainz, Mainz, Germany
- Division of Gastroenterology, Beth Israel Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Vlad Ratziu
- Assistance Publique-Hôpitaux de Paris, hôpital Pitié Salpêtrière, Sorbonne University, ICAN (Institute of Cardiometabolism and Nutrition), Paris, France
| | - Clifford Brass
- Novartis Pharmaceuticals Corporation, East Hanover, New Jersey
| | - Kevin Duffin
- Lilly Research Laboratories, Eli Lilly and Company Ltd (LLY), Indianapolis, Indiana, USA
| | - Koos Zwinderman
- Department of Epidemiology and Data Science, Amsterdam UMC, Amsterdam, the Netherlands
| | | | - Quentin M Anstee
- Department of Gastroenterology Hepatology, Antwerp University Hospital, and Laboratory of Experimental Medicine and Paediatrics, University of Antwerp, Belgium
- Newcastle NIHR Biomedical Research Centre, Newcastle upon Tyne Hospitals NHS Trust, Newcastle upon Tyne, UK
| | - Patrick M Bossuyt
- Department of Epidemiology and Data Science, Amsterdam UMC, Amsterdam, the Netherlands
| |
Collapse
|
10
|
Kokkinakis S, Kritsotakis EI, Paterakis K, Karali GA, Malikides V, Kyprianou A, Papalexandraki M, Anastasiadis CS, Zoras O, Drakos N, Kehagias I, Kehagias D, Gouvas N, Kokkinos G, Pozotou I, Papatheodorou P, Frantzeskou K, Schizas D, Syllaios A, Palios IM, Nastos K, Perdikaris M, Michalopoulos NV, Margaris I, Lolis E, Dimopoulou G, Panagiotou D, Nikolaou V, Glantzounis GK, Pappas-Gogos G, Tepelenis K, Zacharioudakis G, Tsaramanidis S, Patsarikas I, Stylianidis G, Giannos G, Karanikas M, Kofina K, Markou M, Chrysos E, Lasithiotakis K. Prospective multicenter external validation of postoperative mortality prediction tools in patients undergoing emergency laparotomy. J Trauma Acute Care Surg 2023; 94:847-856. [PMID: 36726191 DOI: 10.1097/ta.0000000000003904] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
BACKGROUND Accurate preoperative risk assessment in emergency laparotomy (EL) is valuable for informed decision making and rational use of resources. Available risk prediction tools have not been validated adequately across diverse health care settings. Herein, we report a comparative external validation of four widely cited prognostic models. METHODS A multicenter cohort was prospectively composed of consecutive patients undergoing EL in 11 Greek hospitals from January 2020 to May 2021 using the National Emergency Laparotomy Audit (NELA) inclusion criteria. Thirty-day mortality risk predictions were calculated using the American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP), NELA, Portsmouth Physiological and Operative Severity Score for the Enumeration of Mortality and Morbidity (P-POSSUM), and Predictive Optimal Trees in Emergency Surgery Risk tools. Surgeons' assessment of postoperative mortality using predefined cutoffs was recorded, and a surgeon-adjusted ACS-NSQIP prediction was calculated when the original model's prediction was relatively low. Predictive performances were compared using scaled Brier scores, discrimination and calibration measures and plots, and decision curve analysis. Heterogeneity across hospitals was assessed by random-effects meta-analysis. RESULTS A total of 631 patients were included, and 30-day mortality was 16.3%. The ACS-NSQIP and its surgeon-adjusted version had the highest scaled Brier scores. All models presented high discriminative ability, with concordance statistics ranging from 0.79 for P-POSSUM to 0.85 for NELA. However, except the surgeon-adjusted ACS-NSQIP (Hosmer-Lemeshow test, p = 0.742), all other models were poorly calibrated ( p < 0.001). Decision curve analysis revealed superior clinical utility of the ACS-NSQIP. Following recalibrations, predictive accuracy improved for all models, but ACS-NSQIP retained the lead. Between-hospital heterogeneity was minimum for the ACS-NSQIP model and maximum for P-POSSUM. CONCLUSION The ACS-NSQIP tool was most accurate for mortality predictions after EL in a broad external validation cohort, demonstrating utility for facilitating preoperative risk management in the Greek health care system. Subjective surgeon assessments of patient prognosis may optimize ACS-NSQIP predictions. LEVEL OF EVIDENCE Diagnostic Test/Criteria; Level II.
Collapse
Affiliation(s)
- Stamatios Kokkinakis
- From the Department of General Surgery (S.K., K.P., G.-A.K., V.M., A.K., M.P., E.C., K.L.), University Hospital of Heraklion, University of Crete, School of Medicine; Laboratory of Biostatistics, University of Crete, School of Medicine (E.I.K.); Department of Surgical Oncology, University Hospital of Heraklion, University of Crete, School of Medicine (C.S.A., O.Z.), Heraklion; Department of Surgery, University General Hospital of Patras, School of Medicine (N.D., I.K., D.K.), University of Patras, Patras, Greece; Department of Surgery, General Hospital of Nicosia, School of Medicine (N.G., G.K., I.P., P.P., K.F.), University of Cyprus, Nicosia, Cyprus; First Department of Surgery (D.S., A.S.) and Second Propaedeutic Department of Surgery (I.M.P.), Laikon General Hospital, National and Kapodistrian University of Athens; Department of Surgery, University General Hospital Attikon, School of Medicine (K.N., M.P., N.V.M., I.M.), University of Athens, Athens; Department of Surgery (E.L., G.D.), General Hospital of Volos, Volos, Greece; Department of Surgery (D.P., V.N.), General Hospital of Trikala, Trikala; Department of Surgery (G.K.G., G.P.-G., K.T.), University Hospital of Ioannina, Ioannina, Greece; Department of Surgery, Ippokrateion General Hospital of Thessaloniki, School of Medicine (G.Z., S.T., I.P.), Aristotle University of Thessaloniki, Thessaloniki; Second Department of Surgery (G.S., G.G.), Evangelismos General Hospital, Athens; and Department of Surgery, University General Hospital of Alexandroupolis, School of Medicine (M.K., K.K., M.M.), University of Thrace, Alexandroupolis, Greece
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Cai M, van Buuren S, Vink G. Graphical and numerical diagnostic tools to assess multiple imputation models by posterior predictive checking. Heliyon 2023; 9:e17077. [PMID: 37360073 PMCID: PMC10285146 DOI: 10.1016/j.heliyon.2023.e17077] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 06/03/2023] [Accepted: 06/06/2023] [Indexed: 06/28/2023] Open
Abstract
Problem The congenial of the imputation model is crucial for valid statistical inferences. Hence, it is important to develop methodologies for diagnosing imputation models. Aim We propose and evaluate a new diagnostic method based on posterior predictive checking to diagnose the congeniality of fully conditional imputation models. Our method applies to multiple imputation by chained equations, which is widely used in statistical software. Methods The proposed method compares the observed data with their replicates generated under the corresponding posterior predictive distributions to diagnose the performance of imputation models. The method applies to various imputation models, including parametric and semi-parametric approaches and continuous and discrete incomplete variables. We studied the validity of the method through simulation and application. Results The proposed diagnostic method based on posterior predictive checking demonstrates its validity in assessing the performance of imputation models. The method can diagnose the consistency of imputation models with the substantive model and can be applied to a broad range of research contexts. Conclusion The diagnostic method based on posterior predictive checking provides a valuable tool for researchers who use fully conditional specification to handle missing data. By assessing the performance of imputation models, our method can help researchers improve the accuracy and reliability of their analyzes. Furthermore, our method applies to different imputation models. Hence, it is a versatile and valuable tool for researchers identifying plausible imputation models.
Collapse
Affiliation(s)
- Mingyang Cai
- Corresponding author at: Sjoerd Groenman building, Padualaan 14, 3584 CH, Utrecht, the Netherlands.
| | | | | |
Collapse
|
12
|
Debray TPA, Collins GS, Riley RD, Snell KIE, Van Calster B, Reitsma JB, Moons KGM. Transparent reporting of multivariable prediction models developed or validated using clustered data (TRIPOD-Cluster): explanation and elaboration. BMJ 2023; 380:e071058. [PMID: 36750236 PMCID: PMC9903176 DOI: 10.1136/bmj-2022-071058] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/07/2022] [Indexed: 02/09/2023]
Affiliation(s)
- Thomas P A Debray
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Oxford, UK
- National Institute for Health and Care Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, UK
| | - Richard D Riley
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| | - Kym I E Snell
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- EPI-centre, KU Leuven, Leuven, Belgium
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, Netherlands
| | - Johannes B Reitsma
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
13
|
Yu J, Liu X, Zhu Z, Yang Z, He J, Zhang L, Lu H. Prediction models for cardiovascular disease risk among people living with HIV: A systematic review and meta-analysis. Front Cardiovasc Med 2023; 10:1138234. [PMID: 37034346 PMCID: PMC10077152 DOI: 10.3389/fcvm.2023.1138234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 03/08/2023] [Indexed: 04/11/2023] Open
Abstract
Background HIV continues to be a major global health issue. The relative risk of cardiovascular disease (CVD) among people living with HIV (PLWH) was 2.16 compared to non-HIV-infections. The prediction of CVD is becoming an important issue in current HIV management. However, there is no consensus on optional CVD risk models for PLWH. Therefore, we aimed to systematically summarize and compare prediction models for CVD risk among PLWH. Methods Longitudinal studies that developed or validated prediction models for CVD risk among PLWH were systematically searched. Five databases were searched up to January 2022. The quality of the included articles was evaluated by using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). We applied meta-analysis to pool the logit-transformed C-statistics for discrimination performance. Results Thirteen articles describing 17 models were included. All the included studies had a high risk of bias. In the meta-analysis, the pooled estimated C-statistic was 0.76 (95% CI: 0.72-0.81, I 2 = 84.8%) for the Data collection on Adverse Effects of Anti-HIV Drugs Study risk equation (D:A:D) (2010), 0.75 (95% CI: 0.70-0.79, I 2 = 82.4%) for the D:A:D (2010) 10-year risk version, 0.77 (95% CI: 0.74-0.80, I 2 = 82.2%) for the full D:A:D (2016) model, 0.74 (95% CI: 0.68-0.79, I 2 = 86.2%) for the reduced D:A:D (2016) model, 0.71 (95% CI: 0.61-0.79, I 2 = 87.9%) for the Framingham Risk Score (FRS) for coronary heart disease (CHD) (1998), 0.74 (95% CI: 0.70-0.78, I 2 = 87.8%) for the FRS CVD model (2008), 0.72 (95% CI: 0.67-0.76, I 2 = 75.0%) for the pooled cohort equations of the American Heart Society/ American score (PCE), and 0.67 (95% CI: 0.56-0.77, I 2 = 51.3%) for the Systematic COronary Risk Evaluation (SCORE). In the subgroup analysis, the discrimination of PCE was significantly better in the group aged ≤40 years than in the group aged 40-45 years (P = 0.024) and the group aged ≥45 years (P = 0.010). No models were developed or validated in Sub-Saharan Africa and the Asia region. Conclusions The full D:A:D (2016) model performed the best in terms of discrimination, followed by the D:A:D (2010) and PCE. However, there were no significant differences between any of the model pairings. Specific CVD risk models for older PLWH and for PLWH in Sub-Saharan Africa and the Asia region should be established.Systematic Review Registration: PROSPERO CRD42022322024.
Collapse
Affiliation(s)
- Junwen Yu
- School of Nursing, Fudan University, Shanghai, China
| | - Xiaoning Liu
- Department of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Shenzhen Third People's Hospital, Guangdong, China
- National Heart & Lung Institute, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Zheng Zhu
- School of Nursing, Fudan University, Shanghai, China
- Fudan University Centre for Evidence-Based Nursing: A Joanna Briggs Institute Centre of Excellence, Shanghai, China
- NYU Rory Meyers College of Nursing, New York University, New York City, NY, United States
- Correspondence: Zheng Zhu Hongzhou Lu
| | - Zhongfang Yang
- School of Nursing, Fudan University, Shanghai, China
- Fudan University Centre for Evidence-Based Nursing: A Joanna Briggs Institute Centre of Excellence, Shanghai, China
- Shanghai Institute of Infectious Disease and Biosecurity, Fudan University, Shanghai, China
| | - Jiamin He
- School of Nursing, Fudan University, Shanghai, China
| | - Lin Zhang
- Shanghai Public Health Clinical Center, Fudan University, Shanghai, China
| | - Hongzhou Lu
- Department of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Shenzhen Third People's Hospital, Guangdong, China
- Correspondence: Zheng Zhu Hongzhou Lu
| |
Collapse
|
14
|
Charumporn T, Jarupanich N, Rinthapon C, Meetham K, Pattayakornkul N, Taerujjirakul T, Tanasombatkul K, Ditsatham C, Chongruksut W, Phanphaisarn A, Pongnikorn D, Phinyo P. External Validation of the Individualized Prediction of Breast Cancer Survival (IPBS) Model for Estimating Survival after Surgery for Patients with Breast Cancer in Northern Thailand. Cancers (Basel) 2022; 14:cancers14235726. [PMID: 36497208 PMCID: PMC9737252 DOI: 10.3390/cancers14235726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 11/10/2022] [Accepted: 11/19/2022] [Indexed: 11/24/2022] Open
Abstract
The individualized prediction of breast cancer survival (IPBS) model was recently developed. Although the model showed acceptable performance during derivation, its external performance remained unknown. This study aimed to validate the IPBS model using the data of breast cancer patients in Northern Thailand. An external validation study was conducted based on female patients with breast cancer who underwent surgery at Maharaj Nakorn Chiang Mai hospital from 2005 to 2015. Data on IPBS predictors were collected. The endpoints were 5-year overall survival (OS) and disease-free survival (DFS). The model performance was evaluated in terms of discrimination and calibration. Missing data were handled with multiple imputation. Of all 3581 eligible patients, 1868 were included. The 5-year OS and DFS were 85.2% and 81.9%. The IPBS model showed acceptable discrimination: C-statistics 0.706 to 0.728 for OS and 0.675 to 0.689 for DFS at 5 years. However, the IPBS model minimally overestimated both OS and DFS predictions. These overestimations were corrected after model recalibration. In this external validation study, the IPBS model exhibited good discriminative ability. Although it may provide minimal overestimation, recalibrating the model to the local context is a practical solution to improve the model calibration.
Collapse
Affiliation(s)
- Thanapat Charumporn
- Department of Family Medicine, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Nutcha Jarupanich
- Center for Clinical Epidemiology and Clinical Statistics, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Chanawin Rinthapon
- Department of Family Medicine, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Kantapit Meetham
- Department of Family Medicine, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Napat Pattayakornkul
- Department of Family Medicine, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Teerapant Taerujjirakul
- Department of Family Medicine, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Krittai Tanasombatkul
- Department of Family Medicine, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
- Center for Clinical Epidemiology and Clinical Statistics, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Chagkrit Ditsatham
- Division of Head, Neck, and Breast Surgery, Department of Surgery, Clinical Surgical Research Center, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Wilaiwan Chongruksut
- Center for Clinical Epidemiology and Clinical Statistics, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
- Division of Research, Department of Surgery and Clinical Surgical Research Center, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Areerak Phanphaisarn
- Department of Orthopaedics, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Donsuk Pongnikorn
- Vejjarak Lampang Hospital, Department of Medical Services, Ministry of Public Health, Lampang 52130, Thailand
| | - Phichayut Phinyo
- Department of Family Medicine, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
- Center for Clinical Epidemiology and Clinical Statistics, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
- Musculoskeletal Science and Translational Research Cluster, Chiang Mai University, Chiang Mai 50200, Thailand
- Correspondence:
| |
Collapse
|
15
|
Neumair M, Kattan MW, Freedland SJ, Haese A, Guerrios-Rivera L, De Hoedt AM, Liss MA, Leach RJ, Boorjian SA, Cooperberg MR, Poyet C, Saba K, Herkommer K, Meissner VH, Vickers AJ, Ankerst DP. Accommodating heterogeneous missing data patterns for prostate cancer risk prediction. BMC Med Res Methodol 2022; 22:200. [PMID: 35864460 PMCID: PMC9306143 DOI: 10.1186/s12874-022-01674-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 07/04/2022] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND We compared six commonly used logistic regression methods for accommodating missing risk factor data from multiple heterogeneous cohorts, in which some cohorts do not collect some risk factors at all, and developed an online risk prediction tool that accommodates missing risk factors from the end-user. METHODS Ten North American and European cohorts from the Prostate Biopsy Collaborative Group (PBCG) were used for fitting a risk prediction tool for clinically significant prostate cancer, defined as Gleason grade group ≥ 2 on standard TRUS prostate biopsy. One large European PBCG cohort was withheld for external validation, where calibration-in-the-large (CIL), calibration curves, and area-underneath-the-receiver-operating characteristic curve (AUC) were evaluated. Ten-fold leave-one-cohort-internal validation further validated the optimal missing data approach. RESULTS Among 12,703 biopsies from 10 training cohorts, 3,597 (28%) had clinically significant prostate cancer, compared to 1,757 of 5,540 (32%) in the external validation cohort. In external validation, the available cases method that pooled individual patient data containing all risk factors input by an end-user had best CIL, under-predicting risks as percentages by 2.9% on average, and obtained an AUC of 75.7%. Imputation had the worst CIL (-13.3%). The available cases method was further validated as optimal in internal cross-validation and thus used for development of an online risk tool. For end-users of the risk tool, two risk factors were mandatory: serum prostate-specific antigen (PSA) and age, and ten were optional: digital rectal exam, prostate volume, prior negative biopsy, 5-alpha-reductase-inhibitor use, prior PSA screen, African ancestry, Hispanic ethnicity, first-degree prostate-, breast-, and second-degree prostate-cancer family history. CONCLUSION Developers of clinical risk prediction tools should optimize use of available data and sources even in the presence of high amounts of missing data and offer options for users with missing risk factors.
Collapse
Affiliation(s)
- Matthias Neumair
- grid.6936.a0000000123222966Department of Life Sciences, Technical University of Munich, Freising, Germany
| | - Michael W. Kattan
- grid.239578.20000 0001 0675 4725Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, OH USA
| | - Stephen J. Freedland
- Section of Urology, Durham Veterans Administration Health Care System, Durham, NC USA ,grid.50956.3f0000 0001 2152 9905Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA USA
| | - Alexander Haese
- grid.13648.380000 0001 2180 3484Martini-Clinic Prostate Cancer Center, University Clinic Eppendorf, Hamburg, Germany
| | - Lourdes Guerrios-Rivera
- grid.509403.b0000 0004 0420 4000Department of Surgery, Urology Section, Veterans Affairs Caribbean Healthcare System, San Juan, Puerto Rico
| | - Amanda M. De Hoedt
- Section of Urology, Durham Veterans Administration Health Care System, Durham, NC USA
| | - Michael A. Liss
- grid.267309.90000 0001 0629 5880Department of Urology, University of Texas Health at San Antonio, San Antonio, TX USA
| | - Robin J. Leach
- grid.267309.90000 0001 0629 5880Department of Cell Systems and Anatomy, University of Texas Health at San Antonio, San Antonio, TX USA
| | - Stephen A. Boorjian
- grid.66875.3a0000 0004 0459 167XDepartment of Urology, Mayo Clinic, Rochester, MN USA
| | - Matthew R. Cooperberg
- grid.266102.10000 0001 2297 6811Departments of Urology and Epidemiology & Biostatistics, University of California San Francisco, San Francisco, CA USA
| | - Cedric Poyet
- grid.7400.30000 0004 1937 0650Department of Urology, University Hospital of Zurich, University of Zurich, Zurich, Switzerland
| | - Karim Saba
- grid.7400.30000 0004 1937 0650Department of Urology, University Hospital of Zurich, University of Zurich, Zurich, Switzerland ,grid.483344.c0000000406274213Urology Centre, Hirslanden Klinik Aarau, Aarau, Switzerland
| | - Kathleen Herkommer
- Department of Urology, University Hospital, Technical University of Munich, Munich, Germany
| | - Valentin H. Meissner
- Department of Urology, University Hospital, Technical University of Munich, Munich, Germany
| | - Andrew J. Vickers
- grid.51462.340000 0001 2171 9952Department of Epidemiology & Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY USA
| | - Donna P. Ankerst
- grid.6936.a0000000123222966Department of Life Sciences, Technical University of Munich, Freising, Germany ,grid.6936.a0000000123222966Department of Mathematics, Technical University of Munich, Boltzmannstrasse 3, Garching, Germany
| |
Collapse
|
16
|
Funada S, Luo Y, Yoshioka T, Setoh K, Tabara Y, Negoro H, Yoshimura K, Matsuda F, Efthimiou O, Ogawa O, Furukawa TA, Kobayashi T, Akamatsu S. Development and validation of prediction model for incident overactive bladder: The Nagahama study. Int J Urol 2022; 29:748-756. [PMID: 35393696 PMCID: PMC9546153 DOI: 10.1111/iju.14887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 03/21/2022] [Indexed: 11/10/2022]
Abstract
Objectives We aimed to develop models to predict new‐onset overactive bladder in 5 years using a large prospective cohort of the general population. Methods This is a secondary analysis of a longitudinal cohort study in Japan. The baseline characteristics were measured between 2008 and 2010, with follow‐ups every 5 years. We included subjects without overactive bladder at baseline and with follow‐up data 5 years later. Overactive bladder was assessed using the overactive bladder symptom score. Baseline characteristics (demographics, health behaviors, comorbidities, and overactive bladder symptom scores) and blood test data were included as predictors. We developed two competing prediction models for each sex based on logistic regression with penalized likelihood (LASSO). We chose the best model separately for men and women after evaluating models' performance in terms of discrimination and calibration using an internal validation via 200 bootstrap resamples and a temporal validation. Results We analyzed 7218 participants (male: 2238, female: 4980). The median age was 60 and 55 years, and the number of new‐onset overactive bladder was 223 (10.0%) and 288 (5.8%) per 5 years in males and females, respectively. The in‐sample estimates for C‐statistic, calibration intercept, and slope for the best performing models were 0.77 (95% confidence interval 0.74–0.80), 0.28 and 1.15 for males, and 0.77 (95% confidence interval 0.74–0.80), 0.20 and 1.08 for females. Internal and temporal validation gave broadly similar estimates of performance, indicating low optimism. Conclusion We developed risk prediction models for new‐onset overactive bladder among men and women with good predictive ability.
Collapse
Affiliation(s)
- Satoshi Funada
- Department of Urology, Kyoto University Graduate School of Medicine Faculty of Medicine, Kyoto, Japan.,Department of Health Promotion and Human Behavior, Kyoto University School of Public Health, Kyoto, Japan
| | - Yan Luo
- Department of Health Promotion and Human Behavior, Kyoto University School of Public Health, Kyoto, Japan
| | - Takashi Yoshioka
- Center for Innovative Research for Communities and Clinical Excellence (CiRC2LE), Fukushima Medical University, Fukushima City, Fukushima, Japan
| | - Kazuya Setoh
- Center for Genomic Medicine, Kyoto University Graduate School of Medicine Faculty of Medicine, Kyoto, Japan.,Graduate School of Public Health, Shizuoka Graduate University of Public Health, Shizuoka, Japan
| | - Yasuharu Tabara
- Center for Genomic Medicine, Kyoto University Graduate School of Medicine Faculty of Medicine, Kyoto, Japan.,Graduate School of Public Health, Shizuoka Graduate University of Public Health, Shizuoka, Japan
| | | | - Koji Yoshimura
- Department of Urology, Shizuoka General Hospital, Shizuoka, Japan
| | - Fumihiko Matsuda
- Center for Genomic Medicine, Kyoto University Graduate School of Medicine Faculty of Medicine, Kyoto, Japan
| | - Orestis Efthimiou
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland.,Department of Psychiatry, University of Oxford, Oxford, UK
| | - Osamu Ogawa
- Department of Urology, Kyoto University Graduate School of Medicine Faculty of Medicine, Kyoto, Japan
| | - Toshi A Furukawa
- Department of Health Promotion and Human Behavior, Kyoto University School of Public Health, Kyoto, Japan
| | - Takashi Kobayashi
- Department of Urology, Kyoto University Graduate School of Medicine Faculty of Medicine, Kyoto, Japan
| | - Shusuke Akamatsu
- Department of Urology, Kyoto University Graduate School of Medicine Faculty of Medicine, Kyoto, Japan
| |
Collapse
|
17
|
Validation and recalibration of OxMIV in predicting violent behaviour in patients with schizophrenia spectrum disorders. Sci Rep 2022; 12:461. [PMID: 35013451 PMCID: PMC8748785 DOI: 10.1038/s41598-021-04266-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 12/16/2021] [Indexed: 12/23/2022] Open
Abstract
Oxford Mental Illness and Violence (OxMIV) addresses the need in mental health services for a scalable, transparent and valid tool to predict violent behaviour in patients with severe mental illness. However, external validations are lacking. Therefore, we have used a Dutch sample of general psychiatric patients with schizophrenia spectrum disorders (N = 637) to evaluate the performance of OxMIV in predicting interpersonal violence over 3 years. The predictors and outcome were measured with standardized instruments and multiple sources of information. Patients were mostly male (n = 493, 77%) and, on average, 27 (SD = 7) years old. The outcome rate was 9% (n = 59). Discrimination, as measured by the area under the curve, was moderate at 0.67 (95% confidence interval 0.61–0.73). Calibration-in-the-large was adequate, with a ratio between predicted and observed events of 1.2 and a Brier score of 0.09. At the individual level, risks were systematically underestimated in the original model, which was remedied by recalibrating the intercept and slope of the model. Probability scores generated by the recalibrated model can be used as an adjunct to clinical decision-making in Dutch mental health services.
Collapse
|
18
|
Berkelmans G, Read S, Gudbjörnsdottir S, Wild S, Franzen S, van der Graaf Y, Eliasson B, Visseren F, Paynter N, Dorresteijn J. Population median imputation was noninferior to complex approaches for imputing missing values in cardiovascular prediction models in clinical practice. J Clin Epidemiol 2022; 145:70-80. [DOI: 10.1016/j.jclinepi.2022.01.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 12/05/2021] [Accepted: 01/17/2022] [Indexed: 02/06/2023]
|
19
|
Amaador K, Vos JMI, Pals ST, Kraan W, Dobber JA, Minnema MC, Koene HR, de Bruin PC, Zwinderman AH, Kersten MJ. Discriminating between Waldenström macroglobulinemia and marginal zone lymphoma using logistic LASSO regression. Leuk Lymphoma 2021; 63:1070-1079. [DOI: 10.1080/10428194.2021.2018584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Karima Amaador
- Department of Hematology, Amsterdam UMC, University of Amsterdam, Cancer Center Amsterdam and LYMMCARE (Lymphoma and Myeloma Center Amsterdam), Amsterdam, The Netherlands
| | - Josephine M. I. Vos
- Department of Hematology, Amsterdam UMC, University of Amsterdam, Cancer Center Amsterdam and LYMMCARE (Lymphoma and Myeloma Center Amsterdam), Amsterdam, The Netherlands
| | - Steven T. Pals
- Department of Pathology, Amsterdam UMC, University of Amsterdam, Cancer Center Amsterdam and LYMMCARE (Lymphoma and Myeloma Center Amsterdam), Amsterdam, The Netherlands
| | - Willem Kraan
- Department of Pathology, Amsterdam UMC, University of Amsterdam, Cancer Center Amsterdam and LYMMCARE (Lymphoma and Myeloma Center Amsterdam), Amsterdam, The Netherlands
| | - Johan A. Dobber
- Laboratory of Hematology, Amsterdam UMC, Amsterdam, The Netherlands
| | - Monique C. Minnema
- Department of Hematology, University Medical Center, University Utrecht, Utrecht, The Netherlands
| | - Harry R. Koene
- Department of Internal Medicine, St. Antonius Hospital, Nieuwegein, The Netherlands
| | - Peter C. de Bruin
- Department of Pathology, St. Antonius Hospital, Nieuwegein, The Netherlands
| | - Aiko H. Zwinderman
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Amsterdam UMC, University of Amsterdam, The Netherlands
| | - Marie José Kersten
- Department of Hematology, Amsterdam UMC, University of Amsterdam, Cancer Center Amsterdam and LYMMCARE (Lymphoma and Myeloma Center Amsterdam), Amsterdam, The Netherlands
| |
Collapse
|
20
|
Nijman S, Leeuwenberg AM, Beekers I, Verkouter I, Jacobs J, Bots ML, Asselbergs FW, Moons K, Debray T. Missing data is poorly handled and reported in prediction model studies using machine learning: a literature review. J Clin Epidemiol 2021; 142:218-229. [PMID: 34798287 DOI: 10.1016/j.jclinepi.2021.11.023] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 11/01/2021] [Accepted: 11/10/2021] [Indexed: 12/23/2022]
Abstract
OBJECTIVES Missing data is a common problem during the development, evaluation, and implementation of prediction models. Although machine learning (ML) methods are often said to be capable of circumventing missing data, it is unclear how these methods are used in medical research. We aim to find out if and how well prediction model studies using machine learning report on their handling of missing data. STUDY DESIGN AND SETTING We systematically searched the literature on published papers between 2018 and 2019 about primary studies developing and/or validating clinical prediction models using any supervised ML methodology across medical fields. From the retrieved studies information about the amount and nature (e.g. missing completely at random, potential reasons for missingness) of missing data and the way they were handled were extracted. RESULTS We identified 152 machine learning-based clinical prediction model studies. A substantial amount of these 152 papers did not report anything on missing data (n = 56/152). A majority (n = 96/152) reported details on the handling of missing data (e.g., methods used), though many of these (n = 46/96) did not report the amount of the missingness in the data. In these 96 papers the authors only sometimes reported possible reasons for missingness (n = 7/96) and information about missing data mechanisms (n = 8/96). The most common approach for handling missing data was deletion (n = 65/96), mostly via complete-case analysis (CCA) (n = 43/96). Very few studies used multiple imputation (n = 8/96) or built-in mechanisms such as surrogate splits (n = 7/96) that directly address missing data during the development, validation, or implementation of the prediction model. CONCLUSION Though missing values are highly common in any type of medical research and certainly in the research based on routine healthcare data, a majority of the prediction model studies using machine learning does not report sufficient information on the presence and handling of missing data. Strategies in which patient data are simply omitted are unfortunately the most often used methods, even though it is generally advised against and well known that it likely causes bias and loss of analytical power in prediction model development and in the predictive accuracy estimates. Prediction model researchers should be much more aware of alternative methodologies to address missing data.
Collapse
Affiliation(s)
- Swj Nijman
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, Utrecht, 3584 CX , The Netherlands.
| | - A M Leeuwenberg
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, Utrecht, 3584 CX , The Netherlands
| | - I Beekers
- Department of Health, Ortec B.V. Zoetermeer, The Netherlands
| | - I Verkouter
- Department of Health, Ortec B.V. Zoetermeer, The Netherlands
| | - Jjl Jacobs
- Department of Health, Ortec B.V. Zoetermeer, The Netherlands
| | - M L Bots
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, Utrecht, 3584 CX , The Netherlands
| | - F W Asselbergs
- Department of Cardiology, University Medical Center Utrecht, Utrecht University, The Netherlands; Institute of Cardiovascular Science, Population Health Sciences, University College London, London, UK; Health Data Research UK, Institute of Health Informatics, University College London, London, UK
| | - Kgm Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, Utrecht, 3584 CX , The Netherlands
| | - Tpa Debray
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, Utrecht, 3584 CX , The Netherlands; Health Data Research UK, Institute of Health Informatics, University College London, London, UK
| |
Collapse
|
21
|
Tsvetanova A, Sperrin M, Peek N, Buchan I, Hyland S, Martin GP. Missing data was handled inconsistently in UK prediction models: a review of method used. J Clin Epidemiol 2021; 140:149-158. [PMID: 34520847 DOI: 10.1016/j.jclinepi.2021.09.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 08/17/2021] [Accepted: 09/07/2021] [Indexed: 10/20/2022]
Abstract
OBJECTIVES No clear guidance exists on handling missing data at each stage of developing, validating and implementing a clinical prediction model (CPM). We aimed to review the approaches to handling missing data that underly the CPMs currently recommended for use in UK healthcare. STUDY DESIGN AND SETTING A descriptive cross-sectional meta-epidemiological study aiming to identify CPMs recommended by the National Institute for Health and Care Excellence (NICE), which summarized how missing data is handled across their pipelines. RESULTS A total of 23 CPMs were included through "sampling strategy." Six missing data strategies were identified: complete case analysis (CCA), multiple imputation, imputation of mean values, k-nearest neighbours imputation, using an additional category for missingness, considering missing values as risk-factor-absent. 52% of the development articles and 48% of the validation articles did not report how missing data were handled. CCA was the most common approach used for development (40%) and validation (44%). At implementation, 57% of the CPMs required complete data entry, whilst 43% allowed missing values. Three CPMs had consistent paths in their pipelines. CONCLUSION A broad variety of methods for handling missing data underly the CPMs currently recommended for use in UK healthcare. Missing data handling strategies were generally inconsistent. Better quality assurance of CPMs needs greater clarity and consistency in handling of missing data.
Collapse
Affiliation(s)
- Antonia Tsvetanova
- Centre for Health Informatics, Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK.
| | - Matthew Sperrin
- Centre for Health Informatics, Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Niels Peek
- Centre for Health Informatics, Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK; NIHR Manchester Biomedical Research Centre, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Iain Buchan
- Centre for Health Informatics, Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK; Institute of Population Health, The University of Liverpool, Liverpool, UK
| | | | - Glen P Martin
- Centre for Health Informatics, Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| |
Collapse
|
22
|
Li P, Taylor JMG, Spratt DE, Karnes RJ, Schipper MJ. Evaluation of predictive model performance of an existing model in the presence of missing data. Stat Med 2021; 40:3477-3498. [PMID: 33843085 DOI: 10.1002/sim.8978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2020] [Revised: 02/13/2021] [Accepted: 03/24/2021] [Indexed: 11/11/2022]
Abstract
In medical research, the Brier score (BS) and the area under the receiver operating characteristic (ROC) curves (AUC) are two common metrics used to evaluate prediction models of a binary outcome, such as using biomarkers to predict the risk of developing a disease in the future. The assessment of an existing prediction models using data with missing covariate values is challenging. In this article, we propose inverse probability weighted (IPW) and augmented inverse probability weighted (AIPW) estimates of AUC and BS to handle the missing data. An alternative approach uses multiple imputation (MI), which requires a model for the distribution of the missing variable. We evaluated the performance of IPW and AIPW in comparison with MI in simulation studies under missing completely at random, missing at random, and missing not at random scenarios. When there are missing observations in the data, MI and IPW can be used to obtain unbiased estimates of BS and AUC if the imputation model for the missing variable or the model for the missingness is correctly specified. MI is more efficient than IPW. Our simulation results suggest that AIPW can be more efficient than IPW, and also achieves double robustness from miss-specification of either the missingness model or the imputation model. The outcome variable should be included in the model for the missing variable under all scenarios, while it only needs to be included in missingness model if the missingness depends on the outcome. We illustrate these methods using an example from prostate cancer.
Collapse
Affiliation(s)
- Pin Li
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA
| | - Jeremy M G Taylor
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA.,Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan, USA
| | - Daniel E Spratt
- Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan, USA
| | | | - Matthew J Schipper
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA.,Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
23
|
Nijman SWJ, Groenhof TKJ, Hoogland J, Bots ML, Brandjes M, Jacobs JJL, Asselbergs FW, Moons KGM, Debray TPA. Real-time imputation of missing predictor values improved the application of prediction models in daily practice. J Clin Epidemiol 2021; 134:22-34. [PMID: 33482294 DOI: 10.1016/j.jclinepi.2021.01.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 12/24/2020] [Accepted: 01/12/2021] [Indexed: 12/20/2022]
Abstract
OBJECTIVES In clinical practice, many prediction models cannot be used when predictor values are missing. We, therefore, propose and evaluate methods for real-time imputation. STUDY DESIGN AND SETTING We describe (i) mean imputation (where missing values are replaced by the sample mean), (ii) joint modeling imputation (JMI, where we use a multivariate normal approximation to generate patient-specific imputations), and (iii) conditional modeling imputation (CMI, where a multivariable imputation model is derived for each predictor from a population). We compared these methods in a case study evaluating the root mean squared error (RMSE) and coverage of the 95% confidence intervals (i.e., the proportion of confidence intervals that contain the true predictor value) of imputed predictor values. RESULTS -RMSE was lowest when adopting JMI or CMI, although imputation of individual predictors did not always lead to substantial improvements as compared to mean imputation. JMI and CMI appeared particularly useful when the values of multiple predictors of the model were missing. Coverage reached the nominal level (i.e., 95%) for both CMI and JMI. CONCLUSION Multiple imputations using either CMI or JMI is recommended when dealing with missing predictor values in real-time settings.
Collapse
Affiliation(s)
- Steven Willem Joost Nijman
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands.
| | - T Katrien J Groenhof
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Jeroen Hoogland
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Michiel L Bots
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | | | | | - Folkert W Asselbergs
- Division Heart & Lungs, Department of Cardiology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands; Institute of Cardiovascular Science, Faculty of Population Health Sciences, University College London, London, UK; Health Data Research UK, Institute of Health Informatics, University College London, London, UK
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Thomas P A Debray
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands; Institute of Cardiovascular Science, Faculty of Population Health Sciences, University College London, London, UK
| |
Collapse
|
24
|
Nijman SWJ, Hoogland J, Groenhof TKJ, Brandjes M, Jacobs JJL, Bots ML, Asselbergs FW, Moons KGM, Debray TPA. Real-time imputation of missing predictor values in clinical practice. EUROPEAN HEART JOURNAL. DIGITAL HEALTH 2020; 2:154-164. [PMID: 36711167 PMCID: PMC9707891 DOI: 10.1093/ehjdh/ztaa016] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 11/02/2020] [Accepted: 11/30/2020] [Indexed: 02/01/2023]
Abstract
Aims Use of prediction models is widely recommended by clinical guidelines, but usually requires complete information on all predictors, which is not always available in daily practice. We aim to describe two methods for real-time handling of missing predictor values when using prediction models in practice. Methods and results We compare the widely used method of mean imputation (M-imp) to a method that personalizes the imputations by taking advantage of the observed patient characteristics. These characteristics may include both prediction model variables and other characteristics (auxiliary variables). The method was implemented using imputation from a joint multivariate normal model of the patient characteristics (joint modelling imputation; JMI). Data from two different cardiovascular cohorts with cardiovascular predictors and outcome were used to evaluate the real-time imputation methods. We quantified the prediction model's overall performance [mean squared error (MSE) of linear predictor], discrimination (c-index), calibration (intercept and slope), and net benefit (decision curve analysis). When compared with mean imputation, JMI substantially improved the MSE (0.10 vs. 0.13), c-index (0.70 vs. 0.68), and calibration (calibration-in-the-large: 0.04 vs. 0.06; calibration slope: 1.01 vs. 0.92), especially when incorporating auxiliary variables. When the imputation method was based on an external cohort, calibration deteriorated, but discrimination remained similar. Conclusions We recommend JMI with auxiliary variables for real-time imputation of missing values, and to update imputation models when implementing them in new settings or (sub)populations.
Collapse
Affiliation(s)
- Steven W J Nijman
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands,Corresponding author. Tel: +31 88 75 680 12,
| | - Jeroen Hoogland
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands
| | - T Katrien J Groenhof
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands
| | - Menno Brandjes
- Department of Health, Ortec B.V., Zoetermeer, Houtsingel 5, 2719 EA Zoetermeer, The Netherlands
| | - John J L Jacobs
- Department of Health, Ortec B.V., Zoetermeer, Houtsingel 5, 2719 EA Zoetermeer, The Netherlands
| | - Michiel L Bots
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands
| | - Folkert W Asselbergs
- Department of Cardiology, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands,Institute of Cardiovascular Science, Faculty of Population Health Sciences, University College London, 62 Huntley St, Fitzrovia, London WC1E 6DD, UK,Health Data Research UK, Institute of Health Informatics, University College London, Gibbs Building, 215 Euston Rd, London NW1 2BE, UK
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands
| | - Thomas P A Debray
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands,Health Data Research UK, Institute of Health Informatics, University College London, Gibbs Building, 215 Euston Rd, London NW1 2BE, UK
| |
Collapse
|
25
|
Adibi A, Sadatsafavi M, Ioannidis JPA. Testing Clinical Prediction Models-Reply. JAMA 2020; 324:2000. [PMID: 33201201 DOI: 10.1001/jama.2020.19413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Affiliation(s)
- Amin Adibi
- Faculty of Pharmaceutical Sciences, University of British Columbia, Vancouver, British Columbia, Canada
| | - Mohsen Sadatsafavi
- Faculty of Pharmaceutical Sciences, University of British Columbia, Vancouver, British Columbia, Canada
| | - John P A Ioannidis
- Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, California
| |
Collapse
|
26
|
Hoogland J, van Barreveld M, Debray TPA, Reitsma JB, Verstraelen TE, Dijkgraaf MGW, Zwinderman AH. Handling missing predictor values when validating and applying a prediction model to new patients. Stat Med 2020; 39:3591-3607. [PMID: 32687233 PMCID: PMC7586995 DOI: 10.1002/sim.8682] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Revised: 05/10/2020] [Accepted: 06/10/2020] [Indexed: 12/23/2022]
Abstract
Missing data present challenges for development and real‐world application of clinical prediction models. While these challenges have received considerable attention in the development setting, there is only sparse research on the handling of missing data in applied settings. The main unique feature of handling missing data in these settings is that missing data methods have to be performed for a single new individual, precluding direct application of mainstay methods used during model development. Correspondingly, we propose that it is desirable to perform model validation using missing data methods that transfer to practice in single new patients. This article compares existing and new methods to account for missing data for a new individual in the context of prediction. These methods are based on (i) submodels based on observed data only, (ii) marginalization over the missing variables, or (iii) imputation based on fully conditional specification (also known as chained equations). They were compared in an internal validation setting to highlight the use of missing data methods that transfer to practice while validating a model. As a reference, they were compared to the use of multiple imputation by chained equations in a set of test patients, because this has been used in validation studies in the past. The methods were evaluated in a simulation study where performance was measured by means of optimism corrected C‐statistic and mean squared prediction error. Furthermore, they were applied in data from a large Dutch cohort of prophylactic implantable cardioverter defibrillator patients.
Collapse
Affiliation(s)
- Jeroen Hoogland
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Marit van Barreveld
- Department of Clinical Epidemiology, Biostatistics, & Bioinformatics, Academic Medical Center, Amsterdam University Medical Centers, Amsterdam, The Netherlands.,Heart Center, Department of Cardiology, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, The Netherlands
| | - Thomas P A Debray
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.,Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Johannes B Reitsma
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.,Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Tom E Verstraelen
- Heart Center, Department of Cardiology, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, The Netherlands
| | - Marcel G W Dijkgraaf
- Department of Clinical Epidemiology, Biostatistics, & Bioinformatics, Academic Medical Center, Amsterdam University Medical Centers, Amsterdam, The Netherlands
| | - Aeilko H Zwinderman
- Department of Clinical Epidemiology, Biostatistics, & Bioinformatics, Academic Medical Center, Amsterdam University Medical Centers, Amsterdam, The Netherlands
| |
Collapse
|