1
|
Rockenschaub P, Hilbert A, Kossen T, Elbers P, von Dincklage F, Madai VI, Frey D. The Impact of Multi-Institution Datasets on the Generalizability of Machine Learning Prediction Models in the ICU. Crit Care Med 2024:00003246-990000000-00357. [PMID: 38958568 DOI: 10.1097/ccm.0000000000006359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
OBJECTIVES To evaluate the transferability of deep learning (DL) models for the early detection of adverse events to previously unseen hospitals. DESIGN Retrospective observational cohort study utilizing harmonized intensive care data from four public datasets. SETTING ICUs across Europe and the United States. PATIENTS Adult patients admitted to the ICU for at least 6 hours who had good data quality. INTERVENTIONS None. MEASUREMENTS AND MAIN RESULTS Using carefully harmonized data from a total of 334,812 ICU stays, we systematically assessed the transferability of DL models for three common adverse events: death, acute kidney injury (AKI), and sepsis. We tested whether using more than one data source and/or algorithmically optimizing for generalizability during training improves model performance at new hospitals. We found that models achieved high area under the receiver operating characteristic (AUROC) for mortality (0.838-0.869), AKI (0.823-0.866), and sepsis (0.749-0.824) at the training hospital. As expected, AUROC dropped when models were applied at other hospitals, sometimes by as much as -0.200. Using more than one dataset for training mitigated the performance drop, with multicenter models performing roughly on par with the best single-center model. Dedicated methods promoting generalizability did not noticeably improve performance in our experiments. CONCLUSIONS Our results emphasize the importance of diverse training data for DL-based risk prediction. They suggest that as data from more hospitals become available for training, models may become increasingly generalizable. Even so, good performance at a new hospital still depended on the inclusion of compatible hospitals during training.
Collapse
Affiliation(s)
- Patrick Rockenschaub
- Charité Lab for Artificial Intelligence in Medicine (CLAIM), CharitéUniversitätsmedizin Berlin, Berlin, Germany
- QUEST Center for Responsible Research, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Berlin, Germany
- Institute of Clinical Epidemiology, Public Health, Health Economics, Medical Statistics and Informatics, Medical University of Innsbruck, Innsbruck, Austria
| | - Adam Hilbert
- Charité Lab for Artificial Intelligence in Medicine (CLAIM), CharitéUniversitätsmedizin Berlin, Berlin, Germany
| | - Tabea Kossen
- Charité Lab for Artificial Intelligence in Medicine (CLAIM), CharitéUniversitätsmedizin Berlin, Berlin, Germany
| | - Paul Elbers
- Department of Intensive Care Medicine, Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| | - Falk von Dincklage
- Department of Anesthesia, Intensive Care, Emergency and Pain Medicine, Universitätsmedizin Greifswald, Greifswald, Germany
| | - Vince Istvan Madai
- QUEST Center for Responsible Research, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Berlin, Germany
- Faculty of Computing, Engineering and the Built Environment, School of Computing and Digital Technology, Birmingham City University, Birmingham, United Kingdom
| | - Dietmar Frey
- Charité Lab for Artificial Intelligence in Medicine (CLAIM), CharitéUniversitätsmedizin Berlin, Berlin, Germany
| |
Collapse
|
2
|
Gisbert JP, Chaparro M. Tips and tricks for successfully conducting a multicenter study. GASTROENTEROLOGIA Y HEPATOLOGIA 2024; 47:649-660. [PMID: 38072361 DOI: 10.1016/j.gastrohep.2023.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 11/22/2023] [Accepted: 12/04/2023] [Indexed: 01/09/2024]
Abstract
Multicenter studies play a crucial role in medical research and advancement, facilitating the application of new knowledge to clinical practice. These studies are associated with multiple benefits but are more complex than those involving a single center. With the philosophy that most of the qualities required to lead a multicenter study depend on attitude and can be learned, developed, and improved, in this manuscript, we share with the reader a series of recommendations that we consider important for successfully conducting such studies. The tips and tricks that will be discussed in detail are as follows: effectively leading the project; clearly defining viable and relevant objectives; designing a clear and detailed protocol; carefully selecting centers and collaborating investigators; meticulously designing the case report form; centrally managing the project efficiently; maintaining fluent communication with investigators; and, finally, designing a clear authorship policy and ensuring the appropriate publication of the study results. We hope that these suggestions encourage potential researchers to conduct multicenter studies, thereby collectively enhancing the quality of research and its application to clinical practice.
Collapse
Affiliation(s)
- Javier P Gisbert
- Servicio de Aparato Digestivo, Hospital Universitario de La Princesa, Instituto de Investigación Sanitaria Princesa (IIS-Princesa), Universidad Autónoma de Madrid (UAM), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Madrid, España.
| | - María Chaparro
- Servicio de Aparato Digestivo, Hospital Universitario de La Princesa, Instituto de Investigación Sanitaria Princesa (IIS-Princesa), Universidad Autónoma de Madrid (UAM), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Madrid, España
| |
Collapse
|
3
|
Abstract
BACKGROUND Clinical prediction models should be validated before implementation in clinical practice. But is favorable performance at internal validation or one external validation sufficient to claim that a prediction model works well in the intended clinical context? MAIN BODY We argue to the contrary because (1) patient populations vary, (2) measurement procedures vary, and (3) populations and measurements change over time. Hence, we have to expect heterogeneity in model performance between locations and settings, and across time. It follows that prediction models are never truly validated. This does not imply that validation is not important. Rather, the current focus on developing new models should shift to a focus on more extensive, well-conducted, and well-reported validation studies of promising models. CONCLUSION Principled validation strategies are needed to understand and quantify heterogeneity, monitor performance over time, and update prediction models when appropriate. Such strategies will help to ensure that prediction models stay up-to-date and safe to support clinical decision-making.
Collapse
|
4
|
Debray TPA, Collins GS, Riley RD, Snell KIE, Van Calster B, Reitsma JB, Moons KGM. Transparent reporting of multivariable prediction models developed or validated using clustered data (TRIPOD-Cluster): explanation and elaboration. BMJ 2023; 380:e071058. [PMID: 36750236 PMCID: PMC9903176 DOI: 10.1136/bmj-2022-071058] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/07/2022] [Indexed: 02/09/2023]
Affiliation(s)
- Thomas P A Debray
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Oxford, UK
- National Institute for Health and Care Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, UK
| | - Richard D Riley
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| | - Kym I E Snell
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- EPI-centre, KU Leuven, Leuven, Belgium
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, Netherlands
| | - Johannes B Reitsma
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
5
|
Das MK. Multicenter Studies: Relevance, Design and Implementation. Indian Pediatr 2022. [DOI: 10.1007/s13312-022-2561-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
6
|
Weaver CGW, Basmadjian RB, Williamson T, McBrien K, Sajobi T, Boyne D, Yusuf M, Ronksley PE. Reporting of Model Performance and Statistical Methods in Studies That Use Machine Learning to Develop Clinical Prediction Models: Protocol for a Systematic Review. JMIR Res Protoc 2022; 11:e30956. [PMID: 35238322 PMCID: PMC8931652 DOI: 10.2196/30956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Revised: 12/09/2021] [Accepted: 12/31/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND With the growing excitement of the potential benefits of using machine learning and artificial intelligence in medicine, the number of published clinical prediction models that use these approaches has increased. However, there is evidence (albeit limited) that suggests that the reporting of machine learning-specific aspects in these studies is poor. Further, there are no reviews assessing the reporting quality or broadly accepted reporting guidelines for these aspects. OBJECTIVE This paper presents the protocol for a systematic review that will assess the reporting quality of machine learning-specific aspects in studies that use machine learning to develop clinical prediction models. METHODS We will include studies that use a supervised machine learning algorithm to develop a prediction model for use in clinical practice (ie, for diagnosis or prognosis of a condition or identification of candidates for health care interventions). We will search MEDLINE for studies published in 2019, pseudorandomly sort the records, and screen until we obtain 100 studies that meet our inclusion criteria. We will assess reporting quality with a novel checklist developed in parallel with this review, which includes content derived from existing reporting guidelines, textbooks, and consultations with experts. The checklist will cover 4 key areas where the reporting of machine learning studies is unique: modelling steps (order and data used for each step), model performance (eg, reporting the performance of each model compared), statistical methods (eg, describing the tuning approach), and presentation of models (eg, specifying the predictors that contributed to the final model). RESULTS We completed data analysis in August 2021 and are writing the manuscript. We expect to submit the results to a peer-reviewed journal in early 2022. CONCLUSIONS This review will contribute to more standardized and complete reporting in the field by identifying areas where reporting is poor and can be improved. TRIAL REGISTRATION PROSPERO International Prospective Register of Systematic Reviews CRD42020206167; https://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=206167. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) RR1-10.2196/30956.
Collapse
Affiliation(s)
- Colin George Wyllie Weaver
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Robert B Basmadjian
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Tyler Williamson
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Kerry McBrien
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Family Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Tolu Sajobi
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Devon Boyne
- Department of Oncology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Mohamed Yusuf
- Faculty of Science & Engineering, Manchester Metropolitan University, Manchester, United Kingdom
| | - Paul Everett Ronksley
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
7
|
Sanchez-Martinez S, Camara O, Piella G, Cikes M, González-Ballester MÁ, Miron M, Vellido A, Gómez E, Fraser AG, Bijnens B. Machine Learning for Clinical Decision-Making: Challenges and Opportunities in Cardiovascular Imaging. Front Cardiovasc Med 2022; 8:765693. [PMID: 35059445 PMCID: PMC8764455 DOI: 10.3389/fcvm.2021.765693] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 12/07/2021] [Indexed: 11/30/2022] Open
Abstract
The use of machine learning (ML) approaches to target clinical problems is called to revolutionize clinical decision-making in cardiology. The success of these tools is dependent on the understanding of the intrinsic processes being used during the conventional pathway by which clinicians make decisions. In a parallelism with this pathway, ML can have an impact at four levels: for data acquisition, predominantly by extracting standardized, high-quality information with the smallest possible learning curve; for feature extraction, by discharging healthcare practitioners from performing tedious measurements on raw data; for interpretation, by digesting complex, heterogeneous data in order to augment the understanding of the patient status; and for decision support, by leveraging the previous steps to predict clinical outcomes, response to treatment or to recommend a specific intervention. This paper discusses the state-of-the-art, as well as the current clinical status and challenges associated with the two later tasks of interpretation and decision support, together with the challenges related to the learning process, the auditability/traceability, the system infrastructure and the integration within clinical processes in cardiovascular imaging.
Collapse
Affiliation(s)
| | - Oscar Camara
- Department of Information and Communication Technologies, University Pompeu Fabra, Barcelona, Spain
| | - Gemma Piella
- Department of Information and Communication Technologies, University Pompeu Fabra, Barcelona, Spain
| | - Maja Cikes
- Department of Cardiovascular Diseases, University of Zagreb School of Medicine, University Hospital Centre Zagreb, Zagreb, Croatia
| | | | - Marius Miron
- Joint Research Centre, European Commission, Seville, Spain
| | - Alfredo Vellido
- Computer Science Department, Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Center, Universitat Politècnica de Catalunya, Barcelona, Spain
| | - Emilia Gómez
- Department of Information and Communication Technologies, University Pompeu Fabra, Barcelona, Spain
- Joint Research Centre, European Commission, Seville, Spain
| | - Alan G. Fraser
- School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - Bart Bijnens
- August Pi i Sunyer Biomedical Research Institute (IDIBAPS), Barcelona, Spain
- ICREA, Barcelona, Spain
- Department of Cardiovascular Sciences, KU Leuven, Leuven, Belgium
| |
Collapse
|
8
|
Pavlou M, Ambler G, Omar RZ. Risk prediction in multicentre studies when there is confounding by cluster or informative cluster size. BMC Med Res Methodol 2021; 21:135. [PMID: 34218793 PMCID: PMC8254921 DOI: 10.1186/s12874-021-01321-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 05/19/2021] [Indexed: 12/04/2022] Open
Abstract
Background Clustered data arise in research when patients are clustered within larger units. Generalised Estimating Equations (GEE) and Generalised Linear Models (GLMM) can be used to provide marginal and cluster-specific inference and predictions, respectively. Methods Confounding by Cluster (CBC) and Informative cluster size (ICS) are two complications that may arise when modelling clustered data. CBC can arise when the distribution of a predictor variable (termed ‘exposure’), varies between clusters causing confounding of the exposure-outcome relationship. ICS means that the cluster size conditional on covariates is not independent of the outcome. In both situations, standard GEE and GLMM may provide biased or misleading inference, and modifications have been proposed. However, both CBC and ICS are routinely overlooked in the context of risk prediction, and their impact on the predictive ability of the models has been little explored. We study the effect of CBC and ICS on the predictive ability of risk models for binary outcomes when GEE and GLMM are used. We examine whether two simple approaches to handle CBC and ICS, which involve adjusting for the cluster mean of the exposure and the cluster size, respectively, can improve the accuracy of predictions. Results Both CBC and ICS can be viewed as violations of the assumptions in the standard GLMM; the random effects are correlated with exposure for CBC and cluster size for ICS. Based on these principles, we simulated data subject to CBC/ICS. The simulation studies suggested that the predictive ability of models derived from using standard GLMM and GEE ignoring CBC/ICS was affected. Marginal predictions were found to be mis-calibrated. Adjusting for the cluster-mean of the exposure or the cluster size improved calibration, discrimination and the overall predictive accuracy of marginal predictions, by explaining part of the between cluster variability. The presence of CBC/ICS did not affect the accuracy of conditional predictions. We illustrate these concepts using real data from a multicentre study with potential CBC. Conclusion Ignoring CBC and ICS when developing prediction models for clustered data can affect the accuracy of marginal predictions. Adjusting for the cluster mean of the exposure or the cluster size can improve the predictive accuracy of marginal predictions. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-021-01321-x.
Collapse
|
9
|
Al Turkestani N, Bianchi J, Deleat-Besson R, Le C, Tengfei L, Prieto JC, Gurgel M, Ruellas ACO, Massaro C, Aliaga Del Castillo A, Evangelista K, Yatabe M, Benavides E, Soki F, Zhang W, Najarian K, Gryak J, Styner M, Fillion-Robin JC, Paniagua B, Soroushmehr R, Cevidanes LHS. Clinical decision support systems in orthodontics: A narrative review of data science approaches. Orthod Craniofac Res 2021; 24 Suppl 2:26-36. [PMID: 33973362 DOI: 10.1111/ocr.12492] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Revised: 04/15/2021] [Accepted: 05/04/2021] [Indexed: 12/27/2022]
Abstract
Advancements in technology and data collection generated immense amounts of information from various sources such as health records, clinical examination, imaging, medical devices, as well as experimental and biological data. Proper management and analysis of these data via high-end computing solutions, artificial intelligence and machine learning approaches can assist in extracting meaningful information that enhances population health and well-being. Furthermore, the extracted knowledge can provide new avenues for modern healthcare delivery via clinical decision support systems. This manuscript presents a narrative review of data science approaches for clinical decision support systems in orthodontics. We describe the fundamental components of data science approaches including (a) Data collection, storage and management; (b) Data processing; (c) In-depth data analysis; and (d) Data communication. Then, we introduce a web-based data management platform, the Data Storage for Computation and Integration, for temporomandibular joint and dental clinical decision support systems.
Collapse
Affiliation(s)
- Najla Al Turkestani
- Department of Orthodontics and Pediatric Dentistry, University of Michigan School of Dentistry, Ann Arbor, MI, USA.,Department of Restorative and Aesthetic Dentistry, Faculty of Dentistry, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Jonas Bianchi
- Department of Orthodontics and Pediatric Dentistry, University of Michigan School of Dentistry, Ann Arbor, MI, USA.,Department of Orthodontics, Arthur A. Dugoni School of Dentistry, University of the Pacific, San Francisco, CA, USA
| | - Romain Deleat-Besson
- Department of Orthodontics and Pediatric Dentistry, University of Michigan School of Dentistry, Ann Arbor, MI, USA
| | - Celia Le
- Department of Orthodontics and Pediatric Dentistry, University of Michigan School of Dentistry, Ann Arbor, MI, USA
| | - Li Tengfei
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| | - Juan Carlos Prieto
- Department of Psychiatry, University of North Carolina, Chapel Hill, NC, USA
| | - Marcela Gurgel
- Department of Orthodontics and Pediatric Dentistry, University of Michigan School of Dentistry, Ann Arbor, MI, USA
| | - Antonio C O Ruellas
- Department of Orthodontics and Pediatric Dentistry, University of Michigan School of Dentistry, Ann Arbor, MI, USA.,Department of Orthodontics, School of Dentistry, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Camila Massaro
- Department of Orthodontics and Pediatric Dentistry, University of Michigan School of Dentistry, Ann Arbor, MI, USA.,Department of Orthodontics, Bauru Dental School, University of São Paulo, São Paulo, Brazil
| | - Aron Aliaga Del Castillo
- Department of Orthodontics and Pediatric Dentistry, University of Michigan School of Dentistry, Ann Arbor, MI, USA.,Department of Orthodontics, Bauru Dental School, University of São Paulo, São Paulo, Brazil
| | - Karine Evangelista
- Department of Orthodontics and Pediatric Dentistry, University of Michigan School of Dentistry, Ann Arbor, MI, USA.,Department of Orthodontics, School of Dentistry, University of Goias, Goiania, Brazil
| | - Marilia Yatabe
- Department of Orthodontics and Pediatric Dentistry, University of Michigan School of Dentistry, Ann Arbor, MI, USA
| | - Erika Benavides
- Department of Periodontics and Oral Medicine, University of Michigan School of Dentistry, Ann Arbor, MI, USA
| | - Fabiana Soki
- Department of Periodontics and Oral Medicine, University of Michigan School of Dentistry, Ann Arbor, MI, USA
| | - Winston Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Kayvan Najarian
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Jonathan Gryak
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Martin Styner
- Departments Psychiatry and Computer Science, University of North Carolina, Chapel Hill, NC, USA
| | | | | | - Reza Soroushmehr
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Lucia H S Cevidanes
- Department of Orthodontics and Pediatric Dentistry, University of Michigan School of Dentistry, Ann Arbor, MI, USA
| |
Collapse
|
10
|
Chappell FM, Crawford F, Horne M, Leese GP, Martin A, Weller D, Boulton AJM, Abbott C, Monteiro-Soares M, Veves A, Riley RD. Development and validation of a clinical prediction rule for development of diabetic foot ulceration: an analysis of data from five cohort studies. BMJ Open Diabetes Res Care 2021; 9:9/1/e002150. [PMID: 34035053 PMCID: PMC8154962 DOI: 10.1136/bmjdrc-2021-002150] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 03/05/2021] [Accepted: 04/03/2021] [Indexed: 11/16/2022] Open
Abstract
INTRODUCTION The aim of the study was to develop and validate a clinical prediction rule (CPR) for foot ulceration in people with diabetes. RESEARCH DESIGN AND METHODS Development of a CPR using individual participant data from four international cohort studies identified by systematic review, with validation in a fifth study. Development cohorts were from primary and secondary care foot clinics in Europe and the USA (n=8255, adults over 18 years old, with diabetes, ulcer free at recruitment). Using data from monofilament testing, presence/absence of pulses, and participant history of previous ulcer and/or amputation, we developed a simple CPR to predict who will develop a foot ulcer within 2 years of initial assessment and validated it in a fifth study (n=3324). The CPR's performance was assessed with C-statistics, calibration slopes, calibration-in-the-large, and a net benefit analysis. RESULTS CPR scores of 0, 1, 2, 3, and 4 had a risk of ulcer within 2 years of 2.4% (95% CI 1.5% to 3.9%), 6.0% (95% CI 3.5% to 9.5%), 14.0% (95% CI 8.5% to 21.3%), 29.2% (95% CI 19.2% to 41.0%), and 51.1% (95% CI 37.9% to 64.1%), respectively. In the validation dataset, calibration-in-the-large was -0.374 (95% CI -0.561 to -0.187) and calibration slope 1.139 (95% CI 0.994 to 1.283). The C-statistic was 0.829 (95% CI 0.790 to 0.868). The net benefit analysis suggested that people with a CPR score of 1 or more (risk of ulceration 6.0% or more) should be referred for treatment. CONCLUSION The clinical prediction rule is simple, using routinely obtained data, and could help prevent foot ulcers by redirecting care to patients with scores of 1 or above. It has been validated in a community setting, and requires further validation in secondary care settings.
Collapse
Affiliation(s)
| | - Fay Crawford
- The School of Medicine, University of St Andrews, St Andrews, Fife, UK
- Research, Development and Innovation, NHS Fife, Dunfermline, Fife, UK
| | - Margaret Horne
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, UK
| | | | | | - David Weller
- Usher Institute, University of Edinburgh, Edinburgh, UK
| | - Andrew J M Boulton
- Division of Diabetes, Endocrinology and Gastroenterology, University of Manchester & Manchester Royal Infirmary, Manchester, UK
- University of Miami, Miami, Florida, USA
| | - Caroline Abbott
- Manchester Metropolitan University, Manchester, Greater Manchester, UK
| | | | - Aristidis Veves
- Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Richard D Riley
- School of Primary, Community and Social Care, Keele University, Keele, UK
| |
Collapse
|
11
|
Futoma J, Simons M, Panch T, Doshi-Velez F, Celi LA. The myth of generalisability in clinical research and machine learning in health care. LANCET DIGITAL HEALTH 2020; 2:e489-e492. [PMID: 32864600 PMCID: PMC7444947 DOI: 10.1016/s2589-7500(20)30186-2] [Citation(s) in RCA: 172] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
An emphasis on overly broad notions of generalisability as it pertains to applications of machine learning in health care can overlook situations in which machine learning might provide clinical utility. We believe that this narrow focus on generalisability should be replaced with wider considerations for the ultimate goal of building machine learning systems that are useful at the bedside.
Collapse
Affiliation(s)
- Joseph Futoma
- School of Engineering & Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Morgan Simons
- Department of Medicine, NYU Langone Health, New York, NY, USA
| | - Trishan Panch
- Department of Health Policy and Management, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Wellframe, Boston, MA, USA
| | - Finale Doshi-Velez
- School of Engineering & Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Leo Anthony Celi
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
12
|
Gerry S, Bonnici T, Birks J, Kirtley S, Virdee PS, Watkinson PJ, Collins GS. Early warning scores for detecting deterioration in adult hospital patients: systematic review and critical appraisal of methodology. BMJ 2020; 369:m1501. [PMID: 32434791 PMCID: PMC7238890 DOI: 10.1136/bmj.m1501] [Citation(s) in RCA: 136] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/25/2020] [Indexed: 12/30/2022]
Abstract
OBJECTIVE To provide an overview and critical appraisal of early warning scores for adult hospital patients. DESIGN Systematic review. DATA SOURCES Medline, CINAHL, PsycInfo, and Embase until June 2019. ELIGIBILITY CRITERIA FOR STUDY SELECTION Studies describing the development or external validation of an early warning score for adult hospital inpatients. RESULTS 13 171 references were screened and 95 articles were included in the review. 11 studies were development only, 23 were development and external validation, and 61 were external validation only. Most early warning scores were developed for use in the United States (n=13/34, 38%) and the United Kingdom (n=10/34, 29%). Death was the most frequent prediction outcome for development studies (n=10/23, 44%) and validation studies (n=66/84, 79%), with different time horizons (the most frequent was 24 hours). The most common predictors were respiratory rate (n=30/34, 88%), heart rate (n=28/34, 83%), oxygen saturation, temperature, and systolic blood pressure (all n=24/34, 71%). Age (n=13/34, 38%) and sex (n=3/34, 9%) were less frequently included. Key details of the analysis populations were often not reported in development studies (n=12/29, 41%) or validation studies (n=33/84, 39%). Small sample sizes and insufficient numbers of event patients were common in model development and external validation studies. Missing data were often discarded, with just one study using multiple imputation. Only nine of the early warning scores that were developed were presented in sufficient detail to allow individualised risk prediction. Internal validation was carried out in 19 studies, but recommended approaches such as bootstrapping or cross validation were rarely used (n=4/19, 22%). Model performance was frequently assessed using discrimination (development n=18/22, 82%; validation n=69/84, 82%), while calibration was seldom assessed (validation n=13/84, 15%). All included studies were rated at high risk of bias. CONCLUSIONS Early warning scores are widely used prediction models that are often mandated in daily clinical practice to identify early clinical deterioration in hospital patients. However, many early warning scores in clinical use were found to have methodological weaknesses. Early warning scores might not perform as well as expected and therefore they could have a detrimental effect on patient care. Future work should focus on following recommended approaches for developing and evaluating early warning scores, and investigating the impact and safety of using these scores in clinical practice. SYSTEMATIC REVIEW REGISTRATION PROSPERO CRD42017053324.
Collapse
Affiliation(s)
- Stephen Gerry
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Timothy Bonnici
- Critical Care Division, University College London Hospitals NHS Trust, London, UK
| | - Jacqueline Birks
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Shona Kirtley
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Pradeep S Virdee
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Peter J Watkinson
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| |
Collapse
|
13
|
Rysavy MA, Horbar JD, Bell EF, Li L, Greenberg LT, Tyson JE, Patel RM, Carlo WA, Younge NE, Green CE, Edwards EM, Hintz SR, Walsh MC, Buzas JS, Das A, Higgins RD. Assessment of an Updated Neonatal Research Network Extremely Preterm Birth Outcome Model in the Vermont Oxford Network. JAMA Pediatr 2020; 174:e196294. [PMID: 32119065 PMCID: PMC7052789 DOI: 10.1001/jamapediatrics.2019.6294] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
IMPORTANCE The Eunice Kennedy Shriver National Institute of Child Health and Human Development Neonatal Research Network (NRN) extremely preterm birth outcome model is widely used for prognostication by practitioners caring for families expecting extremely preterm birth. The model provides information on mean outcomes from 1998 to 2003 and does not account for substantial variation in outcomes among US hospitals. OBJECTIVE To update and validate the NRN extremely preterm birth outcome model for most extremely preterm infants in the United States. DESIGN, SETTING, AND PARTICIPANTS This prognostic study included 3 observational cohorts from January 1, 2006, to December 31, 2016, at 19 US centers in the NRN (derivation cohort) and 637 US centers in Vermont Oxford Network (VON) (validation cohorts). Actively treated infants born at 22 weeks' 0 days' to 25 weeks' 6 days' gestation and weighing 401 to 1000 g, including 4176 in the NRN for 2006 to 2012, 45 179 in VON for 2006 to 2012, and 25 969 in VON for 2013 to 2016, were studied. VON cohorts comprised more than 85% of eligible US births. Data analysis was performed from May 1, 2017, to March 31, 2019. EXPOSURES Predictive variables used in the original model, including infant sex, birth weight, plurality, gestational age at birth, and exposure to antenatal corticosteroids. MAIN OUTCOMES AND MEASURES The main outcome was death before discharge. Secondary outcomes included neurodevelopmental impairment at 18 to 26 months' corrected age and measures of hospital resource use (days of hospitalization and ventilator use). RESULTS Among 4176 actively treated infants in the NRN cohort (48% female; mean [SD] gestational age, 24.2 [0.8] weeks), survival was 63% vs 62% among 3702 infants in the era of the original model (47% female; mean [SD] gestational age, 24.2 [0.8] weeks). In the concurrent (2006-2012) VON cohort, survival was 66% among 45 179 actively treated infants (47% female; mean [SD] gestational age, 24.1 [0.8] weeks) and 70% among 25 969 infants from 2013 to 2016 (48% female; mean [SD] gestational age, 24.1 [0.8] weeks). Model C statistics were 0.74 in the 2006-2012 validation cohort and 0.73 in the 2013-2016 validation cohort. With the use of decision curve analysis to compare the model with a gestational age-only approach to prognostication, the updated model showed a predictive advantage. The birth hospital contributed equally as much to prediction of survival as gestational age (20%) but less than the other factors combined (60%). CONCLUSIONS AND RELEVANCE An updated model using well-known factors to predict survival for extremely preterm infants performed moderately well when applied to large US cohorts. Because survival rates change over time, the model requires periodic updating. The hospital of birth contributed substantially to outcome prediction.
Collapse
Affiliation(s)
- Matthew A. Rysavy
- Stead Family Department of Pediatrics, University of Iowa, Iowa City
| | - Jeffrey D. Horbar
- Vermont Oxford Network, Burlington,Department of Pediatrics, University of Vermont College of Medicine, Burlington
| | - Edward F. Bell
- Stead Family Department of Pediatrics, University of Iowa, Iowa City
| | - Lei Li
- Biostatistics and Epidemiology Division, RTI International, Research Triangle Park, North Carolina
| | | | - Jon E. Tyson
- Center for Clinical Research & Evidence-Based Medicine, University of Texas McGovern Medical School, Houston
| | - Ravi M. Patel
- Department of Pediatrics, Emory University School of Medicine, Atlanta, Georgia
| | | | - Noelle E. Younge
- Department of Pediatrics, Duke University School of Medicine, Durham, North Carolina
| | - Charles E. Green
- Center for Clinical Research & Evidence-Based Medicine, University of Texas McGovern Medical School, Houston
| | - Erika M. Edwards
- Vermont Oxford Network, Burlington,Department of Mathematics and Statistics, College of Engineering and Mathematical Sciences, University of Vermont, Burlington
| | - Susan R. Hintz
- Department of Pediatrics, Stanford University School of Medicine, Palo Alto, California
| | - Michele C. Walsh
- Department of Pediatrics, Case Western Reserve University, Cleveland, Ohio
| | - Jeffrey S. Buzas
- Department of Mathematics and Statistics, College of Engineering and Mathematical Sciences, University of Vermont, Burlington
| | - Abhik Das
- Biostatistics and Epidemiology Division, RTI International, Rockville, Maryland
| | - Rosemary D. Higgins
- Office of Research, George Mason University College of Health and Human Services, Fairfax, Virginia
| | | |
Collapse
|
14
|
Wynants L, Van Calster B, Collins GS, Riley RD, Heinze G, Schuit E, Bonten MMJ, Dahly DL, Damen JAA, Debray TPA, de Jong VMT, De Vos M, Dhiman P, Haller MC, Harhay MO, Henckaerts L, Heus P, Kammer M, Kreuzberger N, Lohmann A, Luijken K, Ma J, Martin GP, McLernon DJ, Andaur Navarro CL, Reitsma JB, Sergeant JC, Shi C, Skoetz N, Smits LJM, Snell KIE, Sperrin M, Spijker R, Steyerberg EW, Takada T, Tzoulaki I, van Kuijk SMJ, van Bussel B, van der Horst ICC, van Royen FS, Verbakel JY, Wallisch C, Wilkinson J, Wolff R, Hooft L, Moons KGM, van Smeden M. Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. BMJ 2020; 369:m1328. [PMID: 32265220 PMCID: PMC7222643 DOI: 10.1136/bmj.m1328] [Citation(s) in RCA: 1640] [Impact Index Per Article: 410.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/31/2020] [Indexed: 12/12/2022]
Abstract
OBJECTIVE To review and appraise the validity and usefulness of published and preprint reports of prediction models for diagnosing coronavirus disease 2019 (covid-19) in patients with suspected infection, for prognosis of patients with covid-19, and for detecting people in the general population at increased risk of covid-19 infection or being admitted to hospital with the disease. DESIGN Living systematic review and critical appraisal by the COVID-PRECISE (Precise Risk Estimation to optimise covid-19 Care for Infected or Suspected patients in diverse sEttings) group. DATA SOURCES PubMed and Embase through Ovid, up to 1 July 2020, supplemented with arXiv, medRxiv, and bioRxiv up to 5 May 2020. STUDY SELECTION Studies that developed or validated a multivariable covid-19 related prediction model. DATA EXTRACTION At least two authors independently extracted data using the CHARMS (critical appraisal and data extraction for systematic reviews of prediction modelling studies) checklist; risk of bias was assessed using PROBAST (prediction model risk of bias assessment tool). RESULTS 37 421 titles were screened, and 169 studies describing 232 prediction models were included. The review identified seven models for identifying people at risk in the general population; 118 diagnostic models for detecting covid-19 (75 were based on medical imaging, 10 to diagnose disease severity); and 107 prognostic models for predicting mortality risk, progression to severe disease, intensive care unit admission, ventilation, intubation, or length of hospital stay. The most frequent types of predictors included in the covid-19 prediction models are vital signs, age, comorbidities, and image features. Flu-like symptoms are frequently predictive in diagnostic models, while sex, C reactive protein, and lymphocyte counts are frequent prognostic factors. Reported C index estimates from the strongest form of validation available per model ranged from 0.71 to 0.99 in prediction models for the general population, from 0.65 to more than 0.99 in diagnostic models, and from 0.54 to 0.99 in prognostic models. All models were rated at high or unclear risk of bias, mostly because of non-representative selection of control patients, exclusion of patients who had not experienced the event of interest by the end of the study, high risk of model overfitting, and unclear reporting. Many models did not include a description of the target population (n=27, 12%) or care setting (n=75, 32%), and only 11 (5%) were externally validated by a calibration plot. The Jehi diagnostic model and the 4C mortality score were identified as promising models. CONCLUSION Prediction models for covid-19 are quickly entering the academic literature to support medical decision making at a time when they are urgently needed. This review indicates that almost all pubished prediction models are poorly reported, and at high risk of bias such that their reported predictive performance is probably optimistic. However, we have identified two (one diagnostic and one prognostic) promising models that should soon be validated in multiple cohorts, preferably through collaborative efforts and data sharing to also allow an investigation of the stability and heterogeneity in their performance across populations and settings. Details on all reviewed models are publicly available at https://www.covprecise.org/. Methodological guidance as provided in this paper should be followed because unreliable predictions could cause more harm than benefit in guiding clinical decisions. Finally, prediction model authors should adhere to the TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) reporting guideline. SYSTEMATIC REVIEW REGISTRATION Protocol https://osf.io/ehc47/, registration https://osf.io/wy245. READERS' NOTE This article is a living systematic review that will be updated to reflect emerging evidence. Updates may occur for up to two years from the date of original publication. This version is update 3 of the original article published on 7 April 2020 (BMJ 2020;369:m1328). Previous updates can be found as data supplements (https://www.bmj.com/content/369/bmj.m1328/related#datasupp). When citing this paper please consider adding the update number and date of access for clarity.
Collapse
Affiliation(s)
- Laure Wynants
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Peter Debyeplein 1, 6229 HA Maastricht, Netherlands
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, Netherlands
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Musculoskeletal Sciences, University of Oxford, Oxford, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, UK
| | - Richard D Riley
- Centre for Prognosis Research, School of Primary, Community and Social Care, Keele University, Keele, UK
| | - Georg Heinze
- Section for Clinical Biometrics, Centre for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | - Ewoud Schuit
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Marc M J Bonten
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
- Department of Medical Microbiology, University Medical Centre Utrecht, Utrecht, Netherlands
| | - Darren L Dahly
- HRB Clinical Research Facility, Cork, Ireland
- School of Public Health, University College Cork, Cork, Ireland
| | - Johanna A A Damen
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Thomas P A Debray
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Valentijn M T de Jong
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Maarten De Vos
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- Department of Electrical Engineering, ESAT Stadius, KU Leuven, Leuven, Belgium
| | - Paul Dhiman
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Musculoskeletal Sciences, University of Oxford, Oxford, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, UK
| | - Maria C Haller
- Section for Clinical Biometrics, Centre for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
- Ordensklinikum Linz, Hospital Elisabethinen, Department of Nephrology, Linz, Austria
| | - Michael O Harhay
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Palliative and Advanced Illness Research Center and Division of Pulmonary and Critical Care Medicine, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Liesbet Henckaerts
- Department of Microbiology, Immunology and Transplantation, KU Leuven-University of Leuven, Leuven, Belgium
- Department of General Internal Medicine, KU Leuven-University Hospitals Leuven, Leuven, Belgium
| | - Pauline Heus
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Michael Kammer
- Section for Clinical Biometrics, Centre for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
- Department of Nephrology, Medical University of Vienna, Vienna, Austria
| | - Nina Kreuzberger
- Evidence-Based Oncology, Department I of Internal Medicine and Centre for Integrated Oncology Aachen Bonn Cologne Dusseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Anna Lohmann
- Department of Clinical Epidemiology, Leiden University Medical Centre, Leiden, Netherlands
| | - Kim Luijken
- Department of Clinical Epidemiology, Leiden University Medical Centre, Leiden, Netherlands
| | - Jie Ma
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, UK
| | - Glen P Martin
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, University of Manchester, Manchester, UK
| | - David J McLernon
- Institute of Applied Health Sciences, University of Aberdeen, Aberdeen, UK
| | - Constanza L Andaur Navarro
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Johannes B Reitsma
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Jamie C Sergeant
- Centre for Biostatistics, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
- Centre for Epidemiology Versus Arthritis, Centre for Musculoskeletal Research, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Chunhu Shi
- Division of Nursing, Midwifery and Social Work, School of Health Sciences, University of Manchester, Manchester, UK
| | - Nicole Skoetz
- Department of Nephrology, Medical University of Vienna, Vienna, Austria
| | - Luc J M Smits
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Peter Debyeplein 1, 6229 HA Maastricht, Netherlands
| | - Kym I E Snell
- Centre for Prognosis Research, School of Primary, Community and Social Care, Keele University, Keele, UK
| | - Matthew Sperrin
- Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
| | - René Spijker
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
- Amsterdam UMC, University of Amsterdam, Amsterdam Public Health, Medical Library, Netherlands
| | - Ewout W Steyerberg
- Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, Netherlands
| | - Toshihiko Takada
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Ioanna Tzoulaki
- Department of Epidemiology and Biostatistics, Imperial College London School of Public Health, London, UK
- Department of Hygiene and Epidemiology, University of Ioannina Medical School, Ioannina, Greece
| | - Sander M J van Kuijk
- Department of Clinical Epidemiology and Medical Technology Assessment, Maastricht University Medical Centre+, Maastricht, Netherlands
| | - Bas van Bussel
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Peter Debyeplein 1, 6229 HA Maastricht, Netherlands
- Department of Intensive Care, Maastricht University Medical Centre+, Maastricht University, Maastricht, Netherlands
| | - Iwan C C van der Horst
- Department of Intensive Care, Maastricht University Medical Centre+, Maastricht University, Maastricht, Netherlands
| | - Florien S van Royen
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Jan Y Verbakel
- EPI-Centre, Department of Public Health and Primary Care, KU Leuven, Leuven, Belgium
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
| | - Christine Wallisch
- Section for Clinical Biometrics, Centre for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
- Charité Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
- Berlin Institute of Health, Berlin, Germany
| | - Jack Wilkinson
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, University of Manchester, Manchester, UK
| | | | - Lotty Hooft
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
- Cochrane Netherlands, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Maarten van Smeden
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
15
|
Falconieri N, Van Calster B, Timmerman D, Wynants L. Developing risk models for multicenter data using standard logistic regression produced suboptimal predictions: A simulation study. Biom J 2020; 62:932-944. [PMID: 31957077 PMCID: PMC7383814 DOI: 10.1002/bimj.201900075] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 09/16/2019] [Accepted: 10/15/2019] [Indexed: 11/17/2022]
Abstract
Although multicenter data are common, many prediction model studies ignore this during model development. The objective of this study is to evaluate the predictive performance of regression methods for developing clinical risk prediction models using multicenter data, and provide guidelines for practice. We compared the predictive performance of standard logistic regression, generalized estimating equations, random intercept logistic regression, and fixed effects logistic regression. First, we presented a case study on the diagnosis of ovarian cancer. Subsequently, a simulation study investigated the performance of the different models as a function of the amount of clustering, development sample size, distribution of center‐specific intercepts, the presence of a center‐predictor interaction, and the presence of a dependency between center effects and predictors. The results showed that when sample sizes were sufficiently large, conditional models yielded calibrated predictions, whereas marginal models yielded miscalibrated predictions. Small sample sizes led to overfitting and unreliable predictions. This miscalibration was worse with more heavily clustered data. Calibration of random intercept logistic regression was better than that of standard logistic regression even when center‐specific intercepts were not normally distributed, a center‐predictor interaction was present, center effects and predictors were dependent, or when the model was applied in a new center. Therefore, to make reliable predictions in a specific center, we recommend random intercept logistic regression.
Collapse
Affiliation(s)
- Nora Falconieri
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium.,Department of Biomedical Data Sciences, Leiden University Medical Center (LUMC), Leiden, The Netherlands
| | - Dirk Timmerman
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium.,Department of Obstetrics and Gynecology, University Hospitals Leuven, Leuven, Belgium
| | - Laure Wynants
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium.,Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
16
|
Debray TP, de Jong VM, Moons KG, Riley RD. Evidence synthesis in prognosis research. Diagn Progn Res 2019; 3:13. [PMID: 31338426 PMCID: PMC6621956 DOI: 10.1186/s41512-019-0059-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 04/16/2019] [Indexed: 12/11/2022] Open
Abstract
Over the past few years, evidence synthesis has become essential to investigate and improve the generalizability of medical research findings. This strategy often involves a meta-analysis to formally summarize quantities of interest, such as relative treatment effect estimates. The use of meta-analysis methods is, however, less straightforward in prognosis research because substantial variation exists in research objectives, analysis methods and the level of reported evidence. We present a gentle overview of statistical methods that can be used to summarize data of prognostic factor and prognostic model studies. We discuss how aggregate data, individual participant data, or a combination thereof can be combined through meta-analysis methods. Recent examples are provided throughout to illustrate the various methods.
Collapse
Affiliation(s)
- Thomas P.A. Debray
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, Utrecht, 3584 CG The Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Universiteitsweg 100, Utrecht, 3584 CG The Netherlands
| | - Valentijn M.T. de Jong
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, Utrecht, 3584 CG The Netherlands
| | - Karel G.M. Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, Utrecht, 3584 CG The Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Universiteitsweg 100, Utrecht, 3584 CG The Netherlands
| | - Richard D. Riley
- Research Institute for Primary Care & Health Sciences, Keele University, Staffordshire, ST5 5BG UK
| |
Collapse
|