101
|
Youssef A, Pencina M, Thakur A, Zhu T, Clifton D, Shah NH. External validation of AI models in health should be replaced with recurring local validation. Nat Med 2023; 29:2686-2687. [PMID: 37853136 DOI: 10.1038/s41591-023-02540-z] [Citation(s) in RCA: 43] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2023]
Affiliation(s)
- Alexey Youssef
- Stanford Bioengineering Department, Stanford University, Stanford, CA, USA.
- Department of Engineering Science, University of Oxford, Oxford, UK.
| | | | - Anshul Thakur
- Department of Engineering Science, University of Oxford, Oxford, UK
| | - Tingting Zhu
- Department of Engineering Science, University of Oxford, Oxford, UK
| | - David Clifton
- Department of Engineering Science, University of Oxford, Oxford, UK
- Oxford-Suzhou Centre for Advanced Research, Suzhou, China
| | - Nigam H Shah
- Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA
- Technology and Digital Solutions, Stanford Medicine, Stanford, CA, USA
- Clinical Excellence Research Center, Stanford Medicine, Stanford, CA, USA
| |
Collapse
|
102
|
Ratna MB, Bhattacharya S, McLernon DJ. External validation of models for predicting cumulative live birth over multiple complete cycles of IVF treatment. Hum Reprod 2023; 38:1998-2010. [PMID: 37632223 PMCID: PMC10546080 DOI: 10.1093/humrep/dead165] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 07/28/2023] [Indexed: 08/27/2023] Open
Abstract
STUDY QUESTION Can two prediction models developed using data from 1999 to 2009 accurately predict the cumulative probability of live birth per woman over multiple complete cycles of IVF in an updated UK cohort? SUMMARY ANSWER After being updated, the models were able to estimate individualized chances of cumulative live birth over multiple complete cycles of IVF with greater accuracy. WHAT IS KNOWN ALREADY The McLernon models were the first to predict cumulative live birth over multiple complete cycles of IVF. They were converted into an online calculator called OPIS (Outcome Prediction In Subfertility) which has 3000 users per month on average. A previous study externally validated the McLernon models using a Dutch prospective cohort containing data from 2011 to 2014. With changes in IVF practice over time, it is important that the McLernon models are externally validated on a more recent cohort of patients to ensure that predictions remain accurate. STUDY DESIGN, SIZE, DURATION A population-based cohort of 91 035 women undergoing IVF in the UK between January 2010 and December 2016 was used for external validation. Data on frozen embryo transfers associated with these complete IVF cycles conducted from 1 January 2017 to 31 December 2017 were also collected. PARTICIPANTS/MATERIALS, SETTING, METHODS Data on IVF treatments were obtained from the Human Fertilisation and Embryology Authority (HFEA). The predictive performances of the McLernon models were evaluated in terms of discrimination and calibration. Discrimination was assessed using the c-statistic and calibration was assessed using calibration-in-the-large, calibration slope, and calibration plots. Where any model demonstrated poor calibration in the validation cohort, the models were updated using intercept recalibration, logistic recalibration, or model revision to improve model performance. MAIN RESULTS AND THE ROLE OF CHANCE Following exclusions, 91 035 women who underwent 144 734 complete cycles were included. The validation cohort had a similar distribution age profile to women in the development cohort. Live birth rates over all complete cycles of IVF per woman were higher in the validation cohort. After calibration assessment, both models required updating. The coefficients of the pre-treatment model were revised, and the updated model showed reasonable discrimination (c-statistic: 0.67, 95% CI: 0.66 to 0.68). After logistic recalibration, the post-treatment model showed good discrimination (c-statistic: 0.75, 95% CI: 0.74 to 0.76). As an example, in the updated pre-treatment model, a 32-year-old woman with 2 years of primary infertility has a 42% chance of having a live birth in the first complete ICSI cycle and a 77% chance over three complete cycles. In a couple with 2 years of primary male factor infertility where a 30-year-old woman has 15 oocytes collected in the first cycle, a single fresh blastocyst embryo transferred in the first cycle and spare embryos cryopreserved, the estimated chance of live birth provided by the post-treatment model is 46% in the first complete ICSI cycle and 81% over three complete cycles. LIMITATIONS, REASONS FOR CAUTION Two predictors from the original models, duration of infertility and previous pregnancy, which were not available in the recent HFEA dataset, were imputed using data from the older cohort used to develop the models. The HFEA dataset does not contain some other potentially important predictors, e.g. BMI, ethnicity, race, smoking and alcohol intake in women, as well as measures of ovarian reserve such as antral follicle count. WIDER IMPLICATIONS OF THE FINDINGS Both updated models show improved predictive ability and provide estimates which are more reflective of current practice and patient case mix. The updated OPIS tool can be used by clinicians to help shape couples' expectations by informing them of their individualized chances of live birth over a sequence of multiple complete cycles of IVF. STUDY FUNDING/COMPETING INTEREST(S) This study was supported by an Elphinstone scholarship scheme at the University of Aberdeen and Aberdeen Fertility Centre, University of Aberdeen. S.B. has a commitment of research funding from Merck. D.J.M. and M.B.R. declare support for the present manuscript from Elphinstone scholarship scheme at the University of Aberdeen and Assisted Reproduction Unit at Aberdeen Fertility Centre, University of Aberdeen. D.J.M. declares grants received by University of Aberdeen from NHS Grampian, The Meikle Foundation, and Chief Scientist Office in the past 3 years. D.J.M. declares receiving an honorarium for lectures from Merck. D.J.M. is Associate Editor of Human Reproduction Open and Statistical Advisor for Reproductive BioMed Online. S.B. declares royalties from Cambridge University Press for a book. S.B. declares receiving an honorarium for lectures from Merck, Organon, Ferring, Obstetric and Gynaecological Society of Singapore, and Taiwanese Society for Reproductive Medicine. S.B. has received support from Merck, ESHRE, and Ferring for attending meetings as speaker and is on the METAFOR and CAPRE Trials Data Monitoring Committee. TRIAL REGISTRATION NUMBER N/A.
Collapse
Affiliation(s)
- Mariam B Ratna
- Institute of Applied Health Sciences, School of Medicine, Medical Sciences & Nutrition, University of Aberdeen, Aberdeen, UK
- Clinical Trials Unit, Warwick Medical School, University of Warwick, Warwick, UK
| | | | - David J McLernon
- Institute of Applied Health Sciences, School of Medicine, Medical Sciences & Nutrition, University of Aberdeen, Aberdeen, UK
| |
Collapse
|
103
|
Vaid A, Sawant A, Suarez-Farinas M, Lee J, Kaul S, Kovatch P, Freeman R, Jiang J, Jayaraman P, Fayad Z, Argulian E, Lerakis S, Charney AW, Wang F, Levin M, Glicksberg B, Narula J, Hofer I, Singh K, Nadkarni GN. Implications of the Use of Artificial Intelligence Predictive Models in Health Care Settings : A Simulation Study. Ann Intern Med 2023; 176:1358-1369. [PMID: 37812781 DOI: 10.7326/m23-0949] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/11/2023] Open
Abstract
BACKGROUND Substantial effort has been directed toward demonstrating uses of predictive models in health care. However, implementation of these models into clinical practice may influence patient outcomes, which in turn are captured in electronic health record data. As a result, deployed models may affect the predictive ability of current and future models. OBJECTIVE To estimate changes in predictive model performance with use through 3 common scenarios: model retraining, sequentially implementing 1 model after another, and intervening in response to a model when 2 are simultaneously implemented. DESIGN Simulation of model implementation and use in critical care settings at various levels of intervention effectiveness and clinician adherence. Models were either trained or retrained after simulated implementation. SETTING Admissions to the intensive care unit (ICU) at Mount Sinai Health System (New York, New York) and Beth Israel Deaconess Medical Center (Boston, Massachusetts). PATIENTS 130 000 critical care admissions across both health systems. INTERVENTION Across 3 scenarios, interventions were simulated at varying levels of clinician adherence and effectiveness. MEASUREMENTS Statistical measures of performance, including threshold-independent (area under the curve) and threshold-dependent measures. RESULTS At fixed 90% sensitivity, in scenario 1 a mortality prediction model lost 9% to 39% specificity after retraining once and in scenario 2 a mortality prediction model lost 8% to 15% specificity when created after the implementation of an acute kidney injury (AKI) prediction model; in scenario 3, models for AKI and mortality prediction implemented simultaneously, each led to reduced effective accuracy of the other by 1% to 28%. LIMITATIONS In real-world practice, the effectiveness of and adherence to model-based recommendations are rarely known in advance. Only binary classifiers for tabular ICU admissions data were simulated. CONCLUSION In simulated ICU settings, a universally effective model-updating approach for maintaining model performance does not seem to exist. Model use may have to be recorded to maintain viability of predictive modeling. PRIMARY FUNDING SOURCE National Center for Advancing Translational Sciences.
Collapse
Affiliation(s)
- Akhil Vaid
- Division of Data-Driven and Digital Medicine, Department of Medicine, and The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York (A.V., P.J.)
| | - Ashwin Sawant
- Division of Data-Driven and Digital Medicine, Department of Medicine; The Charles Bronfman Institute of Personalized Medicine; and Division of Hospital Medicine, Icahn School of Medicine at Mount Sinai, New York, New York (A.S.)
| | - Mayte Suarez-Farinas
- Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, New York (M.S., J.L.)
| | - Juhee Lee
- Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, New York (M.S., J.L.)
| | - Sanjeev Kaul
- Department of Surgery, Hackensack Meridian School of Medicine, Nutley, New Jersey (S.K.)
| | - Patricia Kovatch
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York (P.K., B.G.)
| | - Robert Freeman
- Division of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York (R.F.)
| | - Joy Jiang
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York (J.J.)
| | - Pushkala Jayaraman
- Division of Data-Driven and Digital Medicine, Department of Medicine, and The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York (A.V., P.J.)
| | - Zahi Fayad
- BioMedical Engineering and Imaging Institute and Mount Sinai Heart, Icahn School of Medicine at Mount Sinai, New York, New York (Z.F.)
| | - Edgar Argulian
- Mount Sinai Heart, Icahn School of Medicine at Mount Sinai, New York, New York (E.A., S.L., J.N.)
| | - Stamatios Lerakis
- Mount Sinai Heart, Icahn School of Medicine at Mount Sinai, New York, New York (E.A., S.L., J.N.)
| | - Alexander W Charney
- The Charles Bronfman Institute of Personalized Medicine and Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, and Department of Surgery, Hackensack Meridian School of Medicine, Nutley, New Jersey (A.W.C.)
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York (F.W.)
| | - Matthew Levin
- Department of Anesthesiology, Perioperative and Pain Medicine, Icahn School of Medicine at Mount Sinai, New York, New York (M.L.)
| | - Benjamin Glicksberg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York (P.K., B.G.)
| | - Jagat Narula
- Mount Sinai Heart, Icahn School of Medicine at Mount Sinai, New York, New York (E.A., S.L., J.N.)
| | - Ira Hofer
- Division of Data-Driven and Digital Medicine, Department of Medicine; The Charles Bronfman Institute of Personalized Medicine; and Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York (I.H.)
| | - Karandeep Singh
- Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, Michigan (K.S.)
| | - Girish N Nadkarni
- Division of Data-Driven and Digital Medicine, Department of Medicine; The Charles Bronfman Institute of Personalized Medicine; and Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York (G.N.N.)
| |
Collapse
|
104
|
Kwong JCC, Khondker A, Lajkosz K, McDermott MBA, Frigola XB, McCradden MD, Mamdani M, Kulkarni GS, Johnson AEW. APPRAISE-AI Tool for Quantitative Evaluation of AI Studies for Clinical Decision Support. JAMA Netw Open 2023; 6:e2335377. [PMID: 37747733 PMCID: PMC10520738 DOI: 10.1001/jamanetworkopen.2023.35377] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 08/14/2023] [Indexed: 09/26/2023] Open
Abstract
Importance Artificial intelligence (AI) has gained considerable attention in health care, yet concerns have been raised around appropriate methods and fairness. Current AI reporting guidelines do not provide a means of quantifying overall quality of AI research, limiting their ability to compare models addressing the same clinical question. Objective To develop a tool (APPRAISE-AI) to evaluate the methodological and reporting quality of AI prediction models for clinical decision support. Design, Setting, and Participants This quality improvement study evaluated AI studies in the model development, silent, and clinical trial phases using the APPRAISE-AI tool, a quantitative method for evaluating quality of AI studies across 6 domains: clinical relevance, data quality, methodological conduct, robustness of results, reporting quality, and reproducibility. These domains included 24 items with a maximum overall score of 100 points. Points were assigned to each item, with higher points indicating stronger methodological or reporting quality. The tool was applied to a systematic review on machine learning to estimate sepsis that included articles published until September 13, 2019. Data analysis was performed from September to December 2022. Main Outcomes and Measures The primary outcomes were interrater and intrarater reliability and the correlation between APPRAISE-AI scores and expert scores, 3-year citation rate, number of Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) low risk-of-bias domains, and overall adherence to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement. Results A total of 28 studies were included. Overall APPRAISE-AI scores ranged from 33 (low quality) to 67 (high quality). Most studies were moderate quality. The 5 lowest scoring items included source of data, sample size calculation, bias assessment, error analysis, and transparency. Overall APPRAISE-AI scores were associated with expert scores (Spearman ρ, 0.82; 95% CI, 0.64-0.91; P < .001), 3-year citation rate (Spearman ρ, 0.69; 95% CI, 0.43-0.85; P < .001), number of QUADAS-2 low risk-of-bias domains (Spearman ρ, 0.56; 95% CI, 0.24-0.77; P = .002), and adherence to the TRIPOD statement (Spearman ρ, 0.87; 95% CI, 0.73-0.94; P < .001). Intraclass correlation coefficient ranges for interrater and intrarater reliability were 0.74 to 1.00 for individual items, 0.81 to 0.99 for individual domains, and 0.91 to 0.98 for overall scores. Conclusions and Relevance In this quality improvement study, APPRAISE-AI demonstrated strong interrater and intrarater reliability and correlated well with several study quality measures. This tool may provide a quantitative approach for investigators, reviewers, editors, and funding organizations to compare the research quality across AI studies for clinical decision support.
Collapse
Affiliation(s)
- Jethro C. C. Kwong
- Division of Urology, Department of Surgery, University of Toronto, Toronto, Ontario, Canada
- Temerty Centre for AI Research and Education in Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Adree Khondker
- Division of Urology, Department of Surgery, University of Toronto, Toronto, Ontario, Canada
| | - Katherine Lajkosz
- Division of Urology, Department of Surgery, University of Toronto, Toronto, Ontario, Canada
- Department of Biostatistics, University Health Network, University of Toronto, Toronto, Ontario, Canada
| | | | - Xavier Borrat Frigola
- Laboratory for Computational Physiology, Harvard–Massachusetts Institute of Technology Division of Health Sciences and Technology, Cambridge
- Anesthesiology and Critical Care Department, Hospital Clinic de Barcelona, Barcelona, Spain
| | - Melissa D. McCradden
- Department of Bioethics, The Hospital for Sick Children, Toronto, Ontario, Canada
- Genetics & Genome Biology Research Program, Peter Gilgan Centre for Research and Learning, Toronto, Ontario, Canada
- Division of Clinical and Public Health, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Muhammad Mamdani
- Temerty Centre for AI Research and Education in Medicine, University of Toronto, Toronto, Ontario, Canada
- Data Science and Advanced Analytics, Unity Health Toronto, Toronto, Ontario, Canada
| | - Girish S. Kulkarni
- Division of Urology, Department of Surgery, University of Toronto, Toronto, Ontario, Canada
- Princess Margaret Cancer Centre, University Health Network, University of Toronto, Toronto, Ontario, Canada
| | - Alistair E. W. Johnson
- Temerty Centre for AI Research and Education in Medicine, University of Toronto, Toronto, Ontario, Canada
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Child Health Evaluative Sciences, The Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
105
|
White N, Parsons R, Collins G, Barnett A. Evidence of questionable research practices in clinical prediction models. BMC Med 2023; 21:339. [PMID: 37667344 PMCID: PMC10478406 DOI: 10.1186/s12916-023-03048-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 08/24/2023] [Indexed: 09/06/2023] Open
Abstract
BACKGROUND Clinical prediction models are widely used in health and medical research. The area under the receiver operating characteristic curve (AUC) is a frequently used estimate to describe the discriminatory ability of a clinical prediction model. The AUC is often interpreted relative to thresholds, with "good" or "excellent" models defined at 0.7, 0.8 or 0.9. These thresholds may create targets that result in "hacking", where researchers are motivated to re-analyse their data until they achieve a "good" result. METHODS We extracted AUC values from PubMed abstracts to look for evidence of hacking. We used histograms of the AUC values in bins of size 0.01 and compared the observed distribution to a smooth distribution from a spline. RESULTS The distribution of 306,888 AUC values showed clear excesses above the thresholds of 0.7, 0.8 and 0.9 and shortfalls below the thresholds. CONCLUSIONS The AUCs for some models are over-inflated, which risks exposing patients to sub-optimal clinical decision-making. Greater modelling transparency is needed, including published protocols, and data and code sharing.
Collapse
Affiliation(s)
- Nicole White
- Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public Health and Social Work, Faculty of Health, Queensland University of Technology, Kelvin Grove, Queensland, Australia
| | - Rex Parsons
- Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public Health and Social Work, Faculty of Health, Queensland University of Technology, Kelvin Grove, Queensland, Australia
| | - Gary Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology & Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Adrian Barnett
- Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public Health and Social Work, Faculty of Health, Queensland University of Technology, Kelvin Grove, Queensland, Australia.
| |
Collapse
|
106
|
Okada Y, Mertens M, Liu N, Lam SSW, Ong MEH. AI and machine learning in resuscitation: Ongoing research, new concepts, and key challenges. Resusc Plus 2023; 15:100435. [PMID: 37547540 PMCID: PMC10400904 DOI: 10.1016/j.resplu.2023.100435] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/08/2023] Open
Abstract
Aim Artificial intelligence (AI) and machine learning (ML) are important areas of computer science that have recently attracted attention for their application to medicine. However, as techniques continue to advance and become more complex, it is increasingly challenging for clinicians to stay abreast of the latest research. This overview aims to translate research concepts and potential concerns to healthcare professionals interested in applying AI and ML to resuscitation research but who are not experts in the field. Main text We present various research including prediction models using structured and unstructured data, exploring treatment heterogeneity, reinforcement learning, language processing, and large-scale language models. These studies potentially offer valuable insights for optimizing treatment strategies and clinical workflows. However, implementing AI and ML in clinical settings presents its own set of challenges. The availability of high-quality and reliable data is crucial for developing accurate ML models. A rigorous validation process and the integration of ML into clinical practice is essential for practical implementation. We furthermore highlight the potential risks associated with self-fulfilling prophecies and feedback loops, emphasizing the importance of transparency, interpretability, and trustworthiness in AI and ML models. These issues need to be addressed in order to establish reliable and trustworthy AI and ML models. Conclusion In this article, we overview concepts and examples of AI and ML research in the resuscitation field. Moving forward, appropriate understanding of ML and collaboration with relevant experts will be essential for researchers and clinicians to overcome the challenges and harness the full potential of AI and ML in resuscitation.
Collapse
Affiliation(s)
- Yohei Okada
- Duke-NUS Medical School, National University of Singapore, Singapore
- Preventive Services, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Mayli Mertens
- Antwerp Center for Responsible AI, Antwerp University, Belgium
- Centre for Ethics, Department of Philosophy, Antwerp University, Belgium
| | - Nan Liu
- Duke-NUS Medical School, National University of Singapore, Singapore
| | - Sean Shao Wei Lam
- Duke-NUS Medical School, National University of Singapore, Singapore
| | - Marcus Eng Hock Ong
- Duke-NUS Medical School, National University of Singapore, Singapore
- Department of Emergency Medicine, Singapore General Hospital
| |
Collapse
|
107
|
Sherminie LPG, Jayatilake ML, Hewavithana B, Weerakoon BS, Vijithananda SM. Morphometry-based radiomics for predicting therapeutic response in patients with gliomas following radiotherapy. Front Oncol 2023; 13:1139902. [PMID: 37664038 PMCID: PMC10470056 DOI: 10.3389/fonc.2023.1139902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Accepted: 07/31/2023] [Indexed: 09/05/2023] Open
Abstract
Introduction Gliomas are still considered as challenging in oncologic management despite the developments in treatment approaches. The complete elimination of a glioma might not be possible even after a treatment and assessment of therapeutic response is important to determine the future course of actions for patients with such cancers. In the recent years radiomics has emerged as a promising solution with potential applications including prediction of therapeutic response. Hence, this study was focused on investigating whether morphometry-based radiomics signature could be used to predict therapeutic response in patients with gliomas following radiotherapy. Methods 105 magnetic resonance (MR) images including segmented and non-segmented images were used to extract morphometric features and develop a morphometry-based radiomics signature. After determining the appropriate machine learning algorithm, a prediction model was developed to predict the therapeutic response eliminating the highly correlated features as well as without eliminating the highly correlated features. Then the model performance was evaluated. Results Tumor grade had the highest contribution to develop the morphometry-based signature. Random forest provided the highest accuracy to train the prediction model derived from the morphometry-based radiomics signature. An accuracy of 86% and area under the curve (AUC) value of 0.91 were achieved for the prediction model evaluated without eliminating the highly correlated features whereas accuracy and AUC value were 84% and 0.92 respectively for the prediction model evaluated after eliminating the highly correlated features. Discussion Nonetheless, the developed morphometry-based radiomics signature could be utilized as a noninvasive biomarker for therapeutic response in patients with gliomas following radiotherapy.
Collapse
Affiliation(s)
- Lahanda Purage G. Sherminie
- Department of Radiography/Radiotherapy, Faculty of Allied Health Sciences, University of Peradeniya, Peradeniya, Sri Lanka
| | - Mohan L. Jayatilake
- Department of Radiography/Radiotherapy, Faculty of Allied Health Sciences, University of Peradeniya, Peradeniya, Sri Lanka
| | - Badra Hewavithana
- Department of Radiology, Faculty of Medicine, University of Peradeniya, Peradeniya, Sri Lanka
| | - Bimali S. Weerakoon
- Department of Radiography/Radiotherapy, Faculty of Allied Health Sciences, University of Peradeniya, Peradeniya, Sri Lanka
| | - Sahan M. Vijithananda
- Department of Radiology, Faculty of Medicine, University of Peradeniya, Peradeniya, Sri Lanka
| |
Collapse
|
108
|
Ajuwon BI, Richardson A, Roper K, Lidbury BA. Clinical Validity of a Machine Learning Decision Support System for Early Detection of Hepatitis B Virus: A Binational External Validation Study. Viruses 2023; 15:1735. [PMID: 37632077 PMCID: PMC10458613 DOI: 10.3390/v15081735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 08/04/2023] [Accepted: 08/10/2023] [Indexed: 08/27/2023] Open
Abstract
HepB LiveTest is a machine learning decision support system developed for the early detection of hepatitis B virus (HBV). However, there is a lack of evidence on its generalisability. In this study, we aimed to externally assess the clinical validity and portability of HepB LiveTest in predicting HBV infection among independent patient cohorts from Nigeria and Australia. The performance of HepB LiveTest was evaluated by constructing receiver operating characteristic curves and estimating the area under the curve. Delong's method was used to estimate the 95% confidence interval (CI) of the area under the receiver-operating characteristic curve (AUROC). Compared to the Australian cohort, patients in the derivation cohort of HepB LiveTest and the hospital-based Nigerian cohort were younger (mean age, 45.5 years vs. 38.8 years vs. 40.8 years, respectively; p < 0.001) and had a higher incidence of HBV infection (1.9% vs. 69.4% vs. 57.3%). In the hospital-based Nigerian cohort, HepB LiveTest performed optimally with an AUROC of 0.94 (95% CI, 0.91-0.97). The model provided tailored predictions that ensured most cases of HBV infection did not go undetected. However, its discriminatory measure dropped to 0.60 (95% CI, 0.56-0.64) in the Australian cohort. These findings indicate that HepB LiveTest exhibits adequate cross-site transportability and clinical validity in the hospital-based Nigerian patient cohort but shows limited performance in the Australian cohort. Whilst HepB LiveTest holds promise for reducing HBV prevalence in underserved populations, caution is warranted when implementing the model in older populations, particularly in regions with low incidence of HBV infection.
Collapse
Affiliation(s)
- Busayo I. Ajuwon
- National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Acton, Canberra, ACT 2601, Australia; (K.R.); (B.A.L.)
- Department of Biosciences and Biotechnology, Faculty of Pure and Applied Sciences, Kwara State University, Malete 241103, Nigeria
| | - Alice Richardson
- Statistical Support Network, The Australian National University, Acton, Canberra, ACT 2601, Australia;
| | - Katrina Roper
- National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Acton, Canberra, ACT 2601, Australia; (K.R.); (B.A.L.)
| | - Brett A. Lidbury
- National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Acton, Canberra, ACT 2601, Australia; (K.R.); (B.A.L.)
| |
Collapse
|
109
|
Glans M, Kempen TGH, Jakobsson U, Kragh Ekstam A, Bondesson Å, Midlöv P. Identifying older adults at increased risk of medication-related readmission to hospital within 30 days of discharge: development and validation of a risk assessment tool. BMJ Open 2023; 13:e070559. [PMID: 37536970 PMCID: PMC10401249 DOI: 10.1136/bmjopen-2022-070559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Accepted: 07/19/2023] [Indexed: 08/05/2023] Open
Abstract
OBJECTIVE Developing and validating a risk assessment tool aiming to identify older adults (≥65 years) at increased risk of possibly medication-related readmission to hospital within 30 days of discharge. DESIGN Retrospective cohort study. SETTING The risk score was developed using data from a hospital in southern Sweden and validated using data from four hospitals in the mid-eastern part of Sweden. PARTICIPANTS The development cohort (n=720) was admitted to hospital during 2017, whereas the validation cohort (n=892) was admitted during 2017-2018. MEASURES The risk assessment tool aims to predict possibly medication-related readmission to hospital within 30 days of discharge. Variables known at first admission and individually associated with possibly medication-related readmission were used in development. The included variables were assigned points, and Youden's index was used to decide a threshold score. The risk score was calculated for all individuals in both cohorts. Area under the receiver operating characteristic (ROC) curve (c-index) was used to measure the discrimination of the developed risk score. Sensitivity, specificity and positive and negative predictive values were calculated using cross-tabulation. RESULTS The developed risk assessment tool, the Hospitalisations, Own home, Medications, and Emergency admission (HOME) Score, had a c-index of 0.69 in the development cohort and 0.65 in the validation cohort. It showed sensitivity 76%, specificity 54%, positive predictive value 29% and negative predictive value 90% at the threshold score in the development cohort. CONCLUSION The HOME Score can be used to identify older adults at increased risk of possibly medication-related readmission within 30 days of discharge. The tool is easy to use and includes variables available in electronic health records at admission, thus making it possible to implement risk-reducing activities during the hospital stay as well as at discharge and in transitions of care. Further studies are needed to investigate the clinical usefulness of the HOME Score as well as the benefits of implemented activities.
Collapse
Affiliation(s)
- Maria Glans
- Center for Primary Health Care Research, Department of Clinical Sciences, Lund University, Malmö, Sweden
- Kristianstad-Hässleholm Hospitals, Department of Medications, Region Skåne, Kristianstad, Sweden
| | - Thomas Gerardus Hendrik Kempen
- Department of Pharmacy, Uppsala University, Uppsala, Sweden
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Utrecht, The Netherlands
| | - Ulf Jakobsson
- Center for Primary Health Care Research, Department of Clinical Sciences, Lund University, Malmö, Sweden
| | - Annika Kragh Ekstam
- Kristianstad-Hässleholm Hospitals, Department of Orthopaedics, Region Skåne, Kristianstad, Sweden
| | - Åsa Bondesson
- Center for Primary Health Care Research, Department of Clinical Sciences, Lund University, Malmö, Sweden
- Department of Medicines Management and Informatics, Region Skåne, Kristianstad, Sweden
| | - Patrik Midlöv
- Center for Primary Health Care Research, Department of Clinical Sciences, Lund University, Malmö, Sweden
| |
Collapse
|
110
|
de Hond AAH, Shah VB, Kant IMJ, Van Calster B, Steyerberg EW, Hernandez-Boussard T. Perspectives on validation of clinical predictive algorithms. NPJ Digit Med 2023; 6:86. [PMID: 37149704 PMCID: PMC10163568 DOI: 10.1038/s41746-023-00832-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 04/28/2023] [Indexed: 05/08/2023] Open
Affiliation(s)
- Anne A H de Hond
- Clinical AI Implementation and Research Lab, Leiden University Medical Centre, Leiden, the Netherlands.
- Department of Medicine (Biomedical Informatics), Stanford University, Stanford, CA, USA.
- Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, the Netherlands.
| | - Vaibhavi B Shah
- Department of Medicine (Biomedical Informatics), Stanford University, Stanford, CA, USA
| | - Ilse M J Kant
- Department of Digital Health, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Ben Van Calster
- Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, the Netherlands
- Department of Development & Regeneration, KU Leuven, Leuven, Belgium
| | - Ewout W Steyerberg
- Clinical AI Implementation and Research Lab, Leiden University Medical Centre, Leiden, the Netherlands
- Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, the Netherlands
| | - Tina Hernandez-Boussard
- Department of Medicine (Biomedical Informatics), Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
- Department of Epidemiology & Population Health (by courtesy), Stanford University, Stanford, CA, USA
| |
Collapse
|