1
|
Moss L, Shaw M, Piper I, Hawthorne C. From bed to bench and back again: Challenges facing deployment of intracranial pressure data analysis in clinical environments. BRAIN & SPINE 2024; 4:102858. [PMID: 39105104 PMCID: PMC11298855 DOI: 10.1016/j.bas.2024.102858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 05/29/2024] [Accepted: 07/03/2024] [Indexed: 08/07/2024]
Abstract
Introduction Numerous complex physiological models derived from intracranial pressure (ICP) monitoring data have been developed. More recently, techniques such as machine learning are being used to develop increasingly sophisticated models to aid in clinical decision-making tasks such as diagnosis and prediction. Whilst their potential clinical impact may be significant, few models based on ICP data are routinely available at a patient's bedside. Further, the ability to refine models using ongoing patient data collection is rare. In this paper we identify and discuss the challenges faced when converting insight from ICP data analysis into deployable tools at the patient bedside. Research question To provide an overview of challenges facing implementation of sophisticated ICP models and analyses at the patient bedside. Material and methods A narrative review of the barriers facing implementation of sophisticated ICP models and analyses at the patient bedside in a neurocritical care unit combined with a descriptive case study (the CHART-ADAPT project) on the topic. Results Key barriers found were technical, analytical, and integrity related. Examples included: lack of interoperability of medical devices for data collection and/or model deployment; inadequate infrastructure, hindering analysis of large volumes of high frequency patient data; a lack of clinical confidence in a model; and ethical, trust, security and patient confidentiality considerations governing the secondary use of patient data. Discussion and conclusion To realise the benefits of ICP data analysis, the results need to be promptly delivered and meaningfully communicated. Multiple barriers to implementation remain and solutions which address real-world challenges are required.
Collapse
Affiliation(s)
- Laura Moss
- Dept. of Clinical Physics, NHS Greater Glasgow and Clyde, Glasgow, United Kingdom
- College of Medicine, Veterinary and Life Sciences, University of Glasgow, Glasgow, United Kingdom
| | - Martin Shaw
- Dept. of Clinical Physics, NHS Greater Glasgow and Clyde, Glasgow, United Kingdom
- College of Medicine, Veterinary and Life Sciences, University of Glasgow, Glasgow, United Kingdom
| | - Ian Piper
- College of Medicine, Veterinary and Life Sciences, University of Glasgow, Glasgow, United Kingdom
| | - Christopher Hawthorne
- Dept. of Neuroanaesthesia, Institute of Neurological Sciences, NHS Greater Glasgow and Clyde, Glasgow, United Kingdom
| |
Collapse
|
2
|
Kale AU, Hogg HDJ, Pearson R, Glocker B, Golder S, Coombe A, Waring J, Liu X, Moore DJ, Denniston AK. Detecting Algorithmic Errors and Patient Harms for AI-Enabled Medical Devices in Randomized Controlled Trials: Protocol for a Systematic Review. JMIR Res Protoc 2024; 13:e51614. [PMID: 38941147 PMCID: PMC11245650 DOI: 10.2196/51614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 03/11/2024] [Accepted: 04/18/2024] [Indexed: 06/29/2024] Open
Abstract
BACKGROUND Artificial intelligence (AI) medical devices have the potential to transform existing clinical workflows and ultimately improve patient outcomes. AI medical devices have shown potential for a range of clinical tasks such as diagnostics, prognostics, and therapeutic decision-making such as drug dosing. There is, however, an urgent need to ensure that these technologies remain safe for all populations. Recent literature demonstrates the need for rigorous performance error analysis to identify issues such as algorithmic encoding of spurious correlations (eg, protected characteristics) or specific failure modes that may lead to patient harm. Guidelines for reporting on studies that evaluate AI medical devices require the mention of performance error analysis; however, there is still a lack of understanding around how performance errors should be analyzed in clinical studies, and what harms authors should aim to detect and report. OBJECTIVE This systematic review will assess the frequency and severity of AI errors and adverse events (AEs) in randomized controlled trials (RCTs) investigating AI medical devices as interventions in clinical settings. The review will also explore how performance errors are analyzed including whether the analysis includes the investigation of subgroup-level outcomes. METHODS This systematic review will identify and select RCTs assessing AI medical devices. Search strategies will be deployed in MEDLINE (Ovid), Embase (Ovid), Cochrane CENTRAL, and clinical trial registries to identify relevant papers. RCTs identified in bibliographic databases will be cross-referenced with clinical trial registries. The primary outcomes of interest are the frequency and severity of AI errors, patient harms, and reported AEs. Quality assessment of RCTs will be based on version 2 of the Cochrane risk-of-bias tool (RoB2). Data analysis will include a comparison of error rates and patient harms between study arms, and a meta-analysis of the rates of patient harm in control versus intervention arms will be conducted if appropriate. RESULTS The project was registered on PROSPERO in February 2023. Preliminary searches have been completed and the search strategy has been designed in consultation with an information specialist and methodologist. Title and abstract screening started in September 2023. Full-text screening is ongoing and data collection and analysis began in April 2024. CONCLUSIONS Evaluations of AI medical devices have shown promising results; however, reporting of studies has been variable. Detection, analysis, and reporting of performance errors and patient harms is vital to robustly assess the safety of AI medical devices in RCTs. Scoping searches have illustrated that the reporting of harms is variable, often with no mention of AEs. The findings of this systematic review will identify the frequency and severity of AI performance errors and patient harms and generate insights into how errors should be analyzed to account for both overall and subgroup performance. TRIAL REGISTRATION PROSPERO CRD42023387747; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=387747. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) PRR1-10.2196/51614.
Collapse
Affiliation(s)
- Aditya U Kale
- Institute of Inflammation and Ageing, University of Birmingham, Birmingham, United Kingdom
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom
- NIHR Birmingham Biomedical Research Centre, Birmingham, United Kingdom
- NIHR Incubator for AI and Digital Health Research, Birmingham, United Kingdom
| | - Henry David Jeffry Hogg
- Population Health Science Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Russell Pearson
- Medicines and Healthcare Products Regulatory Agency, London, United Kingdom
| | - Ben Glocker
- Kheiron Medical Technologies, London, United Kingdom
- Department of Computing, Imperial College London, London, United Kingdom
| | - Su Golder
- Department of Health Sciences, University of York, York, United Kingdom
| | - April Coombe
- Institute of Applied Health Research, University of Birmingham, Birmingham, United Kingdom
| | - Justin Waring
- Health Services Management Centre, University of Birmingham, Birmingham, United Kingdom
| | - Xiaoxuan Liu
- Institute of Inflammation and Ageing, University of Birmingham, Birmingham, United Kingdom
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom
- NIHR Birmingham Biomedical Research Centre, Birmingham, United Kingdom
- NIHR Incubator for AI and Digital Health Research, Birmingham, United Kingdom
| | - David J Moore
- Institute of Applied Health Research, University of Birmingham, Birmingham, United Kingdom
| | - Alastair K Denniston
- Institute of Inflammation and Ageing, University of Birmingham, Birmingham, United Kingdom
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom
- NIHR Birmingham Biomedical Research Centre, Birmingham, United Kingdom
- NIHR Incubator for AI and Digital Health Research, Birmingham, United Kingdom
| |
Collapse
|
3
|
Wang Y, Fu W, Zhang Y, Wang D, Gu Y, Wang W, Xu H, Ge X, Ye C, Fang J, Su L, Wang J, He W, Zhang X, Feng R. Constructing and implementing a performance evaluation indicator set for artificial intelligence decision support systems in pediatric outpatient clinics: an observational study. Sci Rep 2024; 14:14482. [PMID: 38914707 PMCID: PMC11196575 DOI: 10.1038/s41598-024-64893-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 06/13/2024] [Indexed: 06/26/2024] Open
Abstract
Artificial intelligence (AI) decision support systems in pediatric healthcare have a complex application background. As an AI decision support system (AI-DSS) can be costly, once applied, it is crucial to focus on its performance, interpret its success, and then monitor and update it to ensure ongoing success consistently. Therefore, a set of evaluation indicators was explicitly developed for AI-DSS in pediatric healthcare, enabling continuous and systematic performance monitoring. The study unfolded in two stages. The first stage encompassed establishing the evaluation indicator set through a literature review, a focus group interview, and expert consultation using the Delphi method. In the second stage, weight analysis was conducted. Subjective weights were calculated based on expert opinions through analytic hierarchy process, while objective weights were determined using the entropy weight method. Subsequently, subject and object weights were synthesized to form the combined weight. In the two rounds of expert consultation, the authority coefficients were 0.834 and 0.846, Kendall's coordination coefficient was 0.135 in Round 1 and 0.312 in Round 2. The final evaluation indicator set has three first-class indicators, fifteen second-class indicators, and forty-seven third-class indicators. Indicator I-1(Organizational performance) carries the highest weight, followed by Indicator I-2(Societal performance) and Indicator I-3(User experience performance) in the objective and combined weights. Conversely, 'Societal performance' holds the most weight among the subjective weights, followed by 'Organizational performance' and 'User experience performance'. In this study, a comprehensive and specialized set of evaluation indicators for the AI-DSS in the pediatric outpatient clinic was established, and then implemented. Continuous evaluation still requires long-term data collection to optimize the weight proportions of the established indicators.
Collapse
Affiliation(s)
- Yingwen Wang
- Nursing Department, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Weijia Fu
- Medical Information Center, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Yuejie Zhang
- School of Computer Science, Fudan University, Shanghai, 200438, China
| | - Daoyang Wang
- School of Public, Health Fudan University, Shanghai, 200032, China
| | - Ying Gu
- Nursing Department, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Weibing Wang
- School of Public, Health Fudan University, Shanghai, 200032, China
| | - Hong Xu
- Nephrology Department, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Xiaoling Ge
- Statistical and Data Management Center, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Chengjie Ye
- Medical Information Center, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Jinwu Fang
- School of Public, Health Fudan University, Shanghai, 200032, China
| | - Ling Su
- Statistical and Data Management Center, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Jiayu Wang
- National Health Commission Key Laboratory of Neonatal Diseases (Fudan University), Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Wen He
- Respiratory Department, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Xiaobo Zhang
- Respiratory Department, Children's Hospital of Fudan University, Shanghai, 201102, China.
| | - Rui Feng
- School of Computer Science, Fudan University, Shanghai, 200438, China.
- School of Computer Science, Fudan University, 2005 Songhu Road, Shanghai, 200438, China.
| |
Collapse
|
4
|
Sezgin E, McKay I. Behavioral health and generative AI: a perspective on future of therapies and patient care. NPJ MENTAL HEALTH RESEARCH 2024; 3:25. [PMID: 38849499 PMCID: PMC11161641 DOI: 10.1038/s44184-024-00067-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 04/06/2024] [Indexed: 06/09/2024]
Affiliation(s)
- Emre Sezgin
- The Abigail Wexner Research Institute at Nationwide Children's Hospital, Columbus, OH, USA.
- The Ohio State University College of Medicine, Columbus, OH, USA.
| | - Ian McKay
- The Ohio State University College of Medicine, Columbus, OH, USA
- Department of Psychiatry and Behavioral Health, Nationwide Children's Hospital, Columbus, OH, USA
| |
Collapse
|
5
|
Hornstein S, Scharfenberger J, Lueken U, Wundrack R, Hilbert K. Predicting recurrent chat contact in a psychological intervention for the youth using natural language processing. NPJ Digit Med 2024; 7:132. [PMID: 38762694 PMCID: PMC11102489 DOI: 10.1038/s41746-024-01121-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 04/23/2024] [Indexed: 05/20/2024] Open
Abstract
Chat-based counseling hotlines emerged as a promising low-threshold intervention for youth mental health. However, despite the resulting availability of large text corpora, little work has investigated Natural Language Processing (NLP) applications within this setting. Therefore, this preregistered approach (OSF: XA4PN) utilizes a sample of approximately 19,000 children and young adults that received a chat consultation from a 24/7 crisis service in Germany. Around 800,000 messages were used to predict whether chatters would contact the service again, as this would allow the provision of or redirection to additional treatment. We trained an XGBoost Classifier on the words of the anonymized conversations, using repeated cross-validation and bayesian optimization for hyperparameter search. The best model was able to achieve an AUROC score of 0.68 (p < 0.01) on the previously unseen 3942 newest consultations. A shapely-based explainability approach revealed that words indicating younger age or female gender and terms related to self-harm and suicidal thoughts were associated with a higher chance of recontacting. We conclude that NLP-based predictions of recurrent contact are a promising path toward personalized care at chat hotlines.
Collapse
Affiliation(s)
- Silvan Hornstein
- Department of Psychology, Humboldt-Universität zu Berlin, 10099 Berlin, Germany.
| | | | - Ulrike Lueken
- Department of Psychology, Humboldt-Universität zu Berlin, 10099 Berlin, Germany
- German Center for Mental Health (DZPG), partner site Berlin/Potsdam, Potsdam, Germany
| | - Richard Wundrack
- Department of Psychology, Humboldt-Universität zu Berlin, 10099 Berlin, Germany
| | - Kevin Hilbert
- Department of Psychology, Humboldt-Universität zu Berlin, 10099 Berlin, Germany
| |
Collapse
|
6
|
Poddar M, Marwaha JS, Yuan W, Romero-Brufau S, Brat GA. An operational guide to translational clinical machine learning in academic medical centers. NPJ Digit Med 2024; 7:129. [PMID: 38760407 PMCID: PMC11101468 DOI: 10.1038/s41746-024-01094-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 03/29/2024] [Indexed: 05/19/2024] Open
Abstract
Few published data science tools are ever translated from academia to real-world clinical settings for which they were intended. One dimension of this problem is the software engineering task of turning published academic projects into tools that are usable at the bedside. Given the complexity of the data ecosystem in large health systems, this task often represents a significant barrier to the real-world deployment of data science tools for prospective piloting and evaluation. Many information technology companies have created Machine Learning Operations (MLOps) teams to help with such tasks at scale, but the low penetration of home-grown data science tools in regular clinical practice precludes the formation of such teams in healthcare organizations. Based on experiences deploying data science tools at two large academic medical centers (Beth Israel Deaconess Medical Center, Boston, MA; Mayo Clinic, Rochester, MN), we propose a strategy to facilitate this transition from academic product to operational tool, defining the responsibilities of the principal investigator, data scientist, machine learning engineer, health system IT administrator, and clinician end-user throughout the process. We first enumerate the technical resources and stakeholders needed to prepare for model deployment. We then propose an approach to planning how the final product will work from data extraction and analysis to visualization of model outputs. Finally, we describe how the team should execute on this plan. We hope to guide health systems aiming to deploy minimum viable data science tools and realize their value in clinical practice.
Collapse
Affiliation(s)
- Mukund Poddar
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Department of Surgery, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Jayson S Marwaha
- Department of Surgery, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - William Yuan
- Department of Surgery, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Santiago Romero-Brufau
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Department of Otolaryngology Head & Neck Surgery, Mayo Clinic, Rochester, MN, USA
| | - Gabriel A Brat
- Department of Surgery, Beth Israel Deaconess Medical Center, Boston, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
7
|
Khan SD, Hoodbhoy Z, Raja MHR, Kim JY, Hogg HDJ, Manji AAA, Gulamali F, Hasan A, Shaikh A, Tajuddin S, Khan NS, Patel MR, Balu S, Samad Z, Sendak MP. Frameworks for procurement, integration, monitoring, and evaluation of artificial intelligence tools in clinical settings: A systematic review. PLOS DIGITAL HEALTH 2024; 3:e0000514. [PMID: 38809946 PMCID: PMC11135672 DOI: 10.1371/journal.pdig.0000514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 04/18/2024] [Indexed: 05/31/2024]
Abstract
Research on the applications of artificial intelligence (AI) tools in medicine has increased exponentially over the last few years but its implementation in clinical practice has not seen a commensurate increase with a lack of consensus on implementing and maintaining such tools. This systematic review aims to summarize frameworks focusing on procuring, implementing, monitoring, and evaluating AI tools in clinical practice. A comprehensive literature search, following PRSIMA guidelines was performed on MEDLINE, Wiley Cochrane, Scopus, and EBSCO databases, to identify and include articles recommending practices, frameworks or guidelines for AI procurement, integration, monitoring, and evaluation. From the included articles, data regarding study aim, use of a framework, rationale of the framework, details regarding AI implementation involving procurement, integration, monitoring, and evaluation were extracted. The extracted details were then mapped on to the Donabedian Plan, Do, Study, Act cycle domains. The search yielded 17,537 unique articles, out of which 47 were evaluated for inclusion based on their full texts and 25 articles were included in the review. Common themes extracted included transparency, feasibility of operation within existing workflows, integrating into existing workflows, validation of the tool using predefined performance indicators and improving the algorithm and/or adjusting the tool to improve performance. Among the four domains (Plan, Do, Study, Act) the most common domain was Plan (84%, n = 21), followed by Study (60%, n = 15), Do (52%, n = 13), & Act (24%, n = 6). Among 172 authors, only 1 (0.6%) was from a low-income country (LIC) and 2 (1.2%) were from lower-middle-income countries (LMICs). Healthcare professionals cite the implementation of AI tools within clinical settings as challenging owing to low levels of evidence focusing on integration in the Do and Act domains. The current healthcare AI landscape calls for increased data sharing and knowledge translation to facilitate common goals and reap maximum clinical benefit.
Collapse
Affiliation(s)
- Sarim Dawar Khan
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Zahra Hoodbhoy
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
- Department of Paediatrics and Child Health, Aga Khan University, Karachi, Pakistan
| | | | - Jee Young Kim
- Duke Institute for Health Innovation, Duke University School of Medicine, Durham, North Carolina, United States
| | - Henry David Jeffry Hogg
- Population Health Science Institute, Newcastle University, Newcastle upon Tyne, United Kingdom
- Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, United Kingdom
- Moorfields Eye Hospital NHS Foundation Trust, London, United Kingdom
| | - Afshan Anwar Ali Manji
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Freya Gulamali
- Duke Institute for Health Innovation, Duke University School of Medicine, Durham, North Carolina, United States
| | - Alifia Hasan
- Duke Institute for Health Innovation, Duke University School of Medicine, Durham, North Carolina, United States
| | - Asim Shaikh
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Salma Tajuddin
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Nida Saddaf Khan
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Manesh R. Patel
- Duke Clinical Research Institute, Duke University School of Medicine, Durham, North Carolina, United States
- Division of Cardiology, Duke University School of Medicine, Durham, North Carolina, United States
| | - Suresh Balu
- Duke Institute for Health Innovation, Duke University School of Medicine, Durham, North Carolina, United States
| | - Zainab Samad
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
- Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Mark P. Sendak
- Duke Institute for Health Innovation, Duke University School of Medicine, Durham, North Carolina, United States
| |
Collapse
|
8
|
Boag W, Hasan A, Kim JY, Revoir M, Nichols M, Ratliff W, Gao M, Zilberstein S, Samad Z, Hoodbhoy Z, Ali M, Khan NS, Patel M, Balu S, Sendak M. The algorithm journey map: a tangible approach to implementing AI solutions in healthcare. NPJ Digit Med 2024; 7:87. [PMID: 38594344 PMCID: PMC11003994 DOI: 10.1038/s41746-024-01061-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 02/19/2024] [Indexed: 04/11/2024] Open
Abstract
When integrating AI tools in healthcare settings, complex interactions between technologies and primary users are not always fully understood or visible. This deficient and ambiguous understanding hampers attempts by healthcare organizations to adopt AI/ML, and it also creates new challenges for researchers to identify opportunities for simplifying adoption and developing best practices for the use of AI-based solutions. Our study fills this gap by documenting the process of designing, building, and maintaining an AI solution called SepsisWatch at Duke University Health System. We conducted 20 interviews with the team of engineers and scientists that led the multi-year effort to build the tool, integrate it into practice, and maintain the solution. This "Algorithm Journey Map" enumerates all social and technical activities throughout the AI solution's procurement, development, integration, and full lifecycle management. In addition to mapping the "who?" and "what?" of the adoption of the AI tool, we also show several 'lessons learned' throughout the algorithm journey maps including modeling assumptions, stakeholder inclusion, and organizational structure. In doing so, we identify generalizable insights about how to recognize and navigate barriers to AI/ML adoption in healthcare settings. We expect that this effort will further the development of best practices for operationalizing and sustaining ethical principles-in algorithmic systems.
Collapse
Affiliation(s)
- William Boag
- Duke Institute for Health Innovation, Durham, NC, USA
| | - Alifia Hasan
- Duke Institute for Health Innovation, Durham, NC, USA
| | - Jee Young Kim
- Duke Institute for Health Innovation, Durham, NC, USA
| | - Mike Revoir
- Duke Institute for Health Innovation, Durham, NC, USA
| | | | | | - Michael Gao
- Duke Institute for Health Innovation, Durham, NC, USA
| | - Shira Zilberstein
- Duke Institute for Health Innovation, Durham, NC, USA
- Harvard University, Cambridge, MA, USA
| | | | | | | | | | - Manesh Patel
- Duke University School of Medicine, Durham, NC, USA
| | - Suresh Balu
- Duke Institute for Health Innovation, Durham, NC, USA
| | - Mark Sendak
- Duke Institute for Health Innovation, Durham, NC, USA.
| |
Collapse
|
9
|
Hashemi Gheinani A, Kim J, You S, Adam RM. Bioinformatics in urology - molecular characterization of pathophysiology and response to treatment. Nat Rev Urol 2024; 21:214-242. [PMID: 37604982 DOI: 10.1038/s41585-023-00805-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/13/2023] [Indexed: 08/23/2023]
Abstract
The application of bioinformatics has revolutionized the practice of medicine in the past 20 years. From early studies that uncovered subtypes of cancer to broad efforts spearheaded by the Cancer Genome Atlas initiative, the use of bioinformatics strategies to analyse high-dimensional data has provided unprecedented insights into the molecular basis of disease. In addition to the identification of disease subtypes - which enables risk stratification - informatics analysis has facilitated the identification of novel risk factors and drivers of disease, biomarkers of progression and treatment response, as well as possibilities for drug repurposing or repositioning; moreover, bioinformatics has guided research towards precision and personalized medicine. Implementation of specific computational approaches such as artificial intelligence, machine learning and molecular subtyping has yet to become widespread in urology clinical practice for reasons of cost, disruption of clinical workflow and need for prospective validation of informatics approaches in independent patient cohorts. Solving these challenges might accelerate routine integration of bioinformatics into clinical settings.
Collapse
Affiliation(s)
- Ali Hashemi Gheinani
- Department of Urology, Boston Children's Hospital, Boston, MA, USA
- Department of Surgery, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Urology, Inselspital, Bern, Switzerland
- Department for BioMedical Research, University of Bern, Bern, Switzerland
| | - Jina Kim
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Sungyong You
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Rosalyn M Adam
- Department of Urology, Boston Children's Hospital, Boston, MA, USA.
- Department of Surgery, Harvard Medical School, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
10
|
van de Sande D, Chung EFF, Oosterhoff J, van Bommel J, Gommers D, van Genderen ME. To warrant clinical adoption AI models require a multi-faceted implementation evaluation. NPJ Digit Med 2024; 7:58. [PMID: 38448743 PMCID: PMC10918103 DOI: 10.1038/s41746-024-01064-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 02/22/2024] [Indexed: 03/08/2024] Open
Abstract
Despite artificial intelligence (AI) technology progresses at unprecedented rate, our ability to translate these advancements into clinical value and adoption at the bedside remains comparatively limited. This paper reviews the current use of implementation outcomes in randomized controlled trials evaluating AI-based clinical decision support and found limited adoption. To advance trust and clinical adoption of AI, there is a need to bridge the gap between traditional quantitative metrics and implementation outcomes to better grasp the reasons behind the success or failure of AI systems and improve their translation into clinical value.
Collapse
Affiliation(s)
- Davy van de Sande
- Erasmus MC University Medical Center, Department of Adult Intensive Care, Rotterdam, The Netherlands
| | - Eline Fung Fen Chung
- Erasmus MC University Medical Center, Department of Adult Intensive Care, Rotterdam, The Netherlands
| | - Jacobien Oosterhoff
- Delft University of Technology, Faculty of Technology, Policy and Management, Delft, The Netherlands
| | - Jasper van Bommel
- Erasmus MC University Medical Center, Department of Adult Intensive Care, Rotterdam, The Netherlands
| | - Diederik Gommers
- Erasmus MC University Medical Center, Department of Adult Intensive Care, Rotterdam, The Netherlands
| | - Michel E van Genderen
- Erasmus MC University Medical Center, Department of Adult Intensive Care, Rotterdam, The Netherlands.
| |
Collapse
|
11
|
Kwong JCC, Nickel GC, Wang SCY, Kvedar JC. Integrating artificial intelligence into healthcare systems: more than just the algorithm. NPJ Digit Med 2024; 7:52. [PMID: 38429418 PMCID: PMC10907626 DOI: 10.1038/s41746-024-01066-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 02/22/2024] [Indexed: 03/03/2024] Open
Affiliation(s)
- Jethro C C Kwong
- Division of Urology, Department of Surgery, University of Toronto, Toronto, ON, Canada.
- Temerty Centre for AI Research and Education in Medicine, University of Toronto, Toronto, ON, Canada.
| | | | | | | |
Collapse
|
12
|
Adeoye J, Su YX. Leveraging artificial intelligence for perioperative cancer risk assessment of oral potentially malignant disorders. Int J Surg 2024; 110:1677-1686. [PMID: 38051932 PMCID: PMC10942172 DOI: 10.1097/js9.0000000000000979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 11/21/2023] [Indexed: 12/07/2023]
Abstract
Oral potentially malignant disorders (OPMDs) are mucosal conditions with an inherent disposition to develop oral squamous cell carcinoma. Surgical management is the most preferred strategy to prevent malignant transformation in OPMDs, and surgical approaches to treatment include conventional scalpel excision, laser surgery, cryotherapy, and photodynamic therapy. However, in reality, since all patients with OPMDs will not develop oral squamous cell carcinoma in their lifetime, there is a need to stratify patients according to their risk of malignant transformation to streamline surgical intervention for patients with the highest risks. Artificial intelligence (AI) has the potential to integrate disparate factors influencing malignant transformation for robust, precise, and personalized cancer risk stratification of OPMD patients than current methods to determine the need for surgical resection, excision, or re-excision. Therefore, this article overviews existing AI models and tools, presents a clinical implementation pathway, and discusses necessary refinements to aid the clinical application of AI-based platforms for cancer risk stratification of OPMDs in surgical practice.
Collapse
Affiliation(s)
| | - Yu-Xiong Su
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong SAR, People’s Republic of China
| |
Collapse
|
13
|
Jin W, Fatehi M, Guo R, Hamarneh G. Evaluating the clinical utility of artificial intelligence assistance and its explanation on the glioma grading task. Artif Intell Med 2024; 148:102751. [PMID: 38325929 DOI: 10.1016/j.artmed.2023.102751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 11/06/2023] [Accepted: 12/21/2023] [Indexed: 02/09/2024]
Abstract
Clinical evaluation evidence and model explainability are key gatekeepers to ensure the safe, accountable, and effective use of artificial intelligence (AI) in clinical settings. We conducted a clinical user-centered evaluation with 35 neurosurgeons to assess the utility of AI assistance and its explanation on the glioma grading task. Each participant read 25 brain MRI scans of patients with gliomas, and gave their judgment on the glioma grading without and with the assistance of AI prediction and explanation. The AI model was trained on the BraTS dataset with 88.0% accuracy. The AI explanation was generated using the explainable AI algorithm of SmoothGrad, which was selected from 16 algorithms based on the criterion of being truthful to the AI decision process. Results showed that compared to the average accuracy of 82.5±8.7% when physicians performed the task alone, physicians' task performance increased to 87.7±7.3% with statistical significance (p-value = 0.002) when assisted by AI prediction, and remained at almost the same level of 88.5±7.0% (p-value = 0.35) with the additional assistance of AI explanation. Based on quantitative and qualitative results, the observed improvement in physicians' task performance assisted by AI prediction was mainly because physicians' decision patterns converged to be similar to AI, as physicians only switched their decisions when disagreeing with AI. The insignificant change in physicians' performance with the additional assistance of AI explanation was because the AI explanations did not provide explicit reasons, contexts, or descriptions of clinical features to help doctors discern potentially incorrect AI predictions. The evaluation showed the clinical utility of AI to assist physicians on the glioma grading task, and identified the limitations and clinical usage gaps of existing explainable AI techniques for future improvement.
Collapse
Affiliation(s)
- Weina Jin
- School of Computing Science, Simon Fraser University, Burnaby, Canada.
| | - Mostafa Fatehi
- Division of Neurosurgery, The University of British Columbia, Vancouver, Canada.
| | - Ru Guo
- Division of Neurosurgery, The University of British Columbia, Vancouver, Canada.
| | - Ghassan Hamarneh
- School of Computing Science, Simon Fraser University, Burnaby, Canada.
| |
Collapse
|
14
|
Jung J, Dai J, Liu B, Wu Q. Artificial intelligence in fracture detection with different image modalities and data types: A systematic review and meta-analysis. PLOS DIGITAL HEALTH 2024; 3:e0000438. [PMID: 38289965 PMCID: PMC10826962 DOI: 10.1371/journal.pdig.0000438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 12/25/2023] [Indexed: 02/01/2024]
Abstract
Artificial Intelligence (AI), encompassing Machine Learning and Deep Learning, has increasingly been applied to fracture detection using diverse imaging modalities and data types. This systematic review and meta-analysis aimed to assess the efficacy of AI in detecting fractures through various imaging modalities and data types (image, tabular, or both) and to synthesize the existing evidence related to AI-based fracture detection. Peer-reviewed studies developing and validating AI for fracture detection were identified through searches in multiple electronic databases without time limitations. A hierarchical meta-analysis model was used to calculate pooled sensitivity and specificity. A diagnostic accuracy quality assessment was performed to evaluate bias and applicability. Of the 66 eligible studies, 54 identified fractures using imaging-related data, nine using tabular data, and three using both. Vertebral fractures were the most common outcome (n = 20), followed by hip fractures (n = 18). Hip fractures exhibited the highest pooled sensitivity (92%; 95% CI: 87-96, p< 0.01) and specificity (90%; 95% CI: 85-93, p< 0.01). Pooled sensitivity and specificity using image data (92%; 95% CI: 90-94, p< 0.01; and 91%; 95% CI: 88-93, p < 0.01) were higher than those using tabular data (81%; 95% CI: 77-85, p< 0.01; and 83%; 95% CI: 76-88, p < 0.01), respectively. Radiographs demonstrated the highest pooled sensitivity (94%; 95% CI: 90-96, p < 0.01) and specificity (92%; 95% CI: 89-94, p< 0.01). Patient selection and reference standards were major concerns in assessing diagnostic accuracy for bias and applicability. AI displays high diagnostic accuracy for various fracture outcomes, indicating potential utility in healthcare systems for fracture diagnosis. However, enhanced transparency in reporting and adherence to standardized guidelines are necessary to improve the clinical applicability of AI. Review Registration: PROSPERO (CRD42021240359).
Collapse
Affiliation(s)
- Jongyun Jung
- Department of Biomedical Informatics (Dr. Qing Wu, Jongyun Jung, and Jingyuan Dai), College of Medicine, The Ohio State University, Columbus, Ohio, United States of America
| | - Jingyuan Dai
- Department of Biomedical Informatics (Dr. Qing Wu, Jongyun Jung, and Jingyuan Dai), College of Medicine, The Ohio State University, Columbus, Ohio, United States of America
| | - Bowen Liu
- Department of Mathematics and Statistics, Division of Computing, Analytics, and Mathematics, School of Science and Engineering (Bowen Liu), University of Missouri-Kansas City, Kansas City, Missouri, United States of America
| | - Qing Wu
- Department of Biomedical Informatics (Dr. Qing Wu, Jongyun Jung, and Jingyuan Dai), College of Medicine, The Ohio State University, Columbus, Ohio, United States of America
| |
Collapse
|
15
|
Föllmer B, Williams MC, Dey D, Arbab-Zadeh A, Maurovich-Horvat P, Volleberg RHJA, Rueckert D, Schnabel JA, Newby DE, Dweck MR, Guagliumi G, Falk V, Vázquez Mézquita AJ, Biavati F, Išgum I, Dewey M. Roadmap on the use of artificial intelligence for imaging of vulnerable atherosclerotic plaque in coronary arteries. Nat Rev Cardiol 2024; 21:51-64. [PMID: 37464183 DOI: 10.1038/s41569-023-00900-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/07/2023] [Indexed: 07/20/2023]
Abstract
Artificial intelligence (AI) is likely to revolutionize the way medical images are analysed and has the potential to improve the identification and analysis of vulnerable or high-risk atherosclerotic plaques in coronary arteries, leading to advances in the treatment of coronary artery disease. However, coronary plaque analysis is challenging owing to cardiac and respiratory motion, as well as the small size of cardiovascular structures. Moreover, the analysis of coronary imaging data is time-consuming, can be performed only by clinicians with dedicated cardiovascular imaging training, and is subject to considerable interreader and intrareader variability. AI has the potential to improve the assessment of images of vulnerable plaque in coronary arteries, but requires robust development, testing and validation. Combining human expertise with AI might facilitate the reliable and valid interpretation of images obtained using CT, MRI, PET, intravascular ultrasonography and optical coherence tomography. In this Roadmap, we review existing evidence on the application of AI to the imaging of vulnerable plaque in coronary arteries and provide consensus recommendations developed by an interdisciplinary group of experts on AI and non-invasive and invasive coronary imaging. We also outline future requirements of AI technology to address bias, uncertainty, explainability and generalizability, which are all essential for the acceptance of AI and its clinical utility in handling the anticipated growing volume of coronary imaging procedures.
Collapse
Affiliation(s)
- Bernhard Föllmer
- Department of Radiology, Charité - Universitätsmedizin Berlin, Berlin, Germany.
| | | | - Damini Dey
- Biomedical Imaging Research Institute and Department of Imaging, Medicine and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Armin Arbab-Zadeh
- Division of Cardiology, Department of Medicine, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Pál Maurovich-Horvat
- Department of Radiology, Medical Imaging Center, Semmelweis University, Budapest, Hungary
| | - Rick H J A Volleberg
- Department of Cardiology, Radboud University Medical Center, Nijmegen, Netherlands
| | - Daniel Rueckert
- Artificial Intelligence in Medicine and Healthcare, Technical University of Munich, Munich, Germany
- Department of Computing, Imperial College London, London, UK
| | - Julia A Schnabel
- School of Biomedical Imaging and Imaging Sciences, King's College London, London, UK
- Institute of Machine Learning in Biomedical Imaging, Helmholtz Munich, Neuherberg, Germany
- School of Computation, Information and Technology, Technical University of Munich, Munich, Germany
| | - David E Newby
- Centre for Cardiovascular Science, University of Edinburgh, Edinburgh, UK
| | - Marc R Dweck
- Centre for Cardiovascular Science, University of Edinburgh, Edinburgh, UK
| | - Giulio Guagliumi
- Division of Cardiology, IRCCS Galeazzi Sant'Ambrogio Hospital, Milan, Italy
| | - Volkmar Falk
- Department of Cardiothoracic and Vascular Surgery, Deutsches Herzzentrum der Charité, Charité Universitätsmedizin, Berlin, Germany
- Department of Health Science and Technology, ETH Zurich, Zurich, Switzerland
- Berlin Institute of Health at Charité and DZHK (German Centre for Cardiovascular Research), Partner Site, Berlin, Germany
| | | | - Federico Biavati
- Department of Radiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Ivana Išgum
- Department of Biomedical Engineering and Physics, Amsterdam University Medical Center, University of Amsterdam, Amsterdam, Netherlands
- Department of Radiology and Nuclear Medicine, Amsterdam UMC, University of Amsterdam, Amsterdam, Netherlands
- Informatics Institute, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
| | - Marc Dewey
- Department of Radiology, Charité - Universitätsmedizin Berlin, Berlin, Germany.
- Berlin Institute of Health, Campus Charité Mitte, Berlin, Germany.
- DZHK (German Centre for Cardiovascular Research), Partner Site Berlin and Deutsches Herzzentrum der Charité (DHZC), Charité - Universitätsmedizin Berlin, Berlin, Germany.
| |
Collapse
|
16
|
Shiwani T, Relton S, Evans R, Kale A, Heaven A, Clegg A, Todd O. New Horizons in artificial intelligence in the healthcare of older people. Age Ageing 2023; 52:afad219. [PMID: 38124256 PMCID: PMC10733173 DOI: 10.1093/ageing/afad219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Indexed: 12/23/2023] Open
Abstract
Artificial intelligence (AI) in healthcare describes algorithm-based computational techniques which manage and analyse large datasets to make inferences and predictions. There are many potential applications of AI in the care of older people, from clinical decision support systems that can support identification of delirium from clinical records to wearable devices that can predict the risk of a fall. We held four meetings of older people, clinicians and AI researchers. Three priority areas were identified for AI application in the care of older people. These included: monitoring and early diagnosis of disease, stratified care and care coordination between healthcare providers. However, the meetings also highlighted concerns that AI may exacerbate health inequity for older people through bias within AI models, lack of external validation amongst older people, infringements on privacy and autonomy, insufficient transparency of AI models and lack of safeguarding for errors. Creating effective interventions for older people requires a person-centred approach to account for the needs of older people, as well as sufficient clinical and technological governance to meet standards of generalisability, transparency and effectiveness. Education of clinicians and patients is also needed to ensure appropriate use of AI technologies, with investment in technological infrastructure required to ensure equity of access.
Collapse
Affiliation(s)
- Taha Shiwani
- Academic Unit for Ageing & Stroke Research, Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Duckworth Lane, Bradford, West Yorkshire BD9 6RJ, UK
| | - Samuel Relton
- Leeds Institute of Health Sciences, University of Leeds, Leeds, UK
| | - Ruth Evans
- Leeds Institute of Health Sciences, University of Leeds, Leeds, UK
| | - Aditya Kale
- Academic Unit of Ophthalmology, Institute of Inflammation & Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Anne Heaven
- Academic Unit for Ageing & Stroke Research, Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Duckworth Lane, Bradford, West Yorkshire BD9 6RJ, UK
| | - Andrew Clegg
- Academic Unit for Ageing & Stroke Research, Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Duckworth Lane, Bradford, West Yorkshire BD9 6RJ, UK
| | - Oliver Todd
- Academic Unit for Ageing & Stroke Research, Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Duckworth Lane, Bradford, West Yorkshire BD9 6RJ, UK
| |
Collapse
|
17
|
McCradden MD, Joshi S, Anderson JA, London AJ. A normative framework for artificial intelligence as a sociotechnical system in healthcare. PATTERNS (NEW YORK, N.Y.) 2023; 4:100864. [PMID: 38035190 PMCID: PMC10682751 DOI: 10.1016/j.patter.2023.100864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
Artificial intelligence (AI) tools are of great interest to healthcare organizations for their potential to improve patient care, yet their translation into clinical settings remains inconsistent. One of the reasons for this gap is that good technical performance does not inevitably result in patient benefit. We advocate for a conceptual shift wherein AI tools are seen as components of an intervention ensemble. The intervention ensemble describes the constellation of practices that, together, bring about benefit to patients or health systems. Shifting from a narrow focus on the tool itself toward the intervention ensemble prioritizes a "sociotechnical" vision for translation of AI that values all components of use that support beneficial patient outcomes. The intervention ensemble approach can be used for regulation, institutional oversight, and for AI adopters to responsibly and ethically appraise, evaluate, and use AI tools.
Collapse
Affiliation(s)
- Melissa D. McCradden
- Department of Bioethics, The Hospital for Sick Children, Toronto, ON, Canada
- Genetics & Genome Biology Research Program, Peter Gilgan Center for Research & Learning, Toronto, ON, Canada
- Division of Clinical & Public Health, Dalla Lana School of Public Health, Toronto, ON, Canada
| | - Shalmali Joshi
- Department of Biomedical Informatics, Department of Computer Science (Affliate), Data Science Institute, Columbia University, New York, NY, USA
| | - James A. Anderson
- Department of Bioethics, The Hospital for Sick Children, Toronto, ON, Canada
- Institute for Health Policy, Management, and Evaluation, University of Toronto, Toronto, ON, Canada
| | - Alex John London
- Department of Philosophy and Center for Ethics and Policy, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
18
|
Lin M, Zhou Q, Lei T, Shang N, Zheng Q, He X, Wang N, Xie H. Deep learning system improved detection efficacy of fetal intracranial malformations in a randomized controlled trial. NPJ Digit Med 2023; 6:191. [PMID: 37833395 PMCID: PMC10575919 DOI: 10.1038/s41746-023-00932-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 09/25/2023] [Indexed: 10/15/2023] Open
Abstract
Congenital malformations of the central nervous system are among the most common major congenital malformations. Deep learning systems have come to the fore in prenatal diagnosis of congenital malformation, but the impact of deep learning-assisted detection of congenital intracranial malformations from fetal neurosonographic images has not been evaluated. Here we report a three-way crossover, randomized control trial (Trial Registration: ChiCTR2100048233) that assesses the efficacy of a deep learning system, the Prenatal Ultrasound Diagnosis Artificial Intelligence Conduct System (PAICS), in assisting fetal intracranial malformation detection. A total of 709 fetal neurosonographic images/videos are read interactively by 36 sonologists of different expertise levels in three reading modes: unassisted mode (without PAICS assistance), concurrent mode (using PAICS at the beginning of the assessment) and second mode (using PAICS after a fully unaided interpretation). Aided by PAICS, the average accuracy of the unassisted mode (73%) is increased by the concurrent mode (80%; P < 0.001) and the second mode (82%; P < 0.001). Correspondingly, the AUC is increased from 0.85 to 0.89 and to 0.90, respectively (P < 0.001 for all). The median read time per data is slightly increased in concurrent mode but substantially prolonged in the second mode, from 6 s to 7 s and to 11 s (P < 0.001 for all). In conclusion, PAICS in both concurrent and second modes has the potential to improve sonologists' performance in detecting fetal intracranial malformations from neurosonographic data. PAICS is more efficient when used concurrently for all readers.
Collapse
Affiliation(s)
- Meifang Lin
- Department of Ultrasonic Medicine, Fetal Medical Center, First Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Qian Zhou
- Department of Medical Statistics, Clinical Trials Unit, First Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, China and Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Ting Lei
- Department of Ultrasonic Medicine, Fetal Medical Center, First Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Ning Shang
- Department of Ultrasound, Guangdong Women and Children Hospital, Guangzhou, Guangdong, China
| | - Qiao Zheng
- Department of Ultrasonic Medicine, Fetal Medical Center, First Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Xiaoqin He
- Department of Ultrasound, Women and Children's Hospital affiliated to Xiamen University, Xiamen, Fujian, China
| | - Nan Wang
- Guangzhou Aiyunji Information Technology co., Ltd, Guangzhou, Guangdong, China.
| | - Hongning Xie
- Department of Ultrasonic Medicine, Fetal Medical Center, First Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, China.
| |
Collapse
|
19
|
McCradden M, Hui K, Buchman DZ. Evidence, ethics and the promise of artificial intelligence in psychiatry. JOURNAL OF MEDICAL ETHICS 2023; 49:573-579. [PMID: 36581457 PMCID: PMC10423547 DOI: 10.1136/jme-2022-108447] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Accepted: 11/29/2022] [Indexed: 05/20/2023]
Abstract
Researchers are studying how artificial intelligence (AI) can be used to better detect, prognosticate and subgroup diseases. The idea that AI might advance medicine's understanding of biological categories of psychiatric disorders, as well as provide better treatments, is appealing given the historical challenges with prediction, diagnosis and treatment in psychiatry. Given the power of AI to analyse vast amounts of information, some clinicians may feel obligated to align their clinical judgements with the outputs of the AI system. However, a potential epistemic privileging of AI in clinical judgements may lead to unintended consequences that could negatively affect patient treatment, well-being and rights. The implications are also relevant to precision medicine, digital twin technologies and predictive analytics generally. We propose that a commitment to epistemic humility can help promote judicious clinical decision-making at the interface of big data and AI in psychiatry.
Collapse
Affiliation(s)
- Melissa McCradden
- Joint Centre for Bioethics, University of Toronto Dalla Lana School of Public Health, Toronto, Ontario, Canada
- Bioethics, The Hospital for Sick Children, Toronto, Ontario, Canada
- Genetics & Genome Biology, Peter Gilgan Centre for Research and Learning, Toronto, Ontario, Canada
| | - Katrina Hui
- Everyday Ethics Lab, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
- Department of Psychiatry, University of Toronto, Toronto, Ontario, Canada
| | - Daniel Z Buchman
- Joint Centre for Bioethics, University of Toronto Dalla Lana School of Public Health, Toronto, Ontario, Canada
- Everyday Ethics Lab, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| |
Collapse
|
20
|
Wehkamp K, Krawczak M, Schreiber S. The Quality and Utility of Artificial Intelligence in Patient Care. DEUTSCHES ARZTEBLATT INTERNATIONAL 2023; 120:463-469. [PMID: 37218054 PMCID: PMC10487679 DOI: 10.3238/arztebl.m2023.0124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 11/30/2022] [Accepted: 05/08/2023] [Indexed: 05/24/2023]
Abstract
BACKGROUND Artificial intelligence (AI) is increasingly being used in patient care. In the future, physicians will need to understand not only the basic functioning of AI applications, but also their quality, utility, and risks. METHODS This article is based on a selective review of the literature on the principles, quality, limitations, and benefits AI applications in patient care, along with examples of individual applications. RESULTS The number of AI applications in patient care is rising, with more than 500 approvals in the United States to date. Their quality and utility are based on a number of interdependent factors, including the real-life setting, the type and amount of data collected, the choice of variables used by the application, the algorithms used, and the goal and implementation of each application. Bias (which may be hidden) and errors can arise at all these levels. Any evaluation of the quality and utility of an AI application must, therefore, be conducted according to the scientific principles of evidence-based medicine-a requirement that is often hampered by a lack of transparency. CONCLUSION AI has the potential to improve patient care while meeting the challenge of dealing with an ever-increasing surfeit of information and data in medicine with limited human resources. The limitations and risks of AI applications require critical and responsible consideration. This can best be achieved through a combination of scientific.
Collapse
Affiliation(s)
- Kai Wehkamp
- Department of Internal Medicine I, University Medical Center Schleswig-Holstein, Campus Lübeck, Kiel, Germany
- Department for Medical Management, MSH Medical School Hamburg, Hamburg, Germany
| | - Michael Krawczak
- Institute of Medical Informatics and Statistics, Christian-Albrechts-University of Kiel, University Medical Center Schleswig-Holstein Campus Kiel, Germany
| | - Stefan Schreiber
- Department of Internal Medicine I, University Medical Center Schleswig-Holstein, Campus Lübeck, Kiel, Germany
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, University Medical Center Schleswig-Holstein Campus Kiel, Germany
| |
Collapse
|
21
|
Trottet C, Vogels T, Keitel K, Kulinkina AV, Tan R, Cobuccio L, Jaggi M, Hartley MA. Modular Clinical Decision Support Networks (MoDN)-Updatable, interpretable, and portable predictions for evolving clinical environments. PLOS DIGITAL HEALTH 2023; 2:e0000108. [PMID: 37459285 PMCID: PMC10351690 DOI: 10.1371/journal.pdig.0000108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 06/12/2023] [Indexed: 07/20/2023]
Abstract
Clinical Decision Support Systems (CDSS) have the potential to improve and standardise care with probabilistic guidance. However, many CDSS deploy static, generic rule-based logic, resulting in inequitably distributed accuracy and inconsistent performance in evolving clinical environments. Data-driven models could resolve this issue by updating predictions according to the data collected. However, the size of data required necessitates collaborative learning from analogous CDSS's, which are often imperfectly interoperable (IIO) or unshareable. We propose Modular Clinical Decision Support Networks (MoDN) which allow flexible, privacy-preserving learning across IIO datasets, as well as being robust to the systematic missingness common to CDSS-derived data, while providing interpretable, continuous predictive feedback to the clinician. MoDN is a novel decision tree composed of feature-specific neural network modules that can be combined in any number or combination to make any number or combination of diagnostic predictions, updatable at each step of a consultation. The model is validated on a real-world CDSS-derived dataset, comprising 3,192 paediatric outpatients in Tanzania. MoDN significantly outperforms 'monolithic' baseline models (which take all features at once at the end of a consultation) with a mean macro F1 score across all diagnoses of 0.749 vs 0.651 for logistic regression and 0.620 for multilayer perceptron (p < 0.001). To test collaborative learning between IIO datasets, we create subsets with various percentages of feature overlap and port a MoDN model trained on one subset to another. Even with only 60% common features, fine-tuning a MoDN model on the new dataset or just making a composite model with MoDN modules matched the ideal scenario of sharing data in a perfectly interoperable setting. MoDN integrates into consultation logic by providing interpretable continuous feedback on the predictive potential of each question in a CDSS questionnaire. The modular design allows it to compartmentalise training updates to specific features and collaboratively learn between IIO datasets without sharing any data.
Collapse
Affiliation(s)
- Cécile Trottet
- Intelligent Global Health Research Group, Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - Thijs Vogels
- Intelligent Global Health Research Group, Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - Kristina Keitel
- Division of Pediatric Emergency Medicine, Department of Pediatrics, Inselspital, Bern University Hospital, University of Bern, Switzerland
| | - Alexandra V. Kulinkina
- Digital Health Unit, Swiss Center for International Health, Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Rainer Tan
- Clinical Research Unit, Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- Ifakara Health Institute, Ifakara, Tanzania
- Center for Primary Care and Public Health (Unisanté), Lausanne, Switzerland
| | - Ludovico Cobuccio
- Clinical Research Unit, Swiss Tropical and Public Health Institute, Allschwil, Switzerland
| | - Martin Jaggi
- Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - Mary-Anne Hartley
- Intelligent Global Health Research Group, Machine Learning and Optimization Laboratory, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
- Laboratory of Intelligent Global Health Technologies, Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, CT, USA
| |
Collapse
|
22
|
Eskofier BM, Klucken J. Predictive Models for Health Deterioration: Understanding Disease Pathways for Personalized Medicine. Annu Rev Biomed Eng 2023; 25:131-156. [PMID: 36854259 DOI: 10.1146/annurev-bioeng-110220-030247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
Artificial intelligence (AI) and machine learning (ML) methods are currently widely employed in medicine and healthcare. A PubMed search returns more than 100,000 articles on these topics published between 2018 and 2022 alone. Notwithstanding several recent reviews in various subfields of AI and ML in medicine, we have yet to see a comprehensive review around the methods' use in longitudinal analysis and prediction of an individual patient's health status within a personalized disease pathway. This review seeks to fill that gap. After an overview of the AI and ML methods employed in this field and of specific medical applications of models of this type, the review discusses the strengths and limitations of current studies and looks ahead to future strands of research in this field. We aim to enable interested readers to gain a detailed impression of the research currently available and accordingly plan future work around predictive models for deterioration in health status.
Collapse
Affiliation(s)
- Bjoern M Eskofier
- Machine Learning and Data Analytics Lab, Department of Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany;
| | - Jochen Klucken
- Digital Medicine Group, Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, Belvaux, Luxembourg
- Digital Medicine Group, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg
- Centre Hospitalier de Luxembourg, Luxembourg City, Luxembourg
| |
Collapse
|
23
|
Marwaha JS, Raza MM, Kvedar JC. The digital transformation of surgery. NPJ Digit Med 2023; 6:103. [PMID: 37258642 PMCID: PMC10232406 DOI: 10.1038/s41746-023-00846-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Accepted: 05/15/2023] [Indexed: 06/02/2023] Open
Abstract
Rapid advances in digital technology and artificial intelligence in recent years have already begun to transform many industries, and are beginning to make headway into healthcare. There is tremendous potential for new digital technologies to improve the care of surgical patients. In this piece, we highlight work being done to advance surgical care using machine learning, computer vision, wearable devices, remote patient monitoring, and virtual and augmented reality. We describe ways these technologies can be used to improve the practice of surgery, and discuss opportunities and challenges to their widespread adoption and use in operating rooms and at the bedside.
Collapse
Affiliation(s)
- Jayson S Marwaha
- Beth Israel Deaconess Medical Center, Boston, MA, USA.
- Harvard Medical School, Boston, MA, USA.
| | | | - Joseph C Kvedar
- Harvard Medical School, Boston, MA, USA
- Mass General Brigham, Boston, MA, USA
| |
Collapse
|
24
|
Sun F, Yao J, Du S, Qian F, Appleton AA, Tao C, Xu H, Liu L, Dai Q, Joyce BT, Nannini DR, Hou L, Zhang K. Social Determinants, Cardiovascular Disease, and Health Care Cost: A Nationwide Study in the United States Using Machine Learning. J Am Heart Assoc 2023; 12:e027919. [PMID: 36802713 PMCID: PMC10111459 DOI: 10.1161/jaha.122.027919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
Background Existing studies on cardiovascular diseases (CVDs) often focus on individual-level behavioral risk factors, but research examining social determinants is limited. This study applies a novel machine learning approach to identify the key predictors of county-level care costs and prevalence of CVDs (including atrial fibrillation, acute myocardial infarction, congestive heart failure, and ischemic heart disease). Methods and Results We applied the extreme gradient boosting machine learning approach to a total of 3137 counties. Data are from the Interactive Atlas of Heart Disease and Stroke and a variety of national data sets. We found that although demographic composition (eg, percentages of Black people and older adults) and risk factors (eg, smoking and physical inactivity) are among the most important predictors for inpatient care costs and CVD prevalence, contextual factors such as social vulnerability and racial and ethnic segregation are particularly important for the total and outpatient care costs. Poverty and income inequality are the major contributors to the total care costs for counties that are in nonmetro areas or have high segregation or social vulnerability levels. Racial and ethnic segregation is particularly important in shaping the total care costs for counties with low poverty rates or social vulnerability level. Demographic composition, education, and social vulnerability are consistently important across different scenarios. Conclusions The findings highlight the differences in predictors for different types of CVD cost outcomes and the importance of social determinants. Interventions directed toward areas that have been economically and socially marginalized may aid in reducing the impact of CVDs.
Collapse
Affiliation(s)
- Feinuo Sun
- Global Aging and Community Initiative Mount Saint Vincent University Halifax Nova Scotia Canada
| | - Jie Yao
- Department of Epidemiology and Biostatistics, School of Public Health University at Albany, State University of New York Albany NY
| | - Shichao Du
- Department of Sociology University at Albany, State University of New York Albany NY
| | - Feng Qian
- Department of Health Policy, Management and Behavior, School of Public Health University at Albany, State University of New York Albany NY
| | - Allison A Appleton
- Department of Epidemiology and Biostatistics, School of Public Health University at Albany, State University of New York Albany NY
| | - Cui Tao
- School of Biomedical Informatics The University of Texas Health Science Center at Houston Houston TX
| | - Hua Xu
- School of Biomedical Informatics The University of Texas Health Science Center at Houston Houston TX
| | - Lei Liu
- Division of Biostatistics Washington University in St. Louis St. Louis MO
| | - Qi Dai
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, School of Medicine Vanderbilt University, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center Nashville TN
| | - Brian T Joyce
- Department of Preventive Medicine Northwestern University Feinberg School of Medicine Chicago IL
| | - Drew R Nannini
- Department of Preventive Medicine Northwestern University Feinberg School of Medicine Chicago IL
| | - Lifang Hou
- Department of Preventive Medicine Northwestern University Feinberg School of Medicine Chicago IL
| | - Kai Zhang
- Department of Environmental Health Sciences, School of Public Health University at Albany, State University of New York Albany NY
| |
Collapse
|
25
|
Gurevich E, El Hassan B, El Morr C. Equity within AI systems: What can health leaders expect? Healthc Manage Forum 2023; 36:119-124. [PMID: 36226507 PMCID: PMC9976641 DOI: 10.1177/08404704221125368] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Artificial Intelligence (AI) for health has a great potential; it has already proven to be successful in enhancing patient outcomes, facilitating professional work and benefiting administration. However, AI presents challenges related to health equity defined as an opportunity for people to reach their fullest health potential. This article discusses the opportunities and challenges that AI presents in health and examines ways in which inequities related to AI can be mitigated.
Collapse
Affiliation(s)
| | | | - Christo El Morr
- York University, Toronto, Ontario, Canada.,Christo El Morr, York University, Toronto, Ontario, Canada. E-mail:
| |
Collapse
|
26
|
Walter W, Pohlkamp C, Meggendorfer M, Nadarajah N, Kern W, Haferlach C, Haferlach T. Artificial intelligence in hematological diagnostics: Game changer or gadget? Blood Rev 2023; 58:101019. [PMID: 36241586 DOI: 10.1016/j.blre.2022.101019] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 09/21/2022] [Accepted: 10/03/2022] [Indexed: 11/30/2022]
Abstract
The future of clinical diagnosis and treatment of hematologic diseases will inevitably involve the integration of artificial intelligence (AI)-based systems into routine practice to support the hematologists' decision making. Several studies have shown that AI-based models can already be used to automatically differentiate cells, reliably detect malignant cell populations, support chromosome banding analysis, and interpret clinical variants, contributing to early disease detection and prognosis. However, even the best tool can become useless if it is misapplied or the results are misinterpreted. Therefore, in order to comprehensively judge and correctly apply newly developed AI-based systems, the hematologist must have a basic understanding of the general concepts of machine learning. In this review, we provide the hematologist with a comprehensive overview of various machine learning techniques, their current implementations and approaches in different diagnostic subfields (e.g., cytogenetics, molecular genetics), and the limitations and unresolved challenges of the systems.
Collapse
Affiliation(s)
- Wencke Walter
- MLL Munich Leukemia Laboratory, Max-Lebsche-Platz 31, 81377 München, Germany.
| | - Christian Pohlkamp
- MLL Munich Leukemia Laboratory, Max-Lebsche-Platz 31, 81377 München, Germany.
| | - Manja Meggendorfer
- MLL Munich Leukemia Laboratory, Max-Lebsche-Platz 31, 81377 München, Germany.
| | - Niroshan Nadarajah
- MLL Munich Leukemia Laboratory, Max-Lebsche-Platz 31, 81377 München, Germany.
| | - Wolfgang Kern
- MLL Munich Leukemia Laboratory, Max-Lebsche-Platz 31, 81377 München, Germany.
| | - Claudia Haferlach
- MLL Munich Leukemia Laboratory, Max-Lebsche-Platz 31, 81377 München, Germany.
| | - Torsten Haferlach
- MLL Munich Leukemia Laboratory, Max-Lebsche-Platz 31, 81377 München, Germany.
| |
Collapse
|
27
|
Rocheteau E. On the role of artificial intelligence in psychiatry. Br J Psychiatry 2023; 222:54-57. [PMID: 36093950 DOI: 10.1192/bjp.2022.132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Recently, there has been growing interest in artificial intelligence (AI) to improve efficiency and personalisation of mental health services. So far, the progress has been slow, however, advancements in deep learning may change this. This paper discusses the role for AI in psychiatry, in particular (a) diagnosis tools, (b) monitoring of symptoms, and (c) delivering personalised treatment recommendations. Finally, I discuss ethical concerns and technological limitations.
Collapse
Affiliation(s)
- Emma Rocheteau
- School of Clinical Medicine, University of Cambridge, UK; and Department of Computer Science and Technology, University of Cambridge, UK
| |
Collapse
|
28
|
Marwaha JS, Chen HW, Habashy K, Choi J, Spain DA, Brat GA. Appraising the Quality of Development and Reporting in Surgical Prediction Models. JAMA Surg 2023; 158:214-216. [PMID: 36449299 PMCID: PMC9713676 DOI: 10.1001/jamasurg.2022.4488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 07/23/2022] [Indexed: 12/03/2022]
Abstract
This cross-sectional study uses the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis reporting guideline to assess 120 published studies about surgical prediction models.
Collapse
Affiliation(s)
- Jayson S Marwaha
- Beth Israel Deaconess Medical Center, Department of Surgery, Boston, Massachusetts
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| | - Hao Wei Chen
- Beth Israel Deaconess Medical Center, Department of Surgery, Boston, Massachusetts
| | - Karl Habashy
- American University of Beirut Medical Center, Beirut, Lebanon
| | - Jeff Choi
- Department of Surgery, Stanford University, Palo Alto, California
- Department of Biomedical Data Science, Stanford University, Palo Alto, California
| | - David A Spain
- Department of Surgery, Stanford University, Palo Alto, California
| | - Gabriel A Brat
- Beth Israel Deaconess Medical Center, Department of Surgery, Boston, Massachusetts
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
29
|
Riester MR, Zullo AR. Prediction tool Development and Implementation in pharmacy praCTice (PreDICT) proposed guidance. Am J Health Syst Pharm 2023; 80:111-123. [PMID: 36242567 DOI: 10.1093/ajhp/zxac298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Indexed: 01/26/2023] Open
Abstract
PURPOSE Proposed guidance is presented for Prediction tool Development and Implementation in pharmacy praCTice (PreDICT). This guidance aims to assist pharmacists and their collaborators with planning, developing, and implementing custom risk prediction tools for use by pharmacists in their own health systems or practice settings. We aimed to describe general considerations that would be relevant to most prediction tools designed for use in health systems or other pharmacy practice settings. SUMMARY The PreDICT proposed guidance is organized into 3 sequential phases: (1) planning, (2) development and validation, and (3) testing and refining prediction tools for real-world use. Each phase is accompanied by a checklist of considerations designed to be used by pharmacists or their trainees (eg, residents) during the planning or conduct of a prediction tool project. Commentary and a worked example are also provided to highlight some of the most relevant and impactful considerations for each phase. CONCLUSION The proposed guidance for PreDICT is a pharmacist-focused set of checklists for planning, developing, and implementing prediction tools in pharmacy practice. The list of considerations and accompanying commentary can be used as a reference by pharmacists or their trainees before or during the completion of a prediction tool project.
Collapse
Affiliation(s)
- Melissa R Riester
- Department of Health Services, Policy, and Practice, Brown University School of Public Health, Providence, RI, USA
| | - Andrew R Zullo
- Departments of Health Services, Policy, and Practice and Epidemiology, Brown University School of Public Health, Providence, RI.,Department of Pharmacy, Rhode Island Hospital, Providence, RI, USA
| |
Collapse
|
30
|
Park SH, Han K, Jang HY, Park JE, Lee JG, Kim DW, Choi J. Methods for Clinical Evaluation of Artificial Intelligence Algorithms for Medical Diagnosis. Radiology 2023; 306:20-31. [PMID: 36346314 DOI: 10.1148/radiol.220182] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Adequate clinical evaluation of artificial intelligence (AI) algorithms before adoption in practice is critical. Clinical evaluation aims to confirm acceptable AI performance through adequate external testing and confirm the benefits of AI-assisted care compared with conventional care through appropriately designed and conducted studies, for which prospective studies are desirable. This article explains some of the fundamental methodological points that should be considered when designing and appraising the clinical evaluation of AI algorithms for medical diagnosis. The specific topics addressed include the following: (a) the importance of external testing of AI algorithms and strategies for conducting the external testing effectively, (b) the various metrics and graphical methods for evaluating the AI performance as well as essential methodological points to note in using and interpreting them, (c) paired study designs primarily for comparative performance evaluation of conventional and AI-assisted diagnoses, (d) parallel study designs primarily for evaluating the effect of AI intervention with an emphasis on randomized clinical trials, and (e) up-to-date guidelines for reporting clinical studies on AI, with an emphasis on guidelines registered in the EQUATOR Network library. Sound methodological knowledge of these topics will aid the design, execution, reporting, and appraisal of clinical evaluation of AI.
Collapse
Affiliation(s)
- Seong Ho Park
- From the Department of Radiology and Research Institute of Radiology (S.H.P., J.E.P., D.W.K.) and Department of Biomedical Engineering (J.C.), Asan Medical Center, University of Ulsan College of Medicine, 88, Olympic-ro 43-gil, Songpa-gu, Seoul 05505, South Korea; Department of Radiology, Research Institute of Radiological Science and Center for Clinical Imaging Data Science, Yonsei University College of Medicine, Seoul, South Korea (K.H.); Department of Radiology, National Cancer Center, Goyang, South Korea (H.Y.J.); and Biomedical Engineering Research Center, Asan Institute for Life Sciences, University of Ulsan College of Medicine, Seoul, South Korea (J.G.L.)
| | - Kyunghwa Han
- From the Department of Radiology and Research Institute of Radiology (S.H.P., J.E.P., D.W.K.) and Department of Biomedical Engineering (J.C.), Asan Medical Center, University of Ulsan College of Medicine, 88, Olympic-ro 43-gil, Songpa-gu, Seoul 05505, South Korea; Department of Radiology, Research Institute of Radiological Science and Center for Clinical Imaging Data Science, Yonsei University College of Medicine, Seoul, South Korea (K.H.); Department of Radiology, National Cancer Center, Goyang, South Korea (H.Y.J.); and Biomedical Engineering Research Center, Asan Institute for Life Sciences, University of Ulsan College of Medicine, Seoul, South Korea (J.G.L.)
| | - Hye Young Jang
- From the Department of Radiology and Research Institute of Radiology (S.H.P., J.E.P., D.W.K.) and Department of Biomedical Engineering (J.C.), Asan Medical Center, University of Ulsan College of Medicine, 88, Olympic-ro 43-gil, Songpa-gu, Seoul 05505, South Korea; Department of Radiology, Research Institute of Radiological Science and Center for Clinical Imaging Data Science, Yonsei University College of Medicine, Seoul, South Korea (K.H.); Department of Radiology, National Cancer Center, Goyang, South Korea (H.Y.J.); and Biomedical Engineering Research Center, Asan Institute for Life Sciences, University of Ulsan College of Medicine, Seoul, South Korea (J.G.L.)
| | - Ji Eun Park
- From the Department of Radiology and Research Institute of Radiology (S.H.P., J.E.P., D.W.K.) and Department of Biomedical Engineering (J.C.), Asan Medical Center, University of Ulsan College of Medicine, 88, Olympic-ro 43-gil, Songpa-gu, Seoul 05505, South Korea; Department of Radiology, Research Institute of Radiological Science and Center for Clinical Imaging Data Science, Yonsei University College of Medicine, Seoul, South Korea (K.H.); Department of Radiology, National Cancer Center, Goyang, South Korea (H.Y.J.); and Biomedical Engineering Research Center, Asan Institute for Life Sciences, University of Ulsan College of Medicine, Seoul, South Korea (J.G.L.)
| | - June-Goo Lee
- From the Department of Radiology and Research Institute of Radiology (S.H.P., J.E.P., D.W.K.) and Department of Biomedical Engineering (J.C.), Asan Medical Center, University of Ulsan College of Medicine, 88, Olympic-ro 43-gil, Songpa-gu, Seoul 05505, South Korea; Department of Radiology, Research Institute of Radiological Science and Center for Clinical Imaging Data Science, Yonsei University College of Medicine, Seoul, South Korea (K.H.); Department of Radiology, National Cancer Center, Goyang, South Korea (H.Y.J.); and Biomedical Engineering Research Center, Asan Institute for Life Sciences, University of Ulsan College of Medicine, Seoul, South Korea (J.G.L.)
| | - Dong Wook Kim
- From the Department of Radiology and Research Institute of Radiology (S.H.P., J.E.P., D.W.K.) and Department of Biomedical Engineering (J.C.), Asan Medical Center, University of Ulsan College of Medicine, 88, Olympic-ro 43-gil, Songpa-gu, Seoul 05505, South Korea; Department of Radiology, Research Institute of Radiological Science and Center for Clinical Imaging Data Science, Yonsei University College of Medicine, Seoul, South Korea (K.H.); Department of Radiology, National Cancer Center, Goyang, South Korea (H.Y.J.); and Biomedical Engineering Research Center, Asan Institute for Life Sciences, University of Ulsan College of Medicine, Seoul, South Korea (J.G.L.)
| | - Jaesoon Choi
- From the Department of Radiology and Research Institute of Radiology (S.H.P., J.E.P., D.W.K.) and Department of Biomedical Engineering (J.C.), Asan Medical Center, University of Ulsan College of Medicine, 88, Olympic-ro 43-gil, Songpa-gu, Seoul 05505, South Korea; Department of Radiology, Research Institute of Radiological Science and Center for Clinical Imaging Data Science, Yonsei University College of Medicine, Seoul, South Korea (K.H.); Department of Radiology, National Cancer Center, Goyang, South Korea (H.Y.J.); and Biomedical Engineering Research Center, Asan Institute for Life Sciences, University of Ulsan College of Medicine, Seoul, South Korea (J.G.L.)
| |
Collapse
|
31
|
Sendak M, Vidal D, Trujillo S, Singh K, Liu X, Balu S. Editorial: Surfacing best practices for AI software development and integration in healthcare. Front Digit Health 2023; 5:1150875. [PMID: 36895323 PMCID: PMC9989472 DOI: 10.3389/fdgth.2023.1150875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 02/06/2023] [Indexed: 02/25/2023] Open
Affiliation(s)
- Mark Sendak
- Duke Institute for Health Innovation, Durham, NC, United States
| | | | | | - Karandeep Singh
- Division of Nephrology, Department of Internal Medicine, University of Michigan, Ann Arbor, MI, United States
| | - Xiaoxuan Liu
- Academic Unit of Ophthalmology, Institute of Inflammation and Ageing, University of Birmingham, Birmingham, United Kingdom
| | - Suresh Balu
- Duke Institute for Health Innovation, Durham, NC, United States
| |
Collapse
|
32
|
Adeoye J, Zheng LW, Thomson P, Choi SW, Su YX. Explainable ensemble learning model improves identification of candidates for oral cancer screening. Oral Oncol 2023; 136:106278. [PMID: 36525782 DOI: 10.1016/j.oraloncology.2022.106278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 11/26/2022] [Accepted: 12/06/2022] [Indexed: 12/15/2022]
Abstract
OBJECTIVES Artificial intelligence could enhance the use of disparate risk factors (crude method) for better stratification of patients to be screened for oral cancer. This study aims to construct a meta-classifier that considers diverse risk factors to identify patients at risk of oral cancer and other suspicious oral diseases for targeted screening. MATERIALS AND METHODS A retrospective dataset from a community oral cancer screening program was used to construct and train the novel voting meta-classifier. Comprehensive risk factor information from this dataset was used as input features for eleven supervised learning algorithms which served as base learners and provided predicted probabilities that are weighted and aggregated by the meta-classifier. Training dataset was augmented using SMOTE-ENN. Additionally, Shapley additive explanations (SHAP) values were generated to implement the explainability of the model and display the important risk factors. RESULTS Our meta-classifier had an internal validation recall, specificity, and AUROC of 0.83, 0.86, and 0.85 for identifying the risk of oral cancer and 0.92, 0.60, and 0.76 for identifying suspicious oral mucosal disease respectively. Upon external validation, the meta-classifier had a significantly higher AUROC than the crude/current method used for identifying the risk of oral cancer (0.78 vs 0.46; p = 0.001) Also, the meta-classifier had better recall than the crude method for predicting the risk of suspicious oral mucosal diseases (0.78 vs 0.47). CONCLUSION Overall, these findings showcase that our approach optimizes the use of risk factors in identifying patients for oral screening which suggests potential clinical application.
Collapse
Affiliation(s)
- John Adeoye
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong, China
| | - Li-Wu Zheng
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong, China
| | - Peter Thomson
- College of Medicine and Dentistry, James Cook University, Cairns, Queensland, Australia
| | - Siu-Wai Choi
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong, China
| | - Yu-Xiong Su
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong, China.
| |
Collapse
|
33
|
Joyce C, Markossian TW, Nikolaides J, Ramsey E, Thompson HM, Rojas JC, Sharma B, Dligach D, Oguss MK, Cooper RS, Afshar M. The Evaluation of a Clinical Decision Support Tool Using Natural Language Processing to Screen Hospitalized Adults for Unhealthy Substance Use: Protocol for a Quasi-Experimental Design. JMIR Res Protoc 2022; 11:e42971. [PMID: 36534461 PMCID: PMC9808720 DOI: 10.2196/42971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 12/01/2022] [Accepted: 12/05/2022] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Automated and data-driven methods for screening using natural language processing (NLP) and machine learning may replace resource-intensive manual approaches in the usual care of patients hospitalized with conditions related to unhealthy substance use. The rigorous evaluation of tools that use artificial intelligence (AI) is necessary to demonstrate effectiveness before system-wide implementation. An NLP tool to use routinely collected data in the electronic health record was previously validated for diagnostic accuracy in a retrospective study for screening unhealthy substance use. Our next step is a noninferiority design incorporated into a research protocol for clinical implementation with prospective evaluation of clinical effectiveness in a large health system. OBJECTIVE This study aims to provide a study protocol to evaluate health outcomes and the costs and benefits of an AI-driven automated screener compared to manual human screening for unhealthy substance use. METHODS A pre-post design is proposed to evaluate 12 months of manual screening followed by 12 months of automated screening across surgical and medical wards at a single medical center. The preintervention period consists of usual care with manual screening by nurses and social workers and referrals to a multidisciplinary Substance Use Intervention Team (SUIT). Facilitated by a NLP pipeline in the postintervention period, clinical notes from the first 24 hours of hospitalization will be processed and scored by a machine learning model, and the SUIT will be similarly alerted to patients who flagged positive for substance misuse. Flowsheets within the electronic health record have been updated to capture rates of interventions for the primary outcome (brief intervention/motivational interviewing, medication-assisted treatment, naloxone dispensing, and referral to outpatient care). Effectiveness in terms of patient outcomes will be determined by noninferior rates of interventions (primary outcome), as well as rates of readmission within 6 months, average time to consult, and discharge rates against medical advice (secondary outcomes) in the postintervention period by a SUIT compared to the preintervention period. A separate analysis will be performed to assess the costs and benefits to the health system by using automated screening. Changes from the pre- to postintervention period will be assessed in covariate-adjusted generalized linear mixed-effects models. RESULTS The study will begin in September 2022. Monthly data monitoring and Data Safety Monitoring Board reporting are scheduled every 6 months throughout the study period. We anticipate reporting final results by June 2025. CONCLUSIONS The use of augmented intelligence for clinical decision support is growing with an increasing number of AI tools. We provide a research protocol for prospective evaluation of an automated NLP system for screening unhealthy substance use using a noninferiority design to demonstrate comprehensive screening that may be as effective as manual screening but less costly via automated solutions. TRIAL REGISTRATION ClinicalTrials.gov NCT03833804; https://clinicaltrials.gov/ct2/show/NCT03833804. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) DERR1-10.2196/42971.
Collapse
Affiliation(s)
- Cara Joyce
- Department of Computer Science, Loyola University Chicago, Chicago, IL, United States
| | - Talar W Markossian
- Department of Public Health Sciences, Loyola University Chicago, Maywood, IL, United States
| | - Jenna Nikolaides
- Department of Psychiatry, Rush University Medical Center, Chicago, IL, United States
| | - Elisabeth Ramsey
- Department of Psychiatry, Rush University Medical Center, Chicago, IL, United States
| | - Hale M Thompson
- Department of Psychiatry, Rush University Medical Center, Chicago, IL, United States
| | - Juan C Rojas
- Department of Psychiatry, Rush University Medical Center, Chicago, IL, United States
| | - Brihat Sharma
- Department of Psychiatry, Rush University Medical Center, Chicago, IL, United States
| | - Dmitriy Dligach
- Department of Computer Science, Loyola University Chicago, Chicago, IL, United States
| | - Madeline K Oguss
- Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| | - Richard S Cooper
- Department of Public Health Sciences, Loyola University Chicago, Maywood, IL, United States
| | - Majid Afshar
- Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| |
Collapse
|
34
|
Park SH, Choi JI, Fournier L, Vasey B. Randomized Clinical Trials of Artificial Intelligence in Medicine: Why, When, and How? Korean J Radiol 2022; 23:1119-1125. [PMID: 36447410 PMCID: PMC9747266 DOI: 10.3348/kjr.2022.0834] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Accepted: 10/30/2022] [Indexed: 11/29/2022] Open
Affiliation(s)
- Seong Ho Park
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Joon-Il Choi
- Department of Radiology, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Laure Fournier
- Department of Radiology, Université Paris Cité, AP-HP, Hôpital Européen Georges Pompidou, PARCC UMRS 970, INSERM, Paris, France
| | - Baptiste Vasey
- Nuffield Department of Surgical Sciences, University of Oxford, Oxford, UK
| |
Collapse
|
35
|
Adeoye J, Akinshipo A, Koohi-Moghadam M, Thomson P, Su YX. Construction of machine learning-based models for cancer outcomes in low and lower-middle income countries: A scoping review. Front Oncol 2022; 12:976168. [DOI: 10.3389/fonc.2022.976168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 11/14/2022] [Indexed: 12/05/2022] Open
Abstract
BackgroundThe impact and utility of machine learning (ML)-based prediction tools for cancer outcomes including assistive diagnosis, risk stratification, and adjunctive decision-making have been largely described and realized in the high income and upper-middle-income countries. However, statistical projections have estimated higher cancer incidence and mortality risks in low and lower-middle-income countries (LLMICs). Therefore, this review aimed to evaluate the utilization, model construction methods, and degree of implementation of ML-based models for cancer outcomes in LLMICs.MethodsPubMed/Medline, Scopus, and Web of Science databases were searched and articles describing the use of ML-based models for cancer among local populations in LLMICs between 2002 and 2022 were included. A total of 140 articles from 22,516 citations that met the eligibility criteria were included in this study.ResultsML-based models from LLMICs were often based on traditional ML algorithms than deep or deep hybrid learning. We found that the construction of ML-based models was skewed to particular LLMICs such as India, Iran, Pakistan, and Egypt with a paucity of applications in sub-Saharan Africa. Moreover, models for breast, head and neck, and brain cancer outcomes were frequently explored. Many models were deemed suboptimal according to the Prediction model Risk of Bias Assessment tool (PROBAST) due to sample size constraints and technical flaws in ML modeling even though their performance accuracy ranged from 0.65 to 1.00. While the development and internal validation were described for all models included (n=137), only 4.4% (6/137) have been validated in independent cohorts and 0.7% (1/137) have been assessed for clinical impact and efficacy.ConclusionOverall, the application of ML for modeling cancer outcomes in LLMICs is increasing. However, model development is largely unsatisfactory. We recommend model retraining using larger sample sizes, intensified external validation practices, and increased impact assessment studies using randomized controlled trial designsSystematic review registrationhttps://www.crd.york.ac.uk/prospero/display_record.php?RecordID=308345, identifier CRD42022308345.
Collapse
|
36
|
Developing robust benchmarks for driving forward AI innovation in healthcare. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00559-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
37
|
Han SS, Navarrete-Dechent C, Liopyris K, Kim MS, Park GH, Woo SS, Park J, Shin JW, Kim BR, Kim MJ, Donoso F, Villanueva F, Ramirez C, Chang SE, Halpern A, Kim SH, Na JI. The degradation of performance of a state-of-the-art skin image classifier when applied to patient-driven internet search. Sci Rep 2022; 12:16260. [PMID: 36171272 PMCID: PMC9519737 DOI: 10.1038/s41598-022-20632-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2022] [Accepted: 09/15/2022] [Indexed: 11/09/2022] Open
Abstract
Model Dermatology ( https://modelderm.com ; Build2021) is a publicly testable neural network that can classify 184 skin disorders. We aimed to investigate whether our algorithm can classify clinical images of an Internet community along with tertiary care center datasets. Consecutive images from an Internet skin cancer community ('RD' dataset, 1,282 images posted between 25 January 2020 to 30 July 2021; https://reddit.com/r/melanoma ) were analyzed retrospectively, along with hospital datasets (Edinburgh dataset, 1,300 images; SNU dataset, 2,101 images; TeleDerm dataset, 340 consecutive images). The algorithm's performance was equivalent to that of dermatologists in the curated clinical datasets (Edinburgh and SNU datasets). However, its performance deteriorated in the RD and TeleDerm datasets because of insufficient image quality and the presence of out-of-distribution disorders, respectively. For the RD dataset, the algorithm's Top-1/3 accuracy (39.2%/67.2%) and AUC (0.800) were equivalent to that of general physicians (36.8%/52.9%). It was more accurate than that of the laypersons using random Internet searches (19.2%/24.4%). The Top-1/3 accuracy was affected by inadequate image quality (adequate = 43.2%/71.3% versus inadequate = 32.9%/60.8%), whereas participant performance did not deteriorate (adequate = 35.8%/52.7% vs. inadequate = 38.4%/53.3%). In this report, the algorithm performance was significantly affected by the change of the intended settings, which implies that AI algorithms at dermatologist-level, in-distribution setting, may not be able to show the same level of performance in with out-of-distribution settings.
Collapse
Affiliation(s)
- Seung Seog Han
- Department of Dermatology, I Dermatology Clinic, Seoul, Korea.,IDerma Inc., Seoul, Korea
| | - Cristian Navarrete-Dechent
- Department of Dermatology, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Konstantinos Liopyris
- Department of Dermatology, University of Athens, Andreas Syggros Hospital of Skin and Venereal Diseases, Athens, Greece
| | - Myoung Shin Kim
- Department of Dermatology, Sanggye Paik Hospital, Inje University College of Medicine, Seoul, Korea
| | - Gyeong Hun Park
- Department of Dermatology, Dongtan Sacred Heart Hospital, Hallym University College of Medicine, Seoul, Korea
| | - Sang Seok Woo
- Department of Plastic and Reconstructive Surgery, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, 1, Singil-ro, Yeong deong op-gu, Seoul, 07441, Korea
| | - Juhyun Park
- Department of Dermatology, Seoul National University Bundang Hospital, 82 Gumi-Ro 173 Beon-Gil, Seongnam, 463-707, Gyeonggi, Korea
| | - Jung Won Shin
- Department of Dermatology, Seoul National University Bundang Hospital, 82 Gumi-Ro 173 Beon-Gil, Seongnam, 463-707, Gyeonggi, Korea
| | - Bo Ri Kim
- Department of Dermatology, Seoul National University Bundang Hospital, 82 Gumi-Ro 173 Beon-Gil, Seongnam, 463-707, Gyeonggi, Korea
| | - Min Jae Kim
- Department of Dermatology, Seoul National University Bundang Hospital, 82 Gumi-Ro 173 Beon-Gil, Seongnam, 463-707, Gyeonggi, Korea
| | - Francisca Donoso
- Department of Dermatology, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Francisco Villanueva
- Department of Dermatology, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Cristian Ramirez
- Department of Dermatology, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Sung Eun Chang
- Department of Dermatology, Asan Medical Center, Ulsan University College of Medicine, Seoul, Korea
| | - Allan Halpern
- Dermatology Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Seong Hwan Kim
- Department of Plastic and Reconstructive Surgery, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, 1, Singil-ro, Yeong deong op-gu, Seoul, 07441, Korea.
| | - Jung-Im Na
- Department of Dermatology, Seoul National University Bundang Hospital, 82 Gumi-Ro 173 Beon-Gil, Seongnam, 463-707, Gyeonggi, Korea.
| |
Collapse
|
38
|
Plana D, Shung DL, Grimshaw AA, Saraf A, Sung JJY, Kann BH. Randomized Clinical Trials of Machine Learning Interventions in Health Care: A Systematic Review. JAMA Netw Open 2022; 5:e2233946. [PMID: 36173632 PMCID: PMC9523495 DOI: 10.1001/jamanetworkopen.2022.33946] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
IMPORTANCE Despite the potential of machine learning to improve multiple aspects of patient care, barriers to clinical adoption remain. Randomized clinical trials (RCTs) are often a prerequisite to large-scale clinical adoption of an intervention, and important questions remain regarding how machine learning interventions are being incorporated into clinical trials in health care. OBJECTIVE To systematically examine the design, reporting standards, risk of bias, and inclusivity of RCTs for medical machine learning interventions. EVIDENCE REVIEW In this systematic review, the Cochrane Library, Google Scholar, Ovid Embase, Ovid MEDLINE, PubMed, Scopus, and Web of Science Core Collection online databases were searched and citation chasing was done to find relevant articles published from the inception of each database to October 15, 2021. Search terms for machine learning, clinical decision-making, and RCTs were used. Exclusion criteria included implementation of a non-RCT design, absence of original data, and evaluation of nonclinical interventions. Data were extracted from published articles. Trial characteristics, including primary intervention, demographics, adherence to the CONSORT-AI reporting guideline, and Cochrane risk of bias were analyzed. FINDINGS Literature search yielded 19 737 articles, of which 41 RCTs involved a median of 294 participants (range, 17-2488 participants). A total of 16 RCTS (39%) were published in 2021, 21 (51%) were conducted at single sites, and 15 (37%) involved endoscopy. No trials adhered to all CONSORT-AI standards. Common reasons for nonadherence were not assessing poor-quality or unavailable input data (38 trials [93%]), not analyzing performance errors (38 [93%]), and not including a statement regarding code or algorithm availability (37 [90%]). Overall risk of bias was high in 7 trials (17%). Of 11 trials (27%) that reported race and ethnicity data, the median proportion of participants from underrepresented minority groups was 21% (range, 0%-51%). CONCLUSIONS AND RELEVANCE This systematic review found that despite the large number of medical machine learning-based algorithms in development, few RCTs for these technologies have been conducted. Among published RCTs, there was high variability in adherence to reporting standards and risk of bias and a lack of participants from underrepresented minority groups. These findings merit attention and should be considered in future RCT design and reporting.
Collapse
Affiliation(s)
| | - Dennis L Shung
- Department of Medicine, Yale University, New Haven, Connecticut
| | - Alyssa A Grimshaw
- Harvey Cushing/John Hay Whitney Medical Library, Yale University, New Haven, Connecticut
| | - Anurag Saraf
- Department of Radiation Oncology, Massachusetts General Hospital, Boston, Massachusetts
| | - Joseph J Y Sung
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore
| | - Benjamin H Kann
- Artificial Intelligence in Medicine Program, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
39
|
Afshar M. To err is machine: Considerations on the clinical impact of machine learning models in patients with unhealthy alcohol use. Alcohol Clin Exp Res 2022; 46:912-914. [PMID: 35429003 DOI: 10.1111/acer.14842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 04/07/2022] [Accepted: 04/09/2022] [Indexed: 11/28/2022]
Affiliation(s)
- Majid Afshar
- Department of Medicine, School of Medicine and Public Health, University of Wisconsin, Madison, Wisconsin, USA
| |
Collapse
|
40
|
London AJ. Artificial intelligence in medicine: Overcoming or recapitulating structural challenges to improving patient care? Cell Rep Med 2022; 3:100622. [PMID: 35584620 PMCID: PMC9133460 DOI: 10.1016/j.xcrm.2022.100622] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 02/10/2022] [Accepted: 04/06/2022] [Indexed: 01/09/2023]
Abstract
There is considerable enthusiasm about the prospect that artificial intelligence (AI) will help to improve the safety and efficacy of health services and the efficiency of health systems. To realize this potential, however, AI systems will have to overcome structural problems in the culture and practice of medicine and the organization of health systems that impact the data from which AI models are built, the environments into which they will be deployed, and the practices and incentives that structure their development. This perspective elaborates on some of these structural challenges and provides recommendations to address potential shortcomings.
Collapse
Affiliation(s)
- Alex John London
- Department of Philosophy and Center for Ethics and Policy, Carnegie Mellon University, Pittsburgh, PA 15228, USA.
| |
Collapse
|
41
|
Adeoye J, Akinshipo A, Thomson P, Su YX. Artificial intelligence-based prediction for cancer-related outcomes in Africa: Status and potential refinements. J Glob Health 2022; 12:03017. [PMID: 35493779 PMCID: PMC9022723 DOI: 10.7189/jogh.12.03017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Affiliation(s)
- John Adeoye
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
- Oral Cancer Research Theme, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Abdulwarith Akinshipo
- Department of Oral and Maxillofacial Pathology and Biology, Faculty of Dentistry, University of Lagos, Lagos, Nigeria
| | - Peter Thomson
- College of Medicine and Dentistry, James Cook University, Cairns, Queensland, Australia
| | - Yu-Xiong Su
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
- Oral Cancer Research Theme, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
42
|
Hersh WR, Cohen AM, Nguyen MM, Bensching KL, Deloughery TG. Clinical study applying machine learning to detect a rare disease: results and lessons learned. JAMIA Open 2022; 5:ooac053. [PMID: 35783073 PMCID: PMC9243401 DOI: 10.1093/jamiaopen/ooac053] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 05/06/2022] [Accepted: 06/10/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning has the potential to improve identification of patients for appropriate diagnostic testing and treatment, including those who have rare diseases for which effective treatments are available, such as acute hepatic porphyria (AHP). We trained a machine learning model on 205 571 complete electronic health records from a single medical center based on 30 known cases to identify 22 patients with classic symptoms of AHP that had neither been diagnosed nor tested for AHP. We offered urine porphobilinogen testing to these patients via their clinicians. Of the 7 who agreed to testing, none were positive for AHP. We explore the reasons for this and provide lessons learned for further work evaluating machine learning to detect AHP and other rare diseases.
Collapse
Affiliation(s)
- William R Hersh
- Department of Medical Informatics & Clinical Epidemiology, School of Medicine, Oregon Health & Science University , Portland, Oregon, USA
| | - Aaron M Cohen
- Department of Medical Informatics & Clinical Epidemiology, School of Medicine, Oregon Health & Science University , Portland, Oregon, USA
| | - Michelle M Nguyen
- Department of Medical Informatics & Clinical Epidemiology, School of Medicine, Oregon Health & Science University , Portland, Oregon, USA
| | - Katherine L Bensching
- Department of Medicine, School of Medicine, Oregon Health & Science University , Portland, Oregon, USA
| | - Thomas G Deloughery
- Department of Medicine, School of Medicine, Oregon Health & Science University , Portland, Oregon, USA
| |
Collapse
|
43
|
Han SS, Kim YJ, Moon IJ, Jung JM, Lee MY, Lee WJ, Won CH, Lee MW, Kim SH, Navarrete-Dechent C, Chang SE. Evaluation of Artificial Intelligence-assisted Diagnosis of Skin Neoplasms - a single-center, paralleled, unmasked, randomized controlled trial. J Invest Dermatol 2022; 142:2353-2362.e2. [PMID: 35183551 DOI: 10.1016/j.jid.2022.02.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 01/26/2022] [Accepted: 02/08/2022] [Indexed: 11/24/2022]
Abstract
A randomized trial (KCT0005614; https://cris.nih.go.kr) was conducted in a tertiary care institute in South Korea, to validate whether artificial intelligence (AI) could augment the accuracy of non-expert physicians in the real-world settings which included diverse out-of-distribution conditions. Four non-dermatology trainees and four dermatology residents examined the randomly allocated patients with skin lesions suspicious of skin cancer with or without the real-time assistance of AI algorithm (https://b2020.modelderm.com#world; convolutional neural networks). Using 576 consecutive cases (Fitzpatrick skin phototypes III or IV) with suspicious lesions out of the initial 603 recruitments, the accuracy of the AI-assisted group (n=295, 53.9%) was significantly higher than those of the Unaided group (n=281, 43.8%; P=0.019). The augmentation was more significant from 54.7% (n=150) to 30.7% (n=138; P<0.0001) in the non-dermatology trainees who had the least experience in dermatology. The augmentation was not significant in the dermatology residents. The algorithm could help the trainees in the AI-assisted group include more differential diagnoses than the Unaided group (2.09 diagnoses versus 1.95; P=0.0005). In this single-center, unmasked, paralleled, randomized controlled trial, the multiclass AI algorithm augmented the diagnostic accuracy of non-expert physicians in dermatology.
Collapse
Affiliation(s)
- Seung Seog Han
- Department of Dermatology, I Dermatology, Clinic, Seoul, Korea; IDerma, Inc., Seoul, Korea
| | - Young Jae Kim
- Department of Dermatology, Asan Medical Center, Ulsan University College of Medicine, Seoul, Korea
| | - Ik Jun Moon
- Department of Dermatology, Asan Medical Center, Ulsan University College of Medicine, Seoul, Korea
| | - Joon Min Jung
- Department of Dermatology, Asan Medical Center, Ulsan University College of Medicine, Seoul, Korea
| | - Mi Young Lee
- Department of Dermatology, Asan Medical Center, Ulsan University College of Medicine, Seoul, Korea
| | - Woo Jin Lee
- Department of Dermatology, Asan Medical Center, Ulsan University College of Medicine, Seoul, Korea
| | - Chong Hyun Won
- Department of Dermatology, Asan Medical Center, Ulsan University College of Medicine, Seoul, Korea
| | - Mi Woo Lee
- Department of Dermatology, Asan Medical Center, Ulsan University College of Medicine, Seoul, Korea
| | - Seong Hwan Kim
- Department of Plastic and Reconstructive Surgery, Kangnam Sacred Hospital, Hallym University College of Medicine, Seoul, Korea
| | - Cristian Navarrete-Dechent
- Department of Dermatology, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Sung Eun Chang
- Department of Dermatology, Asan Medical Center, Ulsan University College of Medicine, Seoul, Korea.
| |
Collapse
|