1
|
Gomez C, Smith BL, Zayas A, Unberath M, Canares T. Explainable AI decision support improves accuracy during telehealth strep throat screening. COMMUNICATIONS MEDICINE 2024; 4:149. [PMID: 39048726 PMCID: PMC11269612 DOI: 10.1038/s43856-024-00568-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 07/04/2024] [Indexed: 07/27/2024] Open
Abstract
BACKGROUND Artificial intelligence-based (AI) clinical decision support systems (CDSS) using unconventional data, like smartphone-acquired images, promise transformational opportunities for telehealth; including remote diagnosis. Although such solutions' potential remains largely untapped, providers' trust and understanding are vital for effective adoption. This study examines how different human-AI interaction paradigms affect clinicians' responses to an emerging AI CDSS for streptococcal pharyngitis (strep throat) detection from smartphone throat images. METHODS In a randomized experiment, we tested explainable AI strategies using three AI-based CDSS prototypes for strep throat prediction. Participants received clinical vignettes via an online survey to predict the disease state and offer clinical recommendations. The first set included a validated CDSS prediction (Modified Centor Score) and the second introduced an explainable AI prototype randomly. We used linear models to assess explainable AI's effect on clinicians' accuracy, confirmatory testing rates, and perceived trust and understanding of the CDSS. RESULTS The study, involving 121 telehealth providers, shows that compared to using the Centor Score, AI-based CDSS can improve clinicians' predictions. Despite higher agreement with AI, participants report lower trust in its advice than in the Centor Score, leading to more requests for in-person confirmatory testing. CONCLUSIONS Effectively integrating AI is crucial in the telehealth-based diagnosis of infectious diseases, given the implications of antibiotic over-prescriptions. We demonstrate that AI-based CDSS can improve the accuracy of remote strep throat screening yet underscores the necessity to enhance human-machine collaboration, particularly in trust and intelligibility. This ensures providers and patients can capitalize on AI interventions and smartphones for virtual healthcare.
Collapse
Affiliation(s)
- Catalina Gomez
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | | | - Alisa Zayas
- Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Mathias Unberath
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
- Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Therese Canares
- Division of Pediatric Emergency Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
2
|
Wang Y, Fu W, Zhang Y, Wang D, Gu Y, Wang W, Xu H, Ge X, Ye C, Fang J, Su L, Wang J, He W, Zhang X, Feng R. Constructing and implementing a performance evaluation indicator set for artificial intelligence decision support systems in pediatric outpatient clinics: an observational study. Sci Rep 2024; 14:14482. [PMID: 38914707 PMCID: PMC11196575 DOI: 10.1038/s41598-024-64893-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 06/13/2024] [Indexed: 06/26/2024] Open
Abstract
Artificial intelligence (AI) decision support systems in pediatric healthcare have a complex application background. As an AI decision support system (AI-DSS) can be costly, once applied, it is crucial to focus on its performance, interpret its success, and then monitor and update it to ensure ongoing success consistently. Therefore, a set of evaluation indicators was explicitly developed for AI-DSS in pediatric healthcare, enabling continuous and systematic performance monitoring. The study unfolded in two stages. The first stage encompassed establishing the evaluation indicator set through a literature review, a focus group interview, and expert consultation using the Delphi method. In the second stage, weight analysis was conducted. Subjective weights were calculated based on expert opinions through analytic hierarchy process, while objective weights were determined using the entropy weight method. Subsequently, subject and object weights were synthesized to form the combined weight. In the two rounds of expert consultation, the authority coefficients were 0.834 and 0.846, Kendall's coordination coefficient was 0.135 in Round 1 and 0.312 in Round 2. The final evaluation indicator set has three first-class indicators, fifteen second-class indicators, and forty-seven third-class indicators. Indicator I-1(Organizational performance) carries the highest weight, followed by Indicator I-2(Societal performance) and Indicator I-3(User experience performance) in the objective and combined weights. Conversely, 'Societal performance' holds the most weight among the subjective weights, followed by 'Organizational performance' and 'User experience performance'. In this study, a comprehensive and specialized set of evaluation indicators for the AI-DSS in the pediatric outpatient clinic was established, and then implemented. Continuous evaluation still requires long-term data collection to optimize the weight proportions of the established indicators.
Collapse
Affiliation(s)
- Yingwen Wang
- Nursing Department, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Weijia Fu
- Medical Information Center, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Yuejie Zhang
- School of Computer Science, Fudan University, Shanghai, 200438, China
| | - Daoyang Wang
- School of Public, Health Fudan University, Shanghai, 200032, China
| | - Ying Gu
- Nursing Department, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Weibing Wang
- School of Public, Health Fudan University, Shanghai, 200032, China
| | - Hong Xu
- Nephrology Department, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Xiaoling Ge
- Statistical and Data Management Center, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Chengjie Ye
- Medical Information Center, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Jinwu Fang
- School of Public, Health Fudan University, Shanghai, 200032, China
| | - Ling Su
- Statistical and Data Management Center, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Jiayu Wang
- National Health Commission Key Laboratory of Neonatal Diseases (Fudan University), Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Wen He
- Respiratory Department, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Xiaobo Zhang
- Respiratory Department, Children's Hospital of Fudan University, Shanghai, 201102, China.
| | - Rui Feng
- School of Computer Science, Fudan University, Shanghai, 200438, China.
- School of Computer Science, Fudan University, 2005 Songhu Road, Shanghai, 200438, China.
| |
Collapse
|
3
|
Scott IA, van der Vegt A, Lane P, McPhail S, Magrabi F. Achieving large-scale clinician adoption of AI-enabled decision support. BMJ Health Care Inform 2024; 31:e100971. [PMID: 38816209 PMCID: PMC11141172 DOI: 10.1136/bmjhci-2023-100971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 05/15/2024] [Indexed: 06/01/2024] Open
Abstract
Computerised decision support (CDS) tools enabled by artificial intelligence (AI) seek to enhance accuracy and efficiency of clinician decision-making at the point of care. Statistical models developed using machine learning (ML) underpin most current tools. However, despite thousands of models and hundreds of regulator-approved tools internationally, large-scale uptake into routine clinical practice has proved elusive. While underdeveloped system readiness and investment in AI/ML within Australia and perhaps other countries are impediments, clinician ambivalence towards adopting these tools at scale could be a major inhibitor. We propose a set of principles and several strategic enablers for obtaining broad clinician acceptance of AI/ML-enabled CDS tools.
Collapse
Affiliation(s)
- Ian A Scott
- Internal Medicine and Clinical Epidemiology, Princess Alexandra Hospital, Brisbane, Queensland, Australia
- Centre for Health Services Research, The University of Queensland Faculty of Medicine and Biomedical Sciences, Brisbane, Queensland, Australia
| | - Anton van der Vegt
- Digital Health Centre, The University of Queensland Faculty of Medicine and Biomedical Sciences, Herston, Queensland, Australia
| | - Paul Lane
- Safety, Quality and Innovation, The Prince Charles Hospital, Brisbane, Queensland, Australia
| | - Steven McPhail
- Australian Centre for Health Services Innovation, Queensland University of Technology Faculty of Health, Brisbane, Queensland, Australia
| | - Farah Magrabi
- Macquarie University, Sydney, New South Wales, Australia
| |
Collapse
|
4
|
Lam BD, Dodge LE, Zerbey S, Robertson W, Rosovsky RP, Lake L, Datta S, Elavakanar P, Adamski A, Reyes N, Abe K, Vlachos IS, Zwicker JI, Patell R. The potential use of artificial intelligence for venous thromboembolism prophylaxis and management: clinician and healthcare informatician perspectives. Sci Rep 2024; 14:12010. [PMID: 38796561 PMCID: PMC11127994 DOI: 10.1038/s41598-024-62535-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 05/17/2024] [Indexed: 05/28/2024] Open
Abstract
Venous thromboembolism (VTE) is the leading cause of preventable death in hospitalized patients. Artificial intelligence (AI) and machine learning (ML) can support guidelines recommending an individualized approach to risk assessment and prophylaxis. We conducted electronic surveys asking clinician and healthcare informaticians about their perspectives on AI/ML for VTE prevention and management. Of 101 respondents to the informatician survey, most were 40 years or older, male, clinicians and data scientists, and had performed research on AI/ML. Of the 607 US-based respondents to the clinician survey, most were 40 years or younger, female, physicians, and had never used AI to inform clinical practice. Most informaticians agreed that AI/ML can be used to manage VTE (56.0%). Over one-third were concerned that clinicians would not use the technology (38.9%), but the majority of clinicians believed that AI/ML probably or definitely can help with VTE prevention (70.1%). The most common concern in both groups was a perceived lack of transparency (informaticians 54.4%; clinicians 25.4%). These two surveys revealed that key stakeholders are interested in AI/ML for VTE prevention and management, and identified potential barriers to address prior to implementation.
Collapse
Affiliation(s)
- Barbara D Lam
- Division of Hematology, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, 330 Brookline Avenue, Boston, MA, 02215, USA
- Division of Clinical Informatics, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, USA
| | - Laura E Dodge
- Department of Obstetrics and Gynecology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Sabrina Zerbey
- Division of Hematology, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, 330 Brookline Avenue, Boston, MA, 02215, USA
| | - William Robertson
- Weber State University, Ogden, UT, USA
- National Blood Clot Alliance, Philadelphia, PA, USA
| | - Rachel P Rosovsky
- Division of Hematology, Department of Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | | | - Siddhant Datta
- Division of Hospital Medicine, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Pavania Elavakanar
- Division of Hematology, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, 330 Brookline Avenue, Boston, MA, 02215, USA
| | - Alys Adamski
- Division of Blood Disorders, National Center on Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Nimia Reyes
- Division of Blood Disorders, National Center on Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Karon Abe
- Division of Blood Disorders, National Center on Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Ioannis S Vlachos
- Department of Pathology, Cancer Research Institute, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Jeffrey I Zwicker
- Division of Hematology, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Rushad Patell
- Division of Hematology, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, 330 Brookline Avenue, Boston, MA, 02215, USA.
| |
Collapse
|
5
|
Scott IA, Zuccon G. The new paradigm in machine learning - foundation models, large language models and beyond: a primer for physicians. Intern Med J 2024; 54:705-715. [PMID: 38715436 DOI: 10.1111/imj.16393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 03/26/2024] [Indexed: 05/18/2024]
Abstract
Foundation machine learning models are deep learning models capable of performing many different tasks using different data modalities such as text, audio, images and video. They represent a major shift from traditional task-specific machine learning prediction models. Large language models (LLM), brought to wide public prominence in the form of ChatGPT, are text-based foundational models that have the potential to transform medicine by enabling automation of a range of tasks, including writing discharge summaries, answering patients questions and assisting in clinical decision-making. However, such models are not without risk and can potentially cause harm if their development, evaluation and use are devoid of proper scrutiny. This narrative review describes the different types of LLM, their emerging applications and potential limitations and bias and likely future translation into clinical practice.
Collapse
Affiliation(s)
- Ian A Scott
- Centre for Health Services Research, University of Queensland, Woolloongabba, Australia
| | - Guido Zuccon
- School of Electrical Engineering and Computer Sciences, The University of Queensland, St Lucia, Queensland, Australia
| |
Collapse
|
6
|
Collins GS, Moons KGM, Dhiman P, Riley RD, Beam AL, Van Calster B, Ghassemi M, Liu X, Reitsma JB, van Smeden M, Boulesteix AL, Camaradou JC, Celi LA, Denaxas S, Denniston AK, Glocker B, Golub RM, Harvey H, Heinze G, Hoffman MM, Kengne AP, Lam E, Lee N, Loder EW, Maier-Hein L, Mateen BA, McCradden MD, Oakden-Rayner L, Ordish J, Parnell R, Rose S, Singh K, Wynants L, Logullo P. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024; 385:e078378. [PMID: 38626948 PMCID: PMC11019967 DOI: 10.1136/bmj-2023-078378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/17/2024] [Indexed: 04/19/2024]
Affiliation(s)
- Gary S Collins
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Karel G M Moons
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Paula Dhiman
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Richard D Riley
- Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK
| | - Andrew L Beam
- Department of Epidemiology, Harvard T H Chan School of Public Health, Boston, MA, USA
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- Department of Biomedical Data Science, Leiden University Medical Centre, Leiden, Netherlands
| | - Marzyeh Ghassemi
- Department of Electrical Engineering and Computer Science, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Xiaoxuan Liu
- Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Johannes B Reitsma
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Maarten van Smeden
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Anne-Laure Boulesteix
- Institute for Medical Information Processing, Biometry and Epidemiology, Faculty of Medicine, Ludwig-Maximilians-University of Munich and Munich Centre of Machine Learning, Germany
| | - Jennifer Catherine Camaradou
- Patient representative, Health Data Research UK patient and public involvement and engagement group
- Patient representative, University of East Anglia, Faculty of Health Sciences, Norwich Research Park, Norwich, UK
| | - Leo Anthony Celi
- Beth Israel Deaconess Medical Center, Boston, MA, USA
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, UK
- British Heart Foundation Data Science Centre, London, UK
| | - Alastair K Denniston
- National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK
- Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Ben Glocker
- Department of Computing, Imperial College London, London, UK
| | - Robert M Golub
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | | | - Georg Heinze
- Section for Clinical Biometrics, Centre for Medical Data Science, Medical University of Vienna, Vienna, Austria
| | - Michael M Hoffman
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
| | | | - Emily Lam
- Patient representative, Health Data Research UK patient and public involvement and engagement group
| | - Naomi Lee
- National Institute for Health and Care Excellence, London, UK
| | - Elizabeth W Loder
- The BMJ, London, UK
- Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Lena Maier-Hein
- Department of Intelligent Medical Systems, German Cancer Research Centre, Heidelberg, Germany
| | - Bilal A Mateen
- Institute of Health Informatics, University College London, London, UK
- Wellcome Trust, London, UK
- Alan Turing Institute, London, UK
| | - Melissa D McCradden
- Department of Bioethics, Hospital for Sick Children Toronto, ON, Canada
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
| | - Lauren Oakden-Rayner
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia
| | - Johan Ordish
- Medicines and Healthcare products Regulatory Agency, London, UK
| | - Richard Parnell
- Patient representative, Health Data Research UK patient and public involvement and engagement group
| | - Sherri Rose
- Department of Health Policy and Center for Health Policy, Stanford University, Stanford, CA, USA
| | - Karandeep Singh
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
| | - Laure Wynants
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
| | - Patricia Logullo
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| |
Collapse
|
7
|
Wong CYT, O'Byrne C, Taribagil P, Liu T, Antaki F, Keane PA. Comparing code-free and bespoke deep learning approaches in ophthalmology. Graefes Arch Clin Exp Ophthalmol 2024:10.1007/s00417-024-06432-x. [PMID: 38446200 DOI: 10.1007/s00417-024-06432-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 02/13/2024] [Accepted: 02/27/2024] [Indexed: 03/07/2024] Open
Abstract
AIM Code-free deep learning (CFDL) allows clinicians without coding expertise to build high-quality artificial intelligence (AI) models without writing code. In this review, we comprehensively review the advantages that CFDL offers over bespoke expert-designed deep learning (DL). As exemplars, we use the following tasks: (1) diabetic retinopathy screening, (2) retinal multi-disease classification, (3) surgical video classification, (4) oculomics and (5) resource management. METHODS We performed a search for studies reporting CFDL applications in ophthalmology in MEDLINE (through PubMed) from inception to June 25, 2023, using the keywords 'autoML' AND 'ophthalmology'. After identifying 5 CFDL studies looking at our target tasks, we performed a subsequent search to find corresponding bespoke DL studies focused on the same tasks. Only English-written articles with full text available were included. Reviews, editorials, protocols and case reports or case series were excluded. We identified ten relevant studies for this review. RESULTS Overall, studies were optimistic towards CFDL's advantages over bespoke DL in the five ophthalmological tasks. However, much of such discussions were identified to be mono-dimensional and had wide applicability gaps. High-quality assessment of better CFDL applicability over bespoke DL warrants a context-specific, weighted assessment of clinician intent, patient acceptance and cost-effectiveness. We conclude that CFDL and bespoke DL are unique in their own assets and are irreplaceable with each other. Their benefits are differentially valued on a case-to-case basis. Future studies are warranted to perform a multidimensional analysis of both techniques and to improve limitations of suboptimal dataset quality, poor applicability implications and non-regulated study designs. CONCLUSION For clinicians without DL expertise and easy access to AI experts, CFDL allows the prototyping of novel clinical AI systems. CFDL models concert with bespoke models, depending on the task at hand. A multidimensional, weighted evaluation of the factors involved in the implementation of those models for a designated task is warranted.
Collapse
Affiliation(s)
- Carolyn Yu Tung Wong
- Institute of Ophthalmology, University College London, 11-43 Bath St, London, EC1V 9EL, UK
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Ciara O'Byrne
- Institute of Ophthalmology, University College London, 11-43 Bath St, London, EC1V 9EL, UK
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Priyal Taribagil
- Institute of Ophthalmology, University College London, 11-43 Bath St, London, EC1V 9EL, UK
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Timing Liu
- Institute of Ophthalmology, University College London, 11-43 Bath St, London, EC1V 9EL, UK
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Fares Antaki
- Institute of Ophthalmology, University College London, 11-43 Bath St, London, EC1V 9EL, UK
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
- The CHUM School of Artificial Intelligence in Healthcare, Montreal, QC, Canada
| | - Pearse Andrew Keane
- Institute of Ophthalmology, University College London, 11-43 Bath St, London, EC1V 9EL, UK.
- Moorfields Eye Hospital NHS Foundation Trust, London, UK.
- NIHR Moorfields Biomedical Research Centre, London, UK.
| |
Collapse
|
8
|
Wu DY, Fang YV, Vo DT, Spangler A, Seiler SJ. Detailed Image Data Quality and Cleaning Practices for Artificial Intelligence Tools for Breast Cancer. JCO Clin Cancer Inform 2024; 8:e2300074. [PMID: 38552191 PMCID: PMC10994436 DOI: 10.1200/cci.23.00074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 11/30/2023] [Accepted: 02/13/2024] [Indexed: 04/02/2024] Open
Abstract
Standardizing image-data preparation practices to improve accuracy/consistency of AI diagnostic tools.
Collapse
Affiliation(s)
- Dolly Y. Wu
- Volunteer Services, UT Southwestern Medical Center, Dallas, TX
| | - Yisheng V. Fang
- Department of Pathology, UT Southwestern Medical Center, Dallas, TX
| | - Dat T. Vo
- Department of Radiation Oncology, UT Southwestern Medical Center, Dallas, TX
| | - Ann Spangler
- Retired, Department of Radiation Oncology, UT Southwestern Medical Center, Dallas, TX
| | | |
Collapse
|
9
|
Bräuner KB, Tsouchnika A, Mashkoor M, Williams R, Rosen AW, Hartwig MFS, Bulut M, Dohrn N, Rijnbeek P, Gögenur I. Prediction of 30-day, 90-day, and 1-year mortality after colorectal cancer surgery using a data-driven approach. Int J Colorectal Dis 2024; 39:31. [PMID: 38421482 PMCID: PMC10904562 DOI: 10.1007/s00384-024-04607-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/21/2024] [Indexed: 03/02/2024]
Abstract
PURPOSE To develop prediction models for short-term mortality risk assessment following colorectal cancer surgery. METHODS Data was harmonized from four Danish observational health databases into the Observational Medical Outcomes Partnership Common Data Model. With a data-driven approach using the Least Absolute Shrinkage and Selection Operator logistic regression on preoperative data, we developed 30-day, 90-day, and 1-year mortality prediction models. We assessed discriminative performance using the area under the receiver operating characteristic and precision-recall curve and calibration using calibration slope, intercept, and calibration-in-the-large. We additionally assessed model performance in subgroups of curative, palliative, elective, and emergency surgery. RESULTS A total of 57,521 patients were included in the study population, 51.1% male and with a median age of 72 years. The model showed good discrimination with an area under the receiver operating characteristic curve of 0.88, 0.878, and 0.861 for 30-day, 90-day, and 1-year mortality, respectively, and a calibration-in-the-large of 1.01, 0.99, and 0.99. The overall incidence of mortality were 4.48% for 30-day mortality, 6.64% for 90-day mortality, and 12.8% for 1-year mortality, respectively. Subgroup analysis showed no improvement of discrimination or calibration when separating the cohort into cohorts of elective surgery, emergency surgery, curative surgery, and palliative surgery. CONCLUSION We were able to train prediction models for the risk of short-term mortality on a data set of four combined national health databases with good discrimination and calibration. We found that one cohort including all operated patients resulted in better performing models than cohorts based on several subgroups.
Collapse
Affiliation(s)
- Karoline Bendix Bräuner
- Center for Surgical Science, Zealand University Hospital, Køge, Lykkebækvej 1, 4600, Køge, Denmark.
| | - Andi Tsouchnika
- Center for Surgical Science, Zealand University Hospital, Køge, Lykkebækvej 1, 4600, Køge, Denmark
| | - Maliha Mashkoor
- Center for Surgical Science, Zealand University Hospital, Køge, Lykkebækvej 1, 4600, Køge, Denmark
| | - Ross Williams
- Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein 40, 3015 GD, Rotterdam, Holland, Netherlands
| | - Andreas Weinberger Rosen
- Center for Surgical Science, Zealand University Hospital, Køge, Lykkebækvej 1, 4600, Køge, Denmark
| | | | - Mustafa Bulut
- Center for Surgical Science, Zealand University Hospital, Køge, Lykkebækvej 1, 4600, Køge, Denmark
- University of Copenhagen, The Faculty of Health Science, Blegdamsvej 6, 2200, Copenhagen N, Denmark
| | - Niclas Dohrn
- Center for Surgical Science, Zealand University Hospital, Køge, Lykkebækvej 1, 4600, Køge, Denmark
- Department of Surgery, Copenhagen University Hospital, Herlev & Gentofte, Borgmester Ib Juuls vej 1, 2730, Herlev, Denmark
| | - Peter Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein 40, 3015 GD, Rotterdam, Holland, Netherlands
| | - Ismail Gögenur
- Center for Surgical Science, Zealand University Hospital, Køge, Lykkebækvej 1, 4600, Køge, Denmark
- University of Copenhagen, The Faculty of Health Science, Blegdamsvej 6, 2200, Copenhagen N, Denmark
| |
Collapse
|
10
|
Cai Y, Cai YQ, Tang LY, Wang YH, Gong M, Jing TC, Li HJ, Li-Ling J, Hu W, Yin Z, Gong DX, Zhang GW. Artificial intelligence in the risk prediction models of cardiovascular disease and development of an independent validation screening tool: a systematic review. BMC Med 2024; 22:56. [PMID: 38317226 PMCID: PMC10845808 DOI: 10.1186/s12916-024-03273-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Accepted: 01/23/2024] [Indexed: 02/07/2024] Open
Abstract
BACKGROUND A comprehensive overview of artificial intelligence (AI) for cardiovascular disease (CVD) prediction and a screening tool of AI models (AI-Ms) for independent external validation are lacking. This systematic review aims to identify, describe, and appraise AI-Ms of CVD prediction in the general and special populations and develop a new independent validation score (IVS) for AI-Ms replicability evaluation. METHODS PubMed, Web of Science, Embase, and IEEE library were searched up to July 2021. Data extraction and analysis were performed for the populations, distribution, predictors, algorithms, etc. The risk of bias was evaluated with the prediction risk of bias assessment tool (PROBAST). Subsequently, we designed IVS for model replicability evaluation with five steps in five items, including transparency of algorithms, performance of models, feasibility of reproduction, risk of reproduction, and clinical implication, respectively. The review is registered in PROSPERO (No. CRD42021271789). RESULTS In 20,887 screened references, 79 articles (82.5% in 2017-2021) were included, which contained 114 datasets (67 in Europe and North America, but 0 in Africa). We identified 486 AI-Ms, of which the majority were in development (n = 380), but none of them had undergone independent external validation. A total of 66 idiographic algorithms were found; however, 36.4% were used only once and only 39.4% over three times. A large number of different predictors (range 5-52,000, median 21) and large-span sample size (range 80-3,660,000, median 4466) were observed. All models were at high risk of bias according to PROBAST, primarily due to the incorrect use of statistical methods. IVS analysis confirmed only 10 models as "recommended"; however, 281 and 187 were "not recommended" and "warning," respectively. CONCLUSION AI has led the digital revolution in the field of CVD prediction, but is still in the early stage of development as the defects of research design, report, and evaluation systems. The IVS we developed may contribute to independent external validation and the development of this field.
Collapse
Affiliation(s)
- Yue Cai
- China Medical University, Shenyang, 110122, China
| | - Yu-Qing Cai
- China Medical University, Shenyang, 110122, China
| | - Li-Ying Tang
- China Medical University, Shenyang, 110122, China
| | - Yi-Han Wang
- China Medical University, Shenyang, 110122, China
| | - Mengchun Gong
- Digital Health China Co. Ltd, Beijing, 100089, China
| | - Tian-Ci Jing
- Smart Hospital Management Department, the First Hospital of China Medical University, Shenyang, 110001, China
| | - Hui-Jun Li
- Shenyang Medical & Film Science and Technology Co. Ltd., Shenyang, 110001, China
- Enduring Medicine Smart Innovation Research Institute, Shenyang, 110001, China
| | - Jesse Li-Ling
- Institute of Genetic Medicine, School of Life Science, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, 610065, China
| | - Wei Hu
- Bayi Orthopedic Hospital, Chengdu, 610017, China
| | - Zhihua Yin
- Department of Epidemiology, School of Public Health, China Medical University, Shenyang, 110122, China.
| | - Da-Xin Gong
- Smart Hospital Management Department, the First Hospital of China Medical University, Shenyang, 110001, China.
- The Internet Hospital Branch of the Chinese Research Hospital Association, Beijing, 100006, China.
| | - Guang-Wei Zhang
- Smart Hospital Management Department, the First Hospital of China Medical University, Shenyang, 110001, China.
- The Internet Hospital Branch of the Chinese Research Hospital Association, Beijing, 100006, China.
| |
Collapse
|
11
|
Chang RSK, Nguyen S, Chen Z, Foster E, Kwan P. Role of machine learning in the management of epilepsy: a systematic review protocol. BMJ Open 2024; 14:e079785. [PMID: 38272549 PMCID: PMC10823996 DOI: 10.1136/bmjopen-2023-079785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 01/05/2024] [Indexed: 01/27/2024] Open
Abstract
INTRODUCTION Machine learning is a rapidly expanding field and is already incorporated into many aspects of medicine including diagnostics, prognostication and clinical decision-support tools. Epilepsy is a common and disabling neurological disorder, however, management remains challenging in many cases, despite expanding therapeutic options. We present a systematic review protocol to explore the role of machine learning in the management of epilepsy. METHODS AND ANALYSIS This protocol has been drafted with reference to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) for Protocols. A literature search will be conducted in databases including MEDLINE, Embase, Scopus and Web of Science. A PRISMA flow chart will be constructed to summarise the study workflow. As the scope of this review is the clinical application of machine learning, the selection of papers will be focused on studies directly related to clinical decision-making in management of epilepsy, specifically the prediction of response to antiseizure medications, development of drug-resistant epilepsy, and epilepsy surgery and neuromodulation outcomes. Data will be extracted following the CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies checklist. Prediction model Risk Of Bias ASsessment Tool will be used for the quality assessment of the included studies. Syntheses of quantitative data will be presented in narrative format. ETHICS AND DISSEMINATION As this study is a systematic review which does not involve patients or animals, ethics approval is not required. The results of the systematic review will be submitted to peer-review journals for publication and presented in academic conferences. PROSPERO REGISTRATION NUMBER CRD42023442156.
Collapse
Affiliation(s)
- Richard Shek-Kwan Chang
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| | - Shani Nguyen
- Monash University Faculty of Medicine Nursing and Health Sciences, Melbourne, Victoria, Australia
| | - Zhibin Chen
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| | - Emma Foster
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| | - Patrick Kwan
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| |
Collapse
|
12
|
Bacchi S, Kovoor J, Gupta A, Chan W. Should this artificial intelligence algorithm be used in my practice now? A checklist approach. Clin Exp Ophthalmol 2024; 52:123-125. [PMID: 38220471 DOI: 10.1111/ceo.14307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 04/15/2023] [Indexed: 01/16/2024]
Affiliation(s)
- Stephen Bacchi
- Flinders Medical Centre, Bedford Park, South Australia, Australia
- College of Medicine and Public Health, Flinders University of South Australia, Australia
- University of Adelaide, Adelaide, South Australia, Australia
- Royal Adelaide Hospital, Adelaide, South Australia, Australia
| | - Joshua Kovoor
- University of Adelaide, Adelaide, South Australia, Australia
- Royal Adelaide Hospital, Adelaide, South Australia, Australia
| | - Aashray Gupta
- University of Adelaide, Adelaide, South Australia, Australia
- Gold Coast University Hospital, Southport, Queensland, Australia
| | - WengOnn Chan
- University of Adelaide, Adelaide, South Australia, Australia
- Royal Adelaide Hospital, Adelaide, South Australia, Australia
| |
Collapse
|
13
|
Rahrooh A, Garlid AO, Bartlett K, Coons W, Petousis P, Hsu W, Bui AAT. Towards a framework for interoperability and reproducibility of predictive models. J Biomed Inform 2024; 149:104551. [PMID: 38000765 DOI: 10.1016/j.jbi.2023.104551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 08/28/2023] [Accepted: 11/19/2023] [Indexed: 11/26/2023]
Abstract
The development and deployment of machine learning (ML) models for biomedical research and healthcare currently lacks standard methodologies. Although tools for model replication are numerous, without a unifying blueprint it remains difficult to scientifically reproduce predictive ML models for any number of reasons (e.g., assumptions regarding data distributions and preprocessing, unclear test metrics, etc.) and ultimately, questions around generalizability and transportability are not readily answered. To facilitate scientific reproducibility, we built upon the Predictive Model Markup Language (PMML) to capture essential information. As a key component of the PREdictive Model Index and Exchange REpository (PREMIERE) platform, we present the Automated Metadata Pipeline (AMP) for conversion of a given predictive ML model into an extended PMML file that autocompletes an ML-based checklist, assessing model elements for interoperability and reproducibility. We demonstrate this pipeline on multiple test cases with three different ML algorithms and health-related datasets, providing a foundation for future predictive model reproducibility, sharing, and comparison.
Collapse
Affiliation(s)
- Al Rahrooh
- Medical & Imaging Informatics (MII) Group, University of California Los Angeles (UCLA), Los Angeles, CA, USA.
| | - Anders O Garlid
- Medical & Imaging Informatics (MII) Group, University of California Los Angeles (UCLA), Los Angeles, CA, USA
| | - Kelly Bartlett
- Medical & Imaging Informatics (MII) Group, University of California Los Angeles (UCLA), Los Angeles, CA, USA
| | - Warren Coons
- Medical & Imaging Informatics (MII) Group, University of California Los Angeles (UCLA), Los Angeles, CA, USA
| | - Panayiotis Petousis
- Clinical and Translational Science Institute (CTSI), University of California Los Angeles (UCLA), Los Angeles, CA, USA
| | - William Hsu
- Medical & Imaging Informatics (MII) Group, University of California Los Angeles (UCLA), Los Angeles, CA, USA
| | - Alex A T Bui
- Medical & Imaging Informatics (MII) Group, University of California Los Angeles (UCLA), Los Angeles, CA, USA; Clinical and Translational Science Institute (CTSI), University of California Los Angeles (UCLA), Los Angeles, CA, USA
| |
Collapse
|
14
|
Zantvoort K, Scharfenberger J, Boß L, Lehr D, Funk B. Finding the Best Match - a Case Study on the (Text-)Feature and Model Choice in Digital Mental Health Interventions. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2023; 7:447-479. [PMID: 37927375 PMCID: PMC10620349 DOI: 10.1007/s41666-023-00148-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 08/29/2023] [Indexed: 11/07/2023]
Abstract
With the need for psychological help long exceeding the supply, finding ways of scaling, and better allocating mental health support is a necessity. This paper contributes by investigating how to best predict intervention dropout and failure to allow for a need-based adaptation of treatment. We systematically compare the predictive power of different text representation methods (metadata, TF-IDF, sentiment and topic analysis, and word embeddings) in combination with supplementary numerical inputs (socio-demographic, evaluation, and closed-question data). Additionally, we address the research gap of which ML model types - ranging from linear to sophisticated deep learning models - are best suited for different features and outcome variables. To this end, we analyze nearly 16.000 open-text answers from 849 German-speaking users in a Digital Mental Health Intervention (DMHI) for stress. Our research proves that - contrary to previous findings - there is great promise in using neural network approaches on DMHI text data. We propose a task-specific LSTM-based model architecture to tackle the challenge of long input sequences and thereby demonstrate the potential of word embeddings (AUC scores of up to 0.7) for predictions in DMHIs. Despite the relatively small data set, sequential deep learning models, on average, outperform simpler features such as metadata and bag-of-words approaches when predicting dropout. The conclusion is that user-generated text of the first two sessions carries predictive power regarding patients' dropout and intervention failure risk. Furthermore, the match between the sophistication of features and models needs to be closely considered to optimize results, and additional non-text features increase prediction results. Supplementary Information The online version contains supplementary material available at 10.1007/s41666-023-00148-z.
Collapse
Affiliation(s)
- Kirsten Zantvoort
- Institute of Information Systems, Leuphana University, Lüneburg, Germany
| | | | - Leif Boß
- Institute of Psychology, Leuphana University, Lüneburg, Germany
| | - Dirk Lehr
- Institute of Psychology, Leuphana University, Lüneburg, Germany
| | - Burkhardt Funk
- Institute of Information Systems, Leuphana University, Lüneburg, Germany
| |
Collapse
|
15
|
McFadden BR, Reynolds M, Inglis TJJ. Developing machine learning systems worthy of trust for infection science: a requirement for future implementation into clinical practice. Front Digit Health 2023; 5:1260602. [PMID: 37829595 PMCID: PMC10565494 DOI: 10.3389/fdgth.2023.1260602] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 09/15/2023] [Indexed: 10/14/2023] Open
Abstract
Infection science is a discipline of healthcare which includes clinical microbiology, public health microbiology, mechanisms of microbial disease, and antimicrobial countermeasures. The importance of infection science has become more apparent in recent years during the SARS-CoV-2 (COVID-19) pandemic and subsequent highlighting of critical operational domains within infection science including the hospital, clinical laboratory, and public health environments to prevent, manage, and treat infectious diseases. However, as the global community transitions beyond the pandemic, the importance of infection science remains, with emerging infectious diseases, bloodstream infections, sepsis, and antimicrobial resistance becoming increasingly significant contributions to the burden of global disease. Machine learning (ML) is frequently applied in healthcare and medical domains, with growing interest in the application of ML techniques to problems in infection science. This has the potential to address several key aspects including improving patient outcomes, optimising workflows in the clinical laboratory, and supporting the management of public health. However, despite promising results, the implementation of ML into clinical practice and workflows is limited. Enabling the migration of ML models from the research to real world environment requires the development of trustworthy ML systems that support the requirements of users, stakeholders, and regulatory agencies. This paper will provide readers with a brief introduction to infection science, outline the principles of trustworthy ML systems, provide examples of the application of these principles in infection science, and propose future directions for moving towards the development of trustworthy ML systems in infection science.
Collapse
Affiliation(s)
- Benjamin R. McFadden
- School of Physics, Mathematics and Computing, University of Western Australia, Perth, WA, Australia
| | - Mark Reynolds
- School of Physics, Mathematics and Computing, University of Western Australia, Perth, WA, Australia
| | - Timothy J. J. Inglis
- Western Australian Country Health Service, Perth, WA, Australia
- School of Medicine, University of Western Australia, Perth, WA, Australia
- Department of Microbiology, Pathwest Laboratory Medicine, Perth, WA, Australia
| |
Collapse
|
16
|
Zmudzki F, Smeets RJEM. Machine learning clinical decision support for interdisciplinary multimodal chronic musculoskeletal pain treatment. FRONTIERS IN PAIN RESEARCH 2023; 4:1177070. [PMID: 37228809 PMCID: PMC10203229 DOI: 10.3389/fpain.2023.1177070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 04/07/2023] [Indexed: 05/27/2023] Open
Abstract
Introduction Chronic musculoskeletal pain is a prevalent condition impacting around 20% of people globally; resulting in patients living with pain, fatigue, restricted social and employment capacity, and reduced quality of life. Interdisciplinary multimodal pain treatment programs have been shown to provide positive outcomes by supporting patients modify their behavior and improve pain management through focusing attention on specific patient valued goals rather than fighting pain. Methods Given the complex nature of chronic pain there is no single clinical measure to assess outcomes from multimodal pain programs. Using Centre for Integral Rehabilitation data from 2019-2021 (n = 2,364), we developed a multidimensional machine learning framework of 13 outcome measures across 5 clinically relevant domains including activity/disability, pain, fatigue, coping and quality of life. Machine learning models for each endpoint were separately trained using the most important 30 of 55 demographic and baseline variables based on minimum redundancy maximum relevance feature selection. Five-fold cross validation identified best performing algorithms which were rerun on deidentified source data to verify prognostic accuracy. Results Individual algorithm performance ranged from 0.49 to 0.65 AUC reflecting characteristic outcome variation across patients, and unbalanced training data with high positive proportions of up to 86% for some measures. As expected, no single outcome provided a reliable indicator, however the complete set of algorithms established a stratified prognostic patient profile. Patient level validation achieved consistent prognostic assessment of outcomes for 75.3% of the study group (n = 1,953). Clinician review of a sample of predicted negative patients (n = 81) independently confirmed algorithm accuracy and suggests the prognostic profile is potentially valuable for patient selection and goal setting. Discussion These results indicate that although no single algorithm was individually conclusive, the complete stratified profile consistently identified patient outcomes. Our predictive profile provides promising positive contribution for clinicians and patients to assist with personalized assessment and goal setting, program engagement and improved patient outcomes.
Collapse
Affiliation(s)
- Fredrick Zmudzki
- Époque Consulting, Sydney, NSW, Australia
- Social Policy Research Centre, University of New South Wales, Sydney, NSW, Australia
| | - Rob J. E. M. Smeets
- Department of Rehabilitation Medicine, Care and Public Health Research Institute (CAPHRI), Faculty of Health, Life Sciences and Medicine, Maastricht University, Maastricht, Netherlands
- CIR Rehabilitation, Eindhoven, Netherlands
- Pain in Motion International Research Group (PiM), Brussels, Belgium
| |
Collapse
|
17
|
Fraser AG, Biasin E, Bijnens B, Bruining N, Caiani EG, Cobbaert K, Davies RH, Gilbert SH, Hovestadt L, Kamenjasevic E, Kwade Z, McGauran G, O'Connor G, Vasey B, Rademakers FE. Artificial intelligence in medical device software and high-risk medical devices - a review of definitions, expert recommendations and regulatory initiatives. Expert Rev Med Devices 2023; 20:467-491. [PMID: 37157833 DOI: 10.1080/17434440.2023.2184685] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
INTRODUCTION Artificial intelligence (AI) encompasses a wide range of algorithms with risks when used to support decisions about diagnosis or treatment, so professional and regulatory bodies are recommending how they should be managed. AREAS COVERED AI systems may qualify as standalone medical device software (MDSW) or be embedded within a medical device. Within the European Union (EU) AI software must undergo a conformity assessment procedure to be approved as a medical device. The draft EU Regulation on AI proposes rules that will apply across industry sectors, while for devices the Medical Device Regulation also applies. In the CORE-MD project (Coordinating Research and Evidence for Medical Devices), we have surveyed definitions and summarize initiatives made by professional consensus groups, regulators, and standardization bodies. EXPERT OPINION The level of clinical evidence required should be determined according to each application and to legal and methodological factors that contribute to risk, including accountability, transparency, and interpretability. EU guidance for MDSW based on international recommendations does not yet describe the clinical evidence needed for medical AI software. Regulators, notified bodies, manufacturers, clinicians and patients would all benefit from common standards for the clinical evaluation of high-risk AI applications and transparency of their evidence and performance.
Collapse
Affiliation(s)
- Alan G Fraser
- University Hospital of Wales, School of Medicine, Cardiff University, Heath Park, Cardiff, U.K
- KU Leuven, Leuven, Belgium
| | | | - Bart Bijnens
- Engineering Sciences, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain
| | - Nico Bruining
- Department of Clinical and Experimental Information processing (Digital Cardiology), Erasmus Medical Center, Thoraxcenter, Rotterdam, the Netherlands
| | - Enrico G Caiani
- Department of Electronics, Information and Biomedical Engineering, Politecnico di Milano, Milan, Italy
| | | | - Rhodri H Davies
- Institute of Cardiovascular Science, University College London, London, U.K
| | - Stephen H Gilbert
- Technische Universität Dresden, Else Kröner Fresenius Center for Digital Health, Dresden, Germany
| | | | | | | | | | | | - Baptiste Vasey
- Nuffield Department of Surgical Sciences, University of Oxford, Oxford, UK
| | | |
Collapse
|
18
|
Ledziński Ł, Grześk G. Artificial Intelligence Technologies in Cardiology. J Cardiovasc Dev Dis 2023; 10:jcdd10050202. [PMID: 37233169 DOI: 10.3390/jcdd10050202] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/03/2023] [Accepted: 05/04/2023] [Indexed: 05/27/2023] Open
Abstract
As the world produces exabytes of data, there is a growing need to find new methods that are more suitable for dealing with complex datasets. Artificial intelligence (AI) has significant potential to impact the healthcare industry, which is already on the road to change with the digital transformation of vast quantities of information. The implementation of AI has already achieved success in the domains of molecular chemistry and drug discoveries. The reduction in costs and in the time needed for experiments to predict the pharmacological activities of new molecules is a milestone in science. These successful applications of AI algorithms provide hope for a revolution in healthcare systems. A significant part of artificial intelligence is machine learning (ML), of which there are three main types-supervised learning, unsupervised learning, and reinforcement learning. In this review, the full scope of the AI workflow is presented, with explanations of the most-often-used ML algorithms and descriptions of performance metrics for both regression and classification. A brief introduction to explainable artificial intelligence (XAI) is provided, with examples of technologies that have developed for XAI. We review important AI implementations in cardiology for supervised, unsupervised, and reinforcement learning and natural language processing, emphasizing the used algorithm. Finally, we discuss the need to establish legal, ethical, and methodical requirements for the deployment of AI models in medicine.
Collapse
Affiliation(s)
- Łukasz Ledziński
- Department of Cardiology and Clinical Pharmacology, Faculty of Health Sciences, Collegium Medicum in Bydgoszcz, Nicolaus Copernicus University in Toruń, Ujejskiego 75, 85-168 Bydgoszcz, Poland
| | - Grzegorz Grześk
- Department of Cardiology and Clinical Pharmacology, Faculty of Health Sciences, Collegium Medicum in Bydgoszcz, Nicolaus Copernicus University in Toruń, Ujejskiego 75, 85-168 Bydgoszcz, Poland
| |
Collapse
|
19
|
Pham N, Hill V, Rauschecker A, Lui Y, Niogi S, Fillipi CG, Chang P, Zaharchuk G, Wintermark M. Critical Appraisal of Artificial Intelligence-Enabled Imaging Tools Using the Levels of Evidence System. AJNR Am J Neuroradiol 2023; 44:E21-E28. [PMID: 37080722 PMCID: PMC10171388 DOI: 10.3174/ajnr.a7850] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 03/16/2023] [Indexed: 04/22/2023]
Abstract
Clinical adoption of an artificial intelligence-enabled imaging tool requires critical appraisal of its life cycle from development to implementation by using a systematic, standardized, and objective approach that can verify both its technical and clinical efficacy. Toward this concerted effort, the ASFNR/ASNR Artificial Intelligence Workshop Technology Working Group is proposing a hierarchal evaluation system based on the quality, type, and amount of scientific evidence that the artificial intelligence-enabled tool can demonstrate for each component of its life cycle. The current proposal is modeled after the levels of evidence in medicine, with the uppermost level of the hierarchy showing the strongest evidence for potential impact on patient care and health care outcomes. The intended goal of establishing an evidence-based evaluation system is to encourage transparency, foster an understanding of the creation of artificial intelligence tools and the artificial intelligence decision-making process, and to report the relevant data on the efficacy of artificial intelligence tools that are developed. The proposed system is an essential step in working toward a more formalized, clinically validated, and regulated framework for the safe and effective deployment of artificial intelligence imaging applications that will be used in clinical practice.
Collapse
Affiliation(s)
- N Pham
- From the Department of Radiology (N.P., G.Z.), Stanford School of Medicine, Palo Alto, California
| | - V Hill
- Department of Radiology (V.H.), Northwestern University Feinberg School of Medicine, Chicago, Illinois
| | - A Rauschecker
- Department of Radiology (A.R.), University of California, San Francisco, San Francisco, California
| | - Y Lui
- Department of Radiology (Y.L.), NYU Grossman School of Medicine, New York, New York
| | - S Niogi
- Department of Radiology (S.N.), Weill Cornell Medicine, New York, New York
| | - C G Fillipi
- Department of Radiology (C.G.F.), Tufts University School of Medicine, Boston, Massachusetts
| | - P Chang
- Department of Radiology (P.C.), University of California, Irvine, Irvine, California
| | - G Zaharchuk
- From the Department of Radiology (N.P., G.Z.), Stanford School of Medicine, Palo Alto, California
| | - M Wintermark
- Department of Neuroradiology (M.W.), The University of Texas MD Anderson Cancer Center, Houston, Texas
| |
Collapse
|
20
|
Portuondo-Jiménez J, Barrio I, España PP, García J, Villanueva A, Gascón M, Rodríguez L, Larrea N, García-Gutierrez S, Quintana JM. Clinical prediction rules for adverse evolution in patients with COVID-19 by the Omicron variant. Int J Med Inform 2023; 173:105039. [PMID: 36921481 PMCID: PMC9988314 DOI: 10.1016/j.ijmedinf.2023.105039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 02/03/2023] [Accepted: 03/01/2023] [Indexed: 03/08/2023]
Abstract
OBJECTIVE We identify factors related to SARS-CoV-2 infection linked to hospitalization, ICU admission, and mortality and develop clinical prediction rules. METHODS Retrospective cohort study of 380,081 patients with SARS-CoV-2 infection from March 1, 2020 to January 9, 2022, including a subsample of 46,402 patients who attended Emergency Departments (EDs) having data on vital signs. For derivation and external validation of the prediction rule, two different periods were considered: before and after emergence of the Omicron variant, respectively. Data collected included sociodemographic data, COVID-19 vaccination status, baseline comorbidities and treatments, other background data and vital signs at triage at EDs. The predictive models for the EDs and the whole samples were developed using multivariate logistic regression models using Lasso penalization. RESULTS In the multivariable models, common predictive factors of death among EDs patients were greater age; being male; having no vaccination, dementia; heart failure; liver and kidney disease; hemiplegia or paraplegia; coagulopathy; interstitial pulmonary disease; malignant tumors; use chronic systemic use of steroids, higher temperature, low O2 saturation and altered blood pressure-heart rate. The predictors of an adverse evolution were the same, with the exception of liver disease and the inclusion of cystic fibrosis. Similar predictors were found to be related to hospital admission, including liver disease, arterial hypertension, and basal prescription of immunosuppressants. Similarly, models for the whole sample, without vital signs, are presented. CONCLUSIONS We propose risk scales, based on basic information, easily-calculable, high-predictive that also function with the current Omicron variant and may help manage such patients in primary, emergency, and hospital care.
Collapse
Affiliation(s)
- Janire Portuondo-Jiménez
- Osakidetza Basque Health Service, Sub-Directorate for Primary Care Coordination, Vitoria-Gasteiz, Spain; Biocruces Bizkaia Health Research Institute, Barakaldo, Spain; Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS), Spain
| | - Irantzu Barrio
- University of the Basque Country UPV/EHU, Department of Mathematics, Leioa, Spain; Basque Center for Applied Mathematics, BCAM, Spain.
| | - Pedro P España
- Biocruces Bizkaia Health Research Institute, Barakaldo, Spain; Osakidetza Basque Health Service, Galdakao-Usansolo University Hospital, Respiratory Unit, Galdakao, Spain
| | - Julia García
- Basque Government Department of Health, Office of Healthcare Planning, Organization and Evaluation, Basque Country, Spain
| | - Ane Villanueva
- Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS), Spain; Osakidetza Basque Health Service, Galdakao-Usansolo University Hospital, Research Unit, Galdakao, Spain; Health Service Research Network on Chronic Diseases (REDISSEC), Bilbao, Spain; Kronikgune Institute for Health Services Research, Barakaldo, Spain
| | - María Gascón
- Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS), Spain; Osakidetza Basque Health Service, Galdakao-Usansolo University Hospital, Research Unit, Galdakao, Spain; Health Service Research Network on Chronic Diseases (REDISSEC), Bilbao, Spain; Kronikgune Institute for Health Services Research, Barakaldo, Spain
| | | | - Nere Larrea
- Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS), Spain; Osakidetza Basque Health Service, Galdakao-Usansolo University Hospital, Research Unit, Galdakao, Spain; Health Service Research Network on Chronic Diseases (REDISSEC), Bilbao, Spain; Kronikgune Institute for Health Services Research, Barakaldo, Spain
| | - Susana García-Gutierrez
- Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS), Spain; Osakidetza Basque Health Service, Galdakao-Usansolo University Hospital, Research Unit, Galdakao, Spain; Health Service Research Network on Chronic Diseases (REDISSEC), Bilbao, Spain; Kronikgune Institute for Health Services Research, Barakaldo, Spain
| | - José M Quintana
- Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS), Spain; Osakidetza Basque Health Service, Galdakao-Usansolo University Hospital, Research Unit, Galdakao, Spain; Health Service Research Network on Chronic Diseases (REDISSEC), Bilbao, Spain; Kronikgune Institute for Health Services Research, Barakaldo, Spain
| | | |
Collapse
|
21
|
Strudwick G, Castellanos A, Castillo A, Gomes PJ, Li J, VanderMeer D. Nurses' Work Concerns and Disenchantment During the COVID-19 Pandemic: Machine Learning Analysis of Web-Based Discussions. JMIR Nurs 2023; 6:e40676. [PMID: 36608261 PMCID: PMC9907981 DOI: 10.2196/40676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 12/19/2022] [Accepted: 01/03/2023] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Web-based forums provide a space for communities of interest to exchange ideas and experiences. Nurse professionals used these forums during the COVID-19 pandemic to share their experiences and concerns. OBJECTIVE The objective of this study was to examine the nurse-generated content to capture the evolution of nurses' work concerns during the COVID-19 pandemic. METHODS We analyzed 14,060 posts related to the COVID-19 pandemic from March 2020 to April 2021. The data analysis stage included unsupervised machine learning and thematic qualitative analysis. We used an unsupervised machine learning approach, latent Dirichlet allocation, to identify salient topics in the collected posts. A human-in-the-loop analysis complemented the machine learning approach, categorizing topics into themes and subthemes. We developed insights into nurses' evolving perspectives based on temporal changes. RESULTS We identified themes for biweekly periods and grouped them into 20 major themes based on the work concern inventory framework. Dominant work concerns varied throughout the study period. A detailed analysis of the patterns in how themes evolved over time enabled us to create narratives of work concerns. CONCLUSIONS The analysis demonstrates that professional web-based forums capture nuanced details about nurses' work concerns and workplace stressors during the COVID-19 pandemic. Monitoring and assessment of web-based discussions could provide useful data for health care organizations to understand how their primary caregivers are affected by external pressures and internal managerial decisions and design more effective responses and planning during crises.
Collapse
Affiliation(s)
| | - Arturo Castellanos
- Mason School of Business, The College of William & Mary, Williamsburg, VA, United States
| | - Alfred Castillo
- Information Systems and Business Analytics Department, Florida International University, Miami, FL, United States
| | - Paulo J Gomes
- Information Systems and Business Analytics Department, Florida International University, Miami, FL, United States
| | - Juanjuan Li
- Nicole Wertheim College of Nursing & Health Sciences, Florida International University, Miami, FL, United States
| | - Debra VanderMeer
- Information Systems and Business Analytics Department, Florida International University, Miami, FL, United States
| |
Collapse
|
22
|
Setting up of a machine learning algorithm for the identification of severe liver fibrosis profile in the general US population cohort. Int J Med Inform 2023; 170:104932. [PMID: 36459836 DOI: 10.1016/j.ijmedinf.2022.104932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 11/19/2022] [Accepted: 11/21/2022] [Indexed: 11/27/2022]
Abstract
BACKGROUND The progress of digital transformation in clinical practice opens the door to transforming the current clinical line for liver disease diagnosis from a late-stage diagnosis approach to an early-stage based one. Early diagnosis of liver fibrosis can prevent the progression of the disease and decrease liver-related morbidity and mortality. We developed here a machine learning (ML) algorithm containing standard parameters that can identify liver fibrosis in the general US population. MATERIALS AND METHODS Starting from a public database (National Health and Nutrition Examination Survey, NHANES), representative of the American population with 7265 eligible subjects (control population n = 6828, with Fibroscan values E < 9.7 KPa; target population n = 437 with Fibroscan values E ≥ 9.7 KPa), we set up an SVM algorithm able to discriminate for individuals with liver fibrosis among the general US population. The algorithm set up involved the removal of missing data and a sampling optimization step to managing the data imbalance (only ∼ 5 % of the dataset is the target population). RESULTS For the feature selection, we performed an unbiased analysis, starting from 33 clinical, anthropometric, and biochemical parameters regardless of their previous application as biomarkers of liver diseases. Through PCA analysis, we identified the 26 more significant features and then used them to set up a sampling method on an SVM algorithm. The best sampling technique to manage the data imbalance was found to be oversampling through the SMOTE-NC. For final model validation, we utilized a subset of 300 individuals (150 with liver fibrosis and 150 controls), subtracted from the main dataset prior to sampling. Performances were evaluated on multiple independent runs. CONCLUSIONS We provide proof of concept of an ML clinical decision support tool for liver fibrosis diagnosis in the general US population. Though the presented ML model represents at this stage only a prototype, in the future, it might be implemented and potentially applied to program broad screenings for liver fibrosis.
Collapse
|
23
|
Koçak B, Cuocolo R, dos Santos DP, Stanzione A, Ugga L. Must-have Qualities of Clinical Research on Artificial Intelligence and Machine Learning. Balkan Med J 2023; 40:3-12. [PMID: 36578657 PMCID: PMC9874249 DOI: 10.4274/balkanmedj.galenos.2022.2022-11-51] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 12/06/2022] [Indexed: 12/30/2022] Open
Abstract
In the field of computer science, known as artificial intelligence, algorithms imitate reasoning tasks that are typically performed by humans. The techniques that allow machines to learn and get better at tasks such as recognition and prediction, which form the basis of clinical practice, are referred to as machine learning, which is a subfield of artificial intelligence. The number of artificial intelligence-and machine learnings-related publications in clinical journals has grown exponentially, driven by recent developments in computation and the accessibility of simple tools. However, clinicians are often not included in data science teams, which may limit the clinical relevance, explanability, workflow compatibility, and quality improvement of artificial intelligence solutions. Thus, this results in the language barrier between clinicians and artificial intelligence developers. Healthcare practitioners sometimes lack a basic understanding of artificial intelligence research because the approach is difficult for non-specialists to understand. Furthermore, many editors and reviewers of medical publications might not be familiar with the fundamental ideas behind these technologies, which may prevent journals from publishing high-quality artificial intelligence studies or, worse still, could allow for the publication of low-quality works. In this review, we aim to improve readers’ artificial intelligence literacy and critical thinking. As a result, we concentrated on what we consider the 10 most important qualities of artificial intelligence research: valid scientific purpose, high-quality data set, robust reference standard, robust input, no information leakage, optimal bias-variance tradeoff, proper model evaluation, proven clinical utility, transparent reporting, and open science. Before designing a study, one should have defined a sound scientific purpose. Then, it should be backed by a high-quality data set, robust input, and a solid reference standard. The artificial intelligence development pipeline should prevent information leakage. For the models, optimal bias-variance tradeoff should be achieved, and generalizability assessment must be adequately performed. The clinical value of the final models must also be established. After the study, thought should be given to transparency in publishing the process and results as well as open science for sharing data, code, and models. We hope this work may improve the artificial intelligence literacy and mindset of the readers.
Collapse
Affiliation(s)
- Burak Koçak
- Clinic of Radiology, University of Health Sciences Turkey, Başakşehir Çam and Sakura City Hospital, İstanbul, Turkey
| | - Renato Cuocolo
- Department of Medicine, Surgery and Dentistry University of Salerno, Baronissi, Italy
| | - Daniel Pinto dos Santos
- Department of Radiology, University Hospital of Cologne, Cologne, Germany
- Department of Radiology, University Hospital of Frankfurt, Frankfurt, Germany
| | - Arnaldo Stanzione
- Department of Advanced Biomedical Sciences, University of Naples “Federico II”, Napoli, Italy
| | - Lorenzo Ugga
- Department of Advanced Biomedical Sciences, University of Naples “Federico II”, Napoli, Italy
| |
Collapse
|
24
|
Susanty S, Sufriyana H, Su ECY, Chuang YH. Questionnaire-free machine-learning method to predict depressive symptoms among community-dwelling older adults. PLoS One 2023; 18:e0280330. [PMID: 36696383 PMCID: PMC9876369 DOI: 10.1371/journal.pone.0280330] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 12/27/2022] [Indexed: 01/26/2023] Open
Abstract
The 15-item Geriatric Depression Scale (GDS-15) is widely used to screen for depressive symptoms among older populations. This study aimed to develop and validate a questionnaire-free, machine-learning model as an alternative triage test for the GDS-15 among community-dwelling older adults. The best models were the random forest (RF) and deep-insight visible neural network by internal validation, but both performances were undifferentiated by external validation. The AUROC of the RF model was 0.619 (95% CI 0.610 to 0.627) for the external validation set with a non-local ethnic group. Our triage test can allow healthcare professionals to preliminarily screen for depressive symptoms in older adults without using a questionnaire. If the model shows positive results, then the GDS-15 can be used for follow-up measures. This preliminary screening will save a lot of time and energy for healthcare providers and older adults, especially those persons who are illiterate.
Collapse
Affiliation(s)
- Sri Susanty
- School of Nursing, College of Nursing, Taipei Medical University, Taipei, Taiwan
- Nursing Study Program, Faculty of Medicine, Universitas Halu Oleo, Kendari, Southeast Sulawesi, Indonesia
| | - Herdiantri Sufriyana
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Department of Medical Physiology, Faculty of Medicine, Universitas Nahdlatul Ulama Surabaya, Surabaya, Indonesia
| | - Emily Chia-Yu Su
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan
- Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei, Taiwan
- * E-mail: (YHC); (ECYS)
| | - Yeu-Hui Chuang
- School of Nursing, College of Nursing, Taipei Medical University, Taipei, Taiwan
- Center for Nursing and Healthcare Research in Clinical Practice Application, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan
- * E-mail: (YHC); (ECYS)
| |
Collapse
|
25
|
Daye D, Wiggins WF, Lungren MP, Alkasab T, Kottler N, Allen B, Roth CJ, Bizzo BC, Durniak K, Brink JA, Larson DB, Dreyer KJ, Langlotz CP. Implementation of Clinical Artificial Intelligence in Radiology: Who Decides and How? Radiology 2022; 305:555-563. [PMID: 35916673 PMCID: PMC9713445 DOI: 10.1148/radiol.212151] [Citation(s) in RCA: 48] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 03/30/2022] [Accepted: 04/12/2022] [Indexed: 01/03/2023]
Abstract
As the role of artificial intelligence (AI) in clinical practice evolves, governance structures oversee the implementation, maintenance, and monitoring of clinical AI algorithms to enhance quality, manage resources, and ensure patient safety. In this article, a framework is established for the infrastructure required for clinical AI implementation and presents a road map for governance. The road map answers four key questions: Who decides which tools to implement? What factors should be considered when assessing an application for implementation? How should applications be implemented in clinical practice? Finally, how should tools be monitored and maintained after clinical implementation? Among the many challenges for the implementation of AI in clinical practice, devising flexible governance structures that can quickly adapt to a changing environment will be essential to ensure quality patient care and practice improvement objectives.
Collapse
Affiliation(s)
- Dania Daye
- From the Department of Radiology, Massachusetts General Hospital,
Harvard Medical School, 55 Fruit St, GRB 297, Boston, MA 02155 (D.D., T.A.,
B.C.B., K.D., J.A.B., K.J.D.); Department of Radiology, Duke University, Durham,
NC (W.F.W., C.J.R.); Department of Radiology, Stanford University, Stanford,
Calif (M.P.L., D.B.L., C.P.L.); Radiology Partners, El Segundo, Calif (N.K.);
and Department of Radiology, Grandview Medical Center, Birmingham, Ala
(B.A.)
| | - Walter F. Wiggins
- From the Department of Radiology, Massachusetts General Hospital,
Harvard Medical School, 55 Fruit St, GRB 297, Boston, MA 02155 (D.D., T.A.,
B.C.B., K.D., J.A.B., K.J.D.); Department of Radiology, Duke University, Durham,
NC (W.F.W., C.J.R.); Department of Radiology, Stanford University, Stanford,
Calif (M.P.L., D.B.L., C.P.L.); Radiology Partners, El Segundo, Calif (N.K.);
and Department of Radiology, Grandview Medical Center, Birmingham, Ala
(B.A.)
| | - Matthew P. Lungren
- From the Department of Radiology, Massachusetts General Hospital,
Harvard Medical School, 55 Fruit St, GRB 297, Boston, MA 02155 (D.D., T.A.,
B.C.B., K.D., J.A.B., K.J.D.); Department of Radiology, Duke University, Durham,
NC (W.F.W., C.J.R.); Department of Radiology, Stanford University, Stanford,
Calif (M.P.L., D.B.L., C.P.L.); Radiology Partners, El Segundo, Calif (N.K.);
and Department of Radiology, Grandview Medical Center, Birmingham, Ala
(B.A.)
| | - Tarik Alkasab
- From the Department of Radiology, Massachusetts General Hospital,
Harvard Medical School, 55 Fruit St, GRB 297, Boston, MA 02155 (D.D., T.A.,
B.C.B., K.D., J.A.B., K.J.D.); Department of Radiology, Duke University, Durham,
NC (W.F.W., C.J.R.); Department of Radiology, Stanford University, Stanford,
Calif (M.P.L., D.B.L., C.P.L.); Radiology Partners, El Segundo, Calif (N.K.);
and Department of Radiology, Grandview Medical Center, Birmingham, Ala
(B.A.)
| | - Nina Kottler
- From the Department of Radiology, Massachusetts General Hospital,
Harvard Medical School, 55 Fruit St, GRB 297, Boston, MA 02155 (D.D., T.A.,
B.C.B., K.D., J.A.B., K.J.D.); Department of Radiology, Duke University, Durham,
NC (W.F.W., C.J.R.); Department of Radiology, Stanford University, Stanford,
Calif (M.P.L., D.B.L., C.P.L.); Radiology Partners, El Segundo, Calif (N.K.);
and Department of Radiology, Grandview Medical Center, Birmingham, Ala
(B.A.)
| | - Bibb Allen
- From the Department of Radiology, Massachusetts General Hospital,
Harvard Medical School, 55 Fruit St, GRB 297, Boston, MA 02155 (D.D., T.A.,
B.C.B., K.D., J.A.B., K.J.D.); Department of Radiology, Duke University, Durham,
NC (W.F.W., C.J.R.); Department of Radiology, Stanford University, Stanford,
Calif (M.P.L., D.B.L., C.P.L.); Radiology Partners, El Segundo, Calif (N.K.);
and Department of Radiology, Grandview Medical Center, Birmingham, Ala
(B.A.)
| | - Christopher J. Roth
- From the Department of Radiology, Massachusetts General Hospital,
Harvard Medical School, 55 Fruit St, GRB 297, Boston, MA 02155 (D.D., T.A.,
B.C.B., K.D., J.A.B., K.J.D.); Department of Radiology, Duke University, Durham,
NC (W.F.W., C.J.R.); Department of Radiology, Stanford University, Stanford,
Calif (M.P.L., D.B.L., C.P.L.); Radiology Partners, El Segundo, Calif (N.K.);
and Department of Radiology, Grandview Medical Center, Birmingham, Ala
(B.A.)
| | - Bernardo C. Bizzo
- From the Department of Radiology, Massachusetts General Hospital,
Harvard Medical School, 55 Fruit St, GRB 297, Boston, MA 02155 (D.D., T.A.,
B.C.B., K.D., J.A.B., K.J.D.); Department of Radiology, Duke University, Durham,
NC (W.F.W., C.J.R.); Department of Radiology, Stanford University, Stanford,
Calif (M.P.L., D.B.L., C.P.L.); Radiology Partners, El Segundo, Calif (N.K.);
and Department of Radiology, Grandview Medical Center, Birmingham, Ala
(B.A.)
| | - Kimberly Durniak
- From the Department of Radiology, Massachusetts General Hospital,
Harvard Medical School, 55 Fruit St, GRB 297, Boston, MA 02155 (D.D., T.A.,
B.C.B., K.D., J.A.B., K.J.D.); Department of Radiology, Duke University, Durham,
NC (W.F.W., C.J.R.); Department of Radiology, Stanford University, Stanford,
Calif (M.P.L., D.B.L., C.P.L.); Radiology Partners, El Segundo, Calif (N.K.);
and Department of Radiology, Grandview Medical Center, Birmingham, Ala
(B.A.)
| | - James A. Brink
- From the Department of Radiology, Massachusetts General Hospital,
Harvard Medical School, 55 Fruit St, GRB 297, Boston, MA 02155 (D.D., T.A.,
B.C.B., K.D., J.A.B., K.J.D.); Department of Radiology, Duke University, Durham,
NC (W.F.W., C.J.R.); Department of Radiology, Stanford University, Stanford,
Calif (M.P.L., D.B.L., C.P.L.); Radiology Partners, El Segundo, Calif (N.K.);
and Department of Radiology, Grandview Medical Center, Birmingham, Ala
(B.A.)
| | - David B. Larson
- From the Department of Radiology, Massachusetts General Hospital,
Harvard Medical School, 55 Fruit St, GRB 297, Boston, MA 02155 (D.D., T.A.,
B.C.B., K.D., J.A.B., K.J.D.); Department of Radiology, Duke University, Durham,
NC (W.F.W., C.J.R.); Department of Radiology, Stanford University, Stanford,
Calif (M.P.L., D.B.L., C.P.L.); Radiology Partners, El Segundo, Calif (N.K.);
and Department of Radiology, Grandview Medical Center, Birmingham, Ala
(B.A.)
| | | | | |
Collapse
|
26
|
Laukka E, Hammarén M, Kanste O. Nurse leaders' and digital service developers' perceptions of the future role of artificial intelligence in specialized medical care: An interview study. J Nurs Manag 2022; 30:3838-3846. [PMID: 35970487 PMCID: PMC10087264 DOI: 10.1111/jonm.13769] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 08/01/2022] [Accepted: 08/11/2022] [Indexed: 12/30/2022]
Abstract
AIM To describe nurse leaders' and digital service developers' perceptions of the future role of artificial intelligence (AI) in specialized medical care. BACKGROUND Use of AI has rapidly increased in health care. However, nurse leaders' and developers' perceptions of AI and its future in specialized medical care remain under-researched. METHOD Descriptive qualitative methodology was applied. Data were collected through six focus groups, and interviews with nurse leaders (n = 20) and digital service developers (n = 10) conducted remotely in 2021 at a university hospital in Finland. The data were subjected to inductive content analysis. RESULTS The data yielded 25 sub-categories, 10 categories and three main categories of participants' perceptions. The main categories were designated AI transforming: work, care and services and organizations. CONCLUSIONS According to our respondents, AI will have a significant future role in specialized medical care, but it will likely reinforce, rather than replace, clinicians or traditional care. They also believe that it may have several positive consequences for clinicians' and leaders' work as well as for organizations and patients. IMPLICATIONS FOR NURSING MANAGEMENT Nurse leaders should be familiar with the potential of AI, but also aware of risks. Such leaders may provide betters support for development of AI-based health services that improve clinicians' workflows.
Collapse
Affiliation(s)
- Elina Laukka
- Research Unit of Nursing Science and Health Management, University of Oulu, Oulu, Finland
| | - Mira Hammarén
- Research Unit of Nursing Science and Health Management, University of Oulu, Oulu, Finland
| | - Outi Kanste
- Research Unit of Nursing Science and Health Management, University of Oulu, Oulu, Finland.,Medical Research Center, Oulu University Hospital, Oulu, Finland
| |
Collapse
|
27
|
Explainable medical imaging AI needs human-centered design: guidelines and evidence from a systematic review. NPJ Digit Med 2022; 5:156. [PMID: 36261476 PMCID: PMC9581990 DOI: 10.1038/s41746-022-00699-2] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 09/29/2022] [Indexed: 11/16/2022] Open
Abstract
Transparency in Machine Learning (ML), often also referred to as interpretability or explainability, attempts to reveal the working mechanisms of complex models. From a human-centered design perspective, transparency is not a property of the ML model but an affordance, i.e., a relationship between algorithm and users. Thus, prototyping and user evaluations are critical to attaining solutions that afford transparency. Following human-centered design principles in highly specialized and high stakes domains, such as medical image analysis, is challenging due to the limited access to end users and the knowledge imbalance between those users and ML designers. To investigate the state of transparent ML in medical image analysis, we conducted a systematic review of the literature from 2012 to 2021 in PubMed, EMBASE, and Compendex databases. We identified 2508 records and 68 articles met the inclusion criteria. Current techniques in transparent ML are dominated by computational feasibility and barely consider end users, e.g. clinical stakeholders. Despite the different roles and knowledge of ML developers and end users, no study reported formative user research to inform the design and development of transparent ML models. Only a few studies validated transparency claims through empirical user evaluations. These shortcomings put contemporary research on transparent ML at risk of being incomprehensible to users, and thus, clinically irrelevant. To alleviate these shortcomings in forthcoming research, we introduce the INTRPRT guideline, a design directive for transparent ML systems in medical image analysis. The INTRPRT guideline suggests human-centered design principles, recommending formative user research as the first step to understand user needs and domain requirements. Following these guidelines increases the likelihood that the algorithms afford transparency and enable stakeholders to capitalize on the benefits of transparent ML.
Collapse
|
28
|
Fehr J, Jaramillo-Gutierrez G, Oala L, Gröschel MI, Bierwirth M, Balachandran P, Werneck-Leite A, Lippert C. Piloting a Survey-Based Assessment of Transparency and Trustworthiness with Three Medical AI Tools. Healthcare (Basel) 2022; 10:healthcare10101923. [PMID: 36292369 PMCID: PMC9601535 DOI: 10.3390/healthcare10101923] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 09/18/2022] [Accepted: 09/21/2022] [Indexed: 11/04/2022] Open
Abstract
Artificial intelligence (AI) offers the potential to support healthcare delivery, but poorly trained or validated algorithms bear risks of harm. Ethical guidelines stated transparency about model development and validation as a requirement for trustworthy AI. Abundant guidance exists to provide transparency through reporting, but poorly reported medical AI tools are common. To close this transparency gap, we developed and piloted a framework to quantify the transparency of medical AI tools with three use cases. Our framework comprises a survey to report on the intended use, training and validation data and processes, ethical considerations, and deployment recommendations. The transparency of each response was scored with either 0, 0.5, or 1 to reflect if the requested information was not, partially, or fully provided. Additionally, we assessed on an analogous three-point scale if the provided responses fulfilled the transparency requirement for a set of trustworthiness criteria from ethical guidelines. The degree of transparency and trustworthiness was calculated on a scale from 0% to 100%. Our assessment of three medical AI use cases pin-pointed reporting gaps and resulted in transparency scores of 67% for two use cases and one with 59%. We report anecdotal evidence that business constraints and limited information from external datasets were major obstacles to providing transparency for the three use cases. The observed transparency gaps also lowered the degree of trustworthiness, indicating compliance gaps with ethical guidelines. All three pilot use cases faced challenges to provide transparency about medical AI tools, but more studies are needed to investigate those in the wider medical AI sector. Applying this framework for an external assessment of transparency may be infeasible if business constraints prevent the disclosure of information. New strategies may be necessary to enable audits of medical AI tools while preserving business secrets.
Collapse
Affiliation(s)
- Jana Fehr
- Digital Engineering Faculty, University of Potsdam, 14482 Potsdam, Germany
- Digital Health & Machine Learning, Hasso Plattner Institute, 14482 Potsdam, Germany
- Correspondence:
| | | | - Luis Oala
- Department of Artificial Intelligence, Fraunhofer HHI, 10587 Berlin, Germany
| | - Matthias I. Gröschel
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Manuel Bierwirth
- ITU/WHO Focus Group AI4H, 1211 Geneva, Switzerland
- Alumnus Goethe Frankfurt University, 60323 Frankfurt am Main, Germany
| | - Pradeep Balachandran
- ITU/WHO Focus Group AI4H, 1211 Geneva, Switzerland
- Technical Consultant (Digital Health), Thiruvananthapuram 695010, India
| | | | - Christoph Lippert
- Digital Engineering Faculty, University of Potsdam, 14482 Potsdam, Germany
- Digital Health & Machine Learning, Hasso Plattner Institute, 14482 Potsdam, Germany
| |
Collapse
|
29
|
Abstract
The deployment of machine learning for tasks relevant to complementing standard of care and advancing tools for precision health has gained much attention in the clinical community, thus meriting further investigations into its broader use. In an introduction to predictive modelling using machine learning, we conducted a review of the recent literature that explains standard taxonomies, terminology and central concepts to a broad clinical readership. Articles aimed at readers with little or no prior experience of commonly used methods or typical workflows were summarised and key references are highlighted. Continual interdisciplinary developments in data science, biostatistics and epidemiology also motivated us to further discuss emerging topics in predictive and data-driven (hypothesis-less) analytics with machine learning. Through two methodological deep dives using examples from precision psychiatry and outcome prediction after lymphoma, we highlight how the use of, for example, natural language processing can outperform established clinical risk scores and aid dynamic prediction and adaptive care strategies. Such realistic and detailed examples allow for critical analysis of the importance of new technological advances in artificial intelligence for clinical decision-making. New clinical decision support systems can assist in prevention and care by leveraging precision medicine.
Collapse
Affiliation(s)
- Sandra Eloranta
- Division of Clinical Epidemiology, Department of Medicine Solna, Karolinska Institutet, Stockholm, Sweden
| | - Magnus Boman
- Division of Software and Computer Systems, School of Electrical Engineering and Computer Science, KTH, Stockholm, Sweden.,Department of Learning, Informatics, Management, and Ethics, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
30
|
Bräuner KB, Rosen AW, Tsouchnika A, Walbech JS, Gögenur M, Lin VA, Clausen JSR, Gögenur I. Developing prediction models for short-term mortality after surgery for colorectal cancer using a Danish national quality assurance database. Int J Colorectal Dis 2022; 37:1835-1843. [PMID: 35849195 DOI: 10.1007/s00384-022-04207-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/20/2022] [Indexed: 02/04/2023]
Abstract
PURPOSE The majority of colorectal cancer surgeries are performed electively, and treatment is often decided at the multidisciplinary team conference. Although the average 30-day mortality rate is low, there is substantial population heterogeneity from young, healthy patients to frail, elderly patients. The individual risk of surgery can vary widely, and tailoring treatment for colorectal cancer may lead to better outcomes. This requires prediction of risk that is accurate and available prior to surgery. METHODS Data from the Danish Colorectal Cancer Group database was transformed into the Observational Medical Outcomes Partnership Common Data Model. Models were developed to predict the risk of mortality within 30, 90, and 180 days after colorectal cancer surgery using only covariates decided at the multidisciplinary team conference. Several machine-learning models were trained, but due to superior performance, a Least Absolute Shrinkage and Selection Operator logistic regression was used for the final model. Performance was assessed with discrimination (area under the receiver operating characteristic and precision recall curve) and calibration measures (calibration in large, intercept, slope, and Brier score). RESULTS The cohort contained 65,612 patients operated for colorectal cancer in the period from 2001 to 2019 in Denmark. The Least Absolute Shrinkage and Selection Operator model showed an area under the receiver operating characteristic for 30-, 90-, and 180-day mortality after colorectal cancer surgery of 0.871 (95% CI: 0.86-0.882), 0.874 (95% CI: 0.864-0.882), and 0.876 (95% CI: 0.867-0.883) and calibration in large of 1.01, 0.98, and 1.01, respectively. CONCLUSION The postoperative short-term mortality prediction model showed excellent discrimination and calibration using only preoperatively known predictors.
Collapse
Affiliation(s)
- Karoline B Bräuner
- Center for Surgical Science, Zealand University Hospital, Lykkebækvej 1, 4600, Køge, Denmark.
| | - Andreas W Rosen
- Center for Surgical Science, Zealand University Hospital, Lykkebækvej 1, 4600, Køge, Denmark
| | - Adamantia Tsouchnika
- Center for Surgical Science, Zealand University Hospital, Lykkebækvej 1, 4600, Køge, Denmark
| | - Julie S Walbech
- Center for Surgical Science, Zealand University Hospital, Lykkebækvej 1, 4600, Køge, Denmark
| | - Mikail Gögenur
- Center for Surgical Science, Zealand University Hospital, Lykkebækvej 1, 4600, Køge, Denmark
| | - Viviane A Lin
- Center for Surgical Science, Zealand University Hospital, Lykkebækvej 1, 4600, Køge, Denmark
| | - Johan S R Clausen
- Center for Surgical Science, Zealand University Hospital, Lykkebækvej 1, 4600, Køge, Denmark
| | - Ismail Gögenur
- Center for Surgical Science, Zealand University Hospital, Lykkebækvej 1, 4600, Køge, Denmark.,The Faculty of Health Science, University of Copenhagen, Blegdamsvej 6, 2200, Copenhagen N, Denmark
| |
Collapse
|
31
|
Momtazmanesh S, Nowroozi A, Rezaei N. Artificial Intelligence in Rheumatoid Arthritis: Current Status and Future Perspectives: A State-of-the-Art Review. Rheumatol Ther 2022; 9:1249-1304. [PMID: 35849321 PMCID: PMC9510088 DOI: 10.1007/s40744-022-00475-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 06/24/2022] [Indexed: 11/23/2022] Open
Abstract
Investigation of the potential applications of artificial intelligence (AI), including machine learning (ML) and deep learning (DL) techniques, is an exponentially growing field in medicine and healthcare. These methods can be critical in providing high-quality care to patients with chronic rheumatological diseases lacking an optimal treatment, like rheumatoid arthritis (RA), which is the second most prevalent autoimmune disease. Herein, following reviewing the basic concepts of AI, we summarize the advances in its applications in RA clinical practice and research. We provide directions for future investigations in this field after reviewing the current knowledge gaps and technical and ethical challenges in applying AI. Automated models have been largely used to improve RA diagnosis since the early 2000s, and they have used a wide variety of techniques, e.g., support vector machine, random forest, and artificial neural networks. AI algorithms can facilitate screening and identification of susceptible groups, diagnosis using omics, imaging, clinical, and sensor data, patient detection within electronic health record (EHR), i.e., phenotyping, treatment response assessment, monitoring disease course, determining prognosis, novel drug discovery, and enhancing basic science research. They can also aid in risk assessment for incidence of comorbidities, e.g., cardiovascular diseases, in patients with RA. However, the proposed models may vary significantly in their performance and reliability. Despite the promising results achieved by AI models in enhancing early diagnosis and management of patients with RA, they are not fully ready to be incorporated into clinical practice. Future investigations are required to ensure development of reliable and generalizable algorithms while they carefully look for any potential source of bias or misconduct. We showed that a growing body of evidence supports the potential role of AI in revolutionizing screening, diagnosis, and management of patients with RA. However, multiple obstacles hinder clinical applications of AI models. Incorporating the machine and/or deep learning algorithms into real-world settings would be a key step in the progress of AI in medicine.
Collapse
Affiliation(s)
- Sara Momtazmanesh
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran.,Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran.,Research Center for Immunodeficiencies, Pediatrics Center of Excellence, Children's Medical Center, Tehran University of Medical Sciences, Dr. Gharib St, Keshavarz Blvd, Tehran, Iran
| | - Ali Nowroozi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran.,Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran
| | - Nima Rezaei
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran. .,Research Center for Immunodeficiencies, Pediatrics Center of Excellence, Children's Medical Center, Tehran University of Medical Sciences, Dr. Gharib St, Keshavarz Blvd, Tehran, Iran. .,Department of Immunology, School of Medicine, Tehran University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
32
|
Stafford IS, Gosink MM, Mossotto E, Ennis S, Hauben M. A Systematic Review of Artificial Intelligence and Machine Learning Applications to Inflammatory Bowel Disease, with Practical Guidelines for Interpretation. Inflamm Bowel Dis 2022; 28:1573-1583. [PMID: 35699597 PMCID: PMC9527612 DOI: 10.1093/ibd/izac115] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Indexed: 12/15/2022]
Abstract
BACKGROUND Inflammatory bowel disease (IBD) is a gastrointestinal chronic disease with an unpredictable disease course. Computational methods such as machine learning (ML) have the potential to stratify IBD patients for the provision of individualized care. The use of ML methods for IBD was surveyed, with an additional focus on how the field has changed over time. METHODS On May 6, 2021, a systematic review was conducted through a search of MEDLINE and Embase databases, with the search structure ("machine learning" OR "artificial intelligence") AND ("Crohn* Disease" OR "Ulcerative Colitis" OR "Inflammatory Bowel Disease"). Exclusion criteria included studies not written in English, no human patient data, publication before 2001, studies that were not peer reviewed, nonautoimmune disease comorbidity research, and record types that were not primary research. RESULTS Seventy-eight (of 409) records met the inclusion criteria. Random forest methods were most prevalent, and there was an increase in neural networks, mainly applied to imaging data sets. The main applications of ML to clinical tasks were diagnosis (18 of 78), disease course (22 of 78), and disease severity (16 of 78). The median sample size was 263. Clinical and microbiome-related data sets were most popular. Five percent of studies used an external data set after training and testing for additional model validation. DISCUSSION Availability of longitudinal and deep phenotyping data could lead to better modeling. Machine learning pipelines that consider imbalanced data and that feature selection only on training data will generate more generalizable models. Machine learning models are increasingly being applied to more complex clinical tasks for specific phenotypes, indicating progress towards personalized medicine for IBD.
Collapse
Affiliation(s)
| | | | - Enrico Mossotto
- Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
| | - Sarah Ennis
- Address correspondence to: Sarah Ennis, Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK ()
| | | |
Collapse
|
33
|
Al-Zaiti SS, Alghwiri AA, Hu X, Clermont G, Peace A, Macfarlane P, Bond R. A clinician's guide to understanding and critically appraising machine learning studies: a checklist for Ruling Out Bias Using Standard Tools in Machine Learning (ROBUST-ML). EUROPEAN HEART JOURNAL. DIGITAL HEALTH 2022; 3:125-140. [PMID: 36713011 PMCID: PMC9708024 DOI: 10.1093/ehjdh/ztac016] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 02/11/2022] [Indexed: 05/06/2023]
Abstract
Developing functional machine learning (ML)-based models to address unmet clinical needs requires unique considerations for optimal clinical utility. Recent debates about the rigours, transparency, explainability, and reproducibility of ML models, terms which are defined in this article, have raised concerns about their clinical utility and suitability for integration in current evidence-based practice paradigms. This featured article focuses on increasing the literacy of ML among clinicians by providing them with the knowledge and tools needed to understand and critically appraise clinical studies focused on ML. A checklist is provided for evaluating the rigour and reproducibility of the four ML building blocks: data curation, feature engineering, model development, and clinical deployment. Checklists like this are important for quality assurance and to ensure that ML studies are rigourously and confidently reviewed by clinicians and are guided by domain knowledge of the setting in which the findings will be applied. Bridging the gap between clinicians, healthcare scientists, and ML engineers can address many shortcomings and pitfalls of ML-based solutions and their potential deployment at the bedside.
Collapse
Affiliation(s)
| | - Alaa A Alghwiri
- Data Science Core, The Provost Office, University of Pittsburgh, Pittsburgh PA, USA
| | - Xiao Hu
- Center for Data Science, Emory University, Atlanta, GA, USA
| | - Gilles Clermont
- Departments of Critical Care Medicine, Mathematics, Clinical and Translational Science, and Industrial Engineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Aaron Peace
- The Clinical Translational Research and Innovation Centre, Northern Ireland, UK
| | - Peter Macfarlane
- Institute of Health and Wellbeing, Electrocardiology Section, University of Glasgow, Glasgow, UK
| | - Raymond Bond
- School of Computing, Ulster University, Ulster, UK
| |
Collapse
|
34
|
Chan SL, Lee JW, Ong MEH, Siddiqui FJ, Graves N, Ho AFW, Liu N. Implementation of prediction models in the emergency department from an implementation science perspective—Determinants, outcomes and real-world impact: A scoping review protocol. PLoS One 2022; 17:e0267965. [PMID: 35551537 PMCID: PMC9097992 DOI: 10.1371/journal.pone.0267965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 04/19/2022] [Indexed: 11/28/2022] Open
Abstract
The number of prediction models developed for use in emergency departments (EDs) have been increasing in recent years to complement traditional triage systems. However, most of these models have only reached the development or validation phase, and few have been implemented in clinical practice. There is a gap in knowledge on the real-world performance of prediction models in the ED and how they can be implemented successfully into routine practice. Existing reviews of prediction models in the ED have also mainly focused on model development and validation. The aim of this scoping review is to summarize the current landscape and understanding of implementation of predictions models in the ED. This scoping review follows the Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) checklist. We will include studies that report implementation outcomes and/or contextual determinants according to the RE-AIM/PRISM framework for prediction models used in EDs. We will include outcomes or contextual determinants studied at any point of time in the implementation process except for effectiveness, where only post-implementation results will be included. Conference abstracts, theses and dissertations, letters to editors, commentaries, non-research documents and non-English full-text articles will be excluded. Four databases (MEDLINE (through PubMed), Embase, Scopus and CINAHL) will be searched from their inception using a combination of search terms related to the population, intervention and outcomes. Two reviewers will independently screen articles for inclusion and any discrepancy resolved with a third reviewer. Results from included studies will be summarized narratively according to the RE-AIM/PRISM outcomes and domains. Where appropriate, a simple descriptive summary of quantitative outcomes may be performed.
Collapse
Affiliation(s)
- Sze Ling Chan
- Health Services Research Centre, Singapore Health Services, Singapore, Singapore
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
| | - Jin Wee Lee
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
| | - Marcus Eng Hock Ong
- Health Services Research Centre, Singapore Health Services, Singapore, Singapore
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- Department of Emergency Medicine, Singapore General Hospital, Singapore, Singapore
- Prehospital Emergency Research Centre, Duke-NUS Medical School, Singapore, Singapore
| | - Fahad Javaid Siddiqui
- Prehospital Emergency Research Centre, Duke-NUS Medical School, Singapore, Singapore
| | - Nicholas Graves
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
| | - Andrew Fu Wah Ho
- Department of Emergency Medicine, Singapore General Hospital, Singapore, Singapore
- Prehospital Emergency Research Centre, Duke-NUS Medical School, Singapore, Singapore
| | - Nan Liu
- Health Services Research Centre, Singapore Health Services, Singapore, Singapore
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
- Prehospital Emergency Research Centre, Duke-NUS Medical School, Singapore, Singapore
- SingHealth AI Health Program, Singapore Health Services, Singapore, Singapore
- Institute of Data Science, National University of Singapore, Singapore, Singapore
- * E-mail:
| |
Collapse
|
35
|
Wang G, Chen Y. Enabling Legal Risk Management Model for International Corporation with Deep Learning and Self Data Mining. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:6385404. [PMID: 35432517 PMCID: PMC9007679 DOI: 10.1155/2022/6385404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Revised: 02/24/2022] [Accepted: 03/04/2022] [Indexed: 11/17/2022]
Abstract
In uncertain times, risk management is critical in keeping companies from acting rashly and wrongly, allowing them to become more flexible and resilient. International cooperative production project investment and operational risks are different from domestic projects. It has a larger likelihood of occurrence, severe damage ramifications, and greater difficulty in prevention and control. As a result, companies must develop a scientific, logical, and comprehensive risk management system and procedure when "reaching out" to perform international joint production projects. We utilize machine learning (ML) to build a legal risk assessment model for international cooperative production projects, evaluate its validity, divide it into five risk categories, and suggest countermeasures for the risk variables discovered at each risk level in this work. The output of a single classifier is then fused using an SDM (self-organizing data mining) approach at the decision level, resulting in a multiclassifier early-warning model. In the context of the sustainable development goals, this methodology also allows for a sustainability assessment through risk evaluation. The experimental results show that the MCFM-SDM model outperforms a single classifier and other MCFMs in terms of early warning accuracy and stability, confirming the model's use and superiority.
Collapse
Affiliation(s)
- Guiling Wang
- Guangdong Justice Police Vocational College Department of Law, Guangzhou, Guangdong, China
| | - Yimin Chen
- GF Securities Co., Ltd, Guangzhou, Guangdong, China
| |
Collapse
|
36
|
Kamel Rahimi A, Canfell OJ, Chan W, Sly B, Pole JD, Sullivan C, Shrapnel S. Machine learning models for diabetes management in acute care using electronic medical records: A systematic review. Int J Med Inform 2022; 162:104758. [PMID: 35398812 DOI: 10.1016/j.ijmedinf.2022.104758] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 03/24/2022] [Accepted: 03/29/2022] [Indexed: 12/23/2022]
Abstract
BACKGROUND Machine learning (ML) is a subset of Artificial Intelligence (AI) that is used to predict and potentially prevent adverse patient outcomes. There is increasing interest in the application of these models in digital hospitals to improve clinical decision-making and chronic disease management, particularly for patients with diabetes. The potential of ML models using electronic medical records (EMR) to improve the clinical care of hospitalised patients with diabetes is currently unknown. OBJECTIVE The aim was to systematically identify and critically review the published literature examining the development and validation of ML models using EMR data for improving the care of hospitalised adult patients with diabetes. METHODS The Preferred Reporting Items for Systematic Reviews and Meta Analyses (PRISMA) guidelines were followed. Four databases were searched (Embase, PubMed, IEEE and Web of Science) for studies published between January 2010 to January 2022. The reference lists of the eligible articles were manually searched. Articles that examined adults and both developed and validated ML models using EMR data were included. Studies conducted in primary care and community care settings were excluded. Studies were independently screened and data was extracted using Covidence® systematic review software. For data extraction and critical appraisal, the Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS) was followed. Risk of bias was assessed using the Prediction model Risk Of Bias Assessment Tool (PROBAST). Quality of reporting was assessed by adherence to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) guideline. The IJMEDI checklist was followed to assess quality of ML models and the reproducibility of their outcomes. The external validation methodology of the studies was appraised. RESULTS Of the 1317 studies screened, twelve met inclusion criteria. Eight studies developed ML models to predict disglycaemic episodes for hospitalized patients with diabetes, one study developed a ML model to predict total insulin dosage, two studies predicted risk of readmission, and one study improved the prediction of hospital readmission for inpatients with diabetes. All included studies were heterogeneous with regard to ML types, cohort, input predictors, sample size, performance and validation metrics and clinical outcomes. Two studies adhered to the TRIPOD guideline. The methodological reporting of all the studies was evaluated to be at high risk of bias. The quality of ML models in all studies was assessed as poor. Robust external validation was not performed on any of the studies. No models were implemented or evaluated in routine clinical care. CONCLUSIONS This review identified a limited number of ML models which were developed to improve inpatient management of diabetes. No ML models were implemented in real hospital settings. Future research needs to enhance the development, reporting and validation steps to enable ML models for integration into routine clinical care.
Collapse
Affiliation(s)
- Amir Kamel Rahimi
- Centre for Health Services Research, Faculty of Medicine, The University of Queensland, Herston 4006, Brisbane, Australia; Digital Health Cooperative Research Centre, Australian Government, Sydney, New South Wales, Australia.
| | - Oliver J Canfell
- Centre for Health Services Research, Faculty of Medicine, The University of Queensland, Herston 4006, Brisbane, Australia; Digital Health Cooperative Research Centre, Australian Government, Sydney, New South Wales, Australia; UQ Business School, The University of Queensland, St Lucia 4072, Brisbane, Australia
| | - Wilkin Chan
- The School of Clinical Medicine, The University of Queensland, Herston 4006, Brisbane, Australia
| | - Benjamin Sly
- Centre for Health Services Research, Faculty of Medicine, The University of Queensland, Herston 4006, Brisbane, Australia; Princess Alexandra Hospital, 199 Ipswich Road, Woolloongabba 4102, Brisbane, Australia
| | - Jason D Pole
- Centre for Health Services Research, Faculty of Medicine, The University of Queensland, Herston 4006, Brisbane, Australia; Dalla Lana School of Public Health, The University of Toronto, Toronto, Canada; ICES, Toronto, Canada
| | - Clair Sullivan
- Centre for Health Services Research, Faculty of Medicine, The University of Queensland, Herston 4006, Brisbane, Australia; Metro North Hospital and Health Service, Department of Health, Queensland Government, Herston 4006, Brisbane, Australia
| | - Sally Shrapnel
- Centre for Health Services Research, Faculty of Medicine, The University of Queensland, Herston 4006, Brisbane, Australia; The School of Mathematics and Physics, The University of Queensland, St Lucia 4072, Brisbane, Australia
| |
Collapse
|
37
|
Cerrato P, Halamka J, Pencina M. A proposal for developing a platform that evaluates algorithmic equity and accuracy. BMJ Health Care Inform 2022; 29:bmjhci-2021-100423. [PMID: 35410952 PMCID: PMC9003600 DOI: 10.1136/bmjhci-2021-100423] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 01/06/2022] [Indexed: 01/21/2023] Open
Abstract
We are at a pivotal moment in the development of healthcare artificial intelligence (AI), a point at which enthusiasm for machine learning has not caught up with the scientific evidence to support the equity and accuracy of diagnostic and therapeutic algorithms. This proposal examines algorithmic biases, including those related to race, gender and socioeconomic status, and accuracy, including the paucity of prospective studies and lack of multisite validation. We then suggest solutions to these problems. We describe the Mayo Clinic, Duke University, Change Healthcare project that is evaluating 35.1 billion healthcare records for bias. And we propose ‘Ingredients’ style labels and an AI evaluation/testing system to help clinicians judge the merits of products and services that include algorithms. Said testing would include input data sources and types, dataset population composition, algorithm validation techniques, bias assessment evaluation and performance metrics.
Collapse
Affiliation(s)
- Paul Cerrato
- Paul Cerrato is Senior Research Analyst/Communications Specialist, Mayo Clinic Platform; John Halamka is President of Mayo Clinic Platform, Mayo Clinic Rochester, Rochester, Minnesota, USA
| | - John Halamka
- Paul Cerrato is Senior Research Analyst/Communications Specialist, Mayo Clinic Platform; John Halamka is President of Mayo Clinic Platform, Mayo Clinic Rochester, Rochester, Minnesota, USA
| | - Michael Pencina
- Vice Dean for Data Science and Information Technology, Duke University, Durham, North Carolina, USA
| |
Collapse
|
38
|
King H, Wright J, Treanor D, Williams B, Randell R. What works where and how for uptake and impact of artificial intelligence in pathology: A review of theories for a realist evaluation (Preprint). J Med Internet Res 2022; 25:e38039. [PMID: 37093631 PMCID: PMC10167589 DOI: 10.2196/38039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 06/14/2022] [Accepted: 07/11/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND There is increasing interest in the use of artificial intelligence (AI) in pathology to increase accuracy and efficiency. To date, studies of clinicians' perceptions of AI have found only moderate acceptability, suggesting the need for further research regarding how to integrate it into clinical practice. OBJECTIVE The aim of the study was to determine contextual factors that may support or constrain the uptake of AI in pathology. METHODS To go beyond a simple listing of barriers and facilitators, we drew on the approach of realist evaluation and undertook a review of the literature to elicit stakeholders' theories of how, for whom, and in what circumstances AI can provide benefit in pathology. Searches were designed by an information specialist and peer-reviewed by a second information specialist. Searches were run on the arXiv.org repository, MEDLINE, and the Health Management Information Consortium, with additional searches undertaken on a range of websites to identify gray literature. In line with a realist approach, we also made use of relevant theory. Included documents were indexed in NVivo 12, using codes to capture different contexts, mechanisms, and outcomes that could affect the introduction of AI in pathology. Coded data were used to produce narrative summaries of each of the identified contexts, mechanisms, and outcomes, which were then translated into theories in the form of context-mechanism-outcome configurations. RESULTS A total of 101 relevant documents were identified. Our analysis indicates that the benefits that can be achieved will vary according to the size and nature of the pathology department's workload and the extent to which pathologists work collaboratively; the major perceived benefit for specialist centers is in reducing workload. For uptake of AI, pathologists' trust is essential. Existing theories suggest that if pathologists are able to "make sense" of AI, engage in the adoption process, receive support in adapting their work processes, and can identify potential benefits to its introduction, it is more likely to be accepted. CONCLUSIONS For uptake of AI in pathology, for all but the most simple quantitative tasks, measures will be required that either increase confidence in the system or provide users with an understanding of the performance of the system. For specialist centers, efforts should focus on reducing workload rather than increasing accuracy. Designers also need to give careful thought to usability and how AI is integrated into pathologists' workflow.
Collapse
Affiliation(s)
- Henry King
- Faculty of Medicine & Health, University of Leeds, Leeds, United Kingdom
| | - Judy Wright
- Faculty of Medicine & Health, University of Leeds, Leeds, United Kingdom
| | - Darren Treanor
- Faculty of Medicine & Health, University of Leeds, Leeds, United Kingdom
- Leeds Teaching Hospitals NHS Trust, Leeds, United Kingdom
- Department of Clinical Pathology, and Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
- Center for Medical Image Science and Visualization, Linköping University, Linköping, Sweden
| | | | - Rebecca Randell
- Faculty of Health Studies, University of Bradford, Bradford, United Kingdom
- Wolfson Centre for Applied Health Research, Bradford, United Kingdom
| |
Collapse
|
39
|
Binvignat M, Pedoia V, Butte AJ, Louati K, Klatzmann D, Berenbaum F, Mariotti-Ferrandiz E, Sellam J. Use of machine learning in osteoarthritis research: a systematic literature review. RMD Open 2022; 8:e001998. [PMID: 35296530 PMCID: PMC8928401 DOI: 10.1136/rmdopen-2021-001998] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 02/16/2022] [Indexed: 11/21/2022] Open
Abstract
OBJECTIVE The aim of this systematic literature review was to provide a comprehensive and exhaustive overview of the use of machine learning (ML) in the clinical care of osteoarthritis (OA). METHODS A systematic literature review was performed in July 2021 using MEDLINE PubMed with key words and MeSH terms. For each selected article, the number of patients, ML algorithms used, type of data analysed, validation methods and data availability were collected. RESULTS From 1148 screened articles, 46 were selected and analysed; most were published after 2017. Twelve articles were related to diagnosis, 7 to prediction, 4 to phenotyping, 12 to severity and 11 to progression. The number of patients included ranged from 18 to 5749. Overall, 35% of the articles described the use of deep learning And 74% imaging analyses. A total of 85% of the articles involved knee OA and 15% hip OA. No study investigated hand OA. Most of the studies involved the same cohort, with data from the OA initiative described in 46% of the articles and the MOST and Cohort Hip and Cohort Knee cohorts in 11% and 7%. Data and source codes were described as publicly available respectively in 54% and 22% of the articles. External validation was provided in only 7% of the articles. CONCLUSION This review proposes an up-to-date overview of ML approaches used in clinical OA research and will help to enhance its application in this field.
Collapse
Affiliation(s)
- Marie Binvignat
- Department of Rheumatology, Hôpital Saint-Antoine, Assistance Publique - Hôpitaux de Paris (AP-HP), Centre de Recherche Saint-Antoine, Inserm UMRS_938, Assistance Publique - Hôpitaux de Paris (AP-HP), Sorbonne Universite, Paris, France
- Bakar Computational Health Science Institute, University of California, San Francisco, California, USA
- Immunology Immunopathology Immunotherapy UMRS_959, Sorbonne Universite, Paris, France
| | - Valentina Pedoia
- Center for Intelligent Imaging (CI2), Department of Radiology and Biomedical Imaging, University of California, San Francisco, California, USA
| | - Atul J Butte
- Bakar Computational Health Science Institute, University of California, San Francisco, California, USA
| | - Karine Louati
- Department of Rheumatology, Hôpital Saint-Antoine, Assistance Publique - Hôpitaux de Paris (AP-HP), Centre de Recherche Saint-Antoine, Inserm UMRS_938, Assistance Publique - Hôpitaux de Paris (AP-HP), Sorbonne Universite, Paris, France
| | - David Klatzmann
- Immunology Immunopathology Immunotherapy UMRS_959, Sorbonne Universite, Paris, France
- Biotherapy (CIC-BTi) and Inflammation Immunopathology-Biotherapy Department (i2B), Hôpital Pitié-Salpêtrière, AP-HP, Paris, France
| | - Francis Berenbaum
- Department of Rheumatology, Hôpital Saint-Antoine, Assistance Publique - Hôpitaux de Paris (AP-HP), Centre de Recherche Saint-Antoine, Inserm UMRS_938, Assistance Publique - Hôpitaux de Paris (AP-HP), Sorbonne Universite, Paris, France
| | | | - Jérémie Sellam
- Department of Rheumatology, Hôpital Saint-Antoine, Assistance Publique - Hôpitaux de Paris (AP-HP), Centre de Recherche Saint-Antoine, Inserm UMRS_938, Assistance Publique - Hôpitaux de Paris (AP-HP), Sorbonne Universite, Paris, France
| |
Collapse
|
40
|
Sujan M, Pool R, Salmon P. Eight human factors and ergonomics principles for healthcare artificial intelligence. BMJ Health Care Inform 2022; 29:bmjhci-2021-100516. [PMID: 35121617 PMCID: PMC8819549 DOI: 10.1136/bmjhci-2021-100516] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 01/26/2022] [Indexed: 01/21/2023] Open
Affiliation(s)
- Mark Sujan
- Human Factors Everywhere, Woking, UK .,Chartered Institute of Ergonomics and Human Factors, Birmingham, UK
| | | | - Paul Salmon
- Centre for Human Factors and Sociotechnical Systems, University of the Sunshine Coast, Maroochydore DC, Queensland, Australia
| |
Collapse
|
41
|
Crossnohere NL, Elsaid M, Paskett J, Bose-Brill S, Bridges JFP. Guidelines for artificial intelligence in medicine: A literature review and content analysis of frameworks (Preprint). J Med Internet Res 2022; 24:e36823. [PMID: 36006692 PMCID: PMC9459836 DOI: 10.2196/36823] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 06/02/2022] [Accepted: 07/14/2022] [Indexed: 12/15/2022] Open
Abstract
Background Artificial intelligence (AI) is rapidly expanding in medicine despite a lack of consensus on its application and evaluation. Objective We sought to identify current frameworks guiding the application and evaluation of AI for predictive analytics in medicine and to describe the content of these frameworks. We also assessed what stages along the AI translational spectrum (ie, AI development, reporting, evaluation, implementation, and surveillance) the content of each framework has been discussed. Methods We performed a literature review of frameworks regarding the oversight of AI in medicine. The search included key topics such as “artificial intelligence,” “machine learning,” “guidance as topic,” and “translational science,” and spanned the time period 2014-2022. Documents were included if they provided generalizable guidance regarding the use or evaluation of AI in medicine. Included frameworks are summarized descriptively and were subjected to content analysis. A novel evaluation matrix was developed and applied to appraise the frameworks’ coverage of content areas across translational stages. Results Fourteen frameworks are featured in the review, including six frameworks that provide descriptive guidance and eight that provide reporting checklists for medical applications of AI. Content analysis revealed five considerations related to the oversight of AI in medicine across frameworks: transparency, reproducibility, ethics, effectiveness, and engagement. All frameworks include discussions regarding transparency, reproducibility, ethics, and effectiveness, while only half of the frameworks discuss engagement. The evaluation matrix revealed that frameworks were most likely to report AI considerations for the translational stage of development and were least likely to report considerations for the translational stage of surveillance. Conclusions Existing frameworks for the application and evaluation of AI in medicine notably offer less input on the role of engagement in oversight and regarding the translational stage of surveillance. Identifying and optimizing strategies for engagement are essential to ensure that AI can meaningfully benefit patients and other end users.
Collapse
Affiliation(s)
- Norah L Crossnohere
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, United States
- Division of General Internal Medicine, Department of Internal Medicine, The Ohio State University College of Medicine, Columbus, OH, United States
| | - Mohamed Elsaid
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, United States
| | - Jonathan Paskett
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, United States
| | - Seuli Bose-Brill
- Division of General Internal Medicine, Department of Internal Medicine, The Ohio State University College of Medicine, Columbus, OH, United States
| | - John F P Bridges
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, United States
| |
Collapse
|
42
|
Niemiec E. Will the EU Medical Device Regulation help to improve the safety and performance of medical AI devices? Digit Health 2022; 8:20552076221089079. [PMID: 35386955 PMCID: PMC8977702 DOI: 10.1177/20552076221089079] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 03/06/2022] [Indexed: 12/23/2022] Open
Abstract
Concerns have been raised over the quality of evidence on the performance of medical
artificial intelligence devices, including devices that are already on the market in the
USA and Europe. Recently, the Medical Device Regulation, which aims to set high standards
of safety and quality, has become applicable in the European Union. The aim of this
article is to discuss whether, and how, the Medical Device Regulation will help improve
the safety and performance of medical artificial intelligence devices entering the market.
The Medical Device Regulation introduces new rules for risk classification of the devices,
which will result in more devices subjected to a higher degree of scrutiny before entering
the market; more stringent requirements on clinical evaluation, including the requirement
for appraisal of clinical data; new requirements for post-market surveillance, which may
help spot early on any new, unexpected side effects and risks of the devices; and
requirements for notified bodies, including for expertise of the personnel and
consideration of relevant best practice documents. The guidance of the Medical Device
Coordination Group on clinical evaluation of medical device software and the MEDDEV2.7
guideline on clinical evaluation also attend to some of the problems identified in studies
on medical artificial intelligence devices. The Medical Device Regulation will likely help
improve the safety and performance of the medical artificial intelligence devices on the
European market. The impact of the Regulation, however, is also dependent on its adequate
enforcement by the European Union member states.
Collapse
Affiliation(s)
- Emilia Niemiec
- Medical Ethics Division, Department of Clinical Sciences, Lund University, Sweden
| |
Collapse
|
43
|
De Souza LT, Silva Filho WE, Santana Lima B, Silva T, Takeshita W. Artificial intelligence in oral radiology: A checklist proposal. JOURNAL OF ORAL AND MAXILLOFACIAL RADIOLOGY 2022. [DOI: 10.4103/jomr.jomr_21_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
|
44
|
Lim MJR, Quek RHC, Ng KJ, Loh NHW, Lwin S, Teo K, Nga VDW, Yeo TT, Motani M. Machine Learning Models Prognosticate Functional Outcomes Better than Clinical Scores in Spontaneous Intracerebral Haemorrhage. J Stroke Cerebrovasc Dis 2021; 31:106234. [PMID: 34896819 DOI: 10.1016/j.jstrokecerebrovasdis.2021.106234] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 11/11/2021] [Accepted: 11/17/2021] [Indexed: 10/19/2022] Open
Abstract
OBJECTIVE This study aims to develop and compare the use of deep neural networks (DNN) and support vector machines (SVM) to clinical prognostic scores for prognosticating 30-day mortality and 90-day poor functional outcome (PFO) in spontaneous intracerebral haemorrhage (SICH). MATERIALS AND METHODS We conducted a retrospective cohort study of 297 SICH patients between December 2014 and May 2016. Clinical data was collected from electronic medical records using standardized data collection forms. The machine learning workflow included imputation of missing data, dimensionality reduction, imbalanced-class correction, and evaluation using cross-validation and comparison of accuracy against clinical prognostic scores. RESULTS 32 (11%) patients had 30-day mortality while 177 (63%) patients had 90-day PFO. For prognosticating 30-day mortality, the class-balanced accuracies for DNN (0.875; 95% CI 0.800-0.950; McNemar's p-value 1.000) and SVM (0.848; 95% CI 0.767-0.930; McNemar's p-value 0.791) were comparable to that of the original ICH score (0.833; 95% CI 0.748-0.918). The c-statistics for DNN (0.895; DeLong's p-value 0.715), and SVM (0.900; DeLong's p-value 0.619), though greater than that of the original ICH score (0.862), were not significantly different. For prognosticating 90-day PFO, the class-balanced accuracies for DNN (0.853; 95% CI 0.772-0.934; McNemar's p-value 0.003) and SVM (0.860; 95% CI 0.781-0.939; McNemar's p-value 0.004) were better than that of the ICH-Grading Scale (0.706; 95% CI 0.600-0.812). The c-statistic for SVM (0.883; DeLong's p-value 0.022) was significantly greater than that of the ICH-Grading Scale (0.778), while the c-statistic for DNN was 0.864 (DeLong's p-value 0.055). CONCLUSION We showed that the SVM model performs significantly better than clinical prognostic scores in predicting 90-day PFO in SICH.
Collapse
Affiliation(s)
- Mervyn Jun Rui Lim
- Division of Neurosurgery, University Surgical Centre, National University Hospital, Singapore.
| | | | - Kai Jie Ng
- Yong Loo Lin School of Medicine, National University of Singapore
| | - Ne-Hooi Will Loh
- Department of Anaesthesia, National University Hospital, Singapore
| | - Sein Lwin
- Division of Neurosurgery, University Surgical Centre, National University Hospital, Singapore
| | - Kejia Teo
- Division of Neurosurgery, University Surgical Centre, National University Hospital, Singapore
| | - Vincent Diong Weng Nga
- Division of Neurosurgery, University Surgical Centre, National University Hospital, Singapore
| | - Tseng Tsai Yeo
- Division of Neurosurgery, University Surgical Centre, National University Hospital, Singapore
| | - Mehul Motani
- Department of Electrical and Computer Engineering, National University of Singapore; N.1 Institute for Health, National University of Singapore; Institute for Data Science, National University of Singapore
| |
Collapse
|
45
|
Sajjadian M, Lam RW, Milev R, Rotzinger S, Frey BN, Soares CN, Parikh SV, Foster JA, Turecki G, Müller DJ, Strother SC, Farzan F, Kennedy SH, Uher R. Machine learning in the prediction of depression treatment outcomes: a systematic review and meta-analysis. Psychol Med 2021; 51:2742-2751. [PMID: 35575607 DOI: 10.1017/s0033291721003871] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
BACKGROUND Multiple treatments are effective for major depressive disorder (MDD), but the outcomes of each treatment vary broadly among individuals. Accurate prediction of outcomes is needed to help select a treatment that is likely to work for a given person. We aim to examine the performance of machine learning methods in delivering replicable predictions of treatment outcomes. METHODS Of 7732 non-duplicate records identified through literature search, we retained 59 eligible reports and extracted data on sample, treatment, predictors, machine learning method, and treatment outcome prediction. A minimum sample size of 100 and an adequate validation method were used to identify adequate-quality studies. The effects of study features on prediction accuracy were tested with mixed-effects models. Fifty-four of the studies provided accuracy estimates or other estimates that allowed calculation of balanced accuracy of predicting outcomes of treatment. RESULTS Eight adequate-quality studies reported a mean accuracy of 0.63 [95% confidence interval (CI) 0.56-0.71], which was significantly lower than a mean accuracy of 0.75 (95% CI 0.72-0.78) in the other 46 studies. Among the adequate-quality studies, accuracies were higher when predicting treatment resistance (0.69) and lower when predicting remission (0.60) or response (0.56). The choice of machine learning method, feature selection, and the ratio of features to individuals were not associated with reported accuracy. CONCLUSIONS The negative relationship between study quality and prediction accuracy, combined with a lack of independent replication, invites caution when evaluating the potential of machine learning applications for personalizing the treatment of depression.
Collapse
Affiliation(s)
- Mehri Sajjadian
- Department of Psychiatry, Dalhousie University, Halifax, NS, Canada
| | - Raymond W Lam
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
| | - Roumen Milev
- Department of Psychiatry and Psychology, Queen's University, Providence Care Hospital, Kingston, ON, Canada
| | - Susan Rotzinger
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
- Department of Psychiatry, St. Michael's Hospital, University of Toronto, Toronto, Ontario, Canada
| | - Benicio N Frey
- Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, ON, Canada
- Mood Disorders Program and Women's Health Concerns Clinic, St. Joseph's Healthcare Hamilton, Hamilton, ON, Canada
| | - Claudio N Soares
- Department of Psychiatry, Queen's University School of Medicine, Kingston, ON, Canada
| | - Sagar V Parikh
- Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA
| | - Jane A Foster
- Department of Psychiatry & Behavioural Neurosciences, St. Joseph's Healthcare, Hamilton, ON, Canada
| | - Gustavo Turecki
- Department of Psychiatry, Douglas Institute, McGill University, Montreal, QC, Canada
| | - Daniel J Müller
- Campbell Family Mental Health Research Institute, Center for Addiction and Mental Health, Toronto, ON, Canada
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
| | - Stephen C Strother
- Baycrest and Department of Medical Biophysics, Rotman Research Center, University of Toronto, Toronto, ON, Canada
| | - Faranak Farzan
- eBrain Lab, School of Mechatronic Systems Engineering, Simon Fraser University, Surrey, BC, Canada
| | - Sidney H Kennedy
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
- Department of Psychiatry, St. Michael's Hospital, University of Toronto, Toronto, Ontario, Canada
- Department of Psychiatry, University Health Network, Toronto, ON, Canada
- Krembil Research Centre, University Health Network, University of Toronto, Toronto, ON, Canada
| | - Rudolf Uher
- Department of Psychiatry, Dalhousie University, Halifax, NS, Canada
| |
Collapse
|
46
|
Oala L, Murchison AG, Balachandran P, Choudhary S, Fehr J, Leite AW, Goldschmidt PG, Johner C, Schörverth EDM, Nakasi R, Meyer M, Cabitza F, Baird P, Prabhu C, Weicken E, Liu X, Wenzel M, Vogler S, Akogo D, Alsalamah S, Kazim E, Koshiyama A, Piechottka S, Macpherson S, Shadforth I, Geierhofer R, Matek C, Krois J, Sanguinetti B, Arentz M, Bielik P, Calderon-Ramirez S, Abbood A, Langer N, Haufe S, Kherif F, Pujari S, Samek W, Wiegand T. Machine Learning for Health: Algorithm Auditing & Quality Control. J Med Syst 2021; 45:105. [PMID: 34729675 PMCID: PMC8562935 DOI: 10.1007/s10916-021-01783-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 10/11/2021] [Indexed: 01/26/2023]
Abstract
Developers proposing new machine learning for health (ML4H) tools often pledge to match or even surpass the performance of existing tools, yet the reality is usually more complicated. Reliable deployment of ML4H to the real world is challenging as examples from diabetic retinopathy or Covid-19 screening show. We envision an integrated framework of algorithm auditing and quality control that provides a path towards the effective and reliable application of ML systems in healthcare. In this editorial, we give a summary of ongoing work towards that vision and announce a call for participation to the special issue Machine Learning for Health: Algorithm Auditing & Quality Control in this journal to advance the practice of ML4H auditing.
Collapse
Affiliation(s)
| | | | | | | | - Jana Fehr
- Hasso-Plattner-Institute of Digital Engineering, Potsdam, Germany
| | - Alixandro Werneck Leite
- Machine Learning Laboratory in Finance and Organizations, Universidade de Brasília, Brasília, Brazil
| | | | | | | | | | | | | | | | | | | | - Xiaoxuan Liu
- University Hospitals Birmingham NHS Foundation Trust & Academic Unit of Ophthalmology, Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, United Kingdom
| | | | | | | | - Shada Alsalamah
- Information Systems Department, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
- Digital Health and Innovation Department, Science Division, World Health Organization, Winterthur, Switzerland
| | - Emre Kazim
- University College London, London, United Kingdom
| | | | | | | | | | | | | | - Joachim Krois
- Oral Diagnostics Digital Health Health Services Research, Charité-Universitätsmedizin, Berlin, Germany
| | | | - Matthew Arentz
- Department of Global Health, University of Washington, Washington, USA
| | | | | | | | - Nicolas Langer
- Department of Psychology, University of Zurich, Zürich, Switzerland
| | | | - Ferath Kherif
- Laboratory for Research in Neuroimaging, Department of Clinical Neuroscience, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland
| | - Sameer Pujari
- Digital Health and Innovation Department, Science Division, World Health Organization, Winterthur, Switzerland
| | | | | |
Collapse
|
47
|
Falconer N, Abdel-Hafez A, Scott IA, Marxen S, Canaris S, Barras M. Systematic review of machine learning models for personalised dosing of heparin. Br J Clin Pharmacol 2021; 87:4124-4139. [PMID: 33835524 DOI: 10.1111/bcp.14852] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 03/25/2021] [Accepted: 03/29/2021] [Indexed: 12/18/2022] Open
Abstract
AIM To identify and critically appraise studies of prediction models, developed using machine learning (ML) methods, for determining the optimal dosing of unfractionated heparin (UFH). METHODS Embase, PubMed, CINAHL, Web of Science, International Pharmaceutical Abstracts and IEEE Xplore databases were searched from inception to 31 January 2020 to identify relevant studies using key search terms synonymous with artificial intelligence or ML, 'prediction', 'dose', 'activated partial thromboplastin time (aPTT)' and 'UFH.' Studies had to have used ML methods for developing models that predicted optimal dose of UFH or target therapeutic aPTT levels in the hospital setting. The CHARMS Checklist was used to assess quality and risk of bias of included studies. RESULTS Of 8393 retrieved abstracts, 61 underwent full text review and eight studies met inclusion criteria. Four studies described models for predicting aPTT, three studies described models predicting optimal dose of heparin during dialysis and one study described a model that used surrogate outcomes of clotting and bleeding to predict a therapeutic aPTT. Studies varied widely in reporting of study participants, feature characterisation and selection, handling of missing data, sample size calculations and the intended clinical application of the model. Only one study conducted an external validation and no studies evaluated model impacts in clinical practice. CONCLUSION Studies of ML models for UFH dosing are few and none report a model ready for routine clinical use. Existing studies are limited by low methodological quality, inadequate reporting of study factors and absence of external validation and impact analysis.
Collapse
Affiliation(s)
- Nazanin Falconer
- Department of Pharmacy, Princess Alexandra Hospital, Brisbane, Queensland, 4102, Australia
- School of Pharmacy, The University of Queensland, Brisbane, Queensland, 4102, Australia
- Centre for Health Services Research, The University of Queensland, Level two, Building 33, Princess Alexandra Hospital, Brisbane, 4102, Australia
| | - Ahmad Abdel-Hafez
- Clinical Informatics, Princess Alexandra Hospital, Brisbane, Queensland, 4102, Australia
| | - Ian A Scott
- Department of Internal Medicine and Clinical Epidemiology, Princess Alexandra Hospital, Brisbane, Queensland, Australia
- School of Clinical Medicine, Faculty of Medicine, The University of Queensland, 4102, Australia
| | - Sven Marxen
- Department of Pharmacy, Logan and Beaudesert Hospitals, Meadowbrook, Metro South Health, Brisbane, QLD, 4131, Australia
| | - Stephen Canaris
- Clinical Informatics, Princess Alexandra Hospital, Brisbane, Queensland, 4102, Australia
| | - Michael Barras
- Department of Pharmacy, Princess Alexandra Hospital, Brisbane, Queensland, 4102, Australia
- School of Pharmacy, The University of Queensland, Brisbane, Queensland, 4102, Australia
| |
Collapse
|
48
|
Reddy S, Rogers W, Makinen VP, Coiera E, Brown P, Wenzel M, Weicken E, Ansari S, Mathur P, Casey A, Kelly B. Evaluation framework to guide implementation of AI systems into healthcare settings. BMJ Health Care Inform 2021; 28:bmjhci-2021-100444. [PMID: 34642177 PMCID: PMC8513218 DOI: 10.1136/bmjhci-2021-100444] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 09/30/2021] [Indexed: 01/10/2023] Open
Abstract
Objectives To date, many artificial intelligence (AI) systems have been developed in healthcare, but adoption has been limited. This may be due to inappropriate or incomplete evaluation and a lack of internationally recognised AI standards on evaluation. To have confidence in the generalisability of AI systems in healthcare and to enable their integration into workflows, there is a need for a practical yet comprehensive instrument to assess the translational aspects of the available AI systems. Currently available evaluation frameworks for AI in healthcare focus on the reporting and regulatory aspects but have little guidance regarding assessment of the translational aspects of the AI systems like the functional, utility and ethical components. Methods To address this gap and create a framework that assesses real-world systems, an international team has developed a translationally focused evaluation framework termed ‘Translational Evaluation of Healthcare AI (TEHAI)’. A critical review of literature assessed existing evaluation and reporting frameworks and gaps. Next, using health technology evaluation and translational principles, reporting components were identified for consideration. These were independently reviewed for consensus inclusion in a final framework by an international panel of eight expert. Results TEHAI includes three main components: capability, utility and adoption. The emphasis on translational and ethical features of the model development and deployment distinguishes TEHAI from other evaluation instruments. In specific, the evaluation components can be applied at any stage of the development and deployment of the AI system. Discussion One major limitation of existing reporting or evaluation frameworks is their narrow focus. TEHAI, because of its strong foundation in translation research models and an emphasis on safety, translational value and generalisability, not only has a theoretical basis but also practical application to assessing real-world systems. Conclusion The translational research theoretic approach used to develop TEHAI should see it having application not just for evaluation of clinical AI in research settings, but more broadly to guide evaluation of working clinical systems.
Collapse
Affiliation(s)
- Sandeep Reddy
- School of Medicine, Deakin University, Geelong, Victoria, Australia
| | - Wendy Rogers
- Department of Philosophy, Macquarie University, Sydney, New South Wales, Australia
| | - Ville-Petteri Makinen
- South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
| | - Enrico Coiera
- Australian Institute of Health Innovation, Macquarie University, Sydney, New South Wales, Australia
| | - Pieta Brown
- Orion Health, Auckland, Auckland, New Zealand
| | - Markus Wenzel
- Fraunhofer Institute for Telecommunications Heinrich-Hertz-Institute HHI, Berlin, Germany
| | - Eva Weicken
- Fraunhofer Institute for Telecommunications Heinrich-Hertz-Institute HHI, Berlin, Germany
| | - Saba Ansari
- Deakin University Faculty of Health, Geelong, Victoria, Australia
| | - Piyush Mathur
- Anesthesiology Institute, Cleveland Clinic, Cleveland, Ohio, USA
| | - Aaron Casey
- South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
| | - Blair Kelly
- Deakin University Faculty of Health, Geelong, Victoria, Australia
| |
Collapse
|
49
|
Allen B, Dreyer K, Stibolt R, Agarwal S, Coombs L, Treml C, Elkholy M, Brink L, Wald C. Evaluation and Real-World Performance Monitoring of Artificial Intelligence Models in Clinical Practice: Try It, Buy It, Check It. J Am Coll Radiol 2021; 18:1489-1496. [PMID: 34599876 DOI: 10.1016/j.jacr.2021.08.022] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 08/02/2021] [Indexed: 01/16/2023]
Abstract
The pace of regulatory clearance of artificial intelligence (AI) algorithms for radiology continues to accelerate, and numerous algorithms are becoming available for use in clinical practice. End users of AI in radiology should be aware that AI algorithms may not work as expected when used beyond the institutions in which they were trained, and model performance may degrade over time. In this article, we discuss why regulatory clearance alone may not be enough to ensure AI will be safe and effective in all radiological practices and review strategies available resources for evaluating before clinical use and monitoring performance of AI models to ensure efficacy and patient safety.
Collapse
Affiliation(s)
- Bibb Allen
- Chief Medical Officer ACR Data Science Institute; and Department of Radiology, Grandview Medical Center, Birmingham, Alabama.
| | - Keith Dreyer
- Chief Science Officer ACR Data Science Institute; and Massachusetts General Hospital, Boston, Massachusetts
| | - Robert Stibolt
- Diagnostic Radiology, Brookwood Baptist Health, Birmingham, Alabama
| | | | | | - Chris Treml
- ACR Data Science Institute, Reston, Virginia
| | | | - Laura Brink
- ACR Data Science Institute, Reston, Virginia
| | | |
Collapse
|
50
|
Haymond S, McCudden C. Rise of the Machines: Artificial Intelligence and the Clinical Laboratory. J Appl Lab Med 2021; 6:1640-1654. [PMID: 34379752 DOI: 10.1093/jalm/jfab075] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 06/08/2021] [Indexed: 11/14/2022]
Abstract
BACKGROUND Artificial intelligence (AI) is rapidly being developed and implemented to augment and automate decision-making across healthcare systems. Being an essential part of these systems, laboratories will see significant growth in AI applications for the foreseeable future. CONTENT In laboratory medicine, AI can be used for operational decision-making and automating or augmenting human-based workflows. Specific applications include instrument automation, error detection, forecasting, result interpretation, test utilization, genomics, and image analysis. If not doing so today, clinical laboratories will be using AI routinely in the future, therefore, laboratory experts should understand their potential role in this new area and the opportunities for AI technologies. The roles of laboratorians range from passive provision of data to fuel algorithms to developing entirely new algorithms, with subject matter expertise as a perfect fit in the middle. The technical development of algorithms is only a part of the overall picture, where the type, availability, and quality of data are at least as important. Implementation of AI algorithms also offers technical and usability challenges that need to be understood to be successful. Finally, as AI algorithms continue to become available, it is important to understand how to evaluate their validity and utility in the real world. SUMMARY This review provides an overview of what AI is, examples of how it is currently being used in laboratory medicine, different ways for laboratorians to get involved in algorithm development, and key considerations for AI algorithm implementation and critical evaluation.
Collapse
Affiliation(s)
- Shannon Haymond
- Department of Pathology, Northwestern University Feinberg School of Medicine, Chicago, IL.,Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, IL
| | - Christopher McCudden
- Department of Pathology & Laboratory Medicine, University of Ottawa, Canada, The Ottawa Hospital, and the Eastern Ontario Regional Laboratory Association, Canada
| |
Collapse
|