1
|
Yu S, Jeon BR, Liu C, Kim D, Park HI, Park HD, Shin JH, Lee JH, Choi Q, Kim S, Yun YM, Cho EJ. Laboratory Preparation for Digital Medicine in Healthcare 4.0: An Investigation Into the Awareness and Applications of Big Data and Artificial Intelligence. Ann Lab Med 2024; 44:562-571. [PMID: 38953115 PMCID: PMC11375187 DOI: 10.3343/alm.2024.0111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/03/2024] [Accepted: 06/21/2024] [Indexed: 07/03/2024] Open
Abstract
Background Healthcare 4.0. refers to the integration of advanced technologies, such as artificial intelligence (AI) and big data analysis, into the healthcare sector. Recognizing the impact of Healthcare 4.0 technologies in laboratory medicine (LM), we seek to assess the overall awareness and implementation of Healthcare 4.0 among members of the Korean Society for Laboratory Medicine (KSLM). Methods A web-based survey was conducted using an anonymous questionnaire. The survey comprised 36 questions covering demographic information (seven questions), big data (10 questions), and AI (19 questions). Results In total, 182 (17.9%) of 1,017 KSLM members participated in the survey. Thirty-two percent of respondents considered AI to be the most important technology in LM in the era of Healthcare 4.0, closely followed by 31% who favored big data. Approximately 80% of respondents were familiar with big data but had not conducted research using it, and 71% were willing to participate in future big data research conducted by the KSLM. Respondents viewed AI as the most valuable tool in molecular genetics within various divisions. More than half of the respondents were open to the notion of using AI as assistance rather than a complete replacement for their roles. Conclusions This survey highlighted KSLM members' awareness of the potential applications and implications of big data and AI. We emphasize the complexity of AI integration in healthcare, citing technical and ethical challenges leading to diverse opinions on its impact on employment and training. This highlights the need for a holistic approach to adopting new technologies.
Collapse
Affiliation(s)
- Shinae Yu
- Department of Laboratory Medicine, Haeundae Paik Hospital, Inje University College of Medicine, Busan, Korea
| | - Byung Ryul Jeon
- Department of Laboratory Medicine & Genetics, Soonchunhyang University Bucheon Hospital, Soonchunhyang University College of Medicine, Bucheon, Korea
| | - Changseung Liu
- Departments of Laboratory Medicine, Gangneung Asan Hospital, University of Ulsan College of Medicine, Gangneung, Korea
| | - Dokyun Kim
- Department of Laboratory Medicine and Research Institute of Bacterial Resistance, Yonsei University College of Medicine, Seoul, Korea
| | - Hae-Il Park
- Department of Laboratory Medicine, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Hyung Doo Park
- Department of Laboratory Medicine and Genetics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Jeong Hwan Shin
- Department of Laboratory Medicine, Inje University College of Medicine, Busan, Korea
| | - Jun Hyung Lee
- Department of Laboratory Medicine, GC Labs, Yongin, Korea
| | - Qute Choi
- Department of Laboratory Medicine, Chungnam National University Sejong Hospital, Chungnam National University School of Medicine, Daejeon, Korea
| | - Sollip Kim
- Department of Laboratory Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Yeo Min Yun
- Department of Laboratory Medicine, Konkuk University Medical Center, Konkuk University School of Medicine, Seoul, Korea
| | - Eun-Jung Cho
- Department of Laboratory Medicine, Hallym University Dongtan Sacred Heart Hospital, Hallym University College of Medicine, Hwaseong, Korea
| |
Collapse
|
2
|
Spinos D, Martinos A, Petsiou DP, Mistry N, Garas G. Artificial Intelligence in Temporal Bone Imaging: A Systematic Review. Laryngoscope 2024. [PMID: 39352072 DOI: 10.1002/lary.31809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 08/03/2024] [Accepted: 09/17/2024] [Indexed: 10/03/2024]
Abstract
OBJECTIVE The human temporal bone comprises more than 30 identifiable anatomical components. With the demand for precise image interpretation in this complex region, the utilization of artificial intelligence (AI) applications is steadily increasing. This systematic review aims to highlight the current role of AI in temporal bone imaging. DATA SOURCES A Systematic Review of English Publications searching MEDLINE (PubMed), COCHRANE Library, and EMBASE. REVIEW METHODS The search algorithm employed consisted of key items such as 'artificial intelligence,' 'machine learning,' 'deep learning,' 'neural network,' 'temporal bone,' and 'vestibular schwannoma.' Additionally, manual retrieval was conducted to capture any studies potentially missed in our initial search. All abstracts and full texts were screened based on our inclusion and exclusion criteria. RESULTS A total of 72 studies were included. 95.8% were retrospective and 88.9% were based on internal databases. Approximately two-thirds involved an AI-to-human comparison. Computed tomography (CT) was the imaging modality in 54.2% of the studies, with vestibular schwannoma (VS) being the most frequent study item (37.5%). Fifty-eight out of 72 articles employed neural networks, with 72.2% using various types of convolutional neural network models. Quality assessment of the included publications yielded a mean score of 13.6 ± 2.5 on a 20-point scale based on the CONSORT-AI extension. CONCLUSION Current research data highlight AI's potential in enhancing diagnostic accuracy with faster results and decreased performance errors compared to those of clinicians, thus improving patient care. However, the shortcomings of the existing research, often marked by heterogeneity and variable quality, underscore the need for more standardized methodological approaches to ensure the consistency and reliability of future data. LEVEL OF EVIDENCE NA Laryngoscope, 2024.
Collapse
Affiliation(s)
- Dimitrios Spinos
- South Warwickshire NHS Foundation Trust, Warwick, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Anastasios Martinos
- National and Kapodistrian University of Athens School of Medicine, Athens, Greece
| | | | - Nina Mistry
- Gloucestershire Hospitals NHS Foundation Trust, ENT, Head and Neck Surgery, Gloucester, UK
| | - George Garas
- Surgical Innovation Centre, Department of Surgery and Cancer, Imperial College London, St. Mary's Hospital, London, UK
- Athens Medical Center, Marousi & Psychiko Clinic, Athens, Greece
| |
Collapse
|
3
|
Joussellin V, Meneyrol E, Lederlin M, Jouneau S, Terzi N, Tadié JM, Gacouin A. Admission chest CT scan of intensive care patients with interstitial lung disease: Unveiling its limited predictive value through visual and automated analyses in a retrospective study (ILDICTO). Respir Med Res 2024; 86:101140. [PMID: 39357461 DOI: 10.1016/j.resmer.2024.101140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 08/30/2024] [Accepted: 09/05/2024] [Indexed: 10/04/2024]
Abstract
BACKGROUND Clinical course prediction of patients with interstitial lung disease (ILD) admitted to the intensive care unit (ICU) for acute respiratory failure (ARF) can be challenging. This study aimed to characterize the prognostic value of admission chest CT-scan in this situation. METHODS We retrospectively included ILD patients admitted to a French ICU for acute respiratory failure requiring oxygen. Patients with lymphangitis carcinomatosis and ANCA vasculitis were excluded. We analyzed every admission chest CT-scan using two different approaches: a visual analysis (grading the extent of traction bronchiectasis, ground glass and honeycomb) and an automated analysis (grading the extent of ground glass and consolidation with a dedicated software). The primary outcome was ICU mortality. RESULTS Between January 2014 and October 2020, 81 patients presented an acute respiratory failure with ILD on the admission chest CT-scan. In univariate analysis, only the main pulmonary artery diameter differed between patients who survived and those who died in ICU (30 vs 32 mm, p = 0.021). In multivariate analysis, none of the radiological funding was associated with ICU mortality. Visual and automated analyses did not yield different results, with a strong correlation between the two methods. However, the identification of an UIP pattern (and the presence of honeycomb) was associated with a poorer response to corticosteroid therapy. CONCLUSION Our study showed that the extent of radiological findings and the severity of fibrosis indices on admission chest CT scans of ILD patients admitted to the ICU for ARF were not associated with subsequent deterioration.
Collapse
Affiliation(s)
- Vincent Joussellin
- CHU Rennes, Maladies Infectieuses et Réanimation Médicale, F-35033 Rennes, France; Université Rennes1, Faculté de Médecine, Biosit, F-35043 Rennes, France.
| | - Eric Meneyrol
- CHU Rennes, Maladies Infectieuses et Réanimation Médicale, F-35033 Rennes, France; Université Rennes1, Faculté de Médecine, Biosit, F-35043 Rennes, France
| | - Mathieu Lederlin
- Department of Radiology, CHU Rennes, Univ Rennes, 5 LTSI, INSERM U1099 Rennes, France
| | - Stéphane Jouneau
- Department of Respiratory Medicine, Reference Centre for Rare Pulmonary Diseases, CHU Rennes, Univ Rennes, Rennes, France; IRSET UMR1085, Univ Rennes, Rennes, France
| | - Nicolas Terzi
- CHU Rennes, Maladies Infectieuses et Réanimation Médicale, F-35033 Rennes, France; Université Rennes1, Faculté de Médecine, Biosit, F-35043 Rennes, France; Inserm-CIC-1414, Faculté de Médecine, Université Rennes I, IFR 140, F-35033 Rennes, France
| | - Jean-Marc Tadié
- CHU Rennes, Maladies Infectieuses et Réanimation Médicale, F-35033 Rennes, France; Université Rennes1, Faculté de Médecine, Biosit, F-35043 Rennes, France; Inserm-CIC-1414, Faculté de Médecine, Université Rennes I, IFR 140, F-35033 Rennes, France
| | - Arnaud Gacouin
- CHU Rennes, Maladies Infectieuses et Réanimation Médicale, F-35033 Rennes, France; Université Rennes1, Faculté de Médecine, Biosit, F-35043 Rennes, France; Inserm-CIC-1414, Faculté de Médecine, Université Rennes I, IFR 140, F-35033 Rennes, France
| |
Collapse
|
4
|
Lim CY, Sohn B, Seong M, Kim EY, Kim ST, Won SY. Need for Transparency and Clinical Interpretability in Hemorrhagic Stroke Artificial Intelligence Research: Promoting Effective Clinical Application. Yonsei Med J 2024; 65:611-618. [PMID: 39313452 PMCID: PMC11427125 DOI: 10.3349/ymj.2024.0007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 05/08/2024] [Accepted: 05/20/2024] [Indexed: 09/25/2024] Open
Abstract
PURPOSE This study aimed to evaluate the quality of artificial intelligence (AI)/machine learning (ML) studies on hemorrhagic stroke using the Minimum Information for Medical AI Reporting (MINIMAR) and Minimum Information About Clinical Artificial Intelligence Modeling (MI-CLAIM) frameworks to promote clinical application. MATERIALS AND METHODS PubMed, MEDLINE, and Embase were searched for AI/ML studies on hemorrhagic stroke. Out of the 531 articles found, 29 relevant original research articles were included. MINIMAR and MI-CLAIM scores were assigned by two experienced radiologists to assess the quality of the studies. RESULTS We analyzed 29 investigations that utilized AI/ML in the field of hemorrhagic stroke, involving a median of 224.5 patients. The majority of studies focused on diagnostic outcomes using computed tomography scans (89.7%) and were published in computer science journals (48.3%). The overall adherence rates to reporting guidelines, as assessed through the MINIMAR and MI-CLAIM frameworks, were 47.6% and 46.0%, respectively. In MINIMAR, none of the studies reported the socioeconomic status of the patients or how missing values had been addressed. In MI-CLAIM, only two studies applied model-examination techniques to improve model interpretability. Transparency and reproducibility were limited, as only 10.3% of the studies had publicly shared their code. Cohen's kappa between the two radiologists was 0.811 and 0.779 for MINIMAR and MI-CLAIM, respectively. CONCLUSION The overall reporting quality of published AI/ML studies on hemorrhagic stroke is suboptimal. It is necessary to incorporate model examination techniques for interpretability and promote code openness to enhance transparency and increase the clinical applicability of AI/ML studies.
Collapse
Affiliation(s)
- Chae Young Lim
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Beomseok Sohn
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Minjung Seong
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Eung Yeop Kim
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Sung Tae Kim
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - So Yeon Won
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea.
| |
Collapse
|
5
|
Is EE, Menekseoglu AK. Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o. Clin Rheumatol 2024:10.1007/s10067-024-07154-5. [PMID: 39340572 DOI: 10.1007/s10067-024-07154-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 09/18/2024] [Accepted: 09/19/2024] [Indexed: 09/30/2024]
Abstract
OBJECTIVES This study evaluates the performance of AI models, ChatGPT-4o and Google Gemini, in answering rheumatology board-level questions, comparing their effectiveness, reliability, and applicability in clinical practice. METHOD A cross-sectional study was conducted using 420 rheumatology questions from the BoardVitals question bank, excluding 27 visual data questions. Both artificial intelligence models categorized the questions according to difficulty (easy, medium, hard) and answered them. In addition, the reliability of the answers was assessed by asking the questions a second time. The accuracy, reliability, and difficulty categorization of the AI models' response to the questions were analyzed. RESULTS ChatGPT-4o answered 86.9% of the questions correctly, significantly outperforming Google Gemini's 60.2% accuracy (p < 0.001). When the questions were asked a second time, the success rate was 86.7% for ChatGPT-4o and 60.5% for Google Gemini. Both models mainly categorized questions as medium difficulty. ChatGPT-4o showed higher accuracy in various rheumatology subfields, notably in Basic and Clinical Science (p = 0.028), Osteoarthritis (p = 0.023), and Rheumatoid Arthritis (p < 0.001). CONCLUSIONS ChatGPT-4o significantly outperformed Google Gemini in rheumatology board-level questions. This demonstrates the success of ChatGPT-4o in situations requiring complex and specialized knowledge related to rheumatological diseases. The performance of both AI models decreased as the question difficulty increased. This study demonstrates the potential of AI in clinical applications and suggests that its use as a tool to assist clinicians may improve healthcare efficiency in the future. Future studies using real clinical scenarios and real board questions are recommended. Key Points •ChatGPT-4o significantly outperformed Google Gemini in answering rheumatology board-level questions, achieving 86.9% accuracy compared to Google Gemini's 60.2%. •For both AI models, the correct answer rate decreased as the question difficulty increased. •The study demonstrates the potential for AI models to be used in clinical practice as a tool to assist clinicians and improve healthcare efficiency.
Collapse
Affiliation(s)
- Enes Efe Is
- Department of Physical Medicine and Rehabilitation, Sisli Hamidiye Etfal Training and Research Hospital, University of Health Sciences, Seyrantepe Campus, Cumhuriyet ve Demokrasi Avenue, Istanbul, Turkey.
| | - Ahmet Kivanc Menekseoglu
- Department of Physical Medicine and Rehabilitation, Kanuni Sultan Süleyman Training and Research Hospital, University of Health Sciences, Istanbul, Turkey
| |
Collapse
|
6
|
Novak A, Ather S, Gill A, Aylward P, Maskell G, Cowell GW, Espinosa Morgado AT, Duggan T, Keevill M, Gamble O, Akrama O, Belcher E, Taberham R, Hallifax R, Bahra J, Banerji A, Bailey J, James A, Ansaripour A, Spence N, Wrightson J, Jarral W, Barry S, Bhatti S, Astley K, Shadmaan A, Ghelman S, Baenen A, Oke J, Bloomfield C, Johnson H, Beggs M, Gleeson F. Evaluation of the impact of artificial intelligence-assisted image interpretation on the diagnostic performance of clinicians in identifying pneumothoraces on plain chest X-ray: a multi-case multi-reader study. Emerg Med J 2024; 41:602-609. [PMID: 39009424 DOI: 10.1136/emermed-2023-213620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 06/10/2024] [Indexed: 07/17/2024]
Abstract
BACKGROUND Artificial intelligence (AI)-assisted image interpretation is a fast-developing area of clinical innovation. Most research to date has focused on the performance of AI-assisted algorithms in comparison with that of radiologists rather than evaluating the algorithms' impact on the clinicians who often undertake initial image interpretation in routine clinical practice. This study assessed the impact of AI-assisted image interpretation on the diagnostic performance of frontline acute care clinicians for the detection of pneumothoraces (PTX). METHODS A multicentre blinded multi-case multi-reader study was conducted between October 2021 and January 2022. The online study recruited 18 clinician readers from six different clinical specialties, with differing levels of seniority, across four English hospitals. The study included 395 plain CXR images, 189 positive for PTX and 206 negative. The reference standard was the consensus opinion of two thoracic radiologists with a third acting as arbitrator. General Electric Healthcare Critical Care Suite (GEHC CCS) PTX algorithm was applied to the final dataset. Readers individually interpreted the dataset without AI assistance, recording the presence or absence of a PTX and a confidence rating. Following a 'washout' period, this process was repeated including the AI output. RESULTS Analysis of the performance of the algorithm for detecting or ruling out a PTX revealed an overall AUROC of 0.939. Overall reader sensitivity increased by 11.4% (95% CI 4.8, 18.0, p=0.002) from 66.8% (95% CI 57.3, 76.2) unaided to 78.1% aided (95% CI 72.2, 84.0, p=0.002), specificity 93.9% (95% CI 90.9, 97.0) without AI to 95.8% (95% CI 93.7, 97.9, p=0.247). The junior reader subgroup showed the largest improvement at 21.7% (95% CI 10.9, 32.6), increasing from 56.0% (95% CI 37.7, 74.3) to 77.7% (95% CI 65.8, 89.7, p<0.01). CONCLUSION The study indicates that AI-assisted image interpretation significantly enhances the diagnostic accuracy of clinicians in detecting PTX, particularly benefiting less experienced practitioners. While overall interpretation time remained unchanged, the use of AI improved diagnostic confidence and sensitivity, especially among junior clinicians. These findings underscore the potential of AI to support less skilled clinicians in acute care settings.
Collapse
Affiliation(s)
- Alex Novak
- Emergency Department, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Sarim Ather
- Radiology Department, Oxford University Hospitals, Oxford, UK
| | - Avneet Gill
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Peter Aylward
- Report and Image Quality Control (RAIQC), London, UK, UK
| | - Giles Maskell
- Royal Cornwall Hospitals NHS Trust, Truro, Cornwall, UK
| | | | | | - Tom Duggan
- Buckinghamshire Healthcare NHS Trust, Amersham, UK
| | - Melissa Keevill
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Olivia Gamble
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Osama Akrama
- Emergency Department, Royal Berkshire NHS Foundation Trust, Reading, UK
| | | | - Rhona Taberham
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Rob Hallifax
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Jasdeep Bahra
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | | | - Jon Bailey
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Antonia James
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Ali Ansaripour
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Nathan Spence
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - John Wrightson
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Waqas Jarral
- Frimley Health NHS Foundation Trust, Frimley, UK
| | - Steven Barry
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Saher Bhatti
- Frimley Health NHS Foundation Trust, Frimley, UK
| | - Kerry Astley
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Amied Shadmaan
- GE Healthcare Diagnostic Imaging, Little Chalfont, Buckinghamshire, UK
| | | | | | - Jason Oke
- University of Oxford Greyfriars, Oxford, UK
| | | | | | - Mark Beggs
- University of Oxford, Oxford, Oxfordshire, UK
| | - Fergus Gleeson
- Radiology Department, Oxford University Hospitals, Oxford, UK
| |
Collapse
|
7
|
Wahid KA, Kaffey ZY, Farris DP, Humbert-Vidan L, Moreno AC, Rasmussen M, Ren J, Naser MA, Netherton TJ, Korreman S, Balakrishnan G, Fuller CD, Fuentes D, Dohopolski MJ. Artificial intelligence uncertainty quantification in radiotherapy applications - A scoping review. Radiother Oncol 2024; 201:110542. [PMID: 39299574 DOI: 10.1016/j.radonc.2024.110542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 08/18/2024] [Accepted: 09/09/2024] [Indexed: 09/22/2024]
Abstract
BACKGROUND/PURPOSE The use of artificial intelligence (AI) in radiotherapy (RT) is expanding rapidly. However, there exists a notable lack of clinician trust in AI models, underscoring the need for effective uncertainty quantification (UQ) methods. The purpose of this study was to scope existing literature related to UQ in RT, identify areas of improvement, and determine future directions. METHODS We followed the PRISMA-ScR scoping review reporting guidelines. We utilized the population (human cancer patients), concept (utilization of AI UQ), context (radiotherapy applications) framework to structure our search and screening process. We conducted a systematic search spanning seven databases, supplemented by manual curation, up to January 2024. Our search yielded a total of 8980 articles for initial review. Manuscript screening and data extraction was performed in Covidence. Data extraction categories included general study characteristics, RT characteristics, AI characteristics, and UQ characteristics. RESULTS We identified 56 articles published from 2015 to 2024. 10 domains of RT applications were represented; most studies evaluated auto-contouring (50 %), followed by image-synthesis (13 %), and multiple applications simultaneously (11 %). 12 disease sites were represented, with head and neck cancer being the most common disease site independent of application space (32 %). Imaging data was used in 91 % of studies, while only 13 % incorporated RT dose information. Most studies focused on failure detection as the main application of UQ (60 %), with Monte Carlo dropout being the most commonly implemented UQ method (32 %) followed by ensembling (16 %). 55 % of studies did not share code or datasets. CONCLUSION Our review revealed a lack of diversity in UQ for RT applications beyond auto-contouring. Moreover, we identified a clear need to study additional UQ methods, such as conformal prediction. Our results may incentivize the development of guidelines for reporting and implementation of UQ in RT.
Collapse
Affiliation(s)
- Kareem A Wahid
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA; Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Zaphanlene Y Kaffey
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - David P Farris
- Research Medical Library, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Laia Humbert-Vidan
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Amy C Moreno
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | | | - Jintao Ren
- Department of Oncology, Aarhus University Hospital, Denmark
| | - Mohamed A Naser
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Tucker J Netherton
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Stine Korreman
- Department of Oncology, Aarhus University Hospital, Denmark
| | | | - Clifton D Fuller
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - David Fuentes
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
| | - Michael J Dohopolski
- Department of Radiation Oncology, The University of Texas Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
8
|
Pham JH, Thongprayoon C, Miao J, Suppadungsuk S, Koirala P, Craici IM, Cheungpasitporn W. Large language model triaging of simulated nephrology patient inbox messages. Front Artif Intell 2024; 7:1452469. [PMID: 39315245 PMCID: PMC11417033 DOI: 10.3389/frai.2024.1452469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Accepted: 08/29/2024] [Indexed: 09/25/2024] Open
Abstract
Background Efficient triage of patient communications is crucial for timely medical attention and improved care. This study evaluates ChatGPT's accuracy in categorizing nephrology patient inbox messages, assessing its potential in outpatient settings. Methods One hundred and fifty simulated patient inbox messages were created based on cases typically encountered in everyday practice at a nephrology outpatient clinic. These messages were triaged as non-urgent, urgent, and emergent by two nephrologists. The messages were then submitted to ChatGPT-4 for independent triage into the same categories. The inquiry process was performed twice with a two-week period in between. ChatGPT responses were graded as correct (agreement with physicians), overestimation (higher priority), or underestimation (lower priority). Results In the first trial, ChatGPT correctly triaged 140 (93%) messages, overestimated the priority of 4 messages (3%), and underestimated the priority of 6 messages (4%). In the second trial, it correctly triaged 140 (93%) messages, overestimated the priority of 9 (6%), and underestimated the priority of 1 (1%). The accuracy did not depend on the urgency level of the message (p = 0.19). The internal agreement of ChatGPT responses was 92% with an intra-rater Kappa score of 0.88. Conclusion ChatGPT-4 demonstrated high accuracy in triaging nephrology patient messages, highlighting the potential for AI-driven triage systems to enhance operational efficiency and improve patient care in outpatient clinics.
Collapse
Affiliation(s)
- Justin H Pham
- Mayo Clinic College of Medicine and Science, Mayo Clinic, Rochester, MN, United States
| | - Charat Thongprayoon
- Department of Nephrology and Hypertension, Mayo Clinic, Rochester, MN, United States
| | - Jing Miao
- Department of Nephrology and Hypertension, Mayo Clinic, Rochester, MN, United States
| | - Supawadee Suppadungsuk
- Department of Nephrology and Hypertension, Mayo Clinic, Rochester, MN, United States
- Faculty of Medicine Ramathibodi Hospital, Chakri Naruebodindra Medical Institute, Mahidol University, Samut Prakan, Thailand
| | - Priscilla Koirala
- Department of Internal Medicine, Mayo Clinic, Rochester, MN, United States
| | - Iasmina M Craici
- Department of Nephrology and Hypertension, Mayo Clinic, Rochester, MN, United States
| | - Wisit Cheungpasitporn
- Department of Nephrology and Hypertension, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
9
|
Novak A, Hollowday M, Espinosa Morgado AT, Oke J, Shelmerdine S, Woznitza N, Metcalfe D, Costa ML, Wilson S, Kiam JS, Vaz J, Limphaibool N, Ventre J, Jones D, Greenhalgh L, Gleeson F, Welch N, Mistry A, Devic N, Teh J, Ather S. Evaluating the impact of artificial intelligence-assisted image analysis on the diagnostic accuracy of front-line clinicians in detecting fractures on plain X-rays (FRACT-AI): protocol for a prospective observational study. BMJ Open 2024; 14:e086061. [PMID: 39237277 PMCID: PMC11381697 DOI: 10.1136/bmjopen-2024-086061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 09/07/2024] Open
Abstract
INTRODUCTION Missed fractures are the most frequent diagnostic error attributed to clinicians in UK emergency departments and a significant cause of patient morbidity. Recently, advances in computer vision have led to artificial intelligence (AI)-enhanced model developments, which can support clinicians in the detection of fractures. Previous research has shown these models to have promising effects on diagnostic performance, but their impact on the diagnostic accuracy of clinicians in the National Health Service (NHS) setting has not yet been fully evaluated. METHODS AND ANALYSIS A dataset of 500 plain radiographs derived from Oxford University Hospitals (OUH) NHS Foundation Trust will be collated to include all bones except the skull, facial bones and cervical spine. The dataset will be split evenly between radiographs showing one or more fractures and those without. The reference ground truth for each image will be established through independent review by two senior musculoskeletal radiologists. A third senior radiologist will resolve disagreements between two primary radiologists. The dataset will be analysed by a commercially available AI tool, BoneView (Gleamer, Paris, France), and its accuracy for detecting fractures will be determined with reference to the ground truth diagnosis. We will undertake a multiple case multiple reader study in which clinicians interpret all images without AI support, then repeat the process with access to AI algorithm output following a 4-week washout. 18 clinicians will be recruited as readers from four hospitals in England, from six distinct clinical groups, each with three levels of seniority (early-stage, mid-stage and later-stage career). Changes in the accuracy, confidence and speed of reporting will be compared with and without AI support. Readers will use a secure web-based DICOM (Digital Imaging and Communications in Medicine) viewer (www.raiqc.com), allowing radiograph viewing and abnormality identification. Pooled analyses will be reported for overall reader performance as well as for subgroups including clinical role, level of seniority, pathological finding and difficulty of image. ETHICS AND DISSEMINATION The study has been approved by the UK Healthcare Research Authority (IRAS 310995, approved on 13 December 2022). The use of anonymised retrospective radiographs has been authorised by OUH NHS Foundation Trust. The results will be presented at relevant conferences and published in a peer-reviewed journal. TRIAL REGISTRATION NUMBERS This study is registered with ISRCTN (ISRCTN19562541) and ClinicalTrials.gov (NCT06130397). The paper reports the results of a substudy of STEDI2 (Simulation Training for Emergency Department Imaging Phase 2).
Collapse
Affiliation(s)
- Alex Novak
- Emergency Medicine Research Oxford, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Max Hollowday
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | | | - Jason Oke
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
| | - Susan Shelmerdine
- Clinical Radiology, Great Ormond Street Hospital for Children, London, UK
- Radiology, UCL GOSH ICH, London, UK
- NIHR Great Ormond Street Hospital Biomedical Research Centre, London, UK
| | - Nick Woznitza
- Radiology, University College London Hospitals NHS Foundation Trust, London, UK
- Canterbury Christ Church University, Canterbury Christ Church University, Canterbury, UK
| | - David Metcalfe
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Matthew L Costa
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences (NDORMS), Oxford Trauma & Emergency Care (OxTEC), University of Oxford, Oxford, UK
| | - Sarah Wilson
- Frimley Health NHS Foundation Trust, Frimley, UK
| | - Jian Shen Kiam
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - James Vaz
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | | | | | | | | | - Fergus Gleeson
- Department of Oncology, University of Oxford, Oxford, UK
| | - Nick Welch
- Patient and Public Involvement Member, Oxford, UK
| | - Alpesh Mistry
- Liverpool University Hospitals NHS Foundation Trust, Liverpool, UK
- North West MSK Imaging, Liverpool, UK
| | - Natasa Devic
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - James Teh
- Nuffield Orthopaedic Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Sarim Ather
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| |
Collapse
|
10
|
Hennessy A, Tran TH, Sasikumar SN, Al-Falahi Z. Machine learning, advanced data analysis, and a role in pregnancy care? How can we help improve preeclampsia outcomes? Pregnancy Hypertens 2024; 37:101137. [PMID: 38875933 DOI: 10.1016/j.preghy.2024.101137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 03/31/2024] [Accepted: 06/09/2024] [Indexed: 06/16/2024]
Abstract
The value of machine learning capacity in maternal health, and in particular prediction of preeclampsia will only be realised when there are high quality clinical data provided, representative populations included, different health systems and models of care compared, and a culture of rapid use and application of real-time data and outcomes. This review has been undertaken to provide an overview of the language, and early results of machine learning in a pregnancy and preeclampsia context. Clinicians of all backgrounds are encouraged to learn the language of Machine Learning (ML) and Artificial intelligence (AI) to better understand their potential and utility to improve outcomes for women and their families. This review will outline some definitions and features of ML that will benefit clinician's knowledge in the preeclampsia discipline, and also outline some of the future possibilities for preeclampsia-focussed clinicians via understanding AI. It will further explore the criticality of defining the risk, and outcome being determined.
Collapse
Affiliation(s)
- Annemarie Hennessy
- Campbelltown Hospital, South Western Sydney Local Health District, Sydney, Australia; Western Sydney University, Sydney, Australia; University of Sydney, Sydney, Australia.
| | - Tu Hao Tran
- Campbelltown Hospital, South Western Sydney Local Health District, Sydney, Australia; Ingham Institute for Applied Medical Research, SWERI (South Western Emergency Research Institute), Australia.
| | - Suraj Narayanan Sasikumar
- Ingham Institute for Applied Medical Research, SWERI (South Western Emergency Research Institute), Australia.
| | - Zaidon Al-Falahi
- University of Sydney, Sydney, Australia; Ingham Institute for Applied Medical Research, SWERI (South Western Emergency Research Institute), Australia.
| |
Collapse
|
11
|
Cresswell K, de Keizer N, Magrabi F, Williams R, Rigby M, Prgomet M, Kukhareva P, Wong ZSY, Scott P, Craven CK, Georgiou A, Medlock S, Brender McNair J, Ammenwerth E. Evaluating Artificial Intelligence in Clinical Settings-Let Us Not Reinvent the Wheel. J Med Internet Res 2024; 26:e46407. [PMID: 39110494 PMCID: PMC11339570 DOI: 10.2196/46407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 04/20/2023] [Accepted: 03/02/2024] [Indexed: 08/24/2024] Open
Abstract
Given the requirement to minimize the risks and maximize the benefits of technology applications in health care provision, there is an urgent need to incorporate theory-informed health IT (HIT) evaluation frameworks into existing and emerging guidelines for the evaluation of artificial intelligence (AI). Such frameworks can help developers, implementers, and strategic decision makers to build on experience and the existing empirical evidence base. We provide a pragmatic conceptual overview of selected concrete examples of how existing theory-informed HIT evaluation frameworks may be used to inform the safe development and implementation of AI in health care settings. The list is not exhaustive and is intended to illustrate applications in line with various stakeholder requirements. Existing HIT evaluation frameworks can help to inform AI-based development and implementation by supporting developers and strategic decision makers in considering relevant technology, user, and organizational dimensions. This can facilitate the design of technologies, their implementation in user and organizational settings, and the sustainability and scalability of technologies.
Collapse
Affiliation(s)
- Kathrin Cresswell
- Usher Institute, The University of Edinburgh, Usher Building, Edinburgh, United Kingdom
| | - Nicolette de Keizer
- Amsterdam UMC, University of Amsterdam, Medical Informatics, Amsterdam, Netherlands
- Amsterdam Public Health Research Institute, Digital Health and Quality of Care, Amsterdam, Netherlands
| | - Farah Magrabi
- Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| | - Robin Williams
- Institute for the Study of Science, Technology and Innovation, The University of Edinburgh, Edinburgh, United Kingdom
| | - Michael Rigby
- School of Social, Political and Global Studies and School of Primary, Community and Social Care, Keele University, Keele, United Kingdom
| | - Mirela Prgomet
- Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, Australia
| | - Polina Kukhareva
- Department of Biomedical Informatics, University of Utah, Utah, UT, United States
| | | | - Philip Scott
- University of Wales Trinity St David, Swansea, United Kingdom
| | - Catherine K Craven
- University of Texas Health Science Center, San Antonio, TX, United States
| | - Andrew Georgiou
- Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| | - Stephanie Medlock
- Amsterdam UMC, University of Amsterdam, Medical Informatics, Amsterdam, Netherlands
- Amsterdam Public Health, Methodology & Aging & Later Life, Amsterdam, Netherlands
| | - Jytte Brender McNair
- Department of Health Science and Technology, Aalborg University, Aalborg, Denmark
| | - Elske Ammenwerth
- Institute of Medical Informatics, Private University for Health Sciences and Health Technology, UMIT TIROL, Hall in Tirol, Austria
| |
Collapse
|
12
|
Arzamasov K, Vasilev Y, Zelenova M, Pestrenin L, Busygina Y, Bobrovskaya T, Chetverikov S, Shikhmuradov D, Pankratov A, Kirpichev Y, Sinitsyn V, Son I, Omelyanskaya O. Independent evaluation of the accuracy of 5 artificial intelligence software for detecting lung nodules on chest X-rays. Quant Imaging Med Surg 2024; 14:5288-5303. [PMID: 39144030 PMCID: PMC11320553 DOI: 10.21037/qims-24-160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 06/11/2024] [Indexed: 08/16/2024]
Abstract
Background The integration of artificial intelligence (AI) into medicine is growing, with some experts predicting its standalone use soon. However, skepticism remains due to limited positive outcomes from independent validations. This research evaluates AI software's effectiveness in analyzing chest X-rays (CXR) to identify lung nodules, a possible lung cancer indicator. Methods This retrospective study analyzed 7,670,212 record pairs from radiological exams conducted between 2020 and 2022 during the Moscow Computer Vision Experiment, focusing on CXR and computed tomography (CT) scans. All images were acquired during clinical routine. The final dataset comprised 100 CXR images (50 with lung nodules, 50 without), selected consecutively and based on inclusion and exclusion criteria, to evaluate the performance of all five AI-based solutions, participating in the Moscow Computer Vision Experiment and analyzing CXR. The evaluation was performed in 3 stages. In the first stage, the probability of a nodule in the lung obtained from AI services was compared with the Ground Truth (1-there is a nodule, 0-there is no nodule). In the second stage, 3 radiologists evaluated the segmentation of nodules performed by the AI services (1-nodule correctly segmented, 0-nodule incorrectly segmented or not segmented at all). In the third stage, the same radiologists additionally evaluated the classification of the nodules (1-nodule correctly segmented and classified, 0-all other cases). The results obtained in stages 2 and 3 were compared with Ground Truth, which was common to all three stages. For each stage, diagnostic accuracy metrics were calculated for each AI service. Results Three software solutions (Celsus, Lunit INSIGHT CXR, and qXR) demonstrated diagnostic metrics that matched or surpassed the vendor specifications, and achieved the highest area under the receiver operating characteristic curve (AUC) of 0.956 [95% confidence interval (CI): 0.918 to 0.994]. However, when evaluated by three radiologists for accurate nodule segmentation and classification, all solutions performed below the vendor-declared metrics, with the highest AUC reaching 0.812 (95% CI: 0.744 to 0.879). Meanwhile, all AI services demonstrated 100% specificity at stages 2 and 3 of the study. Conclusions To ensure the reliability and applicability of AI-based software, it is crucial to validate performance metrics using high-quality datasets and engage radiologists in the evaluation process. Developers are recommended to improve the accuracy of the underlying models before allowing the standalone use of the software for lung nodule detection. The dataset created during the study may be accessed at https://mosmed.ai/datasets/mosmeddatargogksnalichiemiotsutstviemlegochnihuzlovtipvii/.
Collapse
Affiliation(s)
- Kirill Arzamasov
- State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Moscow, Russian Federation
- MIREA – Russian Technological University, Moscow, Russian Federation
| | - Yuriy Vasilev
- State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Moscow, Russian Federation
- Federal State Budgetary Institution “National Medical and Surgical Center named after N.I. Pirogov” of the Ministry of Health of the Russian Federation, Moscow, Russian Federation
| | - Maria Zelenova
- State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Moscow, Russian Federation
| | - Lev Pestrenin
- State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Moscow, Russian Federation
| | - Yulia Busygina
- State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Moscow, Russian Federation
| | - Tatiana Bobrovskaya
- State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Moscow, Russian Federation
| | - Sergey Chetverikov
- State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Moscow, Russian Federation
| | - David Shikhmuradov
- State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Moscow, Russian Federation
| | - Andrey Pankratov
- State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Moscow, Russian Federation
| | - Yury Kirpichev
- State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Moscow, Russian Federation
| | - Valentin Sinitsyn
- State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Moscow, Russian Federation
- Lomonosov Moscow State University, Moscow, Russian Federation
| | - Irina Son
- Federal State Budgetary Educational Institution of Further Professional Education “Russian Medical Academy of Continuous Professional Education” of the Ministry of Healthcare of the Russian Federation, Moscow, Russian Federation
| | - Olga Omelyanskaya
- State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Moscow, Russian Federation
| |
Collapse
|
13
|
Cè M, Ibba S, Cellina M, Tancredi C, Fantesini A, Fazzini D, Fortunati A, Perazzo C, Presta R, Montanari R, Forzenigo L, Carrafiello G, Papa S, Alì M. Radiologists' perceptions on AI integration: An in-depth survey study. Eur J Radiol 2024; 177:111590. [PMID: 38959557 DOI: 10.1016/j.ejrad.2024.111590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 06/18/2024] [Accepted: 06/24/2024] [Indexed: 07/05/2024]
Abstract
PURPOSE To assess the perceptions and attitudes of radiologists toward the adoption of artificial intelligence (AI) in clinical practice. METHODS A survey was conducted among members of the SIRM Lombardy. Radiologists' attitudes were assessed comprehensively, covering satisfaction with AI-based tools, propensity for innovation, and optimism for the future. The questionnaire consisted of two sections: the first gathered demographic and professional information using categorical responses, while the second evaluated radiologists' attitudes toward AI through Likert-type responses ranging from 1 to 5 (with 1 representing extremely negative attitudes, 3 indicating a neutral stance, and 5 reflecting extremely positive attitudes). Questionnaire refinement involved an iterative process with expert panels and a pilot phase to enhance consistency and eliminate redundancy. Exploratory data analysis employed descriptive statistics and visual assessment of Likert plots, supported by non-parametric tests for subgroup comparisons for a thorough analysis of specific emerging patterns. RESULTS The survey yielded 232 valid responses. The findings reveal a generally optimistic outlook on AI adoption, especially among young radiologist (<30) and seasoned professionals (>60, p<0.01). However, while 36.2 % (84 out 232) of subjects reported daily use of AI-based tools, only a third considered their contribution decisive (30 %, 25 out of 84). AI literacy varied, with a notable proportion feeling inadequately informed (36 %, 84 out of 232), particularly among younger radiologists (46 %, p < 0.01). Positive attitudes towards the potential of AI to improve detection, characterization of anomalies and reduce workload (positive answers > 80 %) and were consistent across subgroups. Radiologists' opinions were more skeptical about the role of AI in enhancing decision-making processes, including the choice of further investigation, and in personalized medicine in general. Overall, respondents recognized AI's significant impact on the radiology profession, viewing it as an opportunity (61 %, 141 out of 232) rather than a threat (18 %, 42 out of 232), with a majority expressing belief in AI's relevance to future radiologists' career choices (60 %, 139 out of 232). However, there were some concerns, particularly among breast radiologists (20 of 232 responders), regarding the potential impact of AI on the profession. Eighty-four percent of the respondents consider the final assessment by the radiologist still to be essential. CONCLUSION Our results indicate an overall positive attitude towards the adoption of AI in radiology, though this is moderated by concerns regarding training and practical efficacy. Addressing AI literacy gaps, especially among younger radiologists, is essential. Furthermore, proactively adapting to technological advancements is crucial to fully leverage AI's potential benefits. Despite the generally positive outlook among radiologists, there remains significant work to be done to enhance the integration and widespread use of AI tools in clinical practice.
Collapse
Affiliation(s)
- Maurizio Cè
- Postgraduation School of Radiodiagnostic, University of Milan, via Festa del Perdono 7, 20122 Milan, Italy
| | - Simona Ibba
- Unit of Diagnostic Imaging and Stereotactic Radiosurgery, CDI Centro Diagnostico Italiano S.p.A., Via Simone Saint Bon 20, 20147 Milan, Italy.
| | - Michaela Cellina
- Radiology Department, ASST Fatebenefratelli Sacco, Piazza Principessa Clotilde 3, 20121 Milan, Italy.
| | - Chiara Tancredi
- University Suor Orsola Benincasa, corso Vittorio Emanuele 292, 80135 Naples, Italy.
| | | | - Deborah Fazzini
- Unit of Diagnostic Imaging and Stereotactic Radiosurgery, CDI Centro Diagnostico Italiano S.p.A., Via Simone Saint Bon 20, 20147 Milan, Italy.
| | - Alice Fortunati
- Postgraduation School of Radiodiagnostic, University of Milan, via Festa del Perdono 7, 20122 Milan, Italy.
| | - Chiara Perazzo
- Postgraduation School of Radiodiagnostic, University of Milan, via Festa del Perdono 7, 20122 Milan, Italy.
| | - Roberta Presta
- University Suor Orsola Benincasa, corso Vittorio Emanuele 292, 80135 Naples, Italy.
| | - Roberto Montanari
- University Suor Orsola Benincasa, corso Vittorio Emanuele 292, 80135 Naples, Italy; RE:LAB s.r.l., Via Tamburini, 5, 42122 Reggio Emilia, Italy.
| | - Laura Forzenigo
- Radiology Department, Fondazione IRCCS Cà Granda Ospedale Maggiore Policlinico, Via Francesco Sforza, 35, 20122, Milan, Italy
| | - Gianpaolo Carrafiello
- Postgraduation School of Radiodiagnostic, University of Milan, via Festa del Perdono 7, 20122 Milan, Italy; Radiology Department, Fondazione IRCCS Cà Granda Ospedale Maggiore Policlinico, Via Francesco Sforza, 35, 20122, Milan, Italy; Department of Biomedical Sciences for Health, Università degli Studi di Milano, Via Mangiagalli 31, 20133 Milan, Italy
| | - Sergio Papa
- Unit of Diagnostic Imaging and Stereotactic Radiosurgery, CDI Centro Diagnostico Italiano S.p.A., Via Simone Saint Bon 20, 20147 Milan, Italy.
| | - Marco Alì
- Unit of Diagnostic Imaging and Stereotactic Radiosurgery, CDI Centro Diagnostico Italiano S.p.A., Via Simone Saint Bon 20, 20147 Milan, Italy; Bracco Imaging SpA, Via Caduti di Marcinelle, 20134 Milan, Italy.
| |
Collapse
|
14
|
Tisherman RT, Bulleit C, Champagne AA, Fatora GC, Lau BC. There is high variability in quantitative measurement techniques in glenohumeral capsular measurements for shoulder instability: A systematic review. Knee Surg Sports Traumatol Arthrosc 2024; 32:2161-2169. [PMID: 38796731 DOI: 10.1002/ksa.12236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/03/2024] [Accepted: 04/24/2024] [Indexed: 05/28/2024]
Abstract
PURPOSE Instability of the glenohumeral joint remains a complex clinical issue with high rates of surgical failure and significant morbidity. Advances in specific radiologic measurements involving the glenoid and the humerus have provided insight into glenohumeral pathology, which can be corrected surgically towards improving patient outcomes. The contributions of capsular pathology to ongoing instability remain unclear. The purpose of this study is to provide a systematic review of existing glenohumeral capsular measurement techniques published in the last 15 years. METHODS A systematic review of multiple databases was performed following PRISMA guidelines for all primary research articles between 2008 and 2023 with quantitative measurements of the glenohumeral capsule in patients with instability, including anterior, posterior and multi-directional instability. RESULTS There were a total of 14 articles meeting the inclusion criteria. High variability in measurement methodology across studies was observed, including variable amounts of intra-articular contrast, heterogeneity among magnetic resonance sequence acquisitions, differences in measurements performed and the specific approach taken to compute each measurement. CONCLUSION There is a need for standardization of methods in the measurement of glenohumeral capsular pathology in the setting of glenohumeral instability to allow for cross-study analysis. LEVEL OF EVIDENCE Level III.
Collapse
Affiliation(s)
| | - Clark Bulleit
- Duke University Hospital, Durham, North Carolina, USA
| | | | | | - Brian C Lau
- Duke University Hospital, Durham, North Carolina, USA
| |
Collapse
|
15
|
Kaya K, Gietzen C, Hahnfeldt R, Zoubi M, Emrich T, Halfmann MC, Sieren MM, Elser Y, Krumm P, Brendel JM, Nikolaou K, Haag N, Borggrefe J, Krüchten RV, Müller-Peltzer K, Ehrengut C, Denecke T, Hagendorff A, Goertz L, Gertz RJ, Bunck AC, Maintz D, Persigehl T, Lennartz S, Luetkens JA, Jaiswal A, Iuga AI, Pennig L, Kottlors J. Generative Pre-trained Transformer 4 analysis of cardiovascular magnetic resonance reports in suspected myocarditis: A multicenter study. J Cardiovasc Magn Reson 2024; 26:101068. [PMID: 39079602 DOI: 10.1016/j.jocmr.2024.101068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 07/04/2024] [Accepted: 07/24/2024] [Indexed: 09/13/2024] Open
Abstract
BACKGROUND Diagnosing myocarditis relies on multimodal data, including cardiovascular magnetic resonance (CMR), clinical symptoms, and blood values. The correct interpretation and integration of CMR findings require radiological expertise and knowledge. We aimed to investigate the performance of Generative Pre-trained Transformer 4 (GPT-4), a large language model, for report-based medical decision-making in the context of cardiac MRI for suspected myocarditis. METHODS This retrospective study includes CMR reports from 396 patients with suspected myocarditis and eight centers, respectively. CMR reports and patient data including blood values, age, and further clinical information were provided to GPT-4 and radiologists with 1 (resident 1), 2 (resident 2), and 4 years (resident 3) of experience in CMR and knowledge of the 2018 Lake Louise Criteria. The final impression of the report regarding the radiological assessment of whether myocarditis is present or not was not provided. The performance of Generative pre-trained transformer 4 (GPT-4) and the human readers were compared to a consensus reading (two board-certified radiologists with 8 and 10 years of experience in CMR). Sensitivity, specificity, and accuracy were calculated. RESULTS GPT-4 yielded an accuracy of 83%, sensitivity of 90%, and specificity of 78%, which was comparable to the physician with 1 year of experience (R1: 86%, 90%, 84%, p = 0.14) and lower than that of more experienced physicians (R2: 89%, 86%, 91%, p = 0.007 and R3: 91%, 85%, 96%, p < 0.001). GPT-4 and human readers showed a higher diagnostic performance when results from T1- and T2-mapping sequences were part of the reports, for residents 1 and 3 with statistical significance (p = 0.004 and p = 0.02, respectively). CONCLUSION GPT-4 yielded good accuracy for diagnosing myocarditis based on CMR reports in a large dataset from multiple centers and therefore holds the potential to serve as a diagnostic decision-supporting tool in this capacity, particularly for less experienced physicians. Further studies are required to explore the full potential and elucidate educational aspects of the integration of large language models in medical decision-making.
Collapse
Affiliation(s)
- Kenan Kaya
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
| | - Carsten Gietzen
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Robert Hahnfeldt
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Maher Zoubi
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Bonn, University of Bonn, Bonn, Germany
| | - Tilman Emrich
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes-Gutenberg-University, Mainz, Germany; Division of Cardiovascular Imaging, Department of Radiology and Radiological Science, Medical University of South Carolina, Charleston, South Carolina, USA; German Centre for Cardiovascular Research, Partner Site Rhine-Main, Mainz, Germany
| | - Moritz C Halfmann
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes-Gutenberg-University, Mainz, Germany
| | - Malte Maria Sieren
- Department of Radiology and Nuclear Medicine, UKSH, Campus Lübeck, Lübeck, Germany; Institute of Interventional Radiology, UKSH, Campus Lübeck, Lübeck, Germany
| | - Yannic Elser
- Department of Radiology and Nuclear Medicine, UKSH, Campus Lübeck, Lübeck, Germany
| | - Patrick Krumm
- Department of Radiology, Diagnostic and Interventional Radiology, University of Tübingen, Tübingen, Germany
| | - Jan M Brendel
- Department of Radiology, Diagnostic and Interventional Radiology, University of Tübingen, Tübingen, Germany
| | - Konstantin Nikolaou
- Department of Radiology, Diagnostic and Interventional Radiology, University of Tübingen, Tübingen, Germany
| | - Nina Haag
- Institute for Radiology, Neuroradiology and Nuclear Medicine Johannes Wesling University Hospital/Mühlenkreiskliniken, Bochum/Minden, Germany
| | - Jan Borggrefe
- Institute for Radiology, Neuroradiology and Nuclear Medicine Johannes Wesling University Hospital/Mühlenkreiskliniken, Bochum/Minden, Germany
| | - Ricarda von Krüchten
- Department of Diagnostic and Interventional Radiology, Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Katharina Müller-Peltzer
- Department of Diagnostic and Interventional Radiology, Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Constantin Ehrengut
- Department of Diagnostic and Interventional Radiology, University of Leipzig, Leipzig, Germany
| | - Timm Denecke
- Department of Diagnostic and Interventional Radiology, University of Leipzig, Leipzig, Germany
| | | | - Lukas Goertz
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Roman J Gertz
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Alexander Christian Bunck
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - David Maintz
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Thorsten Persigehl
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Simon Lennartz
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Julian A Luetkens
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Bonn, University of Bonn, Bonn, Germany
| | - Astha Jaiswal
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Andra Iza Iuga
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Lenhard Pennig
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Jonathan Kottlors
- Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| |
Collapse
|
16
|
Menekşeoğlu AK, İş EE. Comparative performance of artificial ıntelligence models in physical medicine and rehabilitation board-level questions. REVISTA DA ASSOCIACAO MEDICA BRASILEIRA (1992) 2024; 70:e20240241. [PMID: 39045939 PMCID: PMC11262310 DOI: 10.1590/1806-9282.20240241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 03/27/2024] [Indexed: 07/25/2024]
Abstract
OBJECTİVES The aim of this study was to compare the performance of artificial intelligence models ChatGPT-3.5, ChatGPT-4, and Google Bard in answering Physical Medicine and Rehabilitation board-style questions, assessing their capabilities in medical education and potential clinical applications. METHODS A comparative cross-sectional study was conducted using the PMR100, an example question set for the American Board of Physical Medicine and Rehabilitation Part I exam, focusing on artificial intelligence models' ability to answer and categorize questions by difficulty. The study evaluated the artificial intelligence models and analyzed them for accuracy, reliability, and alignment with difficulty levels determined by physiatrists. RESULTS ChatGPT-4 led with a 74% success rate, followed by Bard at 66%, and ChatGPT-3.5 at 63.8%. Bard showed remarkable answer consistency, altering responses in only 1% of cases. The difficulty assessment by ChatGPT models closely matched that of physiatrists. The study highlighted nuanced differences in artificial intelligence models' performance across various Physical Medicine and Rehabilitation subfields. CONCLUSION The study illustrates the potential of artificial intelligence in medical education and clinical settings, with ChatGPT-4 showing a slight edge in performance. It emphasizes the importance of artificial intelligence as a supportive tool for physiatrists, despite the need for careful oversight of artificial intelligence-generated responses to ensure patient safety.
Collapse
Affiliation(s)
- Ahmet Kıvanç Menekşeoğlu
- University of Health Sciences, Kanuni Sultan Süleyman Education and Training Hospital, Department of Physical Medicine and Rehabilitation – İstanbul, Turkey
| | - Enes Efe İş
- University of Health Sciences, Sisli Etfal Education and Training Hospital, Department of Physical Medicine and Rehabilitation – İstanbul, Turkey
| |
Collapse
|
17
|
Kathait AS, Garza-Frias E, Sikka T, Schultz TJ, Bizzo B, Kalra MK, Dreyer KJ. Assessing Laterality Errors in Radiology: Comparing Generative Artificial Intelligence and Natural Language Processing. J Am Coll Radiol 2024:S1546-1440(24)00591-X. [PMID: 38960083 DOI: 10.1016/j.jacr.2024.06.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 05/17/2024] [Accepted: 06/11/2024] [Indexed: 07/05/2024]
Abstract
PURPOSE We compared the performance of generative artificial intelligence (AI) (Augmented Transformer Assisted Radiology Intelligence [ATARI, Microsoft Nuance, Microsoft Corporation, Redmond, Washington]) and natural language processing (NLP) tools for identifying laterality errors in radiology reports and images. METHODS We used an NLP-based (mPower, Microsoft Nuance) tool to identify radiology reports flagged for laterality errors in its Quality Assurance Dashboard. The NLP model detects and highlights laterality mismatches in radiology reports. From an initial pool of 1,124 radiology reports flagged by the NLP for laterality errors, we selected and evaluated 898 reports that encompassed radiography, CT, MRI, and ultrasound modalities to ensure comprehensive coverage. A radiologist reviewed each radiology report to assess if the flagged laterality errors were present (reporting error-true-positive) or absent (NLP error-false-positive). Next, we applied ATARI to 237 radiology reports and images with consecutive NLP true-positive (118 reports) and false-positive (119 reports) laterality errors. We estimated accuracy of NLP and generative AI tools to identify overall and modality-wise laterality errors. RESULTS Among the 898 NLP-flagged laterality errors, 64% (574 of 898) had NLP errors and 36% (324 of 898) were reporting errors. The text query ATARI feature correctly identified the absence of laterality mismatch (NLP false-positives) with a 97.4% accuracy (115 of 118 reports; 95% confidence interval [CI] = 96.5%-98.3%). Combined vision and text query resulted in 98.3% accuracy (116 of 118 reports or images; 95% CI = 97.6%-99.0%), and query alone had a 98.3% accuracy (116 of 118 images; 95% CI = 97.6%-99.0%). CONCLUSION The generative AI-empowered ATARI prototype outperformed the assessed NLP tool for determining true and false laterality errors in radiology reports while enabling an image-based laterality determination. Underlying errors in ATARI text query in complex radiology reports emphasize the need for further improvement in the technology.
Collapse
Affiliation(s)
- Anjaneya Singh Kathait
- Research Fellow, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts.
| | - Emiliano Garza-Frias
- Research Fellow, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts; Mass General Brigham AI, Boston, Massachusetts
| | - Tejash Sikka
- Research Fellow, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts
| | - Thomas J Schultz
- Research Fellow, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts; Senior Director, Enterprise Medical Imaging, Mass General Brigham AI, Boston, Massachusetts
| | - Bernardo Bizzo
- Research Fellow, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts; Mass General Brigham AI, Boston, Massachusetts; ACR DSI (Data Science Institute) Senior Scientist and the Senior Director, Digital Clinical Research Organization
| | - Mannudeep K Kalra
- Research Fellow, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts; Scientific Director, Mass General Brigham AI, Boston, Massachusetts
| | - Keith J Dreyer
- Research Fellow, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts; Chief Data Science Officer, Mass General Brigham AI, Boston, Massachusetts; ACR DSI Chief Science Officer; Chief Imaging Information, Mass General Brigham; Vice Chairman of Radiology-Informatics, Massachusetts General Hospital and Brigham and Women's Hospital; and Co-Chair, Mass General Brigham AI Imaging AI Governance Committee
| |
Collapse
|
18
|
Albaladejo A, Lorleac'h A, Allain JS. [The spring of artificial intelligence: AI vs. expert for internal medicine cases]. Rev Med Interne 2024; 45:409-414. [PMID: 38331591 DOI: 10.1016/j.revmed.2024.01.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 01/09/2024] [Accepted: 01/17/2024] [Indexed: 02/10/2024]
Abstract
INTRODUCTION The "Printemps de la Médecine Interne" are training days for Francophone internists. The clinical cases presented during these days are complex. This study aims to evaluate the diagnostic capabilities of non-specialized artificial intelligence (language models) ChatGPT-4 and Bard by confronting them with the puzzles of the "Printemps de la Médecine Interne". METHOD Clinical cases from the "Printemps de la Médecine Interne" 2021 and 2022 were submitted to two language models: ChatGPT-4 and Bard. In case of a wrong answer, a second attempt was offered. We then compared the responses of human internist experts to those of artificial intelligence. RESULTS Of the 12 clinical cases submitted, human internist experts diagnosed nine, ChatGPT-4 diagnosed three, and Bard diagnosed one. One of the cases solved by ChatGPT-4 was not solved by the internist expert. The artificial intelligence had a response time of a few seconds. CONCLUSIONS Currently, the diagnostic skills of ChatGPT-4 and Bard are inferior to those of human experts in solving complex clinical cases but are very promising. Recently made available to the general public, they already have impressive capabilities, questioning the role of the diagnostic physician. It would be advisable to adapt the rules or subjects of future "Printemps de la Médecine Interne" so that they are not solved by a public language model.
Collapse
Affiliation(s)
- A Albaladejo
- Médecine interne et immunologie clinique, CHU de Rennes, 2, rue Henri-le-Guilloux, 35000 Rennes, France.
| | - A Lorleac'h
- Groupement hospitalier Bretagne Sud, 5, avenue Choiseul, 56100 Lorient, France.
| | - J-S Allain
- Groupement hospitalier Bretagne Sud, 5, avenue Choiseul, 56100 Lorient, France.
| |
Collapse
|
19
|
Fujima N, Nakagawa J, Kameda H, Ikebe Y, Harada T, Shimizu Y, Tsushima N, Kano S, Homma A, Kwon J, Yoneyama M, Kudo K. Improvement of image quality in diffusion-weighted imaging with model-based deep learning reconstruction for evaluations of the head and neck. MAGMA (NEW YORK, N.Y.) 2024; 37:439-447. [PMID: 37989922 DOI: 10.1007/s10334-023-01129-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Revised: 10/18/2023] [Accepted: 10/23/2023] [Indexed: 11/23/2023]
Abstract
OBJECTIVES To investigate the utility of deep learning (DL)-based image reconstruction using a model-based approach in head and neck diffusion-weighted imaging (DWI). MATERIALS AND METHODS We retrospectively analyzed the cases of 41 patients who underwent head/neck DWI. The DWI in 25 patients demonstrated an untreated lesion. We performed qualitative and quantitative assessments in the DWI analyses with both deep learning (DL)- and conventional parallel imaging (PI)-based reconstructions. For the qualitative assessment, we visually evaluated the overall image quality, soft tissue conspicuity, degree of artifact(s), and lesion conspicuity based on a five-point system. In the quantitative assessment, we measured the signal-to-noise ratio (SNR) of the bilateral parotid glands, submandibular gland, the posterior muscle, and the lesion. We then calculated the contrast-to-noise ratio (CNR) between the lesion and the adjacent muscle. RESULTS Significant differences were observed in the qualitative analysis between the DWI with PI-based and DL-based reconstructions for all of the evaluation items (p < 0.001). In the quantitative analysis, significant differences in the SNR and CNR between the DWI with PI-based and DL-based reconstructions were observed for all of the evaluation items (p = 0.002 ~ p < 0.001). DISCUSSION DL-based image reconstruction with the model-based technique effectively provided sufficient image quality in head/neck DWI.
Collapse
Affiliation(s)
- Noriyuki Fujima
- Department of Diagnostic and Interventional Radiology, Hokkaido University Hospital, N14 W5, Kita-Ku, Sapporo, 060-8638, Japan.
| | - Junichi Nakagawa
- Department of Diagnostic and Interventional Radiology, Hokkaido University Hospital, N14 W5, Kita-Ku, Sapporo, 060-8638, Japan
| | - Hiroyuki Kameda
- Faculty of Dental Medicine Department of Radiology, Hokkaido University, N13 W7, Kita-Ku, Sapporo, Hokkaido, 060-8586, Japan
| | - Yohei Ikebe
- Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, N15 W7, Kita-Ku, Sapporo, Hokkaido, 060-8638, Japan
- Center for Cause of Death Investigation, Faculty of Medicine, Hokkaido University, N15 W7, Kita-Ku, Sapporo, Hokkaido, 060-8638, Japan
| | - Taisuke Harada
- Center for Cause of Death Investigation, Faculty of Medicine, Hokkaido University, N15 W7, Kita-Ku, Sapporo, Hokkaido, 060-8638, Japan
| | - Yukie Shimizu
- Department of Diagnostic and Interventional Radiology, Hokkaido University Hospital, N14 W5, Kita-Ku, Sapporo, 060-8638, Japan
| | - Nayuta Tsushima
- Department of Otolaryngology-Head and Neck Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15 W7, Kita Ku, Sapporo, 060-8638, Japan
| | - Satoshi Kano
- Department of Otolaryngology-Head and Neck Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15 W7, Kita Ku, Sapporo, 060-8638, Japan
| | - Akihiro Homma
- Department of Otolaryngology-Head and Neck Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, N15 W7, Kita Ku, Sapporo, 060-8638, Japan
| | - Jihun Kwon
- Philips Japan, 3-37 Kohnan 2-Chome, Minato-Ku, Tokyo, 108-8507, Japan
| | - Masami Yoneyama
- Philips Japan, 3-37 Kohnan 2-Chome, Minato-Ku, Tokyo, 108-8507, Japan
| | - Kohsuke Kudo
- Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, N15 W7, Kita-Ku, Sapporo, Hokkaido, 060-8638, Japan
- Medical AI Research and Development Center, Hokkaido University Hospital, N14 W5, Kita-Ku, Sapporo, Hokkaido, 060-8638, Japan
- Global Center for Biomedical Science and Engineering, Faculty of Medicine, Hokkaido University, N14 W5, Kita-Ku, Sapporo, Hokkaido, 060-8638, Japan
| |
Collapse
|
20
|
Zhang S, Yang B, Yang H, Zhao J, Zhang Y, Gao Y, Monteiro O, Zhang K, Liu B, Wang S. Potential rapid intraoperative cancer diagnosis using dynamic full-field optical coherence tomography and deep learning: A prospective cohort study in breast cancer patients. Sci Bull (Beijing) 2024; 69:1748-1756. [PMID: 38702279 DOI: 10.1016/j.scib.2024.03.061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 03/18/2024] [Accepted: 03/19/2024] [Indexed: 05/06/2024]
Abstract
An intraoperative diagnosis is critical for precise cancer surgery. However, traditional intraoperative assessments based on hematoxylin and eosin (H&E) histology, such as frozen section, are time-, resource-, and labor-intensive, and involve specimen-consuming concerns. Here, we report a near-real-time automated cancer diagnosis workflow for breast cancer that combines dynamic full-field optical coherence tomography (D-FFOCT), a label-free optical imaging method, and deep learning for bedside tumor diagnosis during surgery. To classify the benign and malignant breast tissues, we conducted a prospective cohort trial. In the modeling group (n = 182), D-FFOCT images were captured from April 26 to June 20, 2018, encompassing 48 benign lesions, 114 invasive ductal carcinoma (IDC), 10 invasive lobular carcinoma, 4 ductal carcinoma in situ (DCIS), and 6 rare tumors. Deep learning model was built up and fine-tuned in 10,357 D-FFOCT patches. Subsequently, from June 22 to August 17, 2018, independent tests (n = 42) were conducted on 10 benign lesions, 29 IDC, 1 DCIS, and 2 rare tumors. The model yielded excellent performance, with an accuracy of 97.62%, sensitivity of 96.88% and specificity of 100%; only one IDC was misclassified. Meanwhile, the acquisition of the D-FFOCT images was non-destructive and did not require any tissue preparation or staining procedures. In the simulated intraoperative margin evaluation procedure, the time required for our novel workflow (approximately 3 min) was significantly shorter than that required for traditional procedures (approximately 30 min). These findings indicate that the combination of D-FFOCT and deep learning algorithms can streamline intraoperative cancer diagnosis independently of traditional pathology laboratory procedures.
Collapse
MESH Headings
- Humans
- Breast Neoplasms/diagnostic imaging
- Breast Neoplasms/surgery
- Breast Neoplasms/pathology
- Tomography, Optical Coherence/methods
- Deep Learning
- Female
- Prospective Studies
- Middle Aged
- Carcinoma, Ductal, Breast/diagnostic imaging
- Carcinoma, Ductal, Breast/surgery
- Carcinoma, Ductal, Breast/pathology
- Aged
- Adult
- Carcinoma, Intraductal, Noninfiltrating/diagnostic imaging
- Carcinoma, Intraductal, Noninfiltrating/surgery
- Carcinoma, Intraductal, Noninfiltrating/pathology
- Intraoperative Period
Collapse
Affiliation(s)
- Shuwei Zhang
- Breast Center, Peking University People's Hospital, Beijing 100044, China
| | - Bin Yang
- China ESG Institute, Capital University of Economics and Business, Beijing 100070, China; Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Houpu Yang
- Breast Center, Peking University People's Hospital, Beijing 100044, China
| | - Jin Zhao
- Breast Center, Peking University People's Hospital, Beijing 100044, China
| | - Yuanyuan Zhang
- Department of Pathology, Peking University People's Hospital, Beijing 100044, China
| | - Yuanxu Gao
- Center for Biomedicine and Innovations, Faculty of Medicine, Macau University of Science and Technology, Macao 999078, China
| | - Olivia Monteiro
- Center for Biomedicine and Innovations, Faculty of Medicine, Macau University of Science and Technology, Macao 999078, China
| | - Kang Zhang
- Center for Biomedicine and Innovations, Faculty of Medicine, Macau University of Science and Technology, Macao 999078, China; College of Future Technology, Peking University, Beijing 100091, China.
| | - Bo Liu
- School of Mathematical and Computational Sciences, Massey University, Auckland 0745, New Zealand.
| | - Shu Wang
- Breast Center, Peking University People's Hospital, Beijing 100044, China.
| |
Collapse
|
21
|
Wahid KA, Kaffey ZY, Farris DP, Humbert-Vidan L, Moreno AC, Rasmussen M, Ren J, Naser MA, Netherton TJ, Korreman S, Balakrishnan G, Fuller CD, Fuentes D, Dohopolski MJ. Artificial Intelligence Uncertainty Quantification in Radiotherapy Applications - A Scoping Review. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.13.24307226. [PMID: 38798581 PMCID: PMC11118597 DOI: 10.1101/2024.05.13.24307226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Background/purpose The use of artificial intelligence (AI) in radiotherapy (RT) is expanding rapidly. However, there exists a notable lack of clinician trust in AI models, underscoring the need for effective uncertainty quantification (UQ) methods. The purpose of this study was to scope existing literature related to UQ in RT, identify areas of improvement, and determine future directions. Methods We followed the PRISMA-ScR scoping review reporting guidelines. We utilized the population (human cancer patients), concept (utilization of AI UQ), context (radiotherapy applications) framework to structure our search and screening process. We conducted a systematic search spanning seven databases, supplemented by manual curation, up to January 2024. Our search yielded a total of 8980 articles for initial review. Manuscript screening and data extraction was performed in Covidence. Data extraction categories included general study characteristics, RT characteristics, AI characteristics, and UQ characteristics. Results We identified 56 articles published from 2015-2024. 10 domains of RT applications were represented; most studies evaluated auto-contouring (50%), followed by image-synthesis (13%), and multiple applications simultaneously (11%). 12 disease sites were represented, with head and neck cancer being the most common disease site independent of application space (32%). Imaging data was used in 91% of studies, while only 13% incorporated RT dose information. Most studies focused on failure detection as the main application of UQ (60%), with Monte Carlo dropout being the most commonly implemented UQ method (32%) followed by ensembling (16%). 55% of studies did not share code or datasets. Conclusion Our review revealed a lack of diversity in UQ for RT applications beyond auto-contouring. Moreover, there was a clear need to study additional UQ methods, such as conformal prediction. Our results may incentivize the development of guidelines for reporting and implementation of UQ in RT.
Collapse
Affiliation(s)
- Kareem A. Wahid
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Zaphanlene Y. Kaffey
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - David P. Farris
- Research Medical Library, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Laia Humbert-Vidan
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Amy C. Moreno
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | | | - Jintao Ren
- Department of Oncology, Aarhus University Hospital, Denmark
| | - Mohamed A. Naser
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Tucker J. Netherton
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Stine Korreman
- Department of Oncology, Aarhus University Hospital, Denmark
| | | | - Clifton D. Fuller
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - David Fuentes
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Michael J. Dohopolski
- Department of Radiation Oncology, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
22
|
Baturu M, Solakhan M, Kazaz TG, Bayrak O. Frequently asked questions on erectile dysfunction: evaluating artificial intelligence answers with expert mentorship. Int J Impot Res 2024:10.1038/s41443-024-00898-3. [PMID: 38714784 DOI: 10.1038/s41443-024-00898-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 03/28/2024] [Accepted: 04/25/2024] [Indexed: 05/10/2024]
Abstract
The present study assessed the accuracy of artificiaI intelligence-generated responses to frequently asked questions on erectile dysfunction. A cross-sectional analysis involved 56 erectile dysfunction-related questions searched on Google, categorized into nine sections: causes, diagnosis, treatment options, treatment complications, protective measures, relationship with other illnesses, treatment costs, treatment with herbal agents, and appointments. Responses from ChatGPT 3.5, ChatGPT 4, and BARD were evaluated by two experienced urology experts using the F1 and global quality scores (GQS) for accuracy, relevance, and comprehensibility. ChatGPT 3.5 and ChatGPT 4 achieved higher GQS than BARD in categories such as causes (4.5 ± 0.54, 4.5 ± 0.51, 3.15 ± 1.01, respectively, p < 0.001), treatment options (4.35 ± 0.6, 4.5 ± 0.43, 2.71 ± 1.38, respectively, p < 0.001), protective measures (5.0 ± 0, 5.0 ± 0, 4 ± 0.5, respectively, p = 0.013), relationships with other illnesses (4.58 ± 0.58, 4.83 ± 0.25, 3.58 ± 0.8, respectively, p = 0.006), and treatment with herbal agents (3 ± 0.61, 3.33 ± 0.83, 1.8 ± 1.09, respectively, p = 0.043). F1 scores in categories: causes (1), diagnosis (0.857), treatment options (0.726), and protective measures (1), indicated their alignment with the guidelines. There was no significant difference between ChatGPT 3.5 and ChatGPT 4 regarding answer quality, but both outperformed BARD in the GQS. These results emphasize the need to continually enhance and validate AI-generated medical information, underscoring the importance of artificiaI intelligence systems in delivering reliable information on erectile dysfunction.
Collapse
Affiliation(s)
- Muharrem Baturu
- Department of Urology, University of Gaziantep, Gaziantep, Turkey
| | - Mehmet Solakhan
- Department of Urology, Hasan Kalyoncu University, Gaziantep, Turkey
| | | | - Omer Bayrak
- Department of Urology, University of Gaziantep, Gaziantep, Turkey.
| |
Collapse
|
23
|
Ingvar Å, Oloruntoba A, Sashindranath M, Miller R, Soyer HP, Guitera P, Caccetta T, Shumack S, Abbott L, Arnold C, Lawn C, Button-Sloan A, Janda M, Mar V. Minimum labelling requirements for dermatology artificial intelligence-based Software as Medical Device (SaMD): A consensus statement. Australas J Dermatol 2024; 65:e21-e29. [PMID: 38419186 DOI: 10.1111/ajd.14222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Accepted: 01/21/2024] [Indexed: 03/02/2024]
Abstract
BACKGROUND/OBJECTIVES Artificial intelligence (AI) holds remarkable potential to improve care delivery in dermatology. End users (health professionals and general public) of AI-based Software as Medical Devices (SaMD) require relevant labelling information to ensure that these devices can be used appropriately. Currently, there are no clear minimum labelling requirements for dermatology AI-based SaMDs. METHODS Common labelling recommendations for AI-based SaMD identified in a recent literature review were evaluated by an Australian expert panel in digital health and dermatology via a modified Delphi consensus process. A nine-point Likert scale was used to indicate importance of 10 items, and voting was conducted to determine the specific characteristics to include for some items. Consensus was achieved when more than 75% of the experts agreed that inclusion of information was necessary. RESULTS There was robust consensus supporting inclusion of all proposed items as minimum labelling requirements; indication for use, intended user, training and test data sets, algorithm design, image processing techniques, clinical validation, performance metrics, limitations, updates and adverse events. Nearly all suggested characteristics of the labelling items received endorsement, except for some characteristics related to performance metrics. Moreover, there was consensus that uniform labelling criteria should apply across all AI categories and risk classes set out by the Therapeutic Goods Administration. CONCLUSIONS This study provides critical evidence for setting labelling standards by the Therapeutic Goods Administration to safeguard patients, health professionals, consumers, industry, and regulatory bodies from AI-based dermatology SaMDs that do not currently provide adequate information about how they were developed and tested.
Collapse
Affiliation(s)
- Åsa Ingvar
- Victorian Melanoma Service, Alfred Health, Melbourne, Victoria, Australia
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
- Department of Dermatology, Skåne University Hospital, Lund, Sweden
- Department of Clinical Sciences, Lund University, Lund, Sweden
| | | | - Maithili Sashindranath
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
| | - Robert Miller
- Australasian College of Dermatologists, Sydney, Australia
| | - H Peter Soyer
- Australasian College of Dermatologists, Sydney, Australia
- Dermatology Research Centre, Frazer Institute, The University of Queensland, Brisbane, Queensland, Australia
| | - Pascale Guitera
- Australasian College of Dermatologists, Sydney, Australia
- Faculty of Medicine and Health, The University of Sydney, Sydney, New South Wales, Australia
- Sydney Melanoma Diagnostic Centre, Royal Prince Alfred Hospital, Camperdown, Victoria, Australia
- Melanoma Institute Australia, The University of Sydney, Sydney, New South Wales, Australia
| | - Tony Caccetta
- Australasian College of Dermatologists, Sydney, Australia
- Perth Dermatology Clinic, Perth, Western Australia, Australia
| | - Stephen Shumack
- Australasian College of Dermatologists, Sydney, Australia
- Royal North Shore Hospital of Sydney, Sydney, New South Wales, Australia
| | - Lisa Abbott
- Australasian College of Dermatologists, Sydney, Australia
- Faculty of Medicine and Health, The University of Sydney, Sydney, New South Wales, Australia
- The Skin Hospital, Sydney, New South Wales, Australia
| | - Chris Arnold
- BioGrid Australia Ltd, Melbourne, Australia
- Hodgson Associates, Melbourne, Australia
- Australasian Society of Cosmetic Dermatologists, Melbourne, Australia
| | - Craig Lawn
- Melanoma Institute Australia, The University of Sydney, Sydney, New South Wales, Australia
- Centre of Excellence in Melanoma Imaging, Brisbane, Queensland, Australia
| | | | - Monika Janda
- Australasian College of Dermatologists, Sydney, Australia
- Dermatology Research Centre, Frazer Institute, The University of Queensland, Brisbane, Queensland, Australia
- Centre for Health Services Research, The University of Queensland, Brisbane, Queensland, Australia
| | - Victoria Mar
- Victorian Melanoma Service, Alfred Health, Melbourne, Victoria, Australia
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
- Australasian College of Dermatologists, Sydney, Australia
| |
Collapse
|
24
|
Oloruntoba A, Ingvar Å, Sashindranath M, Anthony O, Abbott L, Guitera P, Caccetta T, Janda M, Soyer HP, Mar V. Examining labelling guidelines for AI-based software as a medical device: A review and analysis of dermatology mobile applications in Australia. Australas J Dermatol 2024. [PMID: 38693690 DOI: 10.1111/ajd.14269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 02/26/2024] [Accepted: 04/01/2024] [Indexed: 05/03/2024]
Abstract
In recent years, there has been a surge in the development of AI-based Software as a Medical Device (SaMD), particularly in visual specialties such as dermatology. In Australia, the Therapeutic Goods Administration (TGA) regulates AI-based SaMD to ensure its safe use. Proper labelling of these devices is crucial to ensure that healthcare professionals and the general public understand how to use them and interpret results accurately. However, guidelines for labelling AI-based SaMD in dermatology are lacking, which may result in products failing to provide essential information about algorithm development and performance metrics. This review examines existing labelling guidelines for AI-based SaMD across visual medical specialties, with a specific focus on dermatology. Common recommendations for labelling are identified and applied to currently available dermatology AI-based SaMD mobile applications to determine usage of these labels. Of the 21 AI-based SaMD mobile applications identified, none fully comply with common labelling recommendations. Results highlight the need for standardized labelling guidelines. Ensuring transparency and accessibility of information is essential for the safe integration of AI into health care and preventing potential risks associated with inaccurate clinical decisions.
Collapse
Affiliation(s)
| | - Åsa Ingvar
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
- Victorian Melanoma Service, Alfred Health, Melbourne, Victoria, Australia
- Department of Dermatology, Skåne University Hospital, Lund University, Lund, Sweden
- Department of Clinical Sciences, Skåne University Hospital, Lund University, Lund, Sweden
| | - Maithili Sashindranath
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
| | - Ojochonu Anthony
- Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| | - Lisa Abbott
- Melanoma Institute Australia, The University of Sydney, Sydney, New South Wales, Australia
| | - Pascale Guitera
- Faculty of Medicine and Health, The University of Sydney, Sydney, New South Wales, Australia
- Sydney Melanoma Diagnostic Centre, Royal Prince Alfred Hospital, Camperdown, New South Wales, Australia
- Perth Dermatology Clinic, Perth, Western Australia, Australia
| | - Tony Caccetta
- Perth Dermatology Clinic, Perth, Western Australia, Australia
| | - Monika Janda
- Dermatology Research Centre, Frazer Institute, The University of Queensland, Brisbane, Queensland, Australia
| | - H Peter Soyer
- Dermatology Research Centre, Frazer Institute, The University of Queensland, Brisbane, Queensland, Australia
| | - Victoria Mar
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
- Victorian Melanoma Service, Alfred Health, Melbourne, Victoria, Australia
| |
Collapse
|
25
|
Hörst F, Rempe M, Heine L, Seibold C, Keyl J, Baldini G, Ugurel S, Siveke J, Grünwald B, Egger J, Kleesiek J. CellViT: Vision Transformers for precise cell segmentation and classification. Med Image Anal 2024; 94:103143. [PMID: 38507894 DOI: 10.1016/j.media.2024.103143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 02/14/2024] [Accepted: 03/12/2024] [Indexed: 03/22/2024]
Abstract
Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in combination with large scale pre-training in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published Segment Anything Model and a ViT-encoder pre-trained on 104 million histological image patches - achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an F1-detection score of 0.83. The code is publicly available at https://github.com/TIO-IKIM/CellViT.
Collapse
Affiliation(s)
- Fabian Hörst
- Institute for AI in Medicine (IKIM), University Hospital Essen (AöR), 45131 Essen, Germany; Cancer Research Center Cologne Essen (CCCE), West German Cancer Center Essen, University Hospital Essen (AöR), 45147 Essen, Germany.
| | - Moritz Rempe
- Institute for AI in Medicine (IKIM), University Hospital Essen (AöR), 45131 Essen, Germany; Cancer Research Center Cologne Essen (CCCE), West German Cancer Center Essen, University Hospital Essen (AöR), 45147 Essen, Germany
| | - Lukas Heine
- Institute for AI in Medicine (IKIM), University Hospital Essen (AöR), 45131 Essen, Germany; Cancer Research Center Cologne Essen (CCCE), West German Cancer Center Essen, University Hospital Essen (AöR), 45147 Essen, Germany
| | - Constantin Seibold
- Institute for AI in Medicine (IKIM), University Hospital Essen (AöR), 45131 Essen, Germany; Clinic for Nuclear Medicine, University Hospital Essen (AöR), 45147 Essen, Germany
| | - Julius Keyl
- Institute for AI in Medicine (IKIM), University Hospital Essen (AöR), 45131 Essen, Germany; Institute of Pathology, University Hospital Essen (AöR), 45147 Essen, Germany
| | - Giulia Baldini
- Institute for AI in Medicine (IKIM), University Hospital Essen (AöR), 45131 Essen, Germany; Institute of Interventional and Diagnostic Radiology and Neuroradiology, University Hospital Essen (AöR), 45147 Essen, Germany
| | - Selma Ugurel
- Department of Dermatology, University Hospital Essen (AöR), 45147 Essen, Germany; German Cancer Consortium (DKTK, Partner site Essen), 69120 Heidelberg, Germany
| | - Jens Siveke
- West German Cancer Center, partner site Essen, a partnership between German Cancer Research Center (DKFZ) and University Hospital Essen, University Hospital Essen (AöR), 45147 Essen, Germany; Bridge Institute of Experimental Tumor Therapy (BIT) and Division of Solid Tumor Translational Oncology (DKTK), West German Cancer Center Essen, University Hospital Essen (AöR), University of Duisburg-Essen, 45147 Essen, Germany
| | - Barbara Grünwald
- Department of Urology, West German Cancer Center, 45147 University Hospital Essen (AöR), Germany; Princess Margaret Cancer Centre, M5G 2M9 Toronto, Ontario, Canada
| | - Jan Egger
- Institute for AI in Medicine (IKIM), University Hospital Essen (AöR), 45131 Essen, Germany; Cancer Research Center Cologne Essen (CCCE), West German Cancer Center Essen, University Hospital Essen (AöR), 45147 Essen, Germany
| | - Jens Kleesiek
- Institute for AI in Medicine (IKIM), University Hospital Essen (AöR), 45131 Essen, Germany; Cancer Research Center Cologne Essen (CCCE), West German Cancer Center Essen, University Hospital Essen (AöR), 45147 Essen, Germany; German Cancer Consortium (DKTK, Partner site Essen), 69120 Heidelberg, Germany; Department of Physics, TU Dortmund University, 44227 Dortmund, Germany
| |
Collapse
|
26
|
VanBerlo B, Hoey J, Wong A. A survey of the impact of self-supervised pretraining for diagnostic tasks in medical X-ray, CT, MRI, and ultrasound. BMC Med Imaging 2024; 24:79. [PMID: 38580932 PMCID: PMC10998380 DOI: 10.1186/s12880-024-01253-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 03/18/2024] [Indexed: 04/07/2024] Open
Abstract
Self-supervised pretraining has been observed to be effective at improving feature representations for transfer learning, leveraging large amounts of unlabelled data. This review summarizes recent research into its usage in X-ray, computed tomography, magnetic resonance, and ultrasound imaging, concentrating on studies that compare self-supervised pretraining to fully supervised learning for diagnostic tasks such as classification and segmentation. The most pertinent finding is that self-supervised pretraining generally improves downstream task performance compared to full supervision, most prominently when unlabelled examples greatly outnumber labelled examples. Based on the aggregate evidence, recommendations are provided for practitioners considering using self-supervised learning. Motivated by limitations identified in current research, directions and practices for future study are suggested, such as integrating clinical knowledge with theoretically justified self-supervised learning methods, evaluating on public datasets, growing the modest body of evidence for ultrasound, and characterizing the impact of self-supervised pretraining on generalization.
Collapse
Affiliation(s)
- Blake VanBerlo
- Cheriton School of Computer Science, 200 University Ave W, N2L 3G1, Waterloo, Canada.
| | - Jesse Hoey
- Cheriton School of Computer Science, 200 University Ave W, N2L 3G1, Waterloo, Canada
| | - Alexander Wong
- Department of Systems Design Engineering, 200 University Ave W, N2L 3G1, Waterloo, Canada
| |
Collapse
|
27
|
Stueckle CA, Haage P. The radiologist as a physician - artificial intelligence as a way to overcome tension between the patient, technology, and referring physicians - a narrative review. ROFO-FORTSCHR RONTG 2024. [PMID: 38569517 DOI: 10.1055/a-2271-0799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2024]
Abstract
BACKGROUND Large volumes of data increasing over time lead to a shortage of radiologists' time. The use of systems based on artificial intelligence (AI) offers opportunities to relieve the burden on radiologists. The AI systems are usually optimized for a radiological area. Radiologists must understand the basic features of its technical function in order to be able to assess the weaknesses and possible errors of the system and use the strengths of the system. This "explainability" creates trust in an AI system and shows its limits. METHOD Based on an expanded Medline search for the key words "radiology, artificial intelligence, referring physician interaction, patient interaction, job satisfaction, communication of findings, expectations", subjective additional relevant articles were considered for this narrative review. RESULTS The use of AI is well advanced, especially in radiology. The programmer should provide the radiologist with clear explanations as to how the system works. All systems on the market have strengths and weaknesses. Some of the optimizations are unintentionally specific, as they are often adapted too precisely to a certain environment that often does not exist in practice - this is known as "overfitting". It should also be noted that there are specific weak points in the systems, so-called "adversarial examples", which lead to fatal misdiagnoses by the AI even though these cannot be visually distinguished from an unremarkable finding by the radiologist. The user must know which diseases the system is trained for, which organ systems are recognized and taken into account by the AI, and, accordingly, which are not properly assessed. This means that the user can and must critically review the results and adjust the findings if necessary. Correctly applied AI can result in a time savings for the radiologist. If he knows how the system works, he only has to spend a short amount of time checking the results. The time saved can be used for communication with patients and referring physicians and thus contribute to higher job satisfaction. CONCLUSION Radiology is a constantly evolving specialty with enormous responsibility, as radiologists often make the diagnosis to be treated. AI-supported systems should be used consistently to provide relief and support. Radiologists need to know the strengths, weaknesses, and areas of application of these AI systems in order to save time. The time gained can be used for communication with patients and referring physicians. KEY POINTS · Explainable AI systems help to improve workflow and to save time.. · The physician must critically review AI results, under consideration of the limitations of the AI.. · The AI system will only provide useful results if it has been adapted to the data type and data origin.. · The communicating radiologist interested in the patient is important for the visibility of the discipline.. CITATION FORMAT · Stueckle CA, Haage P. The radiologist as a physician - artificial intelligence as a way to overcome tension between the patient, technology, and referring physicians - a narrative review. Fortschr Röntgenstr 2024; DOI: 10.1055/a-2271-0799.
Collapse
Affiliation(s)
| | - Patrick Haage
- Diagnostic and Interventional Radiology, HELIOS Universitätsklinikum Wuppertal, Germany
| |
Collapse
|
28
|
Marcus E, Teuwen J. Artificial intelligence and explanation: How, why, and when to explain black boxes. Eur J Radiol 2024; 173:111393. [PMID: 38417186 DOI: 10.1016/j.ejrad.2024.111393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Accepted: 02/22/2024] [Indexed: 03/01/2024]
Abstract
Artificial intelligence (AI) is infiltrating nearly all fields of science by storm. One notorious property that AI algorithms bring is their so-called black box character. In particular, they are said to be inherently unexplainable algorithms. Of course, such characteristics would pose a problem for the medical world, including radiology. The patient journey is filled with explanations along the way, from diagnoses to treatment, follow-up, and more. If we were to replace part of these steps with non-explanatory algorithms, we could lose grip on vital aspects such as finding mistakes, patient trust, and even the creation of new knowledge. In this article, we argue that, even for the darkest of black boxes, there is hope of understanding them. In particular, we compare the situation of understanding black box models to that of understanding the laws of nature in physics. In the case of physics, we are given a 'black box' law of nature, about which there is no upfront explanation. However, as current physical theories show, we can learn plenty about them. During this discussion, we present the process by which we make such explanations and the human role therein, keeping a solid focus on radiological AI situations. We will outline the AI developers' roles in this process, but also the critical role fulfilled by the practitioners, the radiologists, in providing a healthy system of continuous improvement of AI models. Furthermore, we explore the role of the explainable AI (XAI) research program in the broader context we describe.
Collapse
Affiliation(s)
- Eric Marcus
- AI for Oncology, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands.
| | - Jonas Teuwen
- AI for Oncology, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands; Department of Radiology and Nuclear Medicine, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, the Netherlands
| |
Collapse
|
29
|
Heredia-Negrón F, Tosado-Rodríguez EL, Meléndez-Berrios J, Nieves B, Amaya-Ardila CP, Roche-Lima A. Assessing the Impact of AI Education on Hispanic Healthcare Professionals' Perceptions and Knowledge. EDUCATION SCIENCES 2024; 14:339. [PMID: 38818527 PMCID: PMC11138866 DOI: 10.3390/educsci14040339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
This study investigates the awareness and perceptions of artificial intelligence (AI) among Hispanic healthcare-related professionals, focusing on integrating AI in healthcare. The study participants were recruited from an asynchronous course offered twice within a year at the University of Puerto Rico Medical Science Campus, titled "Artificial Intelligence and Machine Learning Applied to Health Disparities Research", which aimed to bridge the gaps in AI knowledge among participants. The participants were divided into Experimental (n = 32; data-illiterate) and Control (n = 18; data-literate) groups, and pre-test and post-test surveys were administered to assess knowledge and attitudes toward AI. Descriptive statistics, power analysis, and the Mann-Whitney U test were employed to determine the influence of the course on participants' comprehension and perspectives regarding AI. Results indicate significant improvements in knowledge and attitudes among participants, emphasizing the effectiveness of the course in enhancing understanding and fostering positive attitudes toward AI. Findings also reveal limited practical exposure to AI applications, highlighting the need for improved integration into education. This research highlights the significance of educating healthcare professionals about AI to enable its advantageous incorporation into healthcare procedures. The study provides valuable perspectives from a broad spectrum of healthcare workers, serving as a basis for future investigations and educational endeavors aimed at AI implementation in healthcare.
Collapse
Affiliation(s)
- Frances Heredia-Negrón
- CCRHD RCMI-Program, Medical Sciences Campus, University of Puerto Rico, San Juan, PR 00934, USA
| | | | - Joshua Meléndez-Berrios
- CCRHD RCMI-Program, Medical Sciences Campus, University of Puerto Rico, San Juan, PR 00934, USA
| | - Brenda Nieves
- CCRHD RCMI-Program, Medical Sciences Campus, University of Puerto Rico, San Juan, PR 00934, USA
| | - Claudia P. Amaya-Ardila
- Department of Biostatistics and Epidemiology, Medical Science Campus, University of Puerto Rico, San Juan, PR 00934, USA
| | - Abiel Roche-Lima
- CCRHD RCMI-Program, Medical Sciences Campus, University of Puerto Rico, San Juan, PR 00934, USA
| |
Collapse
|
30
|
Flory MN, Napel S, Tsai EB. Artificial Intelligence in Radiology: Opportunities and Challenges. Semin Ultrasound CT MR 2024; 45:152-160. [PMID: 38403128 DOI: 10.1053/j.sult.2024.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Artificial intelligence's (AI) emergence in radiology elicits both excitement and uncertainty. AI holds promise for improving radiology with regards to clinical practice, education, and research opportunities. Yet, AI systems are trained on select datasets that can contain bias and inaccuracies. Radiologists must understand these limitations and engage with AI developers at every step of the process - from algorithm initiation and design to development and implementation - to maximize benefit and minimize harm that can be enabled by this technology.
Collapse
Affiliation(s)
- Marta N Flory
- Department of Radiology, Stanford University School of Medicine, Center for Academic Medicine, Palo Alto, CA
| | - Sandy Napel
- Department of Radiology, Stanford University School of Medicine, Center for Academic Medicine, Palo Alto, CA
| | - Emily B Tsai
- Department of Radiology, Stanford University School of Medicine, Center for Academic Medicine, Palo Alto, CA.
| |
Collapse
|
31
|
Ciet P, Eade C, Ho ML, Laborie LB, Mahomed N, Naidoo J, Pace E, Segal B, Toso S, Tschauner S, Vamyanmane DK, Wagner MW, Shelmerdine SC. The unintended consequences of artificial intelligence in paediatric radiology. Pediatr Radiol 2024; 54:585-593. [PMID: 37665368 DOI: 10.1007/s00247-023-05746-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 08/07/2023] [Accepted: 08/08/2023] [Indexed: 09/05/2023]
Abstract
Over the past decade, there has been a dramatic rise in the interest relating to the application of artificial intelligence (AI) in radiology. Originally only 'narrow' AI tasks were possible; however, with increasing availability of data, teamed with ease of access to powerful computer processing capabilities, we are becoming more able to generate complex and nuanced prediction models and elaborate solutions for healthcare. Nevertheless, these AI models are not without their failings, and sometimes the intended use for these solutions may not lead to predictable impacts for patients, society or those working within the healthcare profession. In this article, we provide an overview of the latest opinions regarding AI ethics, bias, limitations, challenges and considerations that we should all contemplate in this exciting and expanding field, with a special attention to how this applies to the unique aspects of a paediatric population. By embracing AI technology and fostering a multidisciplinary approach, it is hoped that we can harness the power AI brings whilst minimising harm and ensuring a beneficial impact on radiology practice.
Collapse
Affiliation(s)
- Pierluigi Ciet
- Department of Radiology and Nuclear Medicine, Erasmus MC - Sophia's Children's Hospital, Rotterdam, The Netherlands
- Department of Medical Sciences, University of Cagliari, Cagliari, Italy
| | | | - Mai-Lan Ho
- University of Missouri, Columbia, MO, USA
| | - Lene Bjerke Laborie
- Department of Radiology, Section for Paediatrics, Haukeland University Hospital, Bergen, Norway
- Department of Clinical Medicine, University of Bergen, Bergen, Norway
| | - Nasreen Mahomed
- Department of Radiology, University of Witwatersrand, Johannesburg, South Africa
| | - Jaishree Naidoo
- Paediatric Diagnostic Imaging, Dr J Naidoo Inc., Johannesburg, South Africa
- Envisionit Deep AI Ltd, Coveham House, Downside Bridge Road, Cobham, UK
| | - Erika Pace
- Department of Diagnostic Radiology, The Royal Marsden NHS Foundation Trust, London, UK
| | - Bradley Segal
- Department of Radiology, University of Witwatersrand, Johannesburg, South Africa
| | - Seema Toso
- Pediatric Radiology, Children's Hospital, University Hospitals of Geneva, Geneva, Switzerland
| | - Sebastian Tschauner
- Division of Paediatric Radiology, Department of Radiology, Medical University of Graz, Graz, Austria
| | - Dhananjaya K Vamyanmane
- Department of Pediatric Radiology, Indira Gandhi Institute of Child Health, Bangalore, India
| | - Matthias W Wagner
- Department of Diagnostic Imaging, Division of Neuroradiology, The Hospital for Sick Children, Toronto, Canada
- Department of Medical Imaging, University of Toronto, Toronto, ON, Canada
- Department of Neuroradiology, University Hospital Augsburg, Augsburg, Germany
| | - Susan C Shelmerdine
- Department of Clinical Radiology, Great Ormond Street Hospital for Children NHS Foundation Trust, Great Ormond Street, London, WC1H 3JH, UK.
- Great Ormond Street Hospital for Children, UCL Great Ormond Street Institute of Child Health, London, UK.
- NIHR Great Ormond Street Hospital Biomedical Research Centre, 30 Guilford Street, Bloomsbury, London, UK.
- Department of Clinical Radiology, St George's Hospital, London, UK.
| |
Collapse
|
32
|
Bharadwaj UU, Chin CT, Majumdar S. Practical Applications of Artificial Intelligence in Spine Imaging: A Review. Radiol Clin North Am 2024; 62:355-370. [PMID: 38272627 DOI: 10.1016/j.rcl.2023.10.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
Artificial intelligence (AI), a transformative technology with unprecedented potential in medical imaging, can be applied to various spinal pathologies. AI-based approaches may improve imaging efficiency, diagnostic accuracy, and interpretation, which is essential for positive patient outcomes. This review explores AI algorithms, techniques, and applications in spine imaging, highlighting diagnostic impact and challenges with future directions for integrating AI into spine imaging workflow.
Collapse
Affiliation(s)
- Upasana Upadhyay Bharadwaj
- Department of Radiology and Biomedical Imaging, University of California San Francisco, 1700 4th Street, Byers Hall, Suite 203, Room 203D, San Francisco, CA 94158, USA
| | - Cynthia T Chin
- Department of Radiology and Biomedical Imaging, University of California San Francisco, 505 Parnassus Avenue, Box 0628, San Francisco, CA 94143, USA.
| | - Sharmila Majumdar
- Department of Radiology and Biomedical Imaging, University of California San Francisco, 1700 4th Street, Byers Hall, Suite 203, Room 203D, San Francisco, CA 94158, USA
| |
Collapse
|
33
|
Hill DLG. AI in imaging: the regulatory landscape. Br J Radiol 2024; 97:483-491. [PMID: 38366148 PMCID: PMC11027239 DOI: 10.1093/bjr/tqae002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 12/03/2023] [Accepted: 12/26/2023] [Indexed: 02/18/2024] Open
Abstract
Artificial intelligence (AI) methods have been applied to medical imaging for several decades, but in the last few years, the number of publications and the number of AI-enabled medical devices coming on the market have significantly increased. While some AI-enabled approaches are proving very valuable, systematic reviews of the AI imaging field identify significant weaknesses in a significant proportion of the literature. Medical device regulators have recently become more proactive in publishing guidance documents and recognizing standards that will require that the development and validation of AI-enabled medical devices need to be more rigorous than required for tradition "rule-based" software. In particular, developers are required to better identify and mitigate risks (such as bias) that arise in AI-enabled devices, and to ensure that the devices are validated in a realistic clinical setting to ensure their output is clinically meaningful. While this evolving regulatory landscape will mean that device developers will take longer to bring novel AI-based medical imaging devices to market, such additional rigour is necessary to address existing weaknesses in the field and ensure that patients and healthcare professionals can trust AI-enabled devices. There would also be benefits in the academic community taking into account this regulatory framework, to improve the quality of the literature and make it easier for academically developed AI tools to make the transition to medical devices that impact healthcare.
Collapse
|
34
|
Sood A, Mansoor N, Memmi C, Lynch M, Lynch J. Generative pretrained transformer-4, an artificial intelligence text predictive model, has a high capability for passing novel written radiology exam questions. Int J Comput Assist Radiol Surg 2024:10.1007/s11548-024-03071-9. [PMID: 38381363 DOI: 10.1007/s11548-024-03071-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 02/01/2024] [Indexed: 02/22/2024]
Abstract
PURPOSE AI-image interpretation, through convolutional neural networks, shows increasing capability within radiology. These models have achieved impressive performance in specific tasks within controlled settings, but possess inherent limitations, such as the inability to consider clinical context. We assess the ability of large language models (LLMs) within the context of radiology specialty exams to determine whether they can evaluate relevant clinical information. METHODS A database of questions was created with official sample, author written, and textbook questions based on the Royal College of Radiology (United Kingdom) FRCR 2A and American Board of Radiology (ABR) Certifying examinations. The questions were input into the Generative Pretrained Transformer (GPT) versions 3 and 4, with prompting to answer the questions. RESULTS One thousand seventy-two questions were evaluated by GPT-3 and GPT-4. 495 (46.2%) were for the FRCR 2A and 577 (53.8%) were for the ABR exam. There were 890 single best answers (SBA), and 182 true/false questions. GPT-4 was correct in 629/890 (70.7%) SBA and 151/182 (83.0%) true/false questions. There was no degradation on author written questions. GPT-4 performed significantly better than GPT-3 which selected the correct answer in 282/890 (31.7%) SBA and 111/182 (61.0%) true/false questions. Performance of GPT-4 was similar across both examinations for all categories of question. CONCLUSION The newest generation of LLMs, GPT-4, demonstrates high capability in answering radiology exam questions. It shows marked improvement from GPT-3, suggesting further improvements in accuracy are possible. Further research is needed to explore the clinical applicability of these AI models in real-world settings.
Collapse
Affiliation(s)
- Avnish Sood
- King's College London, Strand, London, WC2R 2LS, UK
| | - Nina Mansoor
- Department of Neuroradiology, Kings College Hospital, Denmark Hill, London, SE59RS, UK
| | - Caroline Memmi
- Imperial College London, Exhibition Road, London, SW7 2AZ, UK
| | - Magnus Lynch
- King's College London Centre for Stem Cells and Regenerative Medicine, Guy's Hospital, Great Maze Pond, London, UK
- St John's Institute of Dermatology, King's College London, London, UK
| | - Jeremy Lynch
- Department of Neuroradiology, Kings College Hospital, Denmark Hill, London, SE59RS, UK.
| |
Collapse
|
35
|
Payne DL, Xu X, Faraji F, John K, Pradas KF, Bernard VV, Bangiyev L, Prasanna P. Automated Detection of Cervical Spinal Stenosis and Cord Compression via Vision Transformer and Rules-Based Classification. AJNR Am J Neuroradiol 2024; 45:ajnr.A8141. [PMID: 38360785 PMCID: PMC11288556 DOI: 10.3174/ajnr.a8141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 12/15/2023] [Indexed: 02/17/2024]
Abstract
BACKGROUND AND PURPOSE Cervical spinal cord compression, defined as spinal cord deformity and severe narrowing of the spinal canal in the cervical region, can lead to severe clinical consequences, including intractable pain, sensory disturbance, paralysis, and even death, and may require emergent intervention to prevent negative outcomes. Despite the critical nature of cord compression, no automated tool is available to alert clinical radiologists to the presence of such findings. This study aims to demonstrate the ability of a vision transformer (ViT) model for the accurate detection of cervical cord compression. MATERIALS AND METHODS A clinically diverse cohort of 142 cervical spine MRIs was identified, 34% of which were normal or had mild stenosis, 31% with moderate stenosis, and 35% with cord compression. Utilizing gradient-echo images, slices were labeled as no cord compression/mild stenosis, moderate stenosis, or severe stenosis/cord compression. Segmentation of the spinal canal was performed and confirmed by neuroradiology faculty. A pretrained ViT model was fine-tuned to predict section-level severity by using a train:validation:test split of 60:20:20. Each examination was assigned an overall severity based on the highest level of section severity, with an examination labeled as positive for cord compression if ≥1 section was predicted in the severe category. Additionally, 2 convolutional neural network (CNN) models (ResNet50, DenseNet121) were tested in the same manner. RESULTS The ViT model outperformed both CNN models at the section level, achieving section-level accuracy of 82%, compared with 72% and 78% for ResNet and DenseNet121, respectively. ViT patient-level classification achieved accuracy of 93%, sensitivity of 0.90, positive predictive value of 0.90, specificity of 0.95, and negative predictive value of 0.95. Receiver operating characteristic area under the curve was greater for ViT than either CNN. CONCLUSIONS This classification approach using a ViT model and rules-based classification accurately detects the presence of cervical spinal cord compression at the patient level. In this study, the ViT model outperformed both conventional CNN approaches at the section and patient levels. If implemented into the clinical setting, such a tool may streamline neuroradiology workflow, improving efficiency and consistency.
Collapse
Affiliation(s)
- David L Payne
- From the Department of Radiology (D.L.P., F.F., K.J., K.F.P., V.V.B., L.B.), Stony Brook University Hospital, Stony Brook, New York
- Department of Biomedical Informatics (D.L.P., X.X., F.F., K.J., P.P.), Stony Brook University, Stony Brook, New York
| | - Xuan Xu
- Department of Biomedical Informatics (D.L.P., X.X., F.F., K.J., P.P.), Stony Brook University, Stony Brook, New York
| | - Farshid Faraji
- From the Department of Radiology (D.L.P., F.F., K.J., K.F.P., V.V.B., L.B.), Stony Brook University Hospital, Stony Brook, New York
- Department of Biomedical Informatics (D.L.P., X.X., F.F., K.J., P.P.), Stony Brook University, Stony Brook, New York
| | - Kevin John
- From the Department of Radiology (D.L.P., F.F., K.J., K.F.P., V.V.B., L.B.), Stony Brook University Hospital, Stony Brook, New York
- Department of Biomedical Informatics (D.L.P., X.X., F.F., K.J., P.P.), Stony Brook University, Stony Brook, New York
| | - Katherine Ferra Pradas
- From the Department of Radiology (D.L.P., F.F., K.J., K.F.P., V.V.B., L.B.), Stony Brook University Hospital, Stony Brook, New York
| | - Vahni Vishala Bernard
- From the Department of Radiology (D.L.P., F.F., K.J., K.F.P., V.V.B., L.B.), Stony Brook University Hospital, Stony Brook, New York
| | - Lev Bangiyev
- From the Department of Radiology (D.L.P., F.F., K.J., K.F.P., V.V.B., L.B.), Stony Brook University Hospital, Stony Brook, New York
| | - Prateek Prasanna
- Department of Biomedical Informatics (D.L.P., X.X., F.F., K.J., P.P.), Stony Brook University, Stony Brook, New York
| |
Collapse
|
36
|
Kelly BS, Mathur P, McGuinness G, Dillon H, Lee EH, Yeom KW, Lawlor A, Killeen RP. A Radiomic "Warning Sign" of Progression on Brain MRI in Individuals with MS. AJNR Am J Neuroradiol 2024; 45:236-243. [PMID: 38216299 DOI: 10.3174/ajnr.a8104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 11/08/2023] [Indexed: 01/14/2024]
Abstract
BACKGROUND AND PURPOSE MS is a chronic progressive, idiopathic, demyelinating disorder whose diagnosis is contingent on the interpretation of MR imaging. New MR imaging lesions are an early biomarker of disease progression. We aimed to evaluate a machine learning model based on radiomics features in predicting progression on MR imaging of the brain in individuals with MS. MATERIALS AND METHODS This retrospective cohort study with external validation on open-access data obtained full ethics approval. Longitudinal MR imaging data for patients with MS were collected and processed for machine learning. Radiomics features were extracted at the future location of a new lesion in the patients' prior MR imaging ("prelesion"). Additionally, "control" samples were obtained from the normal-appearing white matter for each participant. Machine learning models for binary classification were trained and tested and then evaluated the external data of the model. RESULTS The total number of participants was 167. Of the 147 in the training/test set, 102 were women and 45 were men. The average age was 42 (range, 21-74 years). The best-performing radiomics-based model was XGBoost, with accuracy, precision, recall, and F1-score of 0.91, 0.91, 0.91, and 0.91 on the test set, and 0.74, 0.74, 0.74, and 0.70 on the external validation set. The 5 most important radiomics features to the XGBoost model were associated with the overall heterogeneity and low gray-level emphasis of the segmented regions. Probability maps were produced to illustrate potential future clinical applications. CONCLUSIONS Our machine learning model based on radiomics features successfully differentiated prelesions from normal-appearing white matter. This outcome suggests that radiomics features from normal-appearing white matter could serve as an imaging biomarker for progression of MS on MR imaging.
Collapse
Affiliation(s)
- Brendan S Kelly
- From the Department of Radiology (B.S.K., G.M., H.D., R.P.K.), St. Vincent's University Hospital, Dublin, Ireland
- Insight Centre for Data Analytics (B.S.K., P.M., A.L.), University College Dublin, Dublin, Ireland
- Wellcome Trust and Health Research Board (B.S.K.), Irish Clinical Academic Training, Dublin, Ireland
- School of Medicine (B.S.K.), University College Dublin, Dublin, Ireland
| | - Prateek Mathur
- Insight Centre for Data Analytics (B.S.K., P.M., A.L.), University College Dublin, Dublin, Ireland
| | - Gerard McGuinness
- From the Department of Radiology (B.S.K., G.M., H.D., R.P.K.), St. Vincent's University Hospital, Dublin, Ireland
| | - Henry Dillon
- From the Department of Radiology (B.S.K., G.M., H.D., R.P.K.), St. Vincent's University Hospital, Dublin, Ireland
| | - Edward H Lee
- Lucille Packard Children's Hospital at Stanford (E.H.L., K.W.Y.), Stanford, California
| | - Kristen W Yeom
- Lucille Packard Children's Hospital at Stanford (E.H.L., K.W.Y.), Stanford, California
| | - Aonghus Lawlor
- Insight Centre for Data Analytics (B.S.K., P.M., A.L.), University College Dublin, Dublin, Ireland
| | - Ronan P Killeen
- From the Department of Radiology (B.S.K., G.M., H.D., R.P.K.), St. Vincent's University Hospital, Dublin, Ireland
| |
Collapse
|
37
|
Boverhof BJ, Redekop WK, Bos D, Starmans MPA, Birch J, Rockall A, Visser JJ. Radiology AI Deployment and Assessment Rubric (RADAR) to bring value-based AI into radiological practice. Insights Imaging 2024; 15:34. [PMID: 38315288 PMCID: PMC10844175 DOI: 10.1186/s13244-023-01599-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 11/14/2023] [Indexed: 02/07/2024] Open
Abstract
OBJECTIVE To provide a comprehensive framework for value assessment of artificial intelligence (AI) in radiology. METHODS This paper presents the RADAR framework, which has been adapted from Fryback and Thornbury's imaging efficacy framework to facilitate the valuation of radiology AI from conception to local implementation. Local efficacy has been newly introduced to underscore the importance of appraising an AI technology within its local environment. Furthermore, the RADAR framework is illustrated through a myriad of study designs that help assess value. RESULTS RADAR presents a seven-level hierarchy, providing radiologists, researchers, and policymakers with a structured approach to the comprehensive assessment of value in radiology AI. RADAR is designed to be dynamic and meet the different valuation needs throughout the AI's lifecycle. Initial phases like technical and diagnostic efficacy (RADAR-1 and RADAR-2) are assessed pre-clinical deployment via in silico clinical trials and cross-sectional studies. Subsequent stages, spanning from diagnostic thinking to patient outcome efficacy (RADAR-3 to RADAR-5), require clinical integration and are explored via randomized controlled trials and cohort studies. Cost-effectiveness efficacy (RADAR-6) takes a societal perspective on financial feasibility, addressed via health-economic evaluations. The final level, RADAR-7, determines how prior valuations translate locally, evaluated through budget impact analysis, multi-criteria decision analyses, and prospective monitoring. CONCLUSION The RADAR framework offers a comprehensive framework for valuing radiology AI. Its layered, hierarchical structure, combined with a focus on local relevance, aligns RADAR seamlessly with the principles of value-based radiology. CRITICAL RELEVANCE STATEMENT The RADAR framework advances artificial intelligence in radiology by delineating a much-needed framework for comprehensive valuation. KEYPOINTS • Radiology artificial intelligence lacks a comprehensive approach to value assessment. • The RADAR framework provides a dynamic, hierarchical method for thorough valuation of radiology AI. • RADAR advances clinical radiology by bridging the artificial intelligence implementation gap.
Collapse
Affiliation(s)
- Bart-Jan Boverhof
- Erasmus School of Health Policy and Management, Erasmus University Rotterdam, Rotterdam, The Netherlands
| | - W Ken Redekop
- Erasmus School of Health Policy and Management, Erasmus University Rotterdam, Rotterdam, The Netherlands
| | - Daniel Bos
- Department of Epidemiology, Erasmus University Medical Centre, Rotterdam, The Netherlands
- Department of Radiology & Nuclear Medicine, Erasmus University Medical Centre, Rotterdam, The Netherlands
| | - Martijn P A Starmans
- Department of Radiology & Nuclear Medicine, Erasmus University Medical Centre, Rotterdam, The Netherlands
| | | | - Andrea Rockall
- Department of Surgery & Cancer, Imperial College London, London, UK
| | - Jacob J Visser
- Department of Radiology & Nuclear Medicine, Erasmus University Medical Centre, Rotterdam, The Netherlands.
| |
Collapse
|
38
|
Omoumi P, Richiardi J. Independent Evaluation of Commercial Diagnostic AI Solutions: A Necessary Step toward Increased Transparency. Radiology 2024; 310:e233299. [PMID: 38193839 DOI: 10.1148/radiol.233299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Affiliation(s)
- Patrick Omoumi
- From the Department of Radiology, Lausanne University Hospital and University of Lausanne, Bugnon 46, CH-1011 Lausanne, Switzerland
| | - Jonas Richiardi
- From the Department of Radiology, Lausanne University Hospital and University of Lausanne, Bugnon 46, CH-1011 Lausanne, Switzerland
| |
Collapse
|
39
|
Wahid KA, Fuentes D. Weak Supervision, Strong Results: Achieving High Performance in Intracranial Hemorrhage Detection with Fewer Annotation Labels. Radiol Artif Intell 2024; 6:e230598. [PMID: 38294326 PMCID: PMC10831509 DOI: 10.1148/ryai.230598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 12/18/2023] [Accepted: 12/29/2023] [Indexed: 02/01/2024]
Affiliation(s)
- Kareem A. Wahid
- From the Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, TX 77030
| | - David Fuentes
- From the Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, TX 77030
| |
Collapse
|
40
|
Kidera E, Koyasu S, Hirata K, Hamaji M, Nakamoto R, Nakamoto Y. Convolutional neural network-based program to predict lymph node metastasis of non-small cell lung cancer using 18F-FDG PET. Ann Nucl Med 2024; 38:71-80. [PMID: 37755604 DOI: 10.1007/s12149-023-01866-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Accepted: 09/11/2023] [Indexed: 09/28/2023]
Abstract
PURPOSE To develop a convolutional neural network (CNN)-based program to analyze maximum intensity projection (MIP) images of 2-deoxy-2-[F-18]fluoro-D-glucose (FDG) positron emission tomography (PET) scans, aimed at predicting lymph node metastasis of non-small cell lung cancer (NSCLC), and to evaluate its effectiveness in providing diagnostic assistance to radiologists. METHODS We obtained PET images of NSCLC from public datasets, including those of 435 patients with available N-stage information, which were divided into a training set (n = 304) and a test set (n = 131). We generated 36 maximum intensity projection (MIP) images for each patient. A residual network (ResNet-50)-based CNN was trained using the MIP images of the training set to predict lymph node metastasis. Lymph node metastasis in the test set was predicted by the trained CNN as well as by seven radiologists twice: first without and second with CNN assistance. Diagnostic performance metrics, including accuracy and prediction error (the difference between the truth and the predictions), were calculated, and reading times were recorded. RESULTS In the test set, 67 (51%) patients exhibited lymph node metastases and the CNN yielded 0.748 predictive accuracy. With the assistance of the CNN, the prediction error was significantly reduced for six of the seven radiologists although the accuracy did not change significantly. The prediction time was significantly reduced for five of the seven radiologists with the median reduction ratio 38.0%. CONCLUSION The CNN-based program could potentially assist radiologists in predicting lymph node metastasis by increasing diagnostic confidence and reducing reading time without affecting diagnostic accuracy, at least in the limited situations using MIP images.
Collapse
Affiliation(s)
- Eitaro Kidera
- Department of Radiology, Kishiwada City Hospital, Kishiwada, Japan
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-cho, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Sho Koyasu
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-cho, Sakyo-ku, Kyoto, 606-8507, Japan.
| | - Kenji Hirata
- Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, Sapporo, Japan
| | - Masatsugu Hamaji
- Department of Thoracic Surgery, Kyoto University Hospital, Kyoto University, Kyoto, Japan
| | - Ryusuke Nakamoto
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-cho, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Yuji Nakamoto
- Department of Diagnostic Imaging and Nuclear Medicine, Graduate School of Medicine, Kyoto University, 54 Shogoin Kawahara-cho, Sakyo-ku, Kyoto, 606-8507, Japan
| |
Collapse
|
41
|
Ueda D, Kakinuma T, Fujita S, Kamagata K, Fushimi Y, Ito R, Matsui Y, Nozaki T, Nakaura T, Fujima N, Tatsugami F, Yanagawa M, Hirata K, Yamada A, Tsuboyama T, Kawamura M, Fujioka T, Naganawa S. Fairness of artificial intelligence in healthcare: review and recommendations. Jpn J Radiol 2024; 42:3-15. [PMID: 37540463 PMCID: PMC10764412 DOI: 10.1007/s11604-023-01474-3] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 07/17/2023] [Indexed: 08/05/2023]
Abstract
In this review, we address the issue of fairness in the clinical integration of artificial intelligence (AI) in the medical field. As the clinical adoption of deep learning algorithms, a subfield of AI, progresses, concerns have arisen regarding the impact of AI biases and discrimination on patient health. This review aims to provide a comprehensive overview of concerns associated with AI fairness; discuss strategies to mitigate AI biases; and emphasize the need for cooperation among physicians, AI researchers, AI developers, policymakers, and patients to ensure equitable AI integration. First, we define and introduce the concept of fairness in AI applications in healthcare and radiology, emphasizing the benefits and challenges of incorporating AI into clinical practice. Next, we delve into concerns regarding fairness in healthcare, addressing the various causes of biases in AI and potential concerns such as misdiagnosis, unequal access to treatment, and ethical considerations. We then outline strategies for addressing fairness, such as the importance of diverse and representative data and algorithm audits. Additionally, we discuss ethical and legal considerations such as data privacy, responsibility, accountability, transparency, and explainability in AI. Finally, we present the Fairness of Artificial Intelligence Recommendations in healthcare (FAIR) statement to offer best practices. Through these efforts, we aim to provide a foundation for discussing the responsible and equitable implementation and deployment of AI in healthcare.
Collapse
Affiliation(s)
- Daiju Ueda
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, 1-4-3 Asahi-Machi, Abeno-ku, Osaka, 545-8585, Japan.
| | | | - Shohei Fujita
- Department of Radiology, University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Koji Kamagata
- Department of Radiology, Juntendo University Graduate School of Medicine, Bunkyo-ku, Tokyo, Japan
| | - Yasutaka Fushimi
- Department of Diagnostic Imaging and Nuclear Medicine, Kyoto University Graduate School of Medicine, Sakyoku, Kyoto, Japan
| | - Rintaro Ito
- Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan
| | - Yusuke Matsui
- Department of Radiology, Faculty of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, Kita-ku, Okayama, Japan
| | - Taiki Nozaki
- Department of Radiology, Keio University School of Medicine, Shinjuku-ku, Tokyo, Japan
| | - Takeshi Nakaura
- Department of Diagnostic Radiology, Kumamoto University Graduate School of Medicine, Chuo-ku, Kumamoto, Japan
| | - Noriyuki Fujima
- Department of Diagnostic and Interventional Radiology, Hokkaido University Hospital, Sapporo, Japan
| | - Fuminari Tatsugami
- Department of Diagnostic Radiology, Hiroshima University, Minami-ku, Hiroshima, Japan
| | - Masahiro Yanagawa
- Department of Radiology, Osaka University Graduate School of Medicine, Suita City, Osaka, Japan
| | - Kenji Hirata
- Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, Kita-ku, Sapporo, Hokkaido, Japan
| | - Akira Yamada
- Department of Radiology, Shinshu University School of Medicine, Matsumoto, Nagano, Japan
| | - Takahiro Tsuboyama
- Department of Radiology, Osaka University Graduate School of Medicine, Suita City, Osaka, Japan
| | - Mariko Kawamura
- Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan
| | - Tomoyuki Fujioka
- Department of Diagnostic Radiology, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo, Japan
| | - Shinji Naganawa
- Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan
| |
Collapse
|
42
|
Silberstein J, Wee C, Gupta A, Seymour H, Ghotra SS, Sá dos Reis C, Zhang G, Sun Z. Artificial Intelligence-Assisted Detection of Osteoporotic Vertebral Fractures on Lateral Chest Radiographs in Post-Menopausal Women. J Clin Med 2023; 12:7730. [PMID: 38137799 PMCID: PMC10743975 DOI: 10.3390/jcm12247730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/06/2023] [Accepted: 12/15/2023] [Indexed: 12/24/2023] Open
Abstract
Osteoporotic vertebral fractures (OVFs) are often not reported by radiologists on routine chest radiographs. This study aims to investigate the clinical value of a newly developed artificial intelligence (AI) tool, Ofeye 1.0, for automated detection of OVFs on lateral chest radiographs in post-menopausal women (>60 years) who were referred to undergo chest x-rays for other reasons. A total of 510 de-identified lateral chest radiographs from three clinical sites were retrieved and analysed using the Ofeye 1.0 tool. These images were then reviewed by a consultant radiologist with findings serving as the reference standard for determining the diagnostic performance of the AI tool for the detection of OVFs. Of all the original radiologist reports, missed OVFs were found in 28.8% of images but were detected using the AI tool. The AI tool demonstrated high specificity of 92.8% (95% CI: 89.6, 95.2%), moderate accuracy of 80.3% (95% CI: 76.3, 80.4%), positive predictive value (PPV) of 73.7% (95% CI: 65.2, 80.8%), and negative predictive value (NPV) of 81.5% (95% CI: 79, 83.8%), but low sensitivity of 49% (95% CI: 40.7, 57.3%). The AI tool showed improved sensitivity compared with the original radiologist reports, which was 20.8% (95% CI: 14.5, 28.4). The new AI tool can be used as a complementary tool in routine diagnostic reports for the reduction in missed OVFs in elderly women.
Collapse
Affiliation(s)
- Jenna Silberstein
- Discipline of Medical Radiation Science, Curtin Medical School, Curtin University, Perth, WA 6102, Australia;
| | - Cleo Wee
- Curtin Medical School, Curtin University, Perth, WA 6102, Australia; (C.W.); (A.G.)
| | - Ashu Gupta
- Curtin Medical School, Curtin University, Perth, WA 6102, Australia; (C.W.); (A.G.)
- Radiology Department, Fiona Stanley Hospital, Murdoch, WA 6105, Australia
| | - Hannah Seymour
- Department of Geriatrics and Aged Care, Fiona Stanley Hospital, Murdoch, WA 6150, Australia;
| | - Switinder Singh Ghotra
- Department of Radiology, Hospital of Yverdon-les-Bains (eHnv), 1400 Yverdon-les-Bains, Switzerland;
- School of Health Sciences (HESAV), University of Applied Sciences and Arts Western Switzerland (HES-SO), 1011 Lausanne, Switzerland;
| | - Cláudia Sá dos Reis
- School of Health Sciences (HESAV), University of Applied Sciences and Arts Western Switzerland (HES-SO), 1011 Lausanne, Switzerland;
| | - Guicheng Zhang
- School of Population Health, Curtin University, Perth, WA 6102, Australia;
| | - Zhonghua Sun
- Discipline of Medical Radiation Science, Curtin Medical School, Curtin University, Perth, WA 6102, Australia;
- Curtin Health Research Innovation Institute (CHIRI), Curtin University, Perth, WA 6102, Australia
| |
Collapse
|
43
|
Jiang VS, Pavlovic ZJ, Hariton E. The Role of Artificial Intelligence and Machine Learning in Assisted Reproductive Technologies. Obstet Gynecol Clin North Am 2023; 50:747-762. [PMID: 37914492 DOI: 10.1016/j.ogc.2023.09.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2023]
Abstract
Artificial intelligence (AI) and machine learning, the form most commonly used in medicine, offer powerful tools utilizing the strengths of large data sets and intelligent algorithms. These systems can help to revolutionize delivery of treatments, access to medical care, and improvement of outcomes, particularly in the realm of reproductive medicine. Whether that is more robust oocyte and embryo grading or more accurate follicular measurement, AI will be able to aid clinicians, and most importantly patients, in providing the best possible and individualized care. However, despite all of the potential strengths of AI, algorithms are not immune to bias and are vulnerable to the many socioeconomic and demographic biases that current healthcare systems suffer from. Wrong diagnoses as well is furthering of healthcare discrimination are real possibilities if both the capabilities and limitations of AI are not well understood. Armed with appropriate knowledge of how AI can most appropriately operate within medicine, and specifically reproductive medicine, will enable clinicians to both create and utilize machine learning-based innovations for the furthering of reproductive medicine and ultimately achieving the goal of building of healthy families.
Collapse
Affiliation(s)
- Victoria S Jiang
- Division of Reproductive Endocrinology & Infertility, Vincent Department of Obstetrics and Gynecology, Massachusetts General Hospital/Harvard Medical School, 55 Fruit Street, Suite 10A, Boston, MA 02116, USA
| | - Zoran J Pavlovic
- Department of Obstetrics and Gynecology/Reproductive Endocrinology and Infertility, University of South Florida, Morsani College of Medicine, 2 Tampa General Circle, 6th Floor, Suite 6022, Tampa, FL 33602, USA
| | - Eduardo Hariton
- Reproductive Science Center of the San Francisco Bay Area, 100 Park Place #200, San Ramon, CA 94583, USA.
| |
Collapse
|
44
|
Eckstein J. [Artificial intelligence in internal medicine : From the theory to practical application in practices and hospitals]. INNERE MEDIZIN (HEIDELBERG, GERMANY) 2023; 64:1017-1022. [PMID: 37847260 PMCID: PMC10602942 DOI: 10.1007/s00108-023-01604-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/19/2023] [Indexed: 10/18/2023]
Abstract
The integration of artificial intelligence (AI) technologies has the potential to improve both the efficiency and the quality of medical care. Applications of AI have already become established in various specialized fields in internal medicine, whereas in other fields the applications are still in various phases of development. An aspect that is important to elucidate is the effects of AI on the interaction between patients and healthcare personnel. A further factor is the comprehensibility of the mode of functioning of the AI-based algorithms involved. In addition to the necessary confidence-building measures, an integration of the technology into existing systems must be strived for to achieve an appropriate acceptance and widespread availability and to relieve pressure on the personnel at the administrative level.
Collapse
Affiliation(s)
- Jens Eckstein
- Klinik für Innere Medizin, Universitätsspital Basel, Basel, Schweiz.
- Innovationsmanagement, Universitätsspital Basel, Hebelstr. 10, 4031, Basel, Schweiz.
| |
Collapse
|
45
|
Hong GS, Jang M, Kyung S, Cho K, Jeong J, Lee GY, Shin K, Kim KD, Ryu SM, Seo JB, Lee SM, Kim N. Overcoming the Challenges in the Development and Implementation of Artificial Intelligence in Radiology: A Comprehensive Review of Solutions Beyond Supervised Learning. Korean J Radiol 2023; 24:1061-1080. [PMID: 37724586 PMCID: PMC10613849 DOI: 10.3348/kjr.2023.0393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 07/01/2023] [Accepted: 07/30/2023] [Indexed: 09/21/2023] Open
Abstract
Artificial intelligence (AI) in radiology is a rapidly developing field with several prospective clinical studies demonstrating its benefits in clinical practice. In 2022, the Korean Society of Radiology held a forum to discuss the challenges and drawbacks in AI development and implementation. Various barriers hinder the successful application and widespread adoption of AI in radiology, such as limited annotated data, data privacy and security, data heterogeneity, imbalanced data, model interpretability, overfitting, and integration with clinical workflows. In this review, some of the various possible solutions to these challenges are presented and discussed; these include training with longitudinal and multimodal datasets, dense training with multitask learning and multimodal learning, self-supervised contrastive learning, various image modifications and syntheses using generative models, explainable AI, causal learning, federated learning with large data models, and digital twins.
Collapse
Affiliation(s)
- Gil-Sun Hong
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Miso Jang
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Sunggu Kyung
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Kyungjin Cho
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Jiheon Jeong
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Grace Yoojin Lee
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Keewon Shin
- Laboratory for Biosignal Analysis and Perioperative Outcome Research, Biomedical Engineering Center, Asan Institute of Lifesciences, Asan Medical Center, Seoul, Republic of Korea
| | - Ki Duk Kim
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Seung Min Ryu
- Department of Orthopedic Surgery, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Joon Beom Seo
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Sang Min Lee
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.
| | - Namkug Kim
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.
| |
Collapse
|
46
|
Moassefi M, Rouzrokh P, Conte GM, Vahdati S, Fu T, Tahmasebi A, Younis M, Farahani K, Gentili A, Kline T, Kitamura FC, Huo Y, Kuanar S, Younis K, Erickson BJ, Faghani S. Reproducibility of Deep Learning Algorithms Developed for Medical Imaging Analysis: A Systematic Review. J Digit Imaging 2023; 36:2306-2312. [PMID: 37407841 PMCID: PMC10501962 DOI: 10.1007/s10278-023-00870-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 06/08/2023] [Accepted: 06/09/2023] [Indexed: 07/07/2023] Open
Abstract
Since 2000, there have been more than 8000 publications on radiology artificial intelligence (AI). AI breakthroughs allow complex tasks to be automated and even performed beyond human capabilities. However, the lack of details on the methods and algorithm code undercuts its scientific value. Many science subfields have recently faced a reproducibility crisis, eroding trust in processes and results, and influencing the rise in retractions of scientific papers. For the same reasons, conducting research in deep learning (DL) also requires reproducibility. Although several valuable manuscript checklists for AI in medical imaging exist, they are not focused specifically on reproducibility. In this study, we conducted a systematic review of recently published papers in the field of DL to evaluate if the description of their methodology could allow the reproducibility of their findings. We focused on the Journal of Digital Imaging (JDI), a specialized journal that publishes papers on AI and medical imaging. We used the keyword "Deep Learning" and collected the articles published between January 2020 and January 2022. We screened all the articles and included the ones which reported the development of a DL tool in medical imaging. We extracted the reported details about the dataset, data handling steps, data splitting, model details, and performance metrics of each included article. We found 148 articles. Eighty were included after screening for articles that reported developing a DL model for medical image analysis. Five studies have made their code publicly available, and 35 studies have utilized publicly available datasets. We provided figures to show the ratio and absolute count of reported items from included studies. According to our cross-sectional study, in JDI publications on DL in medical imaging, authors infrequently report the key elements of their study to make it reproducible.
Collapse
Affiliation(s)
- Mana Moassefi
- Artificial Intelligence Lab, Department of Radiology, Mayo Clinic, Rochester, MN, USA.
| | - Pouria Rouzrokh
- Artificial Intelligence Lab, Department of Radiology, Mayo Clinic, Rochester, MN, USA
- Orthopedic Surgery Artificial Intelligence Laboratory (OSAIL), Department of Orthopedic Surgery, Mayo Clinic, Rochester, MN, USA
| | - Gian Marco Conte
- Artificial Intelligence Lab, Department of Radiology, Mayo Clinic, Rochester, MN, USA
| | - Sanaz Vahdati
- Artificial Intelligence Lab, Department of Radiology, Mayo Clinic, Rochester, MN, USA
| | - Tianyuan Fu
- Department of Radiology, University Hospitals Cleveland, Cleveland, OH, USA
| | - Aylin Tahmasebi
- Department of Radiology, Thomas Jefferson University, Philadelphia, PA, USA
| | - Mira Younis
- Cleveland Clinic Children's, Cleveland, OH, USA
| | - Keyvan Farahani
- National Cancer Institute, National Institutes of Health, Bethesda, MA, USA
| | - Amilcare Gentili
- Department of Radiology, University of California, San Diego, CA, USA
| | - Timothy Kline
- Department of Radiology, Mayo Clinic, Rochester, MN, USA
| | | | - Yuankai Huo
- Department of Electrical Engineering & Computer Science, Vanderbilt University, Nashville, TN, USA
| | - Shiba Kuanar
- Artificial Intelligence Lab, Department of Radiology, Mayo Clinic, Rochester, MN, USA
| | | | - Bradley J Erickson
- Artificial Intelligence Lab, Department of Radiology, Mayo Clinic, Rochester, MN, USA
| | - Shahriar Faghani
- Artificial Intelligence Lab, Department of Radiology, Mayo Clinic, Rochester, MN, USA.
| |
Collapse
|
47
|
Lonsdale H, Gray GM, Ahumada LM, Matava CT. Machine Vision and Image Analysis in Anesthesia: Narrative Review and Future Prospects. Anesth Analg 2023; 137:830-840. [PMID: 37712476 DOI: 10.1213/ane.0000000000006679] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/16/2023]
Abstract
Machine vision describes the use of artificial intelligence to interpret, analyze, and derive predictions from image or video data. Machine vision-based techniques are already in clinical use in radiology, ophthalmology, and dermatology, where some applications currently equal or exceed the performance of specialty physicians in areas of image interpretation. While machine vision in anesthesia has many potential applications, its development remains in its infancy in our specialty. Early research for machine vision in anesthesia has focused on automated recognition of anatomical structures during ultrasound-guided regional anesthesia or line insertion; recognition of the glottic opening and vocal cords during video laryngoscopy; prediction of the difficult airway using facial images; and clinical alerts for endobronchial intubation detected on chest radiograph. Current machine vision applications measuring the distance between endotracheal tube tip and carina have demonstrated noninferior performance compared to board-certified physicians. The performance and potential uses of machine vision for anesthesia will only grow with the advancement of underlying machine vision algorithm technical performance developed outside of medicine, such as convolutional neural networks and transfer learning. This article summarizes recently published works of interest, provides a brief overview of techniques used to create machine vision applications, explains frequently used terms, and discusses challenges the specialty will encounter as we embrace the advantages that this technology may bring to future clinical practice and patient care. As machine vision emerges onto the clinical stage, it is critically important that anesthesiologists are prepared to confidently assess which of these devices are safe, appropriate, and bring added value to patient care.
Collapse
Affiliation(s)
- Hannah Lonsdale
- From the Division of Pediatric Anesthesiology, Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Geoffrey M Gray
- Center for Pediatric Data Science and Analytics Methodology, Johns Hopkins All Children's Hospital, St Petersburg, Florida
| | - Luis M Ahumada
- Center for Pediatric Data Science and Analytics Methodology, Johns Hopkins All Children's Hospital, St Petersburg, Florida
| | - Clyde T Matava
- Department of Anesthesia and Pain Medicine, The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Anesthesiology and Pain Medicine, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
48
|
Hu Y, Jiang S, Yu X, Huang S, Lan Z, Yu Y, Zhang X, Chen J, Zhang J. Automatic epicardial adipose tissue segmentation in pulmonary computed tomography venography using nnU-Net. Quant Imaging Med Surg 2023; 13:6482-6492. [PMID: 37869313 PMCID: PMC10585557 DOI: 10.21037/qims-23-233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Accepted: 07/21/2023] [Indexed: 10/24/2023]
Abstract
Background Epicardial adipose tissue (EAT) is a key aspect in the investigation of cardiac pathophysiology. We sought to develop a deep learning (DL) model for fully automatic extraction and quantification of EAT through pulmonary computed tomography venography (PCTV) images. Methods In this retrospective study, we included 128 patients with atrial fibrillation and PCTV from 2 hospitals. A DL model for automated EAT segmentation was developed from a training set of 51 patients and a validation set of 13 patients from hospital A. The algorithm was further validated using an internal test set of 16 patients from hospital A and an external test set of 48 patients from hospital B. The consistency and measurement agreement of EAT quantification were compared between the DL model and the conventional manual protocol using the Dice score coefficient (DSC), Hausdorff distance (HD95), Pearson correlation coefficient, and Bland-Altman plot. Results In the internal and external test set, automated segmentation with DL was successful in all cases. The total analysis time was shorter for DL than for manual reconstruction (5.43±2.52 vs. 106.20±15.90 min; P<0.001). The EAT segmented with the DL model had good consistency with manual segmentation (the DSC of the internal and external test sets were 0.92±0.02 and 0.88±0.03, respectively). The quantification of EAT evaluated with the 2 methods showed excellent correlation (all correlation coefficients >0.9; all P values <0.001) and minimal measurement difference. Conclusions The proposed DL model achieved fully automatic quantification of EAT from PCTV images. The yielded results were highly consistent with those of manual quantification.
Collapse
Affiliation(s)
- Yifan Hu
- Department of Radiology, Dongtai People’s Hospital, Yancheng, China
| | - Shanshan Jiang
- Department of Clinical and Technical Support, Philips Healthcare, Xi’an, China
| | - Xiaojin Yu
- Department of Radiology, Dongtai People’s Hospital, Yancheng, China
| | - Sicong Huang
- Department of Clinical and Technical Support, Philips Healthcare, Xi’an, China
| | - Ziting Lan
- Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yarong Yu
- Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Xiaohui Zhang
- Department of Clinical Science, Philips Healthcare, Shanghai, China
| | - Jin Chen
- Department of Radiology, Dongtai People’s Hospital, Yancheng, China
| | - Jiayin Zhang
- Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
49
|
Kelly BS, Judge C, Hoare S, Colleran G, Lawlor A, Killeen RP. How to apply evidence-based practice to the use of artificial intelligence in radiology (EBRAI) using the data algorithm training output (DATO) method. Br J Radiol 2023; 96:20220215. [PMID: 37086062 PMCID: PMC10546467 DOI: 10.1259/bjr.20220215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 03/17/2023] [Accepted: 03/22/2023] [Indexed: 04/23/2023] Open
Abstract
OBJECTIVE As the number of radiology artificial intelligence (AI) papers increases, there are new challenges for reviewing the AI literature as well as differences to be aware of, for those familiar with the clinical radiology literature. We aim to introduce a tool to aid in this process. METHODS In evidence-based practise (EBP), you must Ask, Search, Appraise, Apply and Evaluate to come to an evidence-based decision. The bottom-up evidence-based radiology (EBR) method allows for a systematic way of choosing the correct radiological investigation or treatment. Just as the population intervention comparison outcome (PICO) method is an established means of asking an answerable question; herein, we introduce the data algorithm training output (DATO) method to complement PICO by considering Data, Algorithm, Training and Output in the use of AI to answer the question. RESULTS We illustrate the DATO method with a worked example concerning bone age assessment from skeletal radiographs. After a systematic search, 17 bone age estimation papers (5 of which externally validated their results) were appraised. The paper with the best DATO metrics found that an ensemble model combining uncorrelated, high performing simple models should achieve error rates comparable to human performance. CONCLUSION Considering DATO in the application of EBR to AI is a simple systematic approach to this potentially daunting subject. ADVANCES IN KNOWLEDGE The growth of AI in radiology means that radiologists and related professionals now need to be able to review not only clinical radiological literature but also research using AI methods. Considering Data, Algorithm, Training and Output in the application of EBR to AI is a simple systematic approach to this potentially daunting subject.
Collapse
Affiliation(s)
| | | | | | | | - Aonghus Lawlor
- Insight Centre for Data Analytics, University College Dublin, Belfield, Dublin, Ireland
| | | |
Collapse
|
50
|
Sachpekidis C, Enqvist O, Ulén J, Kopp-Schneider A, Pan L, Jauch A, Hajiyianni M, John L, Weinhold N, Sauer S, Goldschmidt H, Edenbrandt L, Dimitrakopoulou-Strauss A. Application of an artificial intelligence-based tool in [ 18F]FDG PET/CT for the assessment of bone marrow involvement in multiple myeloma. Eur J Nucl Med Mol Imaging 2023; 50:3697-3708. [PMID: 37493665 PMCID: PMC10547616 DOI: 10.1007/s00259-023-06339-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 07/09/2023] [Indexed: 07/27/2023]
Abstract
PURPOSE [18F]FDG PET/CT is an imaging modality of high performance in multiple myeloma (MM). Nevertheless, the inter-observer reproducibility in PET/CT scan interpretation may be hampered by the different patterns of bone marrow (BM) infiltration in the disease. Although many approaches have been recently developed to address the issue of standardization, none can yet be considered a standard method in the interpretation of PET/CT. We herein aim to validate a novel three-dimensional deep learning-based tool on PET/CT images for automated assessment of the intensity of BM metabolism in MM patients. MATERIALS AND METHODS Whole-body [18F]FDG PET/CT scans of 35 consecutive, previously untreated MM patients were studied. All patients were investigated in the context of an open-label, multicenter, randomized, active-controlled, phase 3 trial (GMMG-HD7). Qualitative (visual) analysis classified the PET/CT scans into three groups based on the presence and number of focal [18F]FDG-avid lesions as well as the degree of diffuse [18F]FDG uptake in the BM. The proposed automated method for BM metabolism assessment is based on an initial CT-based segmentation of the skeleton, its transfer to the SUV PET images, the subsequent application of different SUV thresholds, and refinement of the resulting regions using postprocessing. In the present analysis, six different SUV thresholds (Approaches 1-6) were applied for the definition of pathological tracer uptake in the skeleton [Approach 1: liver SUVmedian × 1.1 (axial skeleton), gluteal muscles SUVmedian × 4 (extremities). Approach 2: liver SUVmedian × 1.5 (axial skeleton), gluteal muscles SUVmedian × 4 (extremities). Approach 3: liver SUVmedian × 2 (axial skeleton), gluteal muscles SUVmedian × 4 (extremities). Approach 4: ≥ 2.5. Approach 5: ≥ 2.5 (axial skeleton), ≥ 2.0 (extremities). Approach 6: SUVmax liver]. Using the resulting masks, subsequent calculations of the whole-body metabolic tumor volume (MTV) and total lesion glycolysis (TLG) in each patient were performed. A correlation analysis was performed between the automated PET values and the results of the visual PET/CT analysis as well as the histopathological, cytogenetical, and clinical data of the patients. RESULTS BM segmentation and calculation of MTV and TLG after the application of the deep learning tool were feasible in all patients. A significant positive correlation (p < 0.05) was observed between the results of the visual analysis of the PET/CT scans for the three patient groups and the MTV and TLG values after the employment of all six [18F]FDG uptake thresholds. In addition, there were significant differences between the three patient groups with regard to their MTV and TLG values for all applied thresholds of pathological tracer uptake. Furthermore, we could demonstrate a significant, moderate, positive correlation of BM plasma cell infiltration and plasma levels of β2-microglobulin with the automated quantitative PET/CT parameters MTV and TLG after utilization of Approaches 1, 2, 4, and 5. CONCLUSIONS The automated, volumetric, whole-body PET/CT assessment of the BM metabolic activity in MM is feasible with the herein applied method and correlates with clinically relevant parameters in the disease. This methodology offers a potentially reliable tool in the direction of optimization and standardization of PET/CT interpretation in MM. Based on the present promising findings, the deep learning-based approach will be further evaluated in future prospective studies with larger patient cohorts.
Collapse
Affiliation(s)
- Christos Sachpekidis
- Clinical Cooperation Unit Nuclear Medicine, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69210, Heidelberg, Germany.
| | - Olof Enqvist
- Eigenvision AB, Malmö, Sweden
- Department of Electrical Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | | | | | - Leyun Pan
- Clinical Cooperation Unit Nuclear Medicine, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69210, Heidelberg, Germany
| | - Anna Jauch
- Institute of Human Genetics, University of Heidelberg, Heidelberg, Germany
| | - Marina Hajiyianni
- Department of Internal Medicine V, University Hospital Heidelberg and National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | - Lukas John
- Department of Internal Medicine V, University Hospital Heidelberg and National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | - Niels Weinhold
- Department of Internal Medicine V, University Hospital Heidelberg and National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | - Sandra Sauer
- Department of Internal Medicine V, University Hospital Heidelberg and National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | - Hartmut Goldschmidt
- Department of Internal Medicine V, University Hospital Heidelberg and National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | - Lars Edenbrandt
- Department of Clinical Physiology, Region Västra Götaland, Sahlgrenska University Hospital, Gothenburg, Sweden
- Department of Molecular and Clinical Medicine, Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Antonia Dimitrakopoulou-Strauss
- Clinical Cooperation Unit Nuclear Medicine, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69210, Heidelberg, Germany
| |
Collapse
|