1
|
Sarangi PK, Narayan RK, Mohakud S, Vats A, Sahani D, Mondal H. Assessing the Capability of ChatGPT, Google Bard, and Microsoft Bing in Solving Radiology Case Vignettes. Indian J Radiol Imaging 2024; 34:276-282. [PMID: 38549897 PMCID: PMC10972658 DOI: 10.1055/s-0043-1777746] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/25/2024] Open
Abstract
Background The field of radiology relies on accurate interpretation of medical images for effective diagnosis and patient care. Recent advancements in artificial intelligence (AI) and natural language processing have sparked interest in exploring the potential of AI models in assisting radiologists. However, limited research has been conducted to assess the performance of AI models in radiology case interpretation, particularly in comparison to human experts. Objective This study aimed to evaluate the performance of ChatGPT, Google Bard, and Bing in solving radiology case vignettes (Fellowship of the Royal College of Radiologists 2A [FRCR2A] examination style questions) by comparing their responses to those provided by two radiology residents. Methods A total of 120 multiple-choice questions based on radiology case vignettes were formulated according to the pattern of FRCR2A examination. The questions were presented to ChatGPT, Google Bard, and Bing. Two residents wrote the examination with the same questions in 3 hours. The responses generated by the AI models were collected and compared to the answer keys and explanation of the answers was rated by the two radiologists. A cutoff of 60% was set as the passing score. Results The two residents (63.33 and 57.5%) outperformed the three AI models: Bard (44.17%), Bing (53.33%), and ChatGPT (45%), but only one resident passed the examination. The response patterns among the five respondents were significantly different ( p = 0.0117). In addition, the agreement among the generative AI models was significant (intraclass correlation coefficient [ICC] = 0.628), but there was no agreement between the residents (Kappa = -0.376). The explanation of generative AI models in support of answer was 44.72% accurate. Conclusion Humans exhibited superior accuracy compared to the AI models, showcasing a stronger comprehension of the subject matter. All three AI models included in the study could not achieve the minimum percentage needed to pass an FRCR2A examination. However, generative AI models showed significant agreement in their answers where the residents exhibited low agreement, highlighting a lack of consistency in their responses.
Collapse
Affiliation(s)
- Pradosh Kumar Sarangi
- Department of Radiodiagnosis, All India Institute of Medical Sciences, Deoghar, Jharkhand, India
| | - Ravi Kant Narayan
- Department of Anatomy, ESIC Medical College & Hospital, Bihta, Patna, Bihar, India
| | - Sudipta Mohakud
- Department of Radiodiagnosis, All India Institute of Medical Sciences, Bhubaneswar, Odisha, India
| | - Aditi Vats
- Department of Radiodiagnosis, All India Institute of Medical Sciences, Bhubaneswar, Odisha, India
| | - Debabrata Sahani
- Department of Radiodiagnosis, All India Institute of Medical Sciences, Bhubaneswar, Odisha, India
| | - Himel Mondal
- Department of Physiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, India
| |
Collapse
|
2
|
Lockwood P, Burton C, Woznitza N, Shaw T. Assessing the barriers and enablers to the implementation of the diagnostic radiographer musculoskeletal X-ray reporting service within the NHS in England: a systematic literature review. BMC Health Serv Res 2023; 23:1270. [PMID: 37974199 PMCID: PMC10655396 DOI: 10.1186/s12913-023-10161-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 10/16/2023] [Indexed: 11/19/2023] Open
Abstract
INTRODUCTION The United Kingdom (UK) government's healthcare policy in the early 1990s paved the way adoption of the skills mix development and implementation of diagnostic radiographers' X-ray reporting service. Current clinical practice within the public UK healthcare system reflects the same pressures of increased demand in patient imaging and limited capacity of the reporting workforce (radiographers and radiologists) as in the 1990s. This study aimed to identify, define and assess the longitudinal macro, meso, and micro barriers and enablers to the implementation of the diagnostic radiographer musculoskeletal X-ray reporting service in the National Healthcare System (NHS) in England. METHODS Multiple independent databases were searched, including PubMed, Ovid MEDLINE; Embase; CINAHL, and Google Scholar, as well as journal databases (Scopus, Wiley), healthcare databases (NHS Evidence Database; Cochrane Library) and grey literature databases (OpenGrey, GreyNet International, and the British Library EthOS depository) and recorded in a PRISMA flow chart. A combination of keywords, Boolean logic, truncation, parentheses and wildcards with inclusion/exclusion criteria and a time frame of 1995-2022 was applied. The literature was assessed against Joanna Briggs Institute's critical appraisal checklists. With meta-aggregation to synthesize each paper, and coded using NVivo, with context grouped into macro, meso, and micro-level sources and categorised into subgroups of enablers and barriers. RESULTS The wide and diverse range of data (n = 241 papers) identified barriers and enablers of implementation, which were categorised into measures of macro, meso, and micro levels, and thematic categories of context, culture, environment, and leadership. CONCLUSION The literature since 1995 has reframed the debates on implementation of the radiographer reporting role and has been instrumental in shaping clinical practice. There has been clear influence upon both meso (professional body) and macro-level (governmental/health service) policies and guidance, that have shaped change at micro-level NHS Trust organisations. There is evidence of a shift in culturally intrenched legacy perspectives within and between different meso-level professional bodies around skills mix acceptance and role boundaries. This has helped shape capacity building of the reporting workforce. All of which have contributed to conceptual understandings of the skills mix workforce within modern radiology services.
Collapse
Affiliation(s)
- P Lockwood
- Present address: School of Allied Health Professions, Faculty of Medicine, Health and Social Care, Canterbury Christ Church University, North Holmes Road, Canterbury, Kent, UK.
| | - C Burton
- Present address: School of Allied Health Professions, Faculty of Medicine, Health and Social Care, Canterbury Christ Church University, North Holmes Road, Canterbury, Kent, UK
| | - N Woznitza
- Present address: School of Allied Health Professions, Faculty of Medicine, Health and Social Care, Canterbury Christ Church University, North Holmes Road, Canterbury, Kent, UK
- Radiology Department, University College London Hospitals NHS Foundation Trust, 235 Euston Road, London, UK
| | - T Shaw
- Present address: School of Allied Health Professions, Faculty of Medicine, Health and Social Care, Canterbury Christ Church University, North Holmes Road, Canterbury, Kent, UK
| |
Collapse
|
3
|
Shelmerdine SC, Martin H, Shirodkar K, Shamshuddin S, Weir-McCall JR. Can artificial intelligence pass the Fellowship of the Royal College of Radiologists examination? Multi-reader diagnostic accuracy study. BMJ 2022; 379:e072826. [PMID: 36543352 PMCID: PMC9768816 DOI: 10.1136/bmj-2022-072826] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
OBJECTIVE To determine whether an artificial intelligence candidate could pass the rapid (radiographic) reporting component of the Fellowship of the Royal College of Radiologists (FRCR) examination. DESIGN Prospective multi-reader diagnostic accuracy study. SETTING United Kingdom. PARTICIPANTS One artificial intelligence candidate (Smarturgences, Milvue) and 26 radiologists who had passed the FRCR examination in the preceding 12 months. MAIN OUTCOME MEASURES Accuracy and pass rate of the artificial intelligence compared with radiologists across 10 mock FRCR rapid reporting examinations (each examination containing 30 radiographs, requiring 90% accuracy rate to pass). RESULTS When non-interpretable images were excluded from the analysis, the artificial intelligence candidate achieved an average overall accuracy of 79.5% (95% confidence interval 74.1% to 84.3%) and passed two of 10 mock FRCR examinations. The average radiologist achieved an average accuracy of 84.8% (76.1-91.9%) and passed four of 10 mock examinations. The sensitivity for the artificial intelligence was 83.6% (95% confidence interval 76.2% to 89.4%) and the specificity was 75.2% (66.7% to 82.5%), compared with summary estimates across all radiologists of 84.1% (81.0% to 87.0%) and 87.3% (85.0% to 89.3%). Across 148/300 radiographs that were correctly interpreted by >90% of radiologists, the artificial intelligence candidate was incorrect in 14/148 (9%). In 20/300 radiographs that most (>50%) radiologists interpreted incorrectly, the artificial intelligence candidate was correct in 10/20 (50%). Most imaging pitfalls related to interpretation of musculoskeletal rather than chest radiographs. CONCLUSIONS When special dispensation for the artificial intelligence candidate was provided (that is, exclusion of non-interpretable images), the artificial intelligence candidate was able to pass two of 10 mock examinations. Potential exists for the artificial intelligence candidate to improve its radiographic interpretation skills by focusing on musculoskeletal cases and learning to interpret radiographs of the axial skeleton and abdomen that are currently considered "non-interpretable."
Collapse
Affiliation(s)
- Susan Cheng Shelmerdine
- Department of Clinical Radiology, Great Ormond Street Hospital for Children, London, UK
- UCL Great Ormond Street Institute of Child Health, Great Ormond Street Hospital for Children, London, UK
- NIHR Great Ormond Street Hospital Biomedical Research Centre, London, UK
- Department of Clinical Radiology, St George's Hospital, London, UK
| | - Helena Martin
- Department of Clinical Radiology, St George's Hospital, London, UK
| | - Kapil Shirodkar
- Department of Radiology, University Hospitals of Morecambe Bay NHS Trust, Royal Lancaster Infirmary, Lancaster, UK
| | - Sameer Shamshuddin
- Department of Radiology, University Hospitals of Morecambe Bay NHS Trust, Royal Lancaster Infirmary, Lancaster, UK
| | - Jonathan Richard Weir-McCall
- School of Clinical Medicine, University of Cambridge, Cambridge, UK
- Department of Radiology, Royal Papworth Hospital, Cambridge, UK
| |
Collapse
|
4
|
Alkhalaf ZSA, Yakar D, de Groot JC, Dierckx RAJO, Kwee TC. Medical knowledge and clinical productivity: independently correlated metrics during radiology residency. Eur Radiol 2021; 31:5344-5350. [PMID: 33449176 PMCID: PMC8213654 DOI: 10.1007/s00330-020-07646-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Revised: 12/07/2020] [Accepted: 12/17/2020] [Indexed: 11/24/2022]
Abstract
Objective To determine the association between medical knowledge relevant to radiology practice (as measured by the Dutch radiology progress test [DRPT]) and clinical productivity during radiology residency. Methods This study analyzed the results of 6 DRPTs and time period–matched clinical production points of radiology residents affiliated to a tertiary care academic medical center between 2013 and 2016. The Spearman correlation analysis was performed to determine the association between DRPT percentile scores and average daily clinical production points. Linear regression analyses were performed to determine the association of DRPT percentile scores with average daily clinical production points, adjusted for age and gender of the radiology resident, and postgraduate year. Results Eighty-four DRPTs with time period–matched clinical production points were included. These 84 DRPTs were made by 29 radiology residents (18 males and 11 females) with a median age of 31 years (range: 26–38 years). The Spearman correlation coefficient between DRPT percentile scores and average daily clinical production points was 0.550 (95% confidence interval: 0.381–0.694) (p < 0.001), indicating a significant moderate positive association. On multivariate analysis, average daily clinical production points (β coefficient of 0.035, p = 0.003), female gender of the radiology resident (β coefficient of 12.690, p = 0.001), and postgraduate year (β coefficient of 10.179, p < 0.001) were significantly associated with DRPT percentile scores. These three independent variables achieved an adjusted R2 of 0.527. Conclusion Clinical productivity is independently associated with medical knowledge relevant to radiology practice during radiology residency. These findings indicate that clinical productivity of a resident could be a potentially relevant metric in a radiology training program. Key Points • There is a significant moderate correlation between medical knowledge relevant to radiology practice and clinical productivity during radiology residency. • Medical knowledge relevant to radiology practice remains independently associated with clinical productivity during radiology residency after adjustment for postgraduate year and gender. • Clinical productivity of a resident may be regarded as a potentially relevant metric in a radiology training program.
Collapse
Affiliation(s)
- Zahraa S A Alkhalaf
- Department of Radiology, Nuclear Medicine and Molecular Imaging, Medical Imaging Center, University Medical Center Groningen, University of Groningen, P.O. Box 30.001, 9700, RB, Groningen, The Netherlands
| | - Derya Yakar
- Department of Radiology, Nuclear Medicine and Molecular Imaging, Medical Imaging Center, University Medical Center Groningen, University of Groningen, P.O. Box 30.001, 9700, RB, Groningen, The Netherlands
| | - Jan Cees de Groot
- Department of Radiology, Nuclear Medicine and Molecular Imaging, Medical Imaging Center, University Medical Center Groningen, University of Groningen, P.O. Box 30.001, 9700, RB, Groningen, The Netherlands
| | - Rudi A J O Dierckx
- Department of Radiology, Nuclear Medicine and Molecular Imaging, Medical Imaging Center, University Medical Center Groningen, University of Groningen, P.O. Box 30.001, 9700, RB, Groningen, The Netherlands
| | - Thomas C Kwee
- Department of Radiology, Nuclear Medicine and Molecular Imaging, Medical Imaging Center, University Medical Center Groningen, University of Groningen, P.O. Box 30.001, 9700, RB, Groningen, The Netherlands.
| |
Collapse
|
5
|
Tan ASM. Review of Junior Resident Plain Film Reporting and Audit in Singapore. J Grad Med Educ 2020; 12:493-497. [PMID: 32879692 PMCID: PMC7450738 DOI: 10.4300/jgme-d-19-00678.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/29/2019] [Revised: 03/02/2020] [Accepted: 05/18/2020] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND Graduate medical education in Singapore recently underwent significant restructuring, leading to the accreditation of residency programs by the Accreditation Council for Graduate Medical Education-International (ACGME-I). In radiology, this involved a change in teaching and quality assurance of plain film (PF) reporting. PF reported by junior residents (postgraduate year 1-3) are subject to a 50% random audit. To date, national data on junior resident performance in PF reporting have not been published. OBJECTIVE We reviewed performance in PF reporting under the current teaching and audit framework. METHODS Retrospective review of junior resident reported PF audit data from all 3 radiology residency programs in Singapore. The number of residents audited, number of PF reported and audited, and major discrepancy rates were analyzed. RESULTS On average, 86 440 PF were audited annually nationwide from an estimated 184 288 junior resident-reported PF. Each program trained between 4 to 24 junior residents annually (mean 15), averaging about 44 each year nationwide. A mean of 28 813 PF were audited annually in each program (range 4355-50 880). An estimated mean of 4148 PF (range 1452-9752) were reported per junior resident per year, about 346 PF per month. The major discrepancy rate ranged from 0.04% to 1.13% (mean 0.34%). One resident required remediation in the study period. CONCLUSIONS Structured residency training in Singapore has produced a high level of junior resident competency in PF interpretation.
Collapse
|
6
|
Ooi SKG, Makmur A, Soon AYQ, Fook-Chong S, Liew C, Sia SY, Ting YH, Lim CY. Attitudes toward artificial intelligence in radiology with learner needs assessment within radiology residency programmes: a national multi-programme survey. Singapore Med J 2019; 62:126-134. [PMID: 31680181 DOI: 10.11622/smedj.2019141] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
INTRODUCTION We aimed to assess the attitudes and learner needs of radiology residents and faculty radiologists regarding artificial intelligence (AI) and machine learning (ML) in radiology. METHODS A web-based questionnaire, designed using SurveyMonkey, was sent out to residents and faculty radiologists in all three radiology residency programmes in Singapore. The questionnaire comprised four sections and aimed to evaluate respondents' current experience, attempts at self-learning, perceptions of career prospects and expectations of an AI/ML curriculum in their residency programme. Respondents' anonymity was ensured. RESULTS A total of 125 respondents (86 male, 39 female; 70 residents, 55 faculty radiologists) completed the questionnaire. The majority agreed that AI/ML will drastically change radiology practice (88.8%) and makes radiology more exciting (76.0%), and most would still choose to specialise in radiology if given a choice (80.0%). 64.8% viewed themselves as novices in their understanding of AI/ML, 76.0% planned to further advance their AI/ML knowledge and 67.2% were keen to get involved in an AI/ML research project. An overwhelming majority (84.8%) believed that AI/ML knowledge should be taught during residency, and most opined that this was as important as imaging physics and clinical skills/knowledge curricula (80.0% and 72.8%, respectively). More than half thought that their residency programme had not adequately implemented AI/ML teaching (59.2%). In subgroup analyses, male and tech-savvy respondents were more involved in AI/ML activities, leading to better technical understanding. CONCLUSION A growing optimism towards radiology undergoing technological transformation and AI/ML implementation has led to a strong demand for an AI/ML curriculum in residency education.
Collapse
Affiliation(s)
- Su Kai Gideon Ooi
- Department of Nuclear Medicine and Molecular Imaging, Division of Radiological Sciences, Singapore General Hospital, Singapore
| | - Andrew Makmur
- Department of Diagnostic Imaging, National University Hospital, Singapore
| | | | | | - Charlene Liew
- Department of Diagnostic Radiology, Changi General Hospital, Singapore
| | - Soon Yiew Sia
- Department of Diagnostic Imaging, National University Hospital, Singapore
| | - Yong Han Ting
- Department of Diagnostic Radiology, Tan Tock Seng Hospital, Singapore
| | - Chee Yeong Lim
- Department of Diagnostic Radiology, Division of Radiological Sciences, Singapore General Hospital, Singapore
| |
Collapse
|