Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gaube S, Suresh H, Raue M, Merritt A, Berkowitz SJ, Lermer E, Coughlin JF, Guttag JV, Colak E, Ghassemi M. Do as AI say: susceptibility in deployment of clinical decision-aids. NPJ Digit Med 2021;4:31. [PMID: 33608629 PMCID: PMC7896064 DOI: 10.1038/s41746-021-00385-9] [Citation(s) in RCA: 112] [Impact Index Per Article: 37.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 01/07/2021] [Indexed: 02/07/2023] Open

For:	Gaube S, Suresh H, Raue M, Merritt A, Berkowitz SJ, Lermer E, Coughlin JF, Guttag JV, Colak E, Ghassemi M. Do as AI say: susceptibility in deployment of clinical decision-aids. NPJ Digit Med 2021;4:31. [PMID: 33608629 PMCID: PMC7896064 DOI: 10.1038/s41746-021-00385-9] [Citation(s) in RCA: 112] [Impact Index Per Article: 37.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 01/07/2021] [Indexed: 02/07/2023] Open

Number

Cited by Other Article(s)

Weber S, Wyszynski M, Godefroid M, Plattfaut R, Niehaves B. How do medical professionals make sense (or not) of AI? A social-media-based computational grounded theory study and an online survey. Comput Struct Biotechnol J 2024;24:146-159. [PMID: 38434249 PMCID: PMC10904922 DOI: 10.1016/j.csbj.2024.02.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 02/14/2024] [Accepted: 02/14/2024] [Indexed: 03/05/2024] Open

Kenny R, Fischhoff B, Davis A, Canfield C. Improving Social Bot Detection Through Aid and Training. HUMAN FACTORS 2024;66:2323-2344. [PMID: 37963198 PMCID: PMC11382440 DOI: 10.1177/00187208231210145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2023]

Wahid KA, Kaffey ZY, Farris DP, Humbert-Vidan L, Moreno AC, Rasmussen M, Ren J, Naser MA, Netherton TJ, Korreman S, Balakrishnan G, Fuller CD, Fuentes D, Dohopolski MJ. Artificial intelligence uncertainty quantification in radiotherapy applications - A scoping review. Radiother Oncol 2024;201:110542. [PMID: 39299574 DOI: 10.1016/j.radonc.2024.110542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 08/18/2024] [Accepted: 09/09/2024] [Indexed: 09/22/2024]

Abstract

BACKGROUND/PURPOSE

The use of artificial intelligence (AI) in radiotherapy (RT) is expanding rapidly. However, there exists a notable lack of clinician trust in AI models, underscoring the need for effective uncertainty quantification (UQ) methods. The purpose of this study was to scope existing literature related to UQ in RT, identify areas of improvement, and determine future directions.

METHODS

We followed the PRISMA-ScR scoping review reporting guidelines. We utilized the population (human cancer patients), concept (utilization of AI UQ), context (radiotherapy applications) framework to structure our search and screening process. We conducted a systematic search spanning seven databases, supplemented by manual curation, up to January 2024. Our search yielded a total of 8980 articles for initial review. Manuscript screening and data extraction was performed in Covidence. Data extraction categories included general study characteristics, RT characteristics, AI characteristics, and UQ characteristics.

RESULTS

We identified 56 articles published from 2015 to 2024. 10 domains of RT applications were represented; most studies evaluated auto-contouring (50 %), followed by image-synthesis (13 %), and multiple applications simultaneously (11 %). 12 disease sites were represented, with head and neck cancer being the most common disease site independent of application space (32 %). Imaging data was used in 91 % of studies, while only 13 % incorporated RT dose information. Most studies focused on failure detection as the main application of UQ (60 %), with Monte Carlo dropout being the most commonly implemented UQ method (32 %) followed by ensembling (16 %). 55 % of studies did not share code or datasets.

CONCLUSION

Our review revealed a lack of diversity in UQ for RT applications beyond auto-contouring. Moreover, we identified a clear need to study additional UQ methods, such as conformal prediction. Our results may incentivize the development of guidelines for reporting and implementation of UQ in RT.

Collapse

Mortlock R, Lucas C. Generative artificial intelligence (Gen-AI) in pharmacy education: Utilization and implications for academic integrity: A scoping review. EXPLORATORY RESEARCH IN CLINICAL AND SOCIAL PHARMACY 2024;15:100481. [PMID: 39184524 PMCID: PMC11341932 DOI: 10.1016/j.rcsop.2024.100481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 07/16/2024] [Accepted: 07/17/2024] [Indexed: 08/27/2024] Open

Abstract

Introduction

Generative artificial intelligence (Gen-AI), exemplified by the widely adopted ChatGPT, has garnered significant attention in recent years. Its application spans various health education domains, including pharmacy, where its potential benefits and drawbacks have become increasingly apparent. Despite the growing adoption of Gen-AIsuch as ChatGPT in pharmacy education, there remains a critical need to assess and mitigate associated risks. This review exploresthe literature and potential strategies for mitigating risks associated with the integration of Gen-AI in pharmacy education.

Aim

To conduct a scoping review to identify implications of Gen-AI in pharmacy education, identify its use and emerging evidence, with a particular focus on strategies which mitigate potential risks to academic integrity.

Methods

A scoping review strategy was employed in accordance with the PRISMA-ScR guidelines. Databases searched includedPubMed, ERIC [Education Resources Information Center], Scopus and ProQuestfrom August 2023 to 20 February 2024 and included all relevant records from 1 January 2000 to 20 February 2024 relating specifically to LLM use within pharmacy education. A grey literature search was also conducted due to the emerging nature of this topic. Policies, procedures, and documents from institutions such as universities and colleges, including standards, guidelines, and policy documents, were hand searched and reviewed in their most updated form. These documents were not published in the scientific literature or indexed in academic search engines.

Results

Articles (n = 12) were derived from the scientific data bases and Records (n = 9) derived from the grey literature. Potential use and benefits of Gen-AI within pharmacy education were identified in all included published articles however there was a paucity of published articles related the degree of consideration to the potential risks to academic integrity. Grey literature recordsheld the largest proportion of risk mitigation strategies largely focusing on increased academic and student education and training relating to the ethical use of Gen-AI as well considerations for redesigning of current assessments likely to be a risk for Gen-AI use to academic integrity.

Conclusion

Drawing upon existing literature, this review highlights the importance of evidence-based approaches to address the challenges posed by Gen-AI such as ChatGPT in pharmacy education settings. Additionally, whilst mitigation strategies are suggested, primarily drawn from the grey literature, there is a paucity of traditionally published scientific literature outlining strategies for the practical and ethical implementation of Gen-AI within pharmacy education. Further research related to the responsible and ethical use of Gen-AIin pharmacy curricula; and studies related to strategies adopted to mitigate risks to academic integrity would be beneficial.

Collapse

Topff L, Steltenpool S, Ranschaert ER, Ramanauskas N, Menezes R, Visser JJ, Beets-Tan RGH, Hartkamp NS. Artificial intelligence-assisted double reading of chest radiographs to detect clinically relevant missed findings: a two-centre evaluation. Eur Radiol 2024;34:5876-5885. [PMID: 38466390 PMCID: PMC11364654 DOI: 10.1007/s00330-024-10676-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 01/21/2024] [Accepted: 02/01/2024] [Indexed: 03/13/2024]

Abstract

OBJECTIVES

To evaluate an artificial intelligence (AI)-assisted double reading system for detecting clinically relevant missed findings on routinely reported chest radiographs.

METHODS

A retrospective study was performed in two institutions, a secondary care hospital and tertiary referral oncology centre. Commercially available AI software performed a comparative analysis of chest radiographs and radiologists' authorised reports using a deep learning and natural language processing algorithm, respectively. The AI-detected discrepant findings between images and reports were assessed for clinical relevance by an external radiologist, as part of the commercial service provided by the AI vendor. The selected missed findings were subsequently returned to the institution's radiologist for final review.

RESULTS

In total, 25,104 chest radiographs of 21,039 patients (mean age 61.1 years ± 16.2 [SD]; 10,436 men) were included. The AI software detected discrepancies between imaging and reports in 21.1% (5289 of 25,104). After review by the external radiologist, 0.9% (47 of 5289) of cases were deemed to contain clinically relevant missed findings. The institution's radiologists confirmed 35 of 47 missed findings (74.5%) as clinically relevant (0.1% of all cases). Missed findings consisted of lung nodules (71.4%, 25 of 35), pneumothoraces (17.1%, 6 of 35) and consolidations (11.4%, 4 of 35).

CONCLUSION

The AI-assisted double reading system was able to identify missed findings on chest radiographs after report authorisation. The approach required an external radiologist to review the AI-detected discrepancies. The number of clinically relevant missed findings by radiologists was very low.

CLINICAL RELEVANCE STATEMENT

The AI-assisted double reader workflow was shown to detect diagnostic errors and could be applied as a quality assurance tool. Although clinically relevant missed findings were rare, there is potential impact given the common use of chest radiography.

KEY POINTS

• A commercially available double reading system supported by artificial intelligence was evaluated to detect reporting errors in chest radiographs (n=25,104) from two institutions. • Clinically relevant missed findings were found in 0.1% of chest radiographs and consisted of unreported lung nodules, pneumothoraces and consolidations. • Applying AI software as a secondary reader after report authorisation can assist in reducing diagnostic errors without interrupting the radiologist's reading workflow. However, the number of AI-detected discrepancies was considerable and required review by a radiologist to assess their relevance.

Collapse

Desolda G, Dimauro G, Esposito A, Lanzilotti R, Matera M, Zancanaro M. A Human-AI interaction paradigm and its application to rhinocytology. Artif Intell Med 2024;155:102933. [PMID: 39094227 DOI: 10.1016/j.artmed.2024.102933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 07/17/2024] [Accepted: 07/19/2024] [Indexed: 08/04/2024]

Moosavi A, Huang S, Vahabi M, Motamedivafa B, Tian N, Mahmood R, Liu P, Sun CL. Prospective Human Validation of Artificial Intelligence Interventions in Cardiology: A Scoping Review. JACC. ADVANCES 2024;3:101202. [PMID: 39372457 PMCID: PMC11450923 DOI: 10.1016/j.jacadv.2024.101202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 07/09/2024] [Accepted: 07/11/2024] [Indexed: 10/08/2024]

Abstract

Background

Despite the potential of artificial intelligence (AI) in enhancing cardiovascular care, its integration into clinical practice is limited by a lack of evidence on its effectiveness with respect to human experts or gold standard practices in real-world settings.

Objectives

The purpose of this study was to identify AI interventions in cardiology that have been prospectively validated against human expert benchmarks or gold standard practices, assessing their effectiveness, and identifying future research areas.

Methods

We systematically reviewed Scopus and MEDLINE to identify peer-reviewed publications that involved prospective human validation of AI-based interventions in cardiology from January 2015 to December 2023.

Results

Of 2,351 initial records, 64 studies were included. Among these studies, 59 (92.2%) were published after 2020. A total of 11 (17.2%) randomized controlled trials were published. AI interventions in 44 articles (68.75%) reported definite clinical or operational improvements over human experts. These interventions were mostly used in imaging (n = 14, 21.9%), ejection fraction (n = 10, 15.6%), arrhythmia (n = 9, 14.1%), and coronary artery disease (n = 12, 18.8%) application areas. Convolutional neural networks were the most common predictive model (n = 44, 69%), and images were the most used data type (n = 38, 54.3%). Only 22 (34.4%) studies made their models or data accessible.

Conclusions

This review identifies the potential of AI in cardiology, with models often performing equally well as human counterparts for specific and clearly scoped tasks suitable for such models. Nonetheless, the limited number of randomized controlled trials emphasizes the need for continued validation, especially in real-world settings that closely examine joint human AI decision-making.

Collapse

McCradden MD, Stedman I. Explaining decisions without explainability? Artificial intelligence and medicolegal accountability. Future Healthc J 2024;11:100171. [PMID: 39371527 PMCID: PMC11452834 DOI: 10.1016/j.fhj.2024.100171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Accepted: 08/06/2024] [Indexed: 10/08/2024]

Dingel J, Kleine AK, Cecil J, Sigl AL, Lermer E, Gaube S. Predictors of Health Care Practitioners' Intention to Use AI-Enabled Clinical Decision Support Systems: Meta-Analysis Based on the Unified Theory of Acceptance and Use of Technology. J Med Internet Res 2024;26:e57224. [PMID: 39102675 PMCID: PMC11333871 DOI: 10.2196/57224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 05/03/2024] [Accepted: 05/13/2024] [Indexed: 08/07/2024] Open

Abstract

BACKGROUND

Artificial intelligence-enabled clinical decision support systems (AI-CDSSs) offer potential for improving health care outcomes, but their adoption among health care practitioners remains limited.

OBJECTIVE

This meta-analysis identified predictors influencing health care practitioners' intention to use AI-CDSSs based on the Unified Theory of Acceptance and Use of Technology (UTAUT). Additional predictors were examined based on existing empirical evidence.

METHODS

The literature search using electronic databases, forward searches, conference programs, and personal correspondence yielded 7731 results, of which 17 (0.22%) studies met the inclusion criteria. Random-effects meta-analysis, relative weight analyses, and meta-analytic moderation and mediation analyses were used to examine the relationships between relevant predictor variables and the intention to use AI-CDSSs.

RESULTS

The meta-analysis results supported the application of the UTAUT to the context of the intention to use AI-CDSSs. The results showed that performance expectancy (r=0.66), effort expectancy (r=0.55), social influence (r=0.66), and facilitating conditions (r=0.66) were positively associated with the intention to use AI-CDSSs, in line with the predictions of the UTAUT. The meta-analysis further identified positive attitude (r=0.63), trust (r=0.73), anxiety (r=-0.41), perceived risk (r=-0.21), and innovativeness (r=0.54) as additional relevant predictors. Trust emerged as the most influential predictor overall. The results of the moderation analyses show that the relationship between social influence and use intention becomes weaker with increasing age. In addition, the relationship between effort expectancy and use intention was stronger for diagnostic AI-CDSSs than for devices that combined diagnostic and treatment recommendations. Finally, the relationship between facilitating conditions and use intention was mediated through performance and effort expectancy.

CONCLUSIONS

This meta-analysis contributes to the understanding of the predictors of intention to use AI-CDSSs based on an extended UTAUT model. More research is needed to substantiate the identified relationships and explain the observed variations in effect sizes by identifying relevant moderating factors. The research findings bear important implications for the design and implementation of training programs for health care practitioners to ease the adoption of AI-CDSSs into their practice.

Collapse

Rainey C, Bond R, McConnell J, Hughes C, Kumar D, McFadden S. Reporting radiographers' interaction with Artificial Intelligence-How do different forms of AI feedback impact trust and decision switching? PLOS DIGITAL HEALTH 2024;3:e0000560. [PMID: 39110687 PMCID: PMC11305567 DOI: 10.1371/journal.pdig.0000560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 06/22/2024] [Indexed: 08/10/2024]

Abstract

Artificial Intelligence (AI) has been increasingly integrated into healthcare settings, including the radiology department to aid radiographic image interpretation, including reporting by radiographers. Trust has been cited as a barrier to effective clinical implementation of AI. Appropriating trust will be important in the future with AI to ensure the ethical use of these systems for the benefit of the patient, clinician and health services. Means of explainable AI, such as heatmaps have been proposed to increase AI transparency and trust by elucidating which parts of image the AI 'focussed on' when making its decision. The aim of this novel study was to quantify the impact of different forms of AI feedback on the expert clinicians' trust. Whilst this study was conducted in the UK, it has potential international application and impact for AI interface design, either globally or in countries with similar cultural and/or economic status to the UK. A convolutional neural network was built for this study; trained, validated and tested on a publicly available dataset of MUsculoskeletal RAdiographs (MURA), with binary diagnoses and Gradient Class Activation Maps (GradCAM) as outputs. Reporting radiographers (n = 12) were recruited to this study from all four regions of the UK. Qualtrics was used to present each participant with a total of 18 complete examinations from the MURA test dataset (each examination contained more than one radiographic image). Participants were presented with the images first, images with heatmaps next and finally an AI binary diagnosis in a sequential order. Perception of trust in the AI systems was obtained following the presentation of each heatmap and binary feedback. The participants were asked to indicate whether they would change their mind (or decision switch) in response to the AI feedback. Participants disagreed with the AI heatmaps for the abnormal examinations 45.8% of the time and agreed with binary feedback on 86.7% of examinations (26/30 presentations).'Only two participants indicated that they would decision switch in response to all AI feedback (GradCAM and binary) (0.7%, n = 2) across all datasets. 22.2% (n = 32) of participants agreed with the localisation of pathology on the heatmap. The level of agreement with the GradCAM and binary diagnosis was found to be correlated with trust (GradCAM:-.515;-.584, significant large negative correlation at 0.01 level (p = < .01 and-.309;-.369, significant medium negative correlation at .01 level (p = < .01) for GradCAM and binary diagnosis respectively). This study shows that the extent of agreement with both AI binary diagnosis and heatmap is correlated with trust in AI for the participants in this study, where greater agreement with the form of AI feedback is associated with greater trust in AI, in particular in the heatmap form of AI feedback. Forms of explainable AI should be developed with cognisance of the need for precision and accuracy in localisation to promote appropriate trust in clinical end users.

Collapse

Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, Pinto Dos Santos D, Tang A, Wald C, Slavotinek J. Developing, Purchasing, Implementing and Monitoring AI Tools in Radiology: Practical Considerations. A Multi-Society Statement From the ACR, CAR, ESR, RANZCR & RSNA. J Am Coll Radiol 2024;21:1292-1310. [PMID: 38276923 DOI: 10.1016/j.jacr.2023.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]

Montomoli J, Bitondo MM, Cascella M, Rezoagli E, Romeo L, Bellini V, Semeraro F, Gamberini E, Frontoni E, Agnoletti V, Altini M, Benanti P, Bignami EG. Algor-ethics: charting the ethical path for AI in critical care. J Clin Monit Comput 2024;38:931-939. [PMID: 38573370 PMCID: PMC11297831 DOI: 10.1007/s10877-024-01157-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Accepted: 03/22/2024] [Indexed: 04/05/2024]

Affiliation(s)

Jonathan Montomoli Department of Anesthesia and Intensive Care, Infermi Hospital, Romagna Local Health Authority, Viale Settembrini 2, Rimini, 47923, Italy. Health Services Research, Evaluation and Policy Unit, Romagna Local Health Authority, Viale Settembrini 2, Rimini, 47923, Italy.
Maria Maddalena Bitondo Department of Anesthesia and Intensive Care, Infermi Hospital, Romagna Local Health Authority, Viale Settembrini 2, Rimini, 47923, Italy
Marco Cascella Unit of Anesthesia and Pain Medicine, Department of Medicine, Surgery and Dentistry "Scuola Medica Salernitana, " University of Salerno, Baronissi, Salerno, Italy
Emanuele Rezoagli School of Medicine and Surgery, University of Milano-Bicocca, Via Cadore, 48, Monza, 20900, Italy Dipartimento di Emergenza e Urgenza, Terapia intensiva e Semintensiva adulti e pediatrica, Fondazione IRCCS San Gerardo dei Tintori, Via Pergolesi, 33, Monza, 20900, Italy
Luca Romeo Department of Economics and Law, University of Macerata, Macerata, 62100, Italy
Valentina Bellini Anesthesiology, Critical Care and Pain Medicine Division, Department of Medicine and Surgery, University of Parma, Via Gramsci 14, Parma, 43125, Italy
Federico Semeraro Department of Anesthesia, Intensive Care and Prehospital Emergency, Ospedale Maggiore Carlo Alberto Pizzardi, Largo Bartolo Nigrisoli, 2, Bologna, 40133, Italy
Emiliano Gamberini Department of Anesthesia and Intensive Care, Infermi Hospital, Romagna Local Health Authority, Viale Settembrini 2, Rimini, 47923, Italy
Emanuele Frontoni Department of Political Sciences, Communication and International Relations, University of Macerata, Macerata, 62100, Italy
Vanni Agnoletti Department of Surgery and Trauma, Anesthesia and Intensive Care Unit, Maurizio Bufalini Hospital, Romagna Local Health Authority, Viale Giovanni Ghirotti, 286, Cesena, 47521, Italy
Mattia Altini Hospital Care Sector, Emilia-Romagna Region, Via Aldo Moro, 21, Bologna, 40127, Italy
Paolo Benanti Pontifical Gregorian University, Piazza della Pilotta 4, Roma, 00187, Italy
Elena Giovanna Bignami Anesthesiology, Critical Care and Pain Medicine Division, Department of Medicine and Surgery, University of Parma, Via Gramsci 14, Parma, 43125, Italy

Collapse

Reis M, Reis F, Kunde W. Influence of believed AI involvement on the perception of digital medical advice. Nat Med 2024:10.1038/s41591-024-03180-7. [PMID: 39054373 DOI: 10.1038/s41591-024-03180-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 07/04/2024] [Indexed: 07/27/2024]

Kostick-Quenet K, Lang BH, Smith J, Hurley M, Blumenthal-Barby J. Trust criteria for artificial intelligence in health: normative and epistemic considerations. JOURNAL OF MEDICAL ETHICS 2024;50:544-551. [PMID: 37979976 PMCID: PMC11101592 DOI: 10.1136/jme-2023-109338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 11/02/2023] [Indexed: 11/20/2023]

Chang JY, Makary MS. Evolving and Novel Applications of Artificial Intelligence in Thoracic Imaging. Diagnostics (Basel) 2024;14:1456. [PMID: 39001346 PMCID: PMC11240935 DOI: 10.3390/diagnostics14131456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 07/01/2024] [Accepted: 07/06/2024] [Indexed: 07/16/2024] Open

Day TG, Matthew J, Budd SF, Venturini L, Wright R, Farruggia A, Vigneswaran TV, Zidere V, Hajnal JV, Razavi R, Simpson JM, Kainz B. Interaction between clinicians and artificial intelligence to detect fetal atrioventricular septal defects on ultrasound: how can we optimize collaborative performance? ULTRASOUND IN OBSTETRICS & GYNECOLOGY : THE OFFICIAL JOURNAL OF THE INTERNATIONAL SOCIETY OF ULTRASOUND IN OBSTETRICS AND GYNECOLOGY 2024;64:28-35. [PMID: 38197584 DOI: 10.1002/uog.27577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 12/19/2023] [Accepted: 12/30/2023] [Indexed: 01/11/2024]

Abstract

OBJECTIVES

Artificial intelligence (AI) has shown promise in improving the performance of fetal ultrasound screening in detecting congenital heart disease (CHD). The effect of giving AI advice to human operators has not been studied in this context. Giving additional information about AI model workings, such as confidence scores for AI predictions, may be a way of further improving performance. Our aims were to investigate whether AI advice improved overall diagnostic accuracy (using a single CHD lesion as an exemplar), and to determine what, if any, additional information given to clinicians optimized the overall performance of the clinician-AI team.

METHODS

An AI model was trained to classify a single fetal CHD lesion (atrioventricular septal defect (AVSD)), using a retrospective cohort of 121 130 cardiac four-chamber images extracted from 173 ultrasound scan videos (98 with normal hearts, 75 with AVSD); a ResNet50 model architecture was used. Temperature scaling of model prediction probability was performed on a validation set, and gradient-weighted class activation maps (grad-CAMs) produced. Ten clinicians (two consultant fetal cardiologists, three trainees in pediatric cardiology and five fetal cardiac sonographers) were recruited from a center of fetal cardiology to participate. Each participant was shown 2000 fetal four-chamber images in a random order (1000 normal and 1000 AVSD). The dataset comprised 500 images, each shown in four conditions: (1) image alone without AI output; (2) image with binary AI classification; (3) image with AI model confidence; and (4) image with grad-CAM image overlays. The clinicians were asked to classify each image as normal or AVSD.

RESULTS

A total of 20 000 image classifications were recorded from 10 clinicians. The AI model alone achieved an accuracy of 0.798 (95% CI, 0.760-0.832), a sensitivity of 0.868 (95% CI, 0.834-0.902) and a specificity of 0.728 (95% CI, 0.702-0.754), and the clinicians without AI achieved an accuracy of 0.844 (95% CI, 0.834-0.854), a sensitivity of 0.827 (95% CI, 0.795-0.858) and a specificity of 0.861 (95% CI, 0.828-0.895). Showing a binary (normal or AVSD) AI model output resulted in significant improvement in accuracy to 0.865 (P < 0.001). This effect was seen in both experienced and less-experienced participants. Giving incorrect AI advice resulted in a significant deterioration in overall accuracy, from 0.761 to 0.693 (P < 0.001), which was driven by an increase in both Type-I and Type-II errors by the clinicians. This effect was worsened by showing model confidence (accuracy, 0.649; P < 0.001) or grad-CAM (accuracy, 0.644; P < 0.001).

CONCLUSIONS

AI has the potential to improve performance when used in collaboration with clinicians, even if the model performance does not reach expert level. Giving additional information about model workings such as model confidence and class activation map image overlays did not improve overall performance, and actually worsened performance for images for which the AI model was incorrect. © 2024 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of International Society of Ultrasound in Obstetrics and Gynecology.

Collapse

Affiliation(s)

T G Day School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK Department of Congenital Heart Disease, Evelina London Children's Healthcare, Guy's and St Thomas' NHS Foundation Trust, London, UK
J Matthew School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
S F Budd School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
L Venturini School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
R Wright School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
A Farruggia School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
T V Vigneswaran School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK Department of Congenital Heart Disease, Evelina London Children's Healthcare, Guy's and St Thomas' NHS Foundation Trust, London, UK
V Zidere School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK Department of Congenital Heart Disease, Evelina London Children's Healthcare, Guy's and St Thomas' NHS Foundation Trust, London, UK Harris Birthright Research Centre, King's College London NHS Foundation Trust, London, UK
J V Hajnal School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
R Razavi School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK Department of Congenital Heart Disease, Evelina London Children's Healthcare, Guy's and St Thomas' NHS Foundation Trust, London, UK
J M Simpson School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK Department of Congenital Heart Disease, Evelina London Children's Healthcare, Guy's and St Thomas' NHS Foundation Trust, London, UK
B Kainz School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität, Erlangen-Nürnberg, Germany Department of Computing, Faculty of Engineering, Imperial College London, London, UK

Collapse

Chen H, Ma X, Rives H, Serpedin A, Yao P, Rameau A. Trust in Machine Learning Driven Clinical Decision Support Tools Among Otolaryngologists. Laryngoscope 2024;134:2799-2804. [PMID: 38230948 DOI: 10.1002/lary.31260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 11/29/2023] [Accepted: 12/20/2023] [Indexed: 01/18/2024]

Kotter E, Pinto Dos Santos D. [Ethics and artificial intelligence]. RADIOLOGIE (HEIDELBERG, GERMANY) 2024;64:498-502. [PMID: 38499692 DOI: 10.1007/s00117-024-01286-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 02/26/2024] [Indexed: 03/20/2024]

Hasani AM, Singh S, Zahergivar A, Ryan B, Nethala D, Bravomontenegro G, Mendhiratta N, Ball M, Farhadi F, Malayeri A. Evaluating the performance of Generative Pre-trained Transformer-4 (GPT-4) in standardizing radiology reports. Eur Radiol 2024;34:3566-3574. [PMID: 37938381 DOI: 10.1007/s00330-023-10384-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 09/01/2023] [Accepted: 09/08/2023] [Indexed: 11/09/2023]

Abstract

OBJECTIVE

Radiology reporting is an essential component of clinical diagnosis and decision-making. With the advent of advanced artificial intelligence (AI) models like GPT-4 (Generative Pre-trained Transformer 4), there is growing interest in evaluating their potential for optimizing or generating radiology reports. This study aimed to compare the quality and content of radiologist-generated and GPT-4 AI-generated radiology reports.

METHODS

A comparative study design was employed in the study, where a total of 100 anonymized radiology reports were randomly selected and analyzed. Each report was processed by GPT-4, resulting in the generation of a corresponding AI-generated report. Quantitative and qualitative analysis techniques were utilized to assess similarities and differences between the two sets of reports.

RESULTS

The AI-generated reports showed comparable quality to radiologist-generated reports in most categories. Significant differences were observed in clarity (p = 0.027), ease of understanding (p = 0.023), and structure (p = 0.050), favoring the AI-generated reports. AI-generated reports were more concise, with 34.53 fewer words and 174.22 fewer characters on average, but had greater variability in sentence length. Content similarity was high, with an average Cosine Similarity of 0.85, Sequence Matcher Similarity of 0.52, BLEU Score of 0.5008, and BERTScore F1 of 0.8775.

CONCLUSION

The results of this proof-of-concept study suggest that GPT-4 can be a reliable tool for generating standardized radiology reports, offering potential benefits such as improved efficiency, better communication, and simplified data extraction and analysis. However, limitations and ethical implications must be addressed to ensure the safe and effective implementation of this technology in clinical practice.

CLINICAL RELEVANCE STATEMENT

The findings of this study suggest that GPT-4 (Generative Pre-trained Transformer 4), an advanced AI model, has the potential to significantly contribute to the standardization and optimization of radiology reporting, offering improved efficiency and communication in clinical practice.

KEY POINTS

• Large language model-generated radiology reports exhibited high content similarity and moderate structural resemblance to radiologist-generated reports. • Performance metrics highlighted the strong matching of word selection and order, as well as high semantic similarity between AI and radiologist-generated reports. • Large language model demonstrated potential for generating standardized radiology reports, improving efficiency and communication in clinical settings.

Collapse

Yuan W, Du Z, Han S. Semi-supervised skin cancer diagnosis based on self-feedback threshold focal learning. Discov Oncol 2024;15:180. [PMID: 38776027 PMCID: PMC11111630 DOI: 10.1007/s12672-024-01043-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 05/17/2024] [Indexed: 05/25/2024] Open

Rosen S, Saban M. Evaluating the reliability of ChatGPT as a tool for imaging test referral: a comparative study with a clinical decision support system. Eur Radiol 2024;34:2826-2837. [PMID: 37828297 DOI: 10.1007/s00330-023-10230-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 07/28/2023] [Accepted: 08/01/2023] [Indexed: 10/14/2023]

Abstract

OBJECTIVES

As the technology continues to evolve and advance, we can expect to see artificial intelligence (AI) being used in increasingly sophisticated ways to make a diagnosis and decisions such as suggesting the most appropriate imaging referrals. We aim to explore whether Chat Generative Pretrained Transformer (ChatGPT) can provide accurate imaging referrals for clinical use that are at least as good as the ESR iGuide.

METHODS

A comparative study was conducted in a tertiary hospital. Data was collected from 97 consecutive cases that were admitted to the emergency department with abdominal complaints. We compared the imaging test referral recommendations suggested by the ESR iGuide and the ChatGPT and analyzed cases of disagreement. In addition, we selected cases where ChatGPT recommended a chest abdominal pelvis (CAP) CT (n = 66), and asked four specialists to grade the appropriateness of the referral.

RESULTS

ChatGPT recommendations were consistent with the recommendations provided by the ESR iGuide. No statistical differences were found between the appropriateness of referrals by age or gender. Using a sub-analysis of CAP cases, a high agreement between ChatGPT and the specialists was found. Cases of disagreement (12.4%) were further analyzed and presented themes of vague recommendations such as "it would be advisable" and "this would help to rule out."

CONCLUSIONS

ChatGPT's ability to guide the selection of appropriate tests may be comparable to some degree with the ESR iGuide. Features such as the clinical, ethical, and regulatory implications are still warranted and need to be addressed prior to clinical implementation. Further studies are needed to confirm these findings.

CLINICAL RELEVANCE STATEMENT

The article explores the potential of using advanced language models, such as ChatGPT, in healthcare as a CDS for selecting appropriate imaging tests. Using ChatGPT can improve the efficiency of the decision-making process KEY POINTS: • ChatGPT recommendations were highly consistent with the recommendations provided by the ESR iGuide. • ChatGPT's ability in guiding the selection of appropriate tests may be comparable to some degree with ESR iGuide's.

Collapse

Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, Dos Santos DP, Tang A, Wald C, Slavotinek J. Developing, Purchasing, Implementing and Monitoring AI Tools in Radiology: Practical Considerations. A Multi-Society Statement From the ACR, CAR, ESR, RANZCR & RSNA. Can Assoc Radiol J 2024;75:226-244. [PMID: 38251882 DOI: 10.1177/08465371231222229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024] Open

Cecil J, Lermer E, Hudecek MFC, Sauer J, Gaube S. Explainability does not mitigate the negative impact of incorrect AI advice in a personnel selection task. Sci Rep 2024;14:9736. [PMID: 38679619 PMCID: PMC11056364 DOI: 10.1038/s41598-024-60220-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 04/19/2024] [Indexed: 05/01/2024] Open

Jiang T, Chen C, Zhou Y, Cai S, Yan Y, Sui L, Lai M, Song M, Zhu X, Pan Q, Wang H, Chen X, Wang K, Xiong J, Chen L, Xu D. Deep learning-assisted diagnosis of benign and malignant parotid tumors based on ultrasound: a retrospective study. BMC Cancer 2024;24:510. [PMID: 38654281 PMCID: PMC11036551 DOI: 10.1186/s12885-024-12277-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 04/16/2024] [Indexed: 04/25/2024] Open

Abstract

BACKGROUND

To develop a deep learning(DL) model utilizing ultrasound images, and evaluate its efficacy in distinguishing between benign and malignant parotid tumors (PTs), as well as its practicality in assisting clinicians with accurate diagnosis.

METHODS

A total of 2211 ultrasound images of 980 pathologically confirmed PTs (Training set: n = 721; Validation set: n = 82; Internal-test set: n = 89; External-test set: n = 88) from 907 patients were retrospectively included in this study. The optimal model was selected and the diagnostic performance evaluation is conducted by utilizing the area under curve (AUC) of the receiver-operating characteristic(ROC) based on five different DL networks constructed at varying depths. Furthermore, a comparison of different seniority radiologists was made in the presence of the optimal auxiliary diagnosis model. Additionally, the diagnostic confusion matrix of the optimal model was calculated, and an analysis and summary of misjudged cases' characteristics were conducted.

RESULTS

The Resnet18 demonstrated superior diagnostic performance, with an AUC value of 0.947, accuracy of 88.5%, sensitivity of 78.2%, and specificity of 92.7% in internal-test set, and with an AUC value of 0.925, accuracy of 89.8%, sensitivity of 83.3%, and specificity of 90.6% in external-test set. The PTs were subjectively assessed twice by six radiologists, both with and without the assisted of the model. With the assisted of the model, both junior and senior radiologists demonstrated enhanced diagnostic performance. In the internal-test set, there was an increase in AUC values by 0.062 and 0.082 for junior radiologists respectively, while senior radiologists experienced an improvement of 0.066 and 0.106 in their respective AUC values.

CONCLUSIONS

The DL model based on ultrasound images demonstrates exceptional capability in distinguishing between benign and malignant PTs, thereby assisting radiologists of varying expertise levels to achieve heightened diagnostic performance, and serve as a noninvasive imaging adjunct diagnostic method for clinical purposes.

Collapse

Affiliation(s)

Tian Jiang Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Postgraduate training base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), 310022, Hangzhou, Zhejiang, China Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology, 310022, Hangzhou, Zhejiang, China
Chen Chen Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Yahan Zhou Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Shenzhou Cai Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Yuqi Yan Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Postgraduate training base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), 310022, Hangzhou, Zhejiang, China Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Lin Sui Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Postgraduate training base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), 310022, Hangzhou, Zhejiang, China Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Min Lai Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology, 310022, Hangzhou, Zhejiang, China Second Clinical College, Zhejiang University of Traditional Chinese Medicine, 310022, Hangzhou, Zhejiang, China
Mei Song Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology, 310022, Hangzhou, Zhejiang, China
Xi Zhu Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Qianmeng Pan Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Hui Wang Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Xiayi Chen Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Kai Wang Dongyang Hospital Affiliated to Wenzhou Medical University, 322100, Jinhua, Zhejiang, China
Jing Xiong Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 518000, Shenzhen, Guangdong, China
Liyu Chen Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China. Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology, 310022, Hangzhou, Zhejiang, China.
Dong Xu Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China. Postgraduate training base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), 310022, Hangzhou, Zhejiang, China. Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology, 310022, Hangzhou, Zhejiang, China. Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China. Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China.

Collapse

Vaidya A, Chen RJ, Williamson DFK, Song AH, Jaume G, Yang Y, Hartvigsen T, Dyer EC, Lu MY, Lipkova J, Shaban M, Chen TY, Mahmood F. Demographic bias in misdiagnosis by computational pathology models. Nat Med 2024;30:1174-1190. [PMID: 38641744 DOI: 10.1038/s41591-024-02885-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Accepted: 02/23/2024] [Indexed: 04/21/2024]

Affiliation(s)

Anurag Vaidya Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
Richard J Chen Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Drew F K Williamson Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology and Laboratory Medicine, Emory University School of Medicine, Atlanta, GA, USA
Andrew H Song Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
Guillaume Jaume Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
Yuzhe Yang Electrical Engineering and Computer Science, MIT, Cambridge, MA, USA
Thomas Hartvigsen School of Data Science, University of Virginia, Charlottesville, VA, USA
Emma C Dyer T.H. Chan School of Public Health, Harvard University, Cambridge, MA, USA
Ming Y Lu Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA Electrical Engineering and Computer Science, MIT, Cambridge, MA, USA
Jana Lipkova Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
Muhammad Shaban Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
Tiffany Y Chen Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
Faisal Mahmood Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA. Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA. Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA. Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA. Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA.

Collapse

Balagopalan A, Baldini I, Celi LA, Gichoya J, McCoy LG, Naumann T, Shalit U, van der Schaar M, Wagstaff KL. Machine learning for healthcare that matters: Reorienting from technical novelty to equitable impact. PLOS DIGITAL HEALTH 2024;3:e0000474. [PMID: 38620047 PMCID: PMC11018283 DOI: 10.1371/journal.pdig.0000474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 02/18/2024] [Indexed: 04/17/2024]

Ciet P, Eade C, Ho ML, Laborie LB, Mahomed N, Naidoo J, Pace E, Segal B, Toso S, Tschauner S, Vamyanmane DK, Wagner MW, Shelmerdine SC. The unintended consequences of artificial intelligence in paediatric radiology. Pediatr Radiol 2024;54:585-593. [PMID: 37665368 DOI: 10.1007/s00247-023-05746-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 08/07/2023] [Accepted: 08/08/2023] [Indexed: 09/05/2023]

Affiliation(s)

Pierluigi Ciet Department of Radiology and Nuclear Medicine, Erasmus MC - Sophia's Children's Hospital, Rotterdam, The Netherlands Department of Medical Sciences, University of Cagliari, Cagliari, Italy
Christine Eade Royal Cornwall Hospitals Trust, Truro, Cornwall, UK
Mai-Lan Ho University of Missouri, Columbia, MO, USA
Lene Bjerke Laborie Department of Radiology, Section for Paediatrics, Haukeland University Hospital, Bergen, Norway Department of Clinical Medicine, University of Bergen, Bergen, Norway
Nasreen Mahomed Department of Radiology, University of Witwatersrand, Johannesburg, South Africa
Jaishree Naidoo Paediatric Diagnostic Imaging, Dr J Naidoo Inc., Johannesburg, South Africa Envisionit Deep AI Ltd, Coveham House, Downside Bridge Road, Cobham, UK
Erika Pace Department of Diagnostic Radiology, The Royal Marsden NHS Foundation Trust, London, UK
Bradley Segal Department of Radiology, University of Witwatersrand, Johannesburg, South Africa
Seema Toso Pediatric Radiology, Children's Hospital, University Hospitals of Geneva, Geneva, Switzerland
Sebastian Tschauner Division of Paediatric Radiology, Department of Radiology, Medical University of Graz, Graz, Austria
Dhananjaya K Vamyanmane Department of Pediatric Radiology, Indira Gandhi Institute of Child Health, Bangalore, India
Matthias W Wagner Department of Diagnostic Imaging, Division of Neuroradiology, The Hospital for Sick Children, Toronto, Canada Department of Medical Imaging, University of Toronto, Toronto, ON, Canada Department of Neuroradiology, University Hospital Augsburg, Augsburg, Germany
Susan C Shelmerdine Department of Clinical Radiology, Great Ormond Street Hospital for Children NHS Foundation Trust, Great Ormond Street, London, WC1H 3JH, UK. Great Ormond Street Hospital for Children, UCL Great Ormond Street Institute of Child Health, London, UK. NIHR Great Ormond Street Hospital Biomedical Research Centre, 30 Guilford Street, Bloomsbury, London, UK. Department of Clinical Radiology, St George's Hospital, London, UK.

Collapse

Simmons C, DeGrasse J, Polakovic S, Aibinder W, Throckmorton T, Noerdlinger M, Papandrea R, Trenhaile S, Schoch B, Gobbato B, Routman H, Parsons M, Roche CP. Initial clinical experience with a predictive clinical decision support tool for anatomic and reverse total shoulder arthroplasty. EUROPEAN JOURNAL OF ORTHOPAEDIC SURGERY & TRAUMATOLOGY : ORTHOPEDIE TRAUMATOLOGIE 2024;34:1307-1318. [PMID: 38095688 DOI: 10.1007/s00590-023-03796-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 11/19/2023] [Indexed: 04/02/2024]

Abstract

PURPOSE

Clinical decision support tools (CDSTs) are software that generate patient-specific assessments that can be used to better inform healthcare provider decision making. Machine learning (ML)-based CDSTs have recently been developed for anatomic (aTSA) and reverse (rTSA) total shoulder arthroplasty to facilitate more data-driven, evidence-based decision making. Using this shoulder CDST as an example, this external validation study provides an overview of how ML-based algorithms are developed and discusses the limitations of these tools.

METHODS

An external validation for a novel CDST was conducted on 243 patients (120F/123M) who received a personalized prediction prior to surgery and had short-term clinical follow-up from 3 months to 2 years after primary aTSA (n = 43) or rTSA (n = 200). The outcome score and active range of motion predictions were compared to each patient's actual result at each timepoint, with the accuracy quantified by the mean absolute error (MAE).

RESULTS

The results of this external validation demonstrate the CDST accuracy to be similar (within 10%) or better than the MAEs from the published internal validation. A few predictive models were observed to have substantially lower MAEs than the internal validation, specifically, Constant (31.6% better), active abduction (22.5% better), global shoulder function (20.0% better), active external rotation (19.0% better), and active forward elevation (16.2% better), which is encouraging; however, the sample size was small.

CONCLUSION

A greater understanding of the limitations of ML-based CDSTs will facilitate more responsible use and build trust and confidence, potentially leading to greater adoption. As CDSTs evolve, we anticipate greater shared decision making between the patient and surgeon with the aim of achieving even better outcomes and greater levels of patient satisfaction.

Collapse

Anderson JW, Visweswaran S. Algorithmic Individual Fairness and Healthcare: A Scoping Review. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.25.24304853. [PMID: 38585746 PMCID: PMC10996729 DOI: 10.1101/2024.03.25.24304853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]

Abstract

Objective

Statistical and artificial intelligence algorithms are increasingly being developed for use in healthcare. These algorithms may reflect biases that magnify disparities in clinical care, and there is a growing need for understanding how algorithmic biases can be mitigated in pursuit of algorithmic fairness. Individual fairness in algorithms constrains algorithms to the notion that "similar individuals should be treated similarly." We conducted a scoping review on algorithmic individual fairness to understand the current state of research in the metrics and methods developed to achieve individual fairness and its applications in healthcare.

Methods

We searched three databases, PubMed, ACM Digital Library, and IEEE Xplore, for algorithmic individual fairness metrics, algorithmic bias mitigation, and healthcare applications. Our search was restricted to articles published between January 2013 and September 2023. We identified 1,886 articles through database searches and manually identified one article from which we included 30 articles in the review. Data from the selected articles were extracted, and the findings were synthesized.

Results

Based on the 30 articles in the review, we identified several themes, including philosophical underpinnings of fairness, individual fairness metrics, mitigation methods for achieving individual fairness, implications of achieving individual fairness on group fairness and vice versa, fairness metrics that combined individual fairness and group fairness, software for measuring and optimizing individual fairness, and applications of individual fairness in healthcare.

Conclusion

While there has been significant work on algorithmic individual fairness in recent years, the definition, use, and study of individual fairness remain in their infancy, especially in healthcare. Future research is needed to apply and evaluate individual fairness in healthcare comprehensively.

Collapse

Wei ML, Tada M, So A, Torres R. Artificial intelligence and skin cancer. Front Med (Lausanne) 2024;11:1331895. [PMID: 38566925 PMCID: PMC10985205 DOI: 10.3389/fmed.2024.1331895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 02/26/2024] [Indexed: 04/04/2024] Open

Campion JR, O'Connor DB, Lahiff C. Human-artificial intelligence interaction in gastrointestinal endoscopy. World J Gastrointest Endosc 2024;16:126-135. [PMID: 38577646 PMCID: PMC10989254 DOI: 10.4253/wjge.v16.i3.126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 01/18/2024] [Accepted: 02/23/2024] [Indexed: 03/14/2024] Open

Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, Pinto Dos Santos D, Tang A, Wald C, Slavotinek J. Developing, purchasing, implementing and monitoring AI tools in radiology: Practical considerations. A multi-society statement from the ACR, CAR, ESR, RANZCR & RSNA. J Med Imaging Radiat Oncol 2024;68:7-26. [PMID: 38259140 DOI: 10.1111/1754-9485.13612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 11/23/2023] [Indexed: 01/24/2024]

Groh M, Badri O, Daneshjou R, Koochek A, Harris C, Soenksen LR, Doraiswamy PM, Picard R. Deep learning-aided decision support for diagnosis of skin disease across skin tones. Nat Med 2024;30:573-583. [PMID: 38317019 PMCID: PMC10878981 DOI: 10.1038/s41591-023-02728-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 11/16/2023] [Indexed: 02/07/2024]

Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, Dos Santos DP, Tang A, Wald C, Slavotinek J. Developing, purchasing, implementing and monitoring AI tools in radiology: practical considerations. A multi-society statement from the ACR, CAR, ESR, RANZCR & RSNA. Insights Imaging 2024;15:16. [PMID: 38246898 PMCID: PMC10800328 DOI: 10.1186/s13244-023-01541-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024] Open

Nguyen T. ChatGPT in Medical Education: A Precursor for Automation Bias? JMIR MEDICAL EDUCATION 2024;10:e50174. [PMID: 38231545 PMCID: PMC10831594 DOI: 10.2196/50174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 12/11/2023] [Indexed: 01/18/2024]

Dot G, Gajny L, Ducret M. [The challenges of artificial intelligence in odontology]. Med Sci (Paris) 2024;40:79-84. [PMID: 38299907 DOI: 10.1051/medsci/2023199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2024] Open

Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, dos Santos DP, Tang A, Wald C, Slavotinek J. Developing, Purchasing, Implementing and Monitoring AI Tools in Radiology: Practical Considerations. A Multi-Society Statement from the ACR, CAR, ESR, RANZCR and RSNA. Radiol Artif Intell 2024;6:e230513. [PMID: 38251899 PMCID: PMC10831521 DOI: 10.1148/ryai.230513] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024]

Abstract

Artificial Intelligence (AI) carries the potential for unprecedented disruption in radiology, with possible positive and negative consequences. The integration of AI in radiology holds the potential to revolutionize healthcare practices by advancing diagnosis, quantification, and management of multiple medical conditions. Nevertheless, the ever-growing availability of AI tools in radiology highlights an increasing need to critically evaluate claims for its utility and to differentiate safe product offerings from potentially harmful, or fundamentally unhelpful ones. This multi-society paper, presenting the views of Radiology Societies in the USA, Canada, Europe, Australia, and New Zealand, defines the potential practical problems and ethical issues surrounding the incorporation of AI into radiological practice. In addition to delineating the main points of concern that developers, regulators, and purchasers of AI tools should consider prior to their introduction into clinical practice, this statement also suggests methods to monitor their stability and safety in clinical use, and their suitability for possible autonomous function. This statement is intended to serve as a useful summary of the practical issues which should be considered by all parties involved in the development of radiology AI resources, and their implementation as clinical tools. This article is simultaneously published in Insights into Imaging (DOI 10.1186/s13244-023-01541-3), Journal of Medical Imaging and Radiation Oncology (DOI 10.1111/1754-9485.13612), Canadian Association of Radiologists Journal (DOI 10.1177/08465371231222229), Journal of the American College of Radiology (DOI 10.1016/j.jacr.2023.12.005), and Radiology: Artificial Intelligence (DOI 10.1148/ryai.230513). Keywords: Artificial Intelligence, Radiology, Automation, Machine Learning Published under a CC BY 4.0 license. ©The Author(s) 2024. Editor's Note: The RSNA Board of Directors has endorsed this article. It has not undergone review or editing by this journal.

Collapse

Teneggi J, Yi PH, Sulam J. Examination-Level Supervision for Deep Learning-based Intracranial Hemorrhage Detection on Head CT Scans. Radiol Artif Intell 2024;6:e230159. [PMID: 38294324 PMCID: PMC10831525 DOI: 10.1148/ryai.230159] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 11/02/2023] [Accepted: 12/05/2023] [Indexed: 02/01/2024]

Abstract

Purpose To compare the effectiveness of weak supervision (ie, with examination-level labels only) and strong supervision (ie, with image-level labels) in training deep learning models for detection of intracranial hemorrhage (ICH) on head CT scans. Materials and Methods In this retrospective study, an attention-based convolutional neural network was trained with either local (ie, image level) or global (ie, examination level) binary labels on the Radiological Society of North America (RSNA) 2019 Brain CT Hemorrhage Challenge dataset of 21 736 examinations (8876 [40.8%] ICH) and 752 422 images (107 784 [14.3%] ICH). The CQ500 (436 examinations; 212 [48.6%] ICH) and CT-ICH (75 examinations; 36 [48.0%] ICH) datasets were employed for external testing. Performance in detecting ICH was compared between weak (examination-level labels) and strong (image-level labels) learners as a function of the number of labels available during training. Results On examination-level binary classification, strong and weak learners did not have different area under the receiver operating characteristic curve values on the internal validation split (0.96 vs 0.96; P = .64) and the CQ500 dataset (0.90 vs 0.92; P = .15). Weak learners outperformed strong ones on the CT-ICH dataset (0.95 vs 0.92; P = .03). Weak learners had better section-level ICH detection performance when more than 10 000 labels were available for training (average f1 = 0.73 vs 0.65; P < .001). Weakly supervised models trained on the entire RSNA dataset required 35 times fewer labels than equivalent strong learners. Conclusion Strongly supervised models did not achieve better performance than weakly supervised ones, which could reduce radiologist labor requirements for prospective dataset curation. Keywords: CT, Head/Neck, Brain/Brain Stem, Hemorrhage Supplemental material is available for this article. © RSNA, 2023 See also commentary by Wahid and Fuentes in this issue.

Collapse

Jabbour S, Fouhey D, Shepard S, Valley TS, Kazerooni EA, Banovic N, Wiens J, Sjoding MW. Measuring the Impact of AI in the Diagnosis of Hospitalized Patients: A Randomized Clinical Vignette Survey Study. JAMA 2023;330:2275-2284. [PMID: 38112814 PMCID: PMC10731487 DOI: 10.1001/jama.2023.22295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 10/11/2023] [Indexed: 12/21/2023]

Abstract

Importance

Artificial intelligence (AI) could support clinicians when diagnosing hospitalized patients; however, systematic bias in AI models could worsen clinician diagnostic accuracy. Recent regulatory guidance has called for AI models to include explanations to mitigate errors made by models, but the effectiveness of this strategy has not been established.

Objectives

To evaluate the impact of systematically biased AI on clinician diagnostic accuracy and to determine if image-based AI model explanations can mitigate model errors.

Design, Setting, and Participants

Randomized clinical vignette survey study administered between April 2022 and January 2023 across 13 US states involving hospitalist physicians, nurse practitioners, and physician assistants.

Interventions

Clinicians were shown 9 clinical vignettes of patients hospitalized with acute respiratory failure, including their presenting symptoms, physical examination, laboratory results, and chest radiographs. Clinicians were then asked to determine the likelihood of pneumonia, heart failure, or chronic obstructive pulmonary disease as the underlying cause(s) of each patient's acute respiratory failure. To establish baseline diagnostic accuracy, clinicians were shown 2 vignettes without AI model input. Clinicians were then randomized to see 6 vignettes with AI model input with or without AI model explanations. Among these 6 vignettes, 3 vignettes included standard-model predictions, and 3 vignettes included systematically biased model predictions.

Main Outcomes and Measures

Clinician diagnostic accuracy for pneumonia, heart failure, and chronic obstructive pulmonary disease.

Results

Median participant age was 34 years (IQR, 31-39) and 241 (57.7%) were female. Four hundred fifty-seven clinicians were randomized and completed at least 1 vignette, with 231 randomized to AI model predictions without explanations, and 226 randomized to AI model predictions with explanations. Clinicians' baseline diagnostic accuracy was 73.0% (95% CI, 68.3% to 77.8%) for the 3 diagnoses. When shown a standard AI model without explanations, clinician accuracy increased over baseline by 2.9 percentage points (95% CI, 0.5 to 5.2) and by 4.4 percentage points (95% CI, 2.0 to 6.9) when clinicians were also shown AI model explanations. Systematically biased AI model predictions decreased clinician accuracy by 11.3 percentage points (95% CI, 7.2 to 15.5) compared with baseline and providing biased AI model predictions with explanations decreased clinician accuracy by 9.1 percentage points (95% CI, 4.9 to 13.2) compared with baseline, representing a nonsignificant improvement of 2.3 percentage points (95% CI, -2.7 to 7.2) compared with the systematically biased AI model.

Conclusions and Relevance

Although standard AI models improve diagnostic accuracy, systematically biased AI models reduced diagnostic accuracy, and commonly used image-based AI model explanations did not mitigate this harmful effect.

Trial Registration

ClinicalTrials.gov Identifier: NCT06098950.

Collapse

Smith CM, Weathers AL, Lewis SL. An overview of clinical machine learning applications in neurology. J Neurol Sci 2023;455:122799. [PMID: 37979413 DOI: 10.1016/j.jns.2023.122799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 10/26/2023] [Accepted: 11/12/2023] [Indexed: 11/20/2023]

Funer F, Liedtke W, Tinnemeyer S, Klausen AD, Schneider D, Zacharias HU, Langanke M, Salloch S. Responsibility and decision-making authority in using clinical decision support systems: an empirical-ethical exploration of German prospective professionals' preferences and concerns. JOURNAL OF MEDICAL ETHICS 2023;50:6-11. [PMID: 37217277 PMCID: PMC10803986 DOI: 10.1136/jme-2022-108814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 03/11/2023] [Indexed: 05/24/2023]

Banerji CRS, Chakraborti T, Harbron C, MacArthur BD. Clinical AI tools must convey predictive uncertainty for each individual patient. Nat Med 2023;29:2996-2998. [PMID: 37821686 DOI: 10.1038/s41591-023-02562-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]

Nagendran M, Festor P, Komorowski M, Gordon AC, Faisal AA. Quantifying the impact of AI recommendations with explanations on prescription decision making. NPJ Digit Med 2023;6:206. [PMID: 37935953 PMCID: PMC10630476 DOI: 10.1038/s41746-023-00955-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 10/27/2023] [Indexed: 11/09/2023] Open

Li MD, Little BP. Appropriate Reliance on Artificial Intelligence in Radiology Education. J Am Coll Radiol 2023;20:1126-1130. [PMID: 37392983 DOI: 10.1016/j.jacr.2023.04.019] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 03/20/2023] [Accepted: 04/06/2023] [Indexed: 07/03/2023]

Ghassemi M. Presentation matters for AI-generated clinical advice. Nat Hum Behav 2023;7:1833-1835. [PMID: 37985904 DOI: 10.1038/s41562-023-01721-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]

Schlicker N, Langer M, Hirsch MC. [How trustworthy is artificial intelligence? : A model for the conflict between objectivity and subjectivity]. INNERE MEDIZIN (HEIDELBERG, GERMANY) 2023;64:1051-1057. [PMID: 37737496 DOI: 10.1007/s00108-023-01602-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/18/2023] [Indexed: 09/23/2023]

Vijayakumar S, Lee VV, Leong QY, Hong SJ, Blasiak A, Ho D. Physicians' Perspectives on AI in Clinical Decision Support Systems: Interview Study of the CURATE.AI Personalized Dose Optimization Platform. JMIR Hum Factors 2023;10:e48476. [PMID: 37902825 PMCID: PMC10644191 DOI: 10.2196/48476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/24/2023] [Accepted: 09/10/2023] [Indexed: 10/31/2023] Open

Abstract

BACKGROUND

Physicians play a key role in integrating new clinical technology into care practices through user feedback and growth propositions to developers of the technology. As physicians are stakeholders involved through the technology iteration process, understanding their roles as users can provide nuanced insights into the workings of these technologies that are being explored. Therefore, understanding physicians' perceptions can be critical toward clinical validation, implementation, and downstream adoption. Given the increasing prevalence of clinical decision support systems (CDSSs), there remains a need to gain an in-depth understanding of physicians' perceptions and expectations toward their downstream implementation. This paper explores physicians' perceptions of integrating CURATE.AI, a novel artificial intelligence (AI)-based and clinical stage personalized dosing CDSSs, into clinical practice.

OBJECTIVE

This study aims to understand physicians' perspectives of integrating CURATE.AI for clinical work and to gather insights on considerations of the implementation of AI-based CDSS tools.

METHODS

A total of 12 participants completed semistructured interviews examining their knowledge, experience, attitudes, risks, and future course of the personalized combination therapy dosing platform, CURATE.AI. Interviews were audio recorded, transcribed verbatim, and coded manually. The data were thematically analyzed.

RESULTS

Overall, 3 broad themes and 9 subthemes were identified through thematic analysis. The themes covered considerations that physicians perceived as significant across various stages of new technology development, including trial, clinical implementation, and mass adoption.

CONCLUSIONS

The study laid out the various ways physicians interpreted an AI-based personalized dosing CDSS, CURATE.AI, for their clinical practice. The research pointed out that physicians' expectations during the different stages of technology exploration can be nuanced and layered with expectations of implementation that are relevant for technology developers and researchers.

Collapse

Joo H, Mathis MR, Tam M, James C, Han P, Mangrulkar RS, Friedman CP, Vydiswaran VGV. Applying AI and Guidelines to Assist Medical Students in Recognizing Patients With Heart Failure: Protocol for a Randomized Trial. JMIR Res Protoc 2023;12:e49842. [PMID: 37874618 PMCID: PMC10630872 DOI: 10.2196/49842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 09/16/2023] [Accepted: 09/20/2023] [Indexed: 10/25/2023] Open

Abstract

BACKGROUND

The integration of artificial intelligence (AI) into clinical practice is transforming both clinical practice and medical education. AI-based systems aim to improve the efficacy of clinical tasks, enhancing diagnostic accuracy and tailoring treatment delivery. As it becomes increasingly prevalent in health care for high-quality patient care, it is critical for health care providers to use the systems responsibly to mitigate bias, ensure effective outcomes, and provide safe clinical practices. In this study, the clinical task is the identification of heart failure (HF) prior to surgery with the intention of enhancing clinical decision-making skills. HF is a common and severe disease, but detection remains challenging due to its subtle manifestation, often concurrent with other medical conditions, and the absence of a simple and effective diagnostic test. While advanced HF algorithms have been developed, the use of these AI-based systems to enhance clinical decision-making in medical education remains understudied.

OBJECTIVE

This research protocol is to demonstrate our study design, systematic procedures for selecting surgical cases from electronic health records, and interventions. The primary objective of this study is to measure the effectiveness of interventions aimed at improving HF recognition before surgery, the second objective is to evaluate the impact of inaccurate AI recommendations, and the third objective is to explore the relationship between the inclination to accept AI recommendations and their accuracy.

METHODS

Our study used a 3 × 2 factorial design (intervention type × order of prepost sets) for this randomized trial with medical students. The student participants are asked to complete a 30-minute e-learning module that includes key information about the intervention and a 5-question quiz, and a 60-minute review of 20 surgical cases to determine the presence of HF. To mitigate selection bias in the pre- and posttests, we adopted a feature-based systematic sampling procedure. From a pool of 703 expert-reviewed surgical cases, 20 were selected based on features such as case complexity, model performance, and positive and negative labels. This study comprises three interventions: (1) a direct AI-based recommendation with a predicted HF score, (2) an indirect AI-based recommendation gauged through the area under the curve metric, and (3) an HF guideline-based intervention.

RESULTS

As of July 2023, 62 of the enrolled medical students have fulfilled this study's participation, including the completion of a short quiz and the review of 20 surgical cases. The subject enrollment commenced in August 2022 and will end in December 2023, with the goal of recruiting 75 medical students in years 3 and 4 with clinical experience.

CONCLUSIONS

We demonstrated a study protocol for the randomized trial, measuring the effectiveness of interventions using AI and HF guidelines among medical students to enhance HF recognition in preoperative care with electronic health record data.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID)

DERR1-10.2196/49842.

Collapse

Vicente L, Matute H. Humans inherit artificial intelligence biases. Sci Rep 2023;13:15737. [PMID: 37789032 PMCID: PMC10547752 DOI: 10.1038/s41598-023-42384-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 09/09/2023] [Indexed: 10/05/2023] Open

Carboni C, Wehrens R, van der Veen R, de Bont A. Eye for an AI: More-than-seeing, fauxtomation, and the enactment of uncertain data in digital pathology. SOCIAL STUDIES OF SCIENCE 2023;53:712-737. [PMID: 37154611 PMCID: PMC10543128 DOI: 10.1177/03063127231167589] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]