1
|
Lehnen NC, Schievelkamp AH, Gronemann C, Haase R, Krause I, Gansen M, Fleckenstein T, Dorn F, Radbruch A, Paech D. Impact of an AI software on the diagnostic performance and reading time for the detection of cerebral aneurysms on time of flight MR-angiography. Neuroradiology 2024:10.1007/s00234-024-03351-w. [PMID: 38619571 DOI: 10.1007/s00234-024-03351-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 03/29/2024] [Indexed: 04/16/2024]
Abstract
PURPOSE To evaluate the impact of an AI-based software trained to detect cerebral aneurysms on TOF-MRA on the diagnostic performance and reading times across readers with varying experience levels. METHODS One hundred eighty-six MRI studies were reviewed by six readers to detect cerebral aneurysms. Initially, readings were assisted by the CNN-based software mdbrain. After 6 weeks, a second reading was conducted without software assistance. The results were compared to the consensus reading of two neuroradiological specialists and sensitivity (lesion and patient level), specificity (patient level), and false positives per case were calculated for the group of all readers, for the subgroup of physicians, and for each individual reader. Also, reading times for each reader were measured. RESULTS The dataset contained 54 aneurysms. The readers had no experience (three medical students), 2 years experience (resident in neuroradiology), 6 years experience (radiologist), and 12 years (neuroradiologist). Significant improvements of overall specificity and the overall number of false positives per case were observed in the reading with AI support. For the physicians, we found significant improvements of sensitivity on lesion and patient level and false positives per case. Four readers experienced reduced reading times with the software, while two encountered increased times. CONCLUSION In the reading with the AI-based software, we observed significant improvements in terms of specificity and false positives per case for the group of all readers and significant improvements of sensitivity and false positives per case for the physicians. Further studies are needed to investigate the effects of the AI-based software in a prospective setting.
Collapse
Affiliation(s)
- Nils C Lehnen
- Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, 53127, Bonn, Germany.
- Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany.
| | - Arndt-Hendrik Schievelkamp
- Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, 53127, Bonn, Germany
| | - Christian Gronemann
- Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, 53127, Bonn, Germany
| | - Robert Haase
- Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, 53127, Bonn, Germany
| | - Inga Krause
- Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, 53127, Bonn, Germany
| | - Max Gansen
- Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, 53127, Bonn, Germany
| | - Tobias Fleckenstein
- Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, 53127, Bonn, Germany
| | - Franziska Dorn
- Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, 53127, Bonn, Germany
| | - Alexander Radbruch
- Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, 53127, Bonn, Germany
- Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany
| | - Daniel Paech
- Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, 53127, Bonn, Germany
| |
Collapse
|
2
|
Lehnen NC, Dorn F, Wiest IC, Zimmermann H, Radbruch A, Kather JN, Paech D. Data Extraction from Free-Text Reports on Mechanical Thrombectomy in Acute Ischemic Stroke Using ChatGPT: A Retrospective Analysis. Radiology 2024; 311:e232741. [PMID: 38625006 DOI: 10.1148/radiol.232741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]
Abstract
Background Procedural details of mechanical thrombectomy in patients with ischemic stroke are important predictors of clinical outcome and are collected for prospective studies or national stroke registries. To date, these data are collected manually by human readers, a labor-intensive task that is prone to errors. Purpose To evaluate the use of the large language models (LLMs) GPT-4 and GPT-3.5 to extract data from neuroradiology reports on mechanical thrombectomy in patients with ischemic stroke. Materials and Methods This retrospective study included consecutive reports from patients with ischemic stroke who underwent mechanical thrombectomy between November 2022 and September 2023 at institution 1 and between September 2016 and December 2019 at institution 2. A set of 20 reports was used to optimize the prompt, and the ability of the LLMs to extract procedural data from the reports was compared using the McNemar test. Data manually extracted by an interventional neuroradiologist served as the reference standard. Results A total of 100 internal reports from 100 patients (mean age, 74.7 years ± 13.2 [SD]; 53 female) and 30 external reports from 30 patients (mean age, 72.7 years ± 13.5; 18 male) were included. All reports were successfully processed by GPT-4 and GPT-3.5. Of 2800 data entries, 2631 (94.0% [95% CI: 93.0, 94.8]; range per category, 61%-100%) data points were correctly extracted by GPT-4 without the need for further postprocessing. With 1788 of 2800 correct data entries, GPT-3.5 produced fewer correct data entries than did GPT-4 (63.9% [95% CI: 62.0, 65.6]; range per category, 14%-99%; P < .001). For the external reports, GPT-4 extracted 760 of 840 (90.5% [95% CI: 88.3, 92.4]) correct data entries, while GPT-3.5 extracted 539 of 840 (64.2% [95% CI: 60.8, 67.4]; P < .001). Conclusion Compared with GPT-3.5, GPT-4 more frequently extracted correct procedural data from free-text reports on mechanical thrombectomy performed in patients with ischemic stroke. © RSNA, 2024 Supplemental material is available for this article.
Collapse
Affiliation(s)
- Nils C Lehnen
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Franziska Dorn
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Isabella C Wiest
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Hanna Zimmermann
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Alexander Radbruch
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Jakob Nikolas Kather
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Daniel Paech
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| |
Collapse
|
4
|
Lehnen NC, Haase R, Schmeel FC, Vatter H, Dorn F, Radbruch A, Paech D. Automated Detection of Cerebral Aneurysms on TOF-MRA Using a Deep Learning Approach: An External Validation Study. AJNR Am J Neuroradiol 2022; 43:1700-1705. [PMID: 36357154 DOI: 10.3174/ajnr.a7695] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 10/05/2022] [Indexed: 11/12/2022]
Abstract
BACKGROUND AND PURPOSE Cerebral aneurysms yield the risk of rupture, severe disability and death. Thus, early detection of cerebral aneurysms is crucial to ensure timely treatment, if necessary. AI-based software tools are expected to enhance radiologists' performance in detecting pathologies like cerebral aneurysms in the future. Our aim was to evaluate the diagnostic performance of an artificial intelligence-based software designed to detect intracranial aneurysms on TOF-MRA. MATERIALS AND METHODS One hundred ninety-one MR imaging data sets were analyzed using the software mdbrain for the presence of intracranial aneurysms on TOF-MRA obtained using two 3T MR imaging scanners or a 1.5T MR imaging scanner according to our clinical standard protocol. The results were compared with the reading of an experienced radiologist as a criterion standard to measure the sensitivity, specificity, positive and negative predictive values, and accuracy of the software. Additionally, detection rates depending on size, morphology, and location of the aneurysms were evaluated. RESULTS Fifty-four aneurysms were detected by the expert reader. The overall sensitivity of the software for the detection of cerebral aneurysms was 72.6%, the specificity was 87.2%, and the accuracy was 82.6%. The positive predictive value was 67.9%, and the negative predictive value was 88.5%. We observed a sensitivity of 100% for saccular aneurysms of >5 mm without signs of thrombosis and low detection rates for fusiform or thrombosed aneurysms of 33.3% and 16.7%, respectively. Of 8 aneurysms that were not included in the initial written reports but were detected by the expert reader, retrospectively, 4 were detected by the software. CONCLUSIONS Our data suggest that the software can assist radiologists in reporting TOF-MRA. The software was highly reliable in detecting saccular aneurysms, while for fusiform or thrombosed aneurysms, further improvements are needed. Further studies are necessary to investigate the impact of the software on detection rates, interrater reliability, and reading times.
Collapse
Affiliation(s)
- N C Lehnen
- From the Departments of Neuroradiology (N.C.L., R.H., F.C.S., F.D., A.R., D.P.)
| | - R Haase
- From the Departments of Neuroradiology (N.C.L., R.H., F.C.S., F.D., A.R., D.P.)
| | - F C Schmeel
- From the Departments of Neuroradiology (N.C.L., R.H., F.C.S., F.D., A.R., D.P.)
| | - H Vatter
- Neurosurgery (H.V.), University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - F Dorn
- From the Departments of Neuroradiology (N.C.L., R.H., F.C.S., F.D., A.R., D.P.)
| | - A Radbruch
- From the Departments of Neuroradiology (N.C.L., R.H., F.C.S., F.D., A.R., D.P.)
| | - D Paech
- From the Departments of Neuroradiology (N.C.L., R.H., F.C.S., F.D., A.R., D.P.)
| |
Collapse
|