1
|
Huang G, Li Y, Jameel S, Long Y, Papanastasiou G. From explainable to interpretable deep learning for natural language processing in healthcare: How far from reality? Comput Struct Biotechnol J 2024; 24:362-373. [PMID: 38800693 PMCID: PMC11126530 DOI: 10.1016/j.csbj.2024.05.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 05/03/2024] [Accepted: 05/03/2024] [Indexed: 05/29/2024] Open
Abstract
Deep learning (DL) has substantially enhanced natural language processing (NLP) in healthcare research. However, the increasing complexity of DL-based NLP necessitates transparent model interpretability, or at least explainability, for reliable decision-making. This work presents a thorough scoping review of explainable and interpretable DL in healthcare NLP. The term "eXplainable and Interpretable Artificial Intelligence" (XIAI) is introduced to distinguish XAI from IAI. Different models are further categorized based on their functionality (model-, input-, output-based) and scope (local, global). Our analysis shows that attention mechanisms are the most prevalent emerging IAI technique. The use of IAI is growing, distinguishing it from XAI. The major challenges identified are that most XIAI does not explore "global" modelling processes, the lack of best practices, and the lack of systematic evaluation and benchmarks. One important opportunity is to use attention mechanisms to enhance multi-modal XIAI for personalized medicine. Additionally, combining DL with causal logic holds promise. Our discussion encourages the integration of XIAI in Large Language Models (LLMs) and domain-specific smaller models. In conclusion, XIAI adoption in healthcare requires dedicated in-house expertise. Collaboration with domain experts, end-users, and policymakers can lead to ready-to-use XIAI methods across NLP and medical tasks. While challenges exist, XIAI techniques offer a valuable foundation for interpretable NLP algorithms in healthcare.
Collapse
Affiliation(s)
- Guangming Huang
- School of Computer Science and Electronic Engineering, University of Essex, Colchester, CO4 3SQ, United Kingdom
| | - Yingya Li
- Harvard Medical School and Boston Children's Hospital, Boston, 02115, United States
| | - Shoaib Jameel
- Electronics and Computer Science, University of Southampton, Southampton, SO17 1BJ, United Kingdom
| | - Yunfei Long
- School of Computer Science and Electronic Engineering, University of Essex, Colchester, CO4 3SQ, United Kingdom
| | | |
Collapse
|
2
|
Murnan AW, Tscholl JJ, Ganta R, Duah HO, Qasem I, Sezgin E. Identification of Child Survivors of Sex Trafficking From Electronic Health Records: An Artificial Intelligence Guided Approach. CHILD MALTREATMENT 2024; 29:601-611. [PMID: 37545138 PMCID: PMC11000265 DOI: 10.1177/10775595231194599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Survivors of child sex trafficking (SCST) experience high rates of adverse health outcomes. Amidst the duration of their victimization, survivors regularly seek healthcare yet fail to be identified. This study sought to utilize artificial intelligence (AI) to identify SCST and describe the elements of their healthcare presentation. An AI-supported keyword search was conducted to identify SCST within the electronic medical records (EMR) of ∼1.5 million patients at a large midwestern pediatric hospital. Descriptive analyses were used to evaluate associated diagnoses and clinical presentation. A sex trafficking-related keyword was identified in .18% of patient charts. Among this cohort, the most common associated diagnostic codes were for Confirmed Sexual/Physical Assault; Trauma and Stress-Related Disorders; Depressive Disorders; Anxiety Disorders; and Suicidal Ideation. Our findings are consistent with the myriad of known adverse physical and psychological outcomes among SCST and illuminate the future potential of AI technology to improve screening and research efforts surrounding all aspects of this vulnerable population.
Collapse
Affiliation(s)
- Aaron W Murnan
- College of Nursing, University of Cincinnati, Cincinnati, OH, USA
| | - Jennifer J Tscholl
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
- Division of Child and Family Advocacy, Center for Family Safety and Healing, Nationwide Children's Hospital, Columbus, OH, USA
| | - Rajesh Ganta
- Information Technology Research and Innovation, Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
| | - Henry O Duah
- College of Nursing, University of Cincinnati, Cincinnati, OH, USA
| | - Islam Qasem
- College of Nursing, University of Cincinnati, Cincinnati, OH, USA
| | - Emre Sezgin
- Information Technology Research and Innovation, Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
- Center for Biobehavioral Health, The Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
| |
Collapse
|
3
|
Fan D, Miao R, Huang H, Wang X, Li S, Huang Q, Yang S, Deng R. Multimodal ischemic stroke recurrence prediction model based on the capsule neural network and support vector machine. Medicine (Baltimore) 2024; 103:e39217. [PMID: 39213233 PMCID: PMC11365640 DOI: 10.1097/md.0000000000039217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 03/06/2024] [Accepted: 07/17/2024] [Indexed: 09/04/2024] Open
Abstract
Ischemic stroke (IS) has a high recurrence rate. Machine learning (ML) models have been developed based on single-modal biochemical tests, and imaging data have been used to predict stroke recurrence. However, the prediction accuracy of these models is not sufficiently high. Therefore, this study aimed to collect biochemical detection and magnetic resonance imaging (MRI) data to establish a dataset and propose a high-performance heterogeneous multimodal IS recurrence prediction model based on deep learning. This is a retrospective cohort study. Data were retrospectively collected from 634 IS patients in Zhuhai, China, a 12-month follow-up was conducted to determine stroke recurrence. We propose the ischemic stroke multi-group learning (ISGL) model, an integrated model for predicting the recurrence risk of multimodal IS in patients, based on a capsule neural network and a linear support vector machine (SVM). Two capsule neural network prediction models based on T1 and T2 signals in the MRI data and a SVM prediction model based on biochemical test data were established. Finally, a vote was conducted on the final judgment of the integrated model. The ISGL model was compared with 6 classical ML and deep learning models: k-nearest neighbors, SVM, logistic regression, random forest, eXtreme Gradient Boosting, and visual geometry group. The results revealed that the accuracy, specificity, sensitivity and the area under the curve of the ISGL model were 95%, 96%, 94%, and 95%, respectively. Among the comparison models, the visual geometry group method exhibited the best performance, but it much lower than those of the ISGL model. Analysis of the importance of biochemical test data revealed that low-density lipoprotein, smoking, and heart disease history were the positively correlated factors, and total cholesterol, high-density lipoprotein, and diabetes were and the negatively correlated factors. This study proposes the ISGL model can be used simultaneously with MRI and biochemical data to predict IS recurrence. This combination resulted in higher rate of performance than that of the other ML models. Additionally, this study found related risk factors affected recurrence, which can be used to intervene in high-risk patients' recurrence as early as possible and promote the development of secondary prevention of stroke.
Collapse
Affiliation(s)
- Daying Fan
- Nursing Department, The Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Rui Miao
- Basic Teaching Department, Zhuhai Campus of Zunyi Medical University, Zhu Hai, China
| | - Hao Huang
- Neurological Department, The Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Xianlin Wang
- Nursing Department, The Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Siyuan Li
- Nursing Department, The Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Qinghua Huang
- Basic Teaching Department, Zhuhai Campus of Zunyi Medical University, Zhu Hai, China
| | - Shan Yang
- Nursing Department, The Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Renli Deng
- Nursing Department, The Affiliated Hospital of Zunyi Medical University, Zunyi, China
| |
Collapse
|
4
|
Ong CJ, Huang Q, Kim ISY, Pohlmann J, Chatzidakis S, Brush B, Zhang Y, Du Y, Malinger LA, Benjamin EJ, Dupuis J, Greer DM, Smirnakis SM, Trinquart L. Association of Dynamic Trajectories of Time-Series Data and Life-Threatening Mass Effect in Large Middle Cerebral Artery Stroke. Neurocrit Care 2024:10.1007/s12028-024-02036-9. [PMID: 38955931 DOI: 10.1007/s12028-024-02036-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 06/05/2024] [Indexed: 07/04/2024]
Abstract
BACKGROUND Life-threatening, space-occupying mass effect due to cerebral edema and/or hemorrhagic transformation is an early complication of patients with middle cerebral artery stroke. Little is known about longitudinal trajectories of laboratory and vital signs leading up to radiographic and clinical deterioration related to this mass effect. METHODS We curated a retrospective data set of 635 patients with large middle cerebral artery stroke totaling 95,463 data points for 10 longitudinal covariates and 40 time-independent covariates. We assessed trajectories of the 10 longitudinal variables during the 72 h preceding three outcomes representative of life-threatening mass effect: midline shift ≥ 5 mm, pineal gland shift (PGS) > 4 mm, and decompressive hemicraniectomy (DHC). We used a "backward-looking" trajectory approach. Patients were aligned based on outcome occurrence time and the trajectory of each variable was assessed before that outcome by accounting for cases and noncases, adjusting for confounders. We evaluated longitudinal trajectories with Cox proportional time-dependent regression. RESULTS Of 635 patients, 49.0% were female, and the mean age was 69 years. Thirty five percent of patients had midline shift ≥ 5 mm, 24.3% of patients had PGS > 4 mm, and 10.7% of patients underwent DHC. Backward-looking trajectories showed mild increases in white blood cell count (10-11 K/UL within 72 h), temperature (up to half a degree within 24 h), and sodium levels (1-3 mEq/L within 24 h) before the three outcomes of interest. We also observed a decrease in heart rate (75-65 beats per minute) 24 h before DHC. We found a significant association between increased white blood cell count with PGS > 4 mm (hazard ratio 1.05, p value 0.007). CONCLUSIONS Longitudinal profiling adjusted for confounders demonstrated that white blood cell count, temperature, and sodium levels appear to increase before radiographic and clinical indicators of space-occupying mass effect. These findings will inform the development of multivariable dynamic risk models to aid prediction of life-threatening, space-occupying mass effect.
Collapse
Affiliation(s)
- Charlene J Ong
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA.
- Chobanian and Avedisian School of Medicine, Boston University School of Medicine, 85 E Concord St, Boston, MA, 02118, USA.
- Department of Neurology, Brigham & Women's Hospital, Harvard Medical School, 75 Francis St, Boston, MA, 02115, USA.
| | - Qiuxi Huang
- Department of Epidemiology, Boston University School of Public Health, 715 Albany St, Boston, MA, 02118, USA
| | - Ivy So Yeon Kim
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA
| | - Jack Pohlmann
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA
| | - Stefanos Chatzidakis
- Department of Neurology, Brigham & Women's Hospital, Harvard Medical School, 75 Francis St, Boston, MA, 02115, USA
| | - Benjamin Brush
- New York University Langone Hospital and NYU Grossman School of Medicine, 550 1St Ave, New York, NY, 10016, USA
| | - Yihan Zhang
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA
| | - Yili Du
- Chobanian and Avedisian School of Medicine, Boston University School of Medicine, 85 E Concord St, Boston, MA, 02118, USA
| | - Leigh Ann Malinger
- Chobanian and Avedisian School of Medicine, Boston University School of Medicine, 85 E Concord St, Boston, MA, 02118, USA
| | - Emelia J Benjamin
- Department of Epidemiology, Boston University School of Public Health, 715 Albany St, Boston, MA, 02118, USA
- Department of Cardiology, Boston Medical Center and Boston University Chobanian and Avedisian School of Medicine, 85 E Concord St, Boston, MA, 02118, USA
| | - Josée Dupuis
- Department of Epidemiology, Boston University School of Public Health, 715 Albany St, Boston, MA, 02118, USA
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, 2001 McGill College, Montreal, QC, Canada
| | - David M Greer
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA
- Chobanian and Avedisian School of Medicine, Boston University School of Medicine, 85 E Concord St, Boston, MA, 02118, USA
| | - Stelios M Smirnakis
- Department of Neurology, Brigham & Women's Hospital, Harvard Medical School, 75 Francis St, Boston, MA, 02115, USA
- Department of Neurology, Jamaica Plain Veterans Administration Medical Center, 150 S Huntington Ave, Boston, MA, 02130, USA
| | - Ludovic Trinquart
- Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, 800 Washington St, Boston, MA, 02111, USA
- Tufts Clinical and Translational Science Institute, Tufts University, 419 Boston Ave, Medford, MA, 02155, USA
| |
Collapse
|
5
|
Pohlmann JE, Kim ISY, Brush B, Sambhu KM, Conti L, Saglam H, Milos K, Yu L, Cronin MFM, Balogun O, Chatzidakis S, Zhang Y, Trinquart L, Huang Q, Smirnakis SM, Benjamin EJ, Dupuis J, Greer DM, Ong CJ. Association of large core middle cerebral artery stroke and hemorrhagic transformation with hospitalization outcomes. Sci Rep 2024; 14:10008. [PMID: 38693282 PMCID: PMC11063151 DOI: 10.1038/s41598-024-60635-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 04/25/2024] [Indexed: 05/03/2024] Open
Abstract
Historically, investigators have not differentiated between patients with and without hemorrhagic transformation (HT) in large core ischemic stroke at risk for life-threatening mass effect (LTME) from cerebral edema. Our objective was to determine whether LTME occurs faster in those with HT compared to those without. We conducted a two-center retrospective study of patients with ≥ 1/2 MCA territory infarct between 2006 and 2021. We tested the association of time-to-LTME and HT subtype (parenchymal, petechial) using Cox regression, controlling for age, mean arterial pressure, glucose, tissue plasminogen activator, mechanical thrombectomy, National Institute of Health Stroke Scale, antiplatelets, anticoagulation, temperature, and stroke side. Secondary and exploratory outcomes included mass effect-related death, all-cause death, disposition, and decompressive hemicraniectomy. Of 840 patients, 358 (42.6%) had no HT, 403 (48.0%) patients had petechial HT, and 79 (9.4%) patients had parenchymal HT. LTME occurred in 317 (37.7%) and 100 (11.9%) had mass effect-related deaths. Parenchymal (HR 8.24, 95% CI 5.46-12.42, p < 0.01) and petechial HT (HR 2.47, 95% CI 1.92-3.17, p < 0.01) were significantly associated with time-to-LTME and mass effect-related death. Understanding different risk factors and sequelae of mass effect with and without HT is critical for informed clinical decisions.
Collapse
Affiliation(s)
- Jack E Pohlmann
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA
- Department of Epidemiology, School of Public Health, Boston University, 715 Albany St, Boston, MA, 02118, USA
| | - Ivy So Yeon Kim
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA
| | - Benjamin Brush
- Department of Neurology, NYU Langone Medical Center, 550 1st Ave, New York, NY, 10016, USA
| | - Krishna M Sambhu
- Department of Neurology, Boston University School of Medicine, Chobanian and Avedisian School of Medicine, 85 E Concord St., Suite 1116, Boston, MA, 02118, USA
| | - Lucas Conti
- Department of Neurology, Boston University School of Medicine, Chobanian and Avedisian School of Medicine, 85 E Concord St., Suite 1116, Boston, MA, 02118, USA
| | - Hanife Saglam
- Department of Neurology, Brigham & Women's Hospital, Harvard Medical School, 75 Francis St, Boston, MA, 02115, USA
| | - Katie Milos
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA
| | - Lillian Yu
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA
| | - Michael F M Cronin
- Department of Neurology, Boston University School of Medicine, Chobanian and Avedisian School of Medicine, 85 E Concord St., Suite 1116, Boston, MA, 02118, USA
| | - Oluwafemi Balogun
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA
| | - Stefanos Chatzidakis
- Department of Neurology, Brigham & Women's Hospital, Harvard Medical School, 75 Francis St, Boston, MA, 02115, USA
| | - Yihan Zhang
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA
| | - Ludovic Trinquart
- Department of Epidemiology, School of Public Health, Boston University, 715 Albany St, Boston, MA, 02118, USA
- Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, 800 Washington St, Boston, MA, 02111, USA
- Tufts Clinical and Translational Science Institute, Tufts University, 419 Boston, Ave, Medford, MA, 02155, USA
| | - Qiuxi Huang
- Department of Epidemiology, School of Public Health, Boston University, 715 Albany St, Boston, MA, 02118, USA
| | - Stelios M Smirnakis
- Department of Neurology, Brigham & Women's Hospital, Harvard Medical School, 75 Francis St, Boston, MA, 02115, USA
- Department of Neurology, Jamaica Plain Veterans Administration Medical Center, 150 S Huntington Ave, Boston, MA, 02130, USA
| | - Emelia J Benjamin
- Department of Epidemiology, School of Public Health, Boston University, 715 Albany St, Boston, MA, 02118, USA
- Department of Cardiology, Boston Medical Center and Boston University Chobanian and Avedisian School of Medicine, 85 E Concord St, Boston, MA, 02118, USA
| | - Josée Dupuis
- Department of Epidemiology, School of Public Health, Boston University, 715 Albany St, Boston, MA, 02118, USA
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, 2001 McGill College, Montreal, QC, Canada
| | - David M Greer
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA
- Department of Neurology, Boston University School of Medicine, Chobanian and Avedisian School of Medicine, 85 E Concord St., Suite 1116, Boston, MA, 02118, USA
| | - Charlene J Ong
- Department of Neurology, Boston Medical Center, 1 Boston Medical Center PI, Boston, MA, 02118, USA.
- Department of Neurology, Boston University School of Medicine, Chobanian and Avedisian School of Medicine, 85 E Concord St., Suite 1116, Boston, MA, 02118, USA.
- Department of Neurology, Brigham & Women's Hospital, Harvard Medical School, 75 Francis St, Boston, MA, 02115, USA.
| |
Collapse
|
6
|
Song JJ, Stafford RA, Pohlmann JE, Kim ISY, Cheekati M, Dennison S, Brush B, Chatzidakis S, Huang Q, Smirnakis SM, Gilmore EJ, Mohammed S, Abdalkader M, Benjamin EJ, Dupuis J, Greer DM, Ong CJ. Later Midline Shift Is Associated with Better Outcomes after Large Middle Cerebral Artery Stroke. RESEARCH SQUARE 2024:rs.3.rs-4189278. [PMID: 38699310 PMCID: PMC11065061 DOI: 10.21203/rs.3.rs-4189278/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Background/Objective Space occupying cerebral edema is the most feared early complication after large ischemic stroke, occurring in up to 30% of patients with middle cerebral artery (MCA) occlusion, and is reported to peak 2-4 days after injury. Little is known about the factors and outcomes associated with peak edema timing, especially when it occurs after 96 hours. We aimed to characterize differences between patients who experienced maximum midline shift (MLS) or decompressive hemicraniectomy (DHC) in the acute (<48 hours), average (48-96 hours), and subacute (>96 hours) groups and determine whether patients with subacute peak edema timing have improved discharge dispositions. Methods We performed a two-center, retrospective study of patients with ≥1/2 MCA territory infarct and MLS. We constructed a multivariable model to test the association of subacute peak edema and favorable discharge disposition, adjusting for age, admission Alberta Stroke Program Early CT Score (ASPECTS), National Institute of Health Stroke Scale (NIHSS), acute thrombolytic intervention, cerebral atrophy, maximum MLS, parenchymal hemorrhagic transformation, DHC, and osmotic therapy receipt. Results Of 321 eligible patients with MLS, 32%, 36%, and 32% experienced acute, average, and subacute peak edema. Subacute peak edema was significantly associated with higher odds of favorable discharge than non-subacute swelling, adjusting for confounders (aOR, 1.85; 95% CI, 1.05-3.31). Conclusions Subacute peak edema after large MCA stroke is associated with better discharge disposition compared to earlier peak edema courses. Understanding how the timing of cerebral edema affects risk of unfavorable discharge has important implications for treatment decisions and prognostication.
Collapse
Affiliation(s)
| | | | | | | | | | - Sydney Dennison
- Department of Epidemiology, Boston University School of Public Health
| | | | | | - Qiuxi Huang
- Department of Neurology, Jamaica Plain Veterans Administration Medical Center
| | | | | | - Shariq Mohammed
- Department of Biostatistics, Boston University School of Public Health
| | | | - Emelia J Benjamin
- Department of Epidemiology, Boston University School of Public Health
| | - Josée Dupuis
- Department of Biostatistics, Boston University School of Public Health
| | - David M Greer
- Boston University Chobanian & Avedisian School of Medicine
| | - Charlene J Ong
- Boston University Chobanian & Avedisian School of Medicine
| |
Collapse
|
7
|
Lehnen NC, Dorn F, Wiest IC, Zimmermann H, Radbruch A, Kather JN, Paech D. Data Extraction from Free-Text Reports on Mechanical Thrombectomy in Acute Ischemic Stroke Using ChatGPT: A Retrospective Analysis. Radiology 2024; 311:e232741. [PMID: 38625006 DOI: 10.1148/radiol.232741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]
Abstract
Background Procedural details of mechanical thrombectomy in patients with ischemic stroke are important predictors of clinical outcome and are collected for prospective studies or national stroke registries. To date, these data are collected manually by human readers, a labor-intensive task that is prone to errors. Purpose To evaluate the use of the large language models (LLMs) GPT-4 and GPT-3.5 to extract data from neuroradiology reports on mechanical thrombectomy in patients with ischemic stroke. Materials and Methods This retrospective study included consecutive reports from patients with ischemic stroke who underwent mechanical thrombectomy between November 2022 and September 2023 at institution 1 and between September 2016 and December 2019 at institution 2. A set of 20 reports was used to optimize the prompt, and the ability of the LLMs to extract procedural data from the reports was compared using the McNemar test. Data manually extracted by an interventional neuroradiologist served as the reference standard. Results A total of 100 internal reports from 100 patients (mean age, 74.7 years ± 13.2 [SD]; 53 female) and 30 external reports from 30 patients (mean age, 72.7 years ± 13.5; 18 male) were included. All reports were successfully processed by GPT-4 and GPT-3.5. Of 2800 data entries, 2631 (94.0% [95% CI: 93.0, 94.8]; range per category, 61%-100%) data points were correctly extracted by GPT-4 without the need for further postprocessing. With 1788 of 2800 correct data entries, GPT-3.5 produced fewer correct data entries than did GPT-4 (63.9% [95% CI: 62.0, 65.6]; range per category, 14%-99%; P < .001). For the external reports, GPT-4 extracted 760 of 840 (90.5% [95% CI: 88.3, 92.4]) correct data entries, while GPT-3.5 extracted 539 of 840 (64.2% [95% CI: 60.8, 67.4]; P < .001). Conclusion Compared with GPT-3.5, GPT-4 more frequently extracted correct procedural data from free-text reports on mechanical thrombectomy performed in patients with ischemic stroke. © RSNA, 2024 Supplemental material is available for this article.
Collapse
Affiliation(s)
- Nils C Lehnen
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Franziska Dorn
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Isabella C Wiest
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Hanna Zimmermann
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Alexander Radbruch
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Jakob Nikolas Kather
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Daniel Paech
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| |
Collapse
|
8
|
Abdulnazar A, Kugic A, Schulz S, Stadlbauer V, Kreuzthaler M. O2 supplementation disambiguation in clinical narratives to support retrospective COVID-19 studies. BMC Med Inform Decis Mak 2024; 24:29. [PMID: 38297364 PMCID: PMC10829265 DOI: 10.1186/s12911-024-02425-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 01/15/2024] [Indexed: 02/02/2024] Open
Abstract
BACKGROUND Oxygen saturation, a key indicator of COVID-19 severity, poses challenges, especially in cases of silent hypoxemia. Electronic health records (EHRs) often contain supplemental oxygen information within clinical narratives. Streamlining patient identification based on oxygen levels is crucial for COVID-19 research, underscoring the need for automated classifiers in discharge summaries to ease the manual review burden on physicians. METHOD We analysed text lines extracted from anonymised COVID-19 patient discharge summaries in German to perform a binary classification task, differentiating patients who received oxygen supplementation and those who did not. Various machine learning (ML) algorithms, including classical ML to deep learning (DL) models, were compared. Classifier decisions were explained using Local Interpretable Model-agnostic Explanations (LIME), which visualize the model decisions. RESULT Classical ML to DL models achieved comparable performance in classification, with an F-measure varying between 0.942 and 0.955, whereas the classical ML approaches were faster. Visualisation of embedding representation of input data reveals notable variations in the encoding patterns between classic and DL encoders. Furthermore, LIME explanations provide insights into the most relevant features at token level that contribute to these observed differences. CONCLUSION Despite a general tendency towards deep learning, these use cases show that classical approaches yield comparable results at lower computational cost. Model prediction explanations using LIME in textual and visual layouts provided a qualitative explanation for the model performance.
Collapse
Affiliation(s)
- Akhila Abdulnazar
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria
- CBmed GmbH - Center for Biomarker Research in Medicine, Graz, Austria
| | - Amila Kugic
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria
| | - Stefan Schulz
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria
| | - Vanessa Stadlbauer
- CBmed GmbH - Center for Biomarker Research in Medicine, Graz, Austria
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Markus Kreuzthaler
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria.
| |
Collapse
|
9
|
Reichenpfader D, Müller H, Denecke K. Large language model-based information extraction from free-text radiology reports: a scoping review protocol. BMJ Open 2023; 13:e076865. [PMID: 38070902 PMCID: PMC10729196 DOI: 10.1136/bmjopen-2023-076865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 11/21/2023] [Indexed: 12/18/2023] Open
Abstract
INTRODUCTION Radiological imaging is one of the most frequently performed diagnostic tests worldwide. The free-text contained in radiology reports is currently only rarely used for secondary use purposes, including research and predictive analysis. However, this data might be made available by means of information extraction (IE), based on natural language processing (NLP). Recently, a new approach to NLP, large language models (LLMs), has gained momentum and continues to improve performance of IE-related tasks. The objective of this scoping review is to show the state of research regarding IE from free-text radiology reports based on LLMs, to investigate applied methods and to guide future research by showing open challenges and limitations of current approaches. To our knowledge, no systematic or scoping review of IE from radiology reports based on LLMs has been published. Existing publications are outdated and do not comprise LLM-based methods. METHODS AND ANALYSIS This protocol is designed based on the JBI Manual for Evidence Synthesis, chapter 11.2: 'Development of a scoping review protocol'. Inclusion criteria and a search strategy comprising four databases (PubMed, IEEE Xplore, Web of Science Core Collection and ACM Digital Library) are defined. Furthermore, we describe the screening process, data charting, analysis and presentation of extracted data. ETHICS AND DISSEMINATION This protocol describes the methodology of a scoping literature review and does not comprise research on or with humans, animals or their data. Therefore, no ethical approval is required. After the publication of this protocol and the conduct of the review, its results are going to be published in an open access journal dedicated to biomedical informatics/digital health.
Collapse
Affiliation(s)
- Daniel Reichenpfader
- Institute for Patient-centered Digital Health, Bern University of Applied Sciences, Bern, Switzerland
| | - Henning Müller
- Department of Radiology and Medical Informatics, Université de Genève, Genève, Switzerland
- Informatics Institute, HES-SO Valais-Wallis, Sierre, Switzerland
| | - Kerstin Denecke
- Institute for Patient-centered Digital Health, Bern University of Applied Sciences, Bern, Switzerland
| |
Collapse
|
10
|
Hobensack M, Song J, Oh S, Evans L, Davoudi A, Bowles KH, McDonald MV, Barrón Y, Sridharan S, Wallace AS, Topaz M. Social Risk Factors are Associated with Risk for Hospitalization in Home Health Care: A Natural Language Processing Study. J Am Med Dir Assoc 2023; 24:1874-1880.e4. [PMID: 37553081 PMCID: PMC10839109 DOI: 10.1016/j.jamda.2023.06.031] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 06/23/2023] [Accepted: 06/25/2023] [Indexed: 08/10/2023]
Abstract
OBJECTIVE This study aimed to develop a natural language processing (NLP) system that identified social risk factors in home health care (HHC) clinical notes and to examine the association between social risk factors and hospitalization or an emergency department (ED) visit. DESIGN Retrospective cohort study. SETTING AND PARTICIPANTS We used standardized assessments and clinical notes from one HHC agency located in the northeastern United States. This included 86,866 episodes of care for 65,593 unique patients. Patients received HHC services between 2015 and 2017. METHODS Guided by HHC experts, we created a vocabulary of social risk factors that influence hospitalization or ED visit risk in the HHC setting. We then developed an NLP system to automatically identify social risk factors documented in clinical notes. We used an adjusted logistic regression model to examine the association between the NLP-based social risk factors and hospitalization or an ED visit. RESULTS On the basis of expert consensus, the following social risk factors emerged: Social Environment, Physical Environment, Education and Literacy, Food Insecurity, Access to Care, and Housing and Economic Circumstances. Our NLP system performed "very good" with an F score of 0.91. Approximately 4% of clinical notes (33% episodes of care) documented a social risk factor. The most frequently documented social risk factors were Physical Environment and Social Environment. Except for Housing and Economic Circumstances, all NLP-based social risk factors were associated with higher odds of hospitalization and ED visits. CONCLUSIONS AND IMPLICATIONS HHC clinicians assess and document social risk factors associated with hospitalizations and ED visits in their clinical notes. Future studies can explore the social risk factors documented in HHC to improve communication across the health care system and to predict patients at risk for being hospitalized or visiting the ED.
Collapse
Affiliation(s)
| | - Jiyoun Song
- Columbia University School of Nursing, New York City, NY, USA
| | - Sungho Oh
- University of Pennsylvania School of Nursing, Philadelphia, PA, USA
| | - Lauren Evans
- Center for Home Care Policy & Research, VNS Health, New York, NY, USA
| | - Anahita Davoudi
- Center for Home Care Policy & Research, VNS Health, New York, NY, USA
| | - Kathryn H Bowles
- Center for Home Care Policy & Research, VNS Health, New York, NY, USA; Department of Biobehavioral Health Sciences, NewCourtland Center for Transitions and Health, University of Pennsylvania School of Nursing, Philadelphia, PA, USA
| | | | - Yolanda Barrón
- Center for Home Care Policy & Research, VNS Health, New York, NY, USA
| | - Sridevi Sridharan
- Center for Home Care Policy & Research, VNS Health, New York, NY, USA
| | - Andrea S Wallace
- The University of Utah College of Nursing, Salt Lake City, UT, USA
| | - Maxim Topaz
- Columbia University School of Nursing, New York City, NY, USA; Center for Home Care Policy & Research, VNS Health, New York, NY, USA; Data Science Institute, Columbia University, New York City, NY, USA
| |
Collapse
|
11
|
De Rosario H, Pitarch-Corresa S, Pedrosa I, Vidal-Pedrós M, de Otto-López B, García-Mieres H, Álvarez-Rodríguez L. Applications of Natural Language Processing for the Management of Stroke Disorders: Scoping Review. JMIR Med Inform 2023; 11:e48693. [PMID: 37672328 PMCID: PMC10512117 DOI: 10.2196/48693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 07/26/2023] [Accepted: 07/28/2023] [Indexed: 09/07/2023] Open
Abstract
BACKGROUND Recent advances in natural language processing (NLP) have heightened the interest of the medical community in its application to health care in general, in particular to stroke, a medical emergency of great impact. In this rapidly evolving context, it is necessary to learn and understand the experience already accumulated by the medical and scientific community. OBJECTIVE The aim of this scoping review was to explore the studies conducted in the last 10 years using NLP to assist the management of stroke emergencies so as to gain insight on the state of the art, its main contexts of application, and the software tools that are used. METHODS Data were extracted from Scopus and Medline through PubMed, using the keywords "natural language processing" and "stroke." Primary research questions were related to the phases, contexts, and types of textual data used in the studies. Secondary research questions were related to the numerical and statistical methods and the software used to process the data. The extracted data were structured in tables and their relative frequencies were calculated. The relationships between categories were analyzed through multiple correspondence analysis. RESULTS Twenty-nine papers were included in the review, with the majority being cohort studies of ischemic stroke published in the last 2 years. The majority of papers focused on the use of NLP to assist in the diagnostic phase, followed by the outcome prognosis, using text data from diagnostic reports and in many cases annotations on medical images. The most frequent approach was based on general machine learning techniques applied to the results of relatively simple NLP methods with the support of ontologies and standard vocabularies. Although smaller in number, there has been an increasing body of studies using deep learning techniques on numerical and vectorized representations of the texts obtained with more sophisticated NLP tools. CONCLUSIONS Studies focused on NLP applied to stroke show specific trends that can be compared to the more general application of artificial intelligence to stroke. The purpose of using NLP is often to improve processes in a clinical context rather than to assist in the rehabilitation process. The state of the art in NLP is represented by deep learning architectures, among which Bidirectional Encoder Representations from Transformers has been found to be especially widely used in the medical field in general, and for stroke in particular, with an increasing focus on the processing of annotations on medical images.
Collapse
Affiliation(s)
- Helios De Rosario
- Instituto de Biomecánica de Valencia, Universitat Politècnica de València, Valencia, Spain
| | | | - Ignacio Pedrosa
- CTIC Centro Tecnológico de la Información y la Comunicación, Gijón, Spain
| | - Marina Vidal-Pedrós
- Instituto de Biomecánica de Valencia, Universitat Politècnica de València, Valencia, Spain
| | | | | | | |
Collapse
|
12
|
Miller MI, Shih LC, Kolachalama VB. Machine Learning in Clinical Trials: A Primer with Applications to Neurology. Neurotherapeutics 2023; 20:1066-1080. [PMID: 37249836 PMCID: PMC10228463 DOI: 10.1007/s13311-023-01384-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2023] [Indexed: 05/31/2023] Open
Abstract
We reviewed foundational concepts in artificial intelligence (AI) and machine learning (ML) and discussed ways in which these methodologies may be employed to enhance progress in clinical trials and research, with particular attention to applications in the design, conduct, and interpretation of clinical trials for neurologic diseases. We discussed ways in which ML may help to accelerate the pace of subject recruitment, provide realistic simulation of medical interventions, and enhance remote trial administration via novel digital biomarkers and therapeutics. Lastly, we provide a brief overview of the technical, administrative, and regulatory challenges that must be addressed as ML achieves greater integration into clinical trial workflows.
Collapse
Affiliation(s)
- Matthew I Miller
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA
| | - Ludy C Shih
- Department of Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, 02118, USA
| | - Vijaya B Kolachalama
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA.
- Department of Computer Science and Faculty of Computing & Data Sciences, Boston University, Boston, MA, 02115, USA.
| |
Collapse
|
13
|
Hsu E, Bako AT, Potter T, Pan AP, Britz GW, Tannous J, Vahidy FS. Extraction of Radiological Characteristics From Free-Text Imaging Reports Using Natural Language Processing Among Patients With Ischemic and Hemorrhagic Stroke: Algorithm Development and Validation. JMIR AI 2023; 2:e42884. [PMID: 38875556 PMCID: PMC11041442 DOI: 10.2196/42884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 01/10/2023] [Accepted: 04/08/2023] [Indexed: 06/16/2024]
Abstract
BACKGROUND Neuroimaging is the gold-standard diagnostic modality for all patients suspected of stroke. However, the unstructured nature of imaging reports remains a major challenge to extracting useful information from electronic health records systems. Despite the increasing adoption of natural language processing (NLP) for radiology reports, information extraction for many stroke imaging features has not been systematically evaluated. OBJECTIVE In this study, we propose an NLP pipeline, which adopts the state-of-the-art ClinicalBERT model with domain-specific pretraining and task-oriented fine-tuning to extract 13 stroke features from head computed tomography imaging notes. METHODS We used the model to generate structured data sets with information on the presence or absence of common stroke features for 24,924 patients with strokes. We compared the survival characteristics of patients with and without features of severe stroke (eg, midline shift, perihematomal edema, or mass effect) using the Kaplan-Meier curve and log-rank tests. RESULTS Pretrained on 82,073 head computed tomography notes with 13.7 million words and fine-tuned on 200 annotated notes, our HeadCT_BERT model achieved an average area under receiver operating characteristic curve of 0.9831, F1-score of 0.8683, and accuracy of 97%. Among patients with acute ischemic stroke, admissions with any severe stroke feature in initial imaging notes were associated with a lower probability of survival (P<.001). CONCLUSIONS Our proposed NLP pipeline achieved high performance and has the potential to improve medical research and patient safety.
Collapse
Affiliation(s)
- Enshuo Hsu
- Center for Health Data Science and Analytics, Houston Methodist Research Institute, Houston, TX, United States
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Abdulaziz T Bako
- Center for Health Data Science and Analytics, Houston Methodist Research Institute, Houston, TX, United States
| | - Thomas Potter
- Center for Health Data Science and Analytics, Houston Methodist Research Institute, Houston, TX, United States
| | - Alan P Pan
- Center for Health Data Science and Analytics, Houston Methodist Research Institute, Houston, TX, United States
| | - Gavin W Britz
- Department of Neurosurgery, Houston Methodist Neurological Institute, Houston, TX, United States
- Department of Neurology, Weill Cornell Medical College, New York, NY, United States
| | - Jonika Tannous
- Center for Health Data Science and Analytics, Houston Methodist Research Institute, Houston, TX, United States
| | - Farhaan S Vahidy
- Center for Health Data Science and Analytics, Houston Methodist Research Institute, Houston, TX, United States
- Department of Neurosurgery, Houston Methodist Neurological Institute, Houston, TX, United States
- Department of Population Health Sciences, Weill Cornell Medical College, New York, NY, United States
| |
Collapse
|
14
|
Galbusera F, Cina A, Bassani T, Panico M, Sconfienza LM. Automatic Diagnosis of Spinal Disorders on Radiographic Images: Leveraging Existing Unstructured Datasets With Natural Language Processing. Global Spine J 2023; 13:1257-1266. [PMID: 34219477 PMCID: PMC10416592 DOI: 10.1177/21925682211026910] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
STUDY DESIGN Retrospective study. OBJECTIVES Huge amounts of images and medical reports are being generated in radiology departments. While these datasets can potentially be employed to train artificial intelligence tools to detect findings on radiological images, the unstructured nature of the reports limits the accessibility of information. In this study, we tested if natural language processing (NLP) can be useful to generate training data for deep learning models analyzing planar radiographs of the lumbar spine. METHODS NLP classifiers based on the Bidirectional Encoder Representations from Transformers (BERT) model able to extract structured information from radiological reports were developed and used to generate annotations for a large set of radiographic images of the lumbar spine (N = 10 287). Deep learning (ResNet-18) models aimed at detecting radiological findings directly from the images were then trained and tested on a set of 204 human-annotated images. RESULTS The NLP models had accuracies between 0.88 and 0.98 and specificities between 0.84 and 0.99; 7 out of 12 radiological findings had sensitivity >0.90. The ResNet-18 models showed performances dependent on the specific radiological findings with sensitivities and specificities between 0.53 and 0.93. CONCLUSIONS NLP generates valuable data to train deep learning models able to detect radiological findings in spine images. Despite the noisy nature of reports and NLP predictions, this approach effectively mitigates the difficulties associated with the manual annotation of large quantities of data and opens the way to the era of big data for artificial intelligence in musculoskeletal radiology.
Collapse
Affiliation(s)
| | - Andrea Cina
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
| | - Tito Bassani
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
| | - Matteo Panico
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
- Department of Chemistry, Materials and Chemical Engineering “Giulio Natta,” Politecnico di Milano, Milan, Italy
| | - Luca Maria Sconfienza
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
- Department of Biomedical Sciences for Health, Università degli Studi di Milano, Milan, Italy
| |
Collapse
|
15
|
Yang L, Huang X, Wang J, Yang X, Ding L, Li Z, Li J. Identifying stroke-related quantified evidence from electronic health records in real-world studies. Artif Intell Med 2023; 140:102552. [PMID: 37210153 DOI: 10.1016/j.artmed.2023.102552] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 02/28/2023] [Accepted: 04/11/2023] [Indexed: 05/22/2023]
Abstract
BACKGROUND Stroke is one of the leading causes of death and disability worldwide. The National Institutes of Health Stroke Scale (NIHSS) scores in electronic health records (EHRs), which quantitatively describe patients' neurological deficits in evidence-based treatment, are crucial in stroke-related clinical investigations. However, the free-text format and lack of standardization inhibit their effective use. Automatically extracting the scale scores from the clinical free text so that its potential value in real-world studies is realized has become an important goal. OBJECTIVE This study aims to develop an automated method to extract scale scores from the free text of EHRs. METHODS We propose a two-step pipeline method to identify NIHSS items and numerical scores and validate its feasibility using a freely accessible critical care database: MIMIC-III (Medical Information Mart for Intensive Care III). First, we utilize MIMIC-III to create an annotated corpus. Then, we investigate possible machine learning methods for two subtasks, NIHSS item and score recognition and item-score relation extraction. In the evaluation, we conduct both task-specific and end-to-end evaluations and compare our method with the rule-based method using precision, recall and F1 scores as evaluation metrics. RESULTS We use all available discharge summaries of stroke cases in MIMIC-III. The annotated NIHSS corpus contains 312 cases, 2929 scale items, 2774 scores and 2733 relations. The results show that the best F1-score of our method was 0.9006, which was attained by combining BERT-BiLSTM-CRF and Random Forest, and it outperformed the rule-based method (F1-score = 0.8098). In the end-to-end task, our method could successfully recognize the item "1b level of consciousness questions", the score "1" and their relation "('1b level of consciousness questions', '1', 'has value')" from the sentence "1b level of consciousness questions: said name = 1", while the rule-based method could not. CONCLUSIONS The two-step pipeline method we propose is an effective approach to identify NIHSS items, scores and their relations. With its help, clinical investigators can easily retrieve and access structured scale data, thereby supporting stroke-related real-world studies.
Collapse
Affiliation(s)
- Lin Yang
- Institute of Medical Information and Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing 100020, China; Key Laboratory of Medical Information Intelligent Technology, Chinese Academy of Medical Sciences, Beijing 100020, China
| | - Xiaoshuo Huang
- Institute of Medical Information and Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing 100020, China; School of Health Care Technology, Dalian Neusoft University of Information, Dalian 116023, China
| | - Jiayang Wang
- Institute of Medical Information and Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing 100020, China
| | - Xin Yang
- China National Clinical Research Center for Neurological Diseases, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China; National Center for Healthcare Quality Management in Neurological Diseases, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
| | - Lingling Ding
- China National Clinical Research Center for Neurological Diseases, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China; Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
| | - Zixiao Li
- China National Clinical Research Center for Neurological Diseases, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China; National Center for Healthcare Quality Management in Neurological Diseases, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China; Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
| | - Jiao Li
- Institute of Medical Information and Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing 100020, China; Key Laboratory of Medical Information Intelligent Technology, Chinese Academy of Medical Sciences, Beijing 100020, China.
| |
Collapse
|
16
|
Bobba PS, Sailer A, Pruneski JA, Beck S, Mozayan A, Mozayan S, Arango J, Cohan A, Chheang S. Natural language processing in radiology: Clinical applications and future directions. Clin Imaging 2023; 97:55-61. [PMID: 36889116 DOI: 10.1016/j.clinimag.2023.02.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 02/10/2023] [Accepted: 02/20/2023] [Indexed: 03/07/2023]
Abstract
Natural language processing (NLP) is a wide range of techniques that allows computers to interact with human text. Applications of NLP in everyday life include language translation aids, chat bots, and text prediction. It has been increasingly utilized in the medical field with increased reliance on electronic health records. As findings in radiology are primarily communicated via text, the field is particularly suited to benefit from NLP based applications. Furthermore, rapidly increasing imaging volume will continue to increase burden on clinicians, emphasizing the need for improvements in workflow. In this article, we highlight the numerous non-clinical, provider focused, and patient focused applications of NLP in radiology. We also comment on challenges associated with development and incorporation of NLP based applications in radiology as well as potential future directions.
Collapse
Affiliation(s)
- Pratheek S Bobba
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, United States
| | - Anne Sailer
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, United States
| | | | - Spencer Beck
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, United States
| | - Ali Mozayan
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, United States
| | - Sara Mozayan
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, United States
| | - Jennifer Arango
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, United States
| | - Arman Cohan
- Department of Computer Science, Yale University, New Haven, CT, United States
| | - Sophie Chheang
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, United States.
| |
Collapse
|
17
|
Callahan TJ, Stefanksi AL, Ostendorf DM, Wyrwa JM, Davies SJD, Hripcsak G, Hunter LE, Kahn MG. Characterizing Patient Representations for Computational Phenotyping. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2023; 2022:319-328. [PMID: 37128436 PMCID: PMC10148332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Patient representation learning methods create rich representations of complex data and have potential to further advance the development of computational phenotypes (CP). Currently, these methods are either applied to small predefined concept sets or all available patient data, limiting the potential for novel discovery and reducing the explainability of the resulting representations. We report on an extensive, data-driven characterization of the utility of patient representation learning methods for the purpose of CP development or automatization. We conducted ablation studies to examine the impact of patient representations, built using data from different combinations of data types and sampling windows on rare disease classification. We demonstrated that the data type and sampling window directly impact classification and clustering performance, and these results differ by rare disease group. Our results, although preliminary, exemplify the importance of and need for data-driven characterization in patient representation-based CP development pipelines.
Collapse
Affiliation(s)
- Tiffany J Callahan
- Columbia University, New York, NY, 10032, USA
- University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | | | | | - Jordan M Wyrwa
- University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Children's Hospital Colorado, Aurora, CO, 80045, USA
| | | | | | - Lawrence E Hunter
- University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Michael G Kahn
- University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| |
Collapse
|
18
|
Nunez JJ, Leung B, Ho C, Bates AT, Ng RT. Predicting the Survival of Patients With Cancer From Their Initial Oncology Consultation Document Using Natural Language Processing. JAMA Netw Open 2023; 6:e230813. [PMID: 36848085 PMCID: PMC9972192 DOI: 10.1001/jamanetworkopen.2023.0813] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open
Abstract
IMPORTANCE Predicting short- and long-term survival of patients with cancer may improve their care. Prior predictive models either use data with limited availability or predict the outcome of only 1 type of cancer. OBJECTIVE To investigate whether natural language processing can predict survival of patients with general cancer from a patient's initial oncologist consultation document. DESIGN, SETTING, AND PARTICIPANTS This retrospective prognostic study used data from 47 625 of 59 800 patients who started cancer care at any of the 6 BC Cancer sites located in the province of British Columbia between April 1, 2011, and December 31, 2016. Mortality data were updated until April 6, 2022, and data were analyzed from update until September 30, 2022. All patients with a medical or radiation oncologist consultation document generated within 180 days of diagnosis were included; patients seen for multiple cancers were excluded. EXPOSURES Initial oncologist consultation documents were analyzed using traditional and neural language models. MAIN OUTCOMES AND MEASURES The primary outcome was the performance of the predictive models, including balanced accuracy and receiver operating characteristics area under the curve (AUC). The secondary outcome was investigating what words the models used. RESULTS Of the 47 625 patients in the sample, 25 428 (53.4%) were female and 22 197 (46.6%) were male, with a mean (SD) age of 64.9 (13.7) years. A total of 41 447 patients (87.0%) survived 6 months, 31 143 (65.4%) survived 36 months, and 27 880 (58.5%) survived 60 months, calculated from their initial oncologist consultation. The best models achieved a balanced accuracy of 0.856 (AUC, 0.928) for predicting 6-month survival, 0.842 (AUC, 0.918) for 36-month survival, and 0.837 (AUC, 0.918) for 60-month survival, on a holdout test set. Differences in what words were important for predicting 6- vs 60-month survival were found. CONCLUSIONS AND RELEVANCE These findings suggest that models performed comparably with or better than previous models predicting cancer survival and that they may be able to predict survival using readily available data without focusing on 1 cancer type.
Collapse
Affiliation(s)
- John-Jose Nunez
- BC Cancer, Vancouver, British Columbia, Canada
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
| | | | - Cheryl Ho
- BC Cancer, Vancouver, British Columbia, Canada
| | - Alan T. Bates
- BC Cancer, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
| | - Raymond T. Ng
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
19
|
Kim Y, Kim JH, Kim YM, Song S, Joo HJ. Predicting medical specialty from text based on a domain-specific pre-trained BERT. Int J Med Inform 2023; 170:104956. [PMID: 36512987 PMCID: PMC9731829 DOI: 10.1016/j.ijmedinf.2022.104956] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 11/15/2022] [Accepted: 12/03/2022] [Indexed: 12/13/2022]
Abstract
BACKGROUND Owing to the prevalence of the coronavirus disease (COVID-19), coping with clinical issues at the individual level has become important to the healthcare system. Accordingly, precise initiation of treatment after a hospital visit is required for expedited processes and effective diagnoses of outpatients. To achieve this, artificial intelligence in medical natural language processing (NLP), such as a healthcare chatbot or a clinical decision support system, can be suitable tools for an advanced clinical system. Furthermore, support for decisions on the medical specialty from the initial visit can be helpful. MATERIALS AND METHODS In this study, we propose a medical specialty prediction model from patient-side medical question text based on pre-trained bidirectional encoder representations from transformers (BERT). The dataset comprised pairs of medical question texts and labeled specialties scraped from a website for the medical question-and-answer service. The model was fine-tuned for predicting the required medical specialty labels among 27 labels from medical question texts. To demonstrate the feasibility, we conducted experiments on a real-world dataset and elaborately evaluated the predictive performance compared with four deep learning NLP models through cross-validation and test set evaluation. RESULTS The proposed model showed improved performance compared with competitive models in terms of overall specialties. In addition, we demonstrate the usefulness of the proposed model by performing case studies for visualization applications. CONCLUSION The proposed model can benefit hospital patient management and reasonable recommendations for specialties for patients.
Collapse
Affiliation(s)
- Yoojoong Kim
- School of Computer Science and Information Engineering, The Catholic University of Korea, Bucheon 14662, Republic of Korea
| | - Jong-Ho Kim
- Korea University Research Institute for Medical Bigdata Science, Korea University, Seoul 02841, Republic of Korea,Department of Cardiology, Cardiovascular Center, Korea University College of Medicine, Seoul 02841, Republic of Korea
| | - Young-Min Kim
- School of Interdisciplinary Industrial Studies, Hanyang University, Seoul 04763, Republic of Korea,Corresponding authors
| | - Sanghoun Song
- Department of Linguistics, Korea University, Seoul 02841, Republic of Korea,Corresponding authors
| | - Hyung Joon Joo
- Korea University Research Institute for Medical Bigdata Science, Korea University, Seoul 02841, Republic of Korea,Department of Cardiology, Cardiovascular Center, Korea University College of Medicine, Seoul 02841, Republic of Korea,Department of Medical Informatics, Korea University College of Medicine, Seoul 02841, Republic of Korea,Corresponding authors
| |
Collapse
|
20
|
Natural Language Processing in Radiology: Update on Clinical Applications. J Am Coll Radiol 2022; 19:1271-1285. [PMID: 36029890 DOI: 10.1016/j.jacr.2022.06.016] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/25/2022] [Accepted: 06/03/2022] [Indexed: 11/24/2022]
Abstract
Radiological reports are a valuable source of information used to guide clinical care and support research. Organizing and managing this content, however, frequently requires several manual curations due to the more common unstructured nature of the reports. However, manual review of these reports for clinical knowledge extraction is costly and time-consuming. Natural language processing (NLP) is a set of methods developed to extract structured meaning from a body of text and can be used to optimize the workflow of health care professionals. Specifically, NLP methods can help radiologists as decision support systems and improve the management of patients' medical data. In this study, we highlight the opportunities offered by NLP in the field of radiology. A comprehensive review of the most commonly used NLP methods to extract information from radiological reports and the development of tools to improve radiological workflow using this information is presented. Finally, we review the important limitations of these tools and discuss the relevant observations and trends in the application of NLP to radiology that could benefit the field in the future.
Collapse
|
21
|
Gunter D, Puac-Polanco P, Miguel O, Thornhill RE, Yu AYX, Liu ZA, Mamdani M, Pou-Prom C, Aviv RI. Rule-based natural language processing for automation of stroke data extraction: a validation study. Neuroradiology 2022; 64:2357-2362. [PMID: 35913525 DOI: 10.1007/s00234-022-03029-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 07/25/2022] [Indexed: 11/30/2022]
Abstract
PURPOSE Data extraction from radiology free-text reports is time consuming when performed manually. Recently, more automated extraction methods using natural language processing (NLP) are proposed. A previously developed rule-based NLP algorithm showed promise in its ability to extract stroke-related data from radiology reports. We aimed to externally validate the accuracy of CHARTextract, a rule-based NLP algorithm, to extract stroke-related data from free-text radiology reports. METHODS Free-text reports of CT angiography (CTA) and perfusion (CTP) studies of consecutive patients with acute ischemic stroke admitted to a regional stroke center for endovascular thrombectomy were analyzed from January 2015 to 2021. Stroke-related variables were manually extracted as reference standard from clinical reports, including proximal and distal anterior circulation occlusion, posterior circulation occlusion, presence of ischemia or hemorrhage, Alberta stroke program early CT score (ASPECTS), and collateral status. These variables were simultaneously extracted using a rule-based NLP algorithm. The NLP algorithm's accuracy, specificity, sensitivity, positive predictive value (PPV), and negative predictive value (NPV) were assessed. RESULTS The NLP algorithm's accuracy was > 90% for identifying distal anterior occlusion, posterior circulation occlusion, hemorrhage, and ASPECTS. Accuracy was 85%, 74%, and 79% for proximal anterior circulation occlusion, presence of ischemia, and collateral status respectively. The algorithm confirmed the absence of variables from radiology reports with an 87-100% accuracy. CONCLUSIONS Rule-based NLP has a moderate to good performance for stroke-related data extraction from free-text imaging reports. The algorithm's accuracy was affected by inconsistent report styles and lexicon among reporting radiologists.
Collapse
Affiliation(s)
- Dane Gunter
- The Ottawa Hospital Research Institute, Ottawa, ON, Canada
| | - Paulo Puac-Polanco
- Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, The Ottawa Hospital Civic Campus Room C110, 1053 Carling Ave, Ottawa, ON, ON K1Y 4E9, Canada
| | - Olivier Miguel
- Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, The Ottawa Hospital Civic Campus Room C110, 1053 Carling Ave, Ottawa, ON, ON K1Y 4E9, Canada
| | - Rebecca E Thornhill
- Division of Medical Physics, Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, Ottawa, ON, Canada
| | - Amy Y X Yu
- Department of Medicine (Neurology), University of Toronto, Sunnybrook Health Sciences Centre, Toronto, ON, Canada
| | - Zhongyu A Liu
- Department of Medicine (Neurology), University of Toronto, Sunnybrook Health Sciences Centre, Toronto, ON, Canada
| | - Muhammad Mamdani
- Department of Medicine, Unity Health Toronto, University of Toronto, Toronto, ON, Canada
| | | | - Richard I Aviv
- The Ottawa Hospital Research Institute, Ottawa, ON, Canada. .,Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, The Ottawa Hospital Civic Campus Room C110, 1053 Carling Ave, Ottawa, ON, ON K1Y 4E9, Canada.
| |
Collapse
|
22
|
Miller MI, Orfanoudaki A, Cronin M, Saglam H, So Yeon Kim I, Balogun O, Tzalidi M, Vasilopoulos K, Fanaropoulou G, Fanaropoulou NM, Kalin J, Hutch M, Prescott BR, Brush B, Benjamin EJ, Shin M, Mian A, Greer DM, Smirnakis SM, Ong CJ. Natural Language Processing of Radiology Reports to Detect Complications of Ischemic Stroke. Neurocrit Care 2022; 37:291-302. [PMID: 35534660 PMCID: PMC9986939 DOI: 10.1007/s12028-022-01513-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 04/05/2022] [Indexed: 02/01/2023]
Abstract
BACKGROUND Abstraction of critical data from unstructured radiologic reports using natural language processing (NLP) is a powerful tool to automate the detection of important clinical features and enhance research efforts. We present a set of NLP approaches to identify critical findings in patients with acute ischemic stroke from radiology reports of computed tomography (CT) and magnetic resonance imaging (MRI). METHODS We trained machine learning classifiers to identify categorical outcomes of edema, midline shift (MLS), hemorrhagic transformation, and parenchymal hematoma, as well as rule-based systems (RBS) to identify intraventricular hemorrhage (IVH) and continuous MLS measurements within CT/MRI reports. Using a derivation cohort of 2289 reports from 550 individuals with acute middle cerebral artery territory ischemic strokes, we externally validated our models on reports from a separate institution as well as from patients with ischemic strokes in any vascular territory. RESULTS In all data sets, a deep neural network with pretrained biomedical word embeddings (BioClinicalBERT) achieved the highest discrimination performance for binary prediction of edema (area under precision recall curve [AUPRC] > 0.94), MLS (AUPRC > 0.98), hemorrhagic conversion (AUPRC > 0.89), and parenchymal hematoma (AUPRC > 0.76). BioClinicalBERT outperformed lasso regression (p < 0.001) for all outcomes except parenchymal hematoma (p = 0.755). Tailored RBS for IVH and continuous MLS outperformed BioClinicalBERT (p < 0.001) and linear regression, respectively (p < 0.001). CONCLUSIONS Our study demonstrates robust performance and external validity of a core NLP tool kit for identifying both categorical and continuous outcomes of ischemic stroke from unstructured radiographic text data. Medically tailored NLP methods have multiple important big data applications, including scalable electronic phenotyping, augmentation of clinical risk prediction models, and facilitation of automatic alert systems in the hospital setting.
Collapse
Affiliation(s)
- Matthew I Miller
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA
| | | | - Michael Cronin
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA
| | - Hanife Saglam
- Department of Neurology, West Virginia University School of Medicine, Morgantown, WV, USA
| | | | - Oluwafemi Balogun
- Boston Medical Center, Boston, MA, USA.,Boston University School of Public Health, Boston, MA, USA
| | - Maria Tzalidi
- School of Medicine, University of Crete, Heraklion, Greece
| | | | | | - Nina M Fanaropoulou
- School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Jack Kalin
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA
| | - Meghan Hutch
- Department of Preventive Medicine, Northwestern University, Chicago, IL, USA.,Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Benjamin Brush
- Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| | - Emelia J Benjamin
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA.,Boston University School of Public Health, Boston, MA, USA
| | - Min Shin
- Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC, USA
| | - Asim Mian
- Department of Radiology, Boston Medical Center, Boston, MA, USA
| | - David M Greer
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA.,Boston Medical Center, Boston, MA, USA
| | - Stelios M Smirnakis
- Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA.,Jamaica Plain Veterans Administration Hospital, Boston, MA, USA
| | - Charlene J Ong
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA. .,Boston Medical Center, Boston, MA, USA. .,Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA. .,Department of Neurology, Massachusetts General Hospital, Boston, MA, USA. .,Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
23
|
Torres-Lopez VM, Rovenolt GE, Olcese AJ, Garcia GE, Chacko SM, Robinson A, Gaiser E, Acosta J, Herman AL, Kuohn LR, Leary M, Soto AL, Zhang Q, Fatima S, Falcone GJ, Payabvash MS, Sharma R, Struck AF, Sheth KN, Westover MB, Kim JA. Development and Validation of a Model to Identify Critical Brain Injuries Using Natural Language Processing of Text Computed Tomography Reports. JAMA Netw Open 2022; 5:e2227109. [PMID: 35972739 PMCID: PMC9382443 DOI: 10.1001/jamanetworkopen.2022.27109] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 06/20/2022] [Indexed: 12/17/2022] Open
Abstract
Importance Clinical text reports from head computed tomography (CT) represent rich, incompletely utilized information regarding acute brain injuries and neurologic outcomes. CT reports are unstructured; thus, extracting information at scale requires automated natural language processing (NLP). However, designing new NLP algorithms for each individual injury category is an unwieldy proposition. An NLP tool that summarizes all injuries in head CT reports would facilitate exploration of large data sets for clinical significance of neuroradiological findings. Objective To automatically extract acute brain pathological data and their features from head CT reports. Design, Setting, and Participants This diagnostic study developed a 2-part named entity recognition (NER) NLP model to extract and summarize data on acute brain injuries from head CT reports. The model, termed BrainNERD, extracts and summarizes detailed brain injury information for research applications. Model development included building and comparing 2 NER models using a custom dictionary of terms, including lesion type, location, size, and age, then designing a rule-based decoder using NER outputs to evaluate for the presence or absence of injury subtypes. BrainNERD was evaluated against independent test data sets of manually classified reports, including 2 external validation sets. The model was trained on head CT reports from 1152 patients generated by neuroradiologists at the Yale Acute Brain Injury Biorepository. External validation was conducted using reports from 2 outside institutions. Analyses were conducted from May 2020 to December 2021. Main Outcomes and Measures Performance of the BrainNERD model was evaluated using precision, recall, and F1 scores based on manually labeled independent test data sets. Results A total of 1152 patients (mean [SD] age, 67.6 [16.1] years; 586 [52%] men), were included in the training set. NER training using transformer architecture and bidirectional encoder representations from transformers was significantly faster than spaCy. For all metrics, the 10-fold cross-validation performance was 93% to 99%. The final test performance metrics for the NER test data set were 98.82% (95% CI, 98.37%-98.93%) for precision, 98.81% (95% CI, 98.46%-99.06%) for recall, and 98.81% (95% CI, 98.40%-98.94%) for the F score. The expert review comparison metrics were 99.06% (95% CI, 97.89%-99.13%) for precision, 98.10% (95% CI, 97.93%-98.77%) for recall, and 98.57% (95% CI, 97.78%-99.10%) for the F score. The decoder test set metrics were 96.06% (95% CI, 95.01%-97.16%) for precision, 96.42% (95% CI, 94.50%-97.87%) for recall, and 96.18% (95% CI, 95.151%-97.16%) for the F score. Performance in external institution report validation including 1053 head CR reports was greater than 96%. Conclusions and Relevance These findings suggest that the BrainNERD model accurately extracted acute brain injury terms and their properties from head CT text reports. This freely available new tool could advance clinical research by integrating information in easily gathered head CT reports to expand knowledge of acute brain injury radiographic phenotypes.
Collapse
Affiliation(s)
| | | | - Angelo J. Olcese
- Department of Neurology, Yale University, New Haven, Connecticut
| | | | - Sarah M. Chacko
- Department of Neurology, Yale University, New Haven, Connecticut
| | - Amber Robinson
- Department of Neurology, Yale University, New Haven, Connecticut
| | - Edward Gaiser
- Department of Neurology, Yale University, New Haven, Connecticut
| | - Julian Acosta
- Department of Neurology, Yale University, New Haven, Connecticut
| | - Alison L. Herman
- Department of Neurology, Yale University, New Haven, Connecticut
| | - Lindsey R. Kuohn
- Department of Neurology, Yale University, New Haven, Connecticut
| | - Megan Leary
- Department of Neurology, Yale University, New Haven, Connecticut
| | | | - Qiang Zhang
- Department of Neurology, Yale University, New Haven, Connecticut
| | - Safoora Fatima
- Department of Neurology, University of Wisconsin, Madison
| | - Guido J. Falcone
- Department of Neurology, Yale University, New Haven, Connecticut
| | | | - Richa Sharma
- Department of Neurology, Yale University, New Haven, Connecticut
| | - Aaron F. Struck
- Department of Neurology, University of Wisconsin, Madison
- William S Middleton Veterans Hospital, Madison, Wisconsin
| | - Kevin N. Sheth
- Department of Neurology, Yale University, New Haven, Connecticut
| | | | - Jennifer A. Kim
- Department of Neurology, Yale University, New Haven, Connecticut
| |
Collapse
|
24
|
Li J, Lin Y, Zhao P, Liu W, Cai L, Sun J, Zhao L, Yang Z, Song H, Lv H, Wang Z. Automatic text classification of actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer (BERT) and in-domain pre-training (IDPT). BMC Med Inform Decis Mak 2022; 22:200. [PMID: 35907966 PMCID: PMC9338483 DOI: 10.1186/s12911-022-01946-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Accepted: 07/18/2022] [Indexed: 11/17/2022] Open
Abstract
Background Given the increasing number of people suffering from tinnitus, the accurate categorization of patients with actionable reports is attractive in assisting clinical decision making. However, this process requires experienced physicians and significant human labor. Natural language processing (NLP) has shown great potential in big data analytics of medical texts; yet, its application to domain-specific analysis of radiology reports is limited. Objective The aim of this study is to propose a novel approach in classifying actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer BERT-based models and evaluate the benefits of in domain pre-training (IDPT) along with a sequence adaptation strategy. Methods A total of 5864 temporal bone computed tomography(CT) reports are labeled by two experienced radiologists as follows: (1) normal findings without notable lesions; (2) notable lesions but uncorrelated to tinnitus; and (3) at least one lesion considered as potential cause of tinnitus. We then constructed a framework consisting of deep learning (DL) neural networks and self-supervised BERT models. A tinnitus domain-specific corpus is used to pre-train the BERT model to further improve its embedding weights. In addition, we conducted an experiment to evaluate multiple groups of max sequence length settings in BERT to reduce the excessive quantity of calculations. After a comprehensive comparison of all metrics, we determined the most promising approach through the performance comparison of F1-scores and AUC values. Results In the first experiment, the BERT finetune model achieved a more promising result (AUC-0.868, F1-0.760) compared with that of the Word2Vec-based models(AUC-0.767, F1-0.733) on validation data. In the second experiment, the BERT in-domain pre-training model (AUC-0.948, F1-0.841) performed significantly better than the BERT based model(AUC-0.868, F1-0.760). Additionally, in the variants of BERT fine-tuning models, Mengzi achieved the highest AUC of 0.878 (F1-0.764). Finally, we found that the BERT max-sequence-length of 128 tokens achieved an AUC of 0.866 (F1-0.736), which is almost equal to the BERT max-sequence-length of 512 tokens (AUC-0.868,F1-0.760). Conclusion In conclusion, we developed a reliable BERT-based framework for tinnitus diagnosis from Chinese radiology reports, along with a sequence adaptation strategy to reduce computational resources while maintaining accuracy. The findings could provide a reference for NLP development in Chinese radiology reports. Supplementary Information The online version contains supplementary material available at 10.1186/s12911-022-01946-y.
Collapse
Affiliation(s)
- Jia Li
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing, 100050, People's Republic of China
| | - Yucong Lin
- School of Medical Technology, Beijing Institute of Technology, No.5 Zhongguancun East Road, Beijing, 100050, People's Republic of China
| | - Pengfei Zhao
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing, 100050, People's Republic of China
| | - Wenjuan Liu
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing, 100050, People's Republic of China
| | - Linkun Cai
- School of Biological Science and Medical Engineering, Beihang University, No.37 XueYuan Road, Beijing, 100191, People's Republic of China
| | - Jing Sun
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing, 100050, People's Republic of China
| | - Lei Zhao
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing, 100050, People's Republic of China
| | - Zhenghan Yang
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing, 100050, People's Republic of China
| | - Hong Song
- School of Computer Science and Technology, Beijing Institute of Technology, No. 5, South Street, Zhongguancun, Haidian District, Beijing, 100050, People's Republic of China.
| | - Han Lv
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing, 100050, People's Republic of China.
| | - Zhenchang Wang
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing, 100050, People's Republic of China. .,School of Biological Science and Medical Engineering, Beihang University, No.37 XueYuan Road, Beijing, 100191, People's Republic of China.
| |
Collapse
|
25
|
Chaudhari GR, Liu T, Chen TL, Joseph GB, Vella M, Lee YJ, Vu TH, Seo Y, Rauschecker AM, McCulloch CE, Sohn JH. Application of a Domain-specific BERT for Detection of Speech Recognition Errors in Radiology Reports. Radiol Artif Intell 2022; 4:e210185. [PMID: 35923373 PMCID: PMC9344210 DOI: 10.1148/ryai.210185] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 04/11/2022] [Accepted: 05/10/2022] [Indexed: 06/15/2023]
Abstract
PURPOSE To develop radiology domain-specific bidirectional encoder representations from transformers (BERT) models that can identify speech recognition (SR) errors and suggest corrections in radiology reports. MATERIALS AND METHODS A pretrained BERT model, Clinical BioBERT, was further pretrained on a corpus of 114 008 radiology reports between April 2016 and August 2019 that were retrospectively collected from two hospitals. Next, the model was fine-tuned on a training dataset of generated insertion, deletion, and substitution errors, creating Radiology BERT. This model was retrospectively evaluated on an independent dataset of radiology reports with generated errors (n = 18 885) and on unaltered report sentences (n = 2000) and prospectively evaluated on true clinical SR errors (n = 92). Correction Radiology BERT was separately trained to suggest corrections for detected deletion and substitution errors. Area under the receiver operating characteristic curve (AUC) and bootstrapped 95% CIs were calculated for each evaluation dataset. RESULTS Radiology-specific BERT had AUC values of >.99 (95% CI: >0.99, >0.99), 0.94 (95% CI: 0.93, 0.94), 0.98 (95% CI: 0.98, 0.98), and 0.97 (95% CI: 0.97, 0.97) for detecting insertion, deletion, substitution, and all errors, respectively, on the independently generated test set. Testing on unaltered report impressions revealed a sensitivity of 82% (28 of 34; 95% CI: 70%, 93%) and specificity of 88% (1521 of 1728; 95% CI: 87%, 90%). Testing on prospective SR errors showed an accuracy of 75% (69 of 92; 95% CI: 65%, 83%). Finally, the correct word was the top suggestion for 45.6% (475 of 1041; 95% CI: 42.5%, 49.3%) of errors. CONCLUSION Radiology-specific BERT models fine-tuned on generated errors were able to identify SR errors in radiology reports and suggest corrections.Keywords: Computer Applications, Technology Assessment Supplemental material is available for this article. © RSNA, 2022See also the commentary by Abajian and Cheung in this issue.
Collapse
|
26
|
Linna N, Kahn CE. Applications of Natural Language Processing in Radiology: A Systematic Review. Int J Med Inform 2022; 163:104779. [DOI: 10.1016/j.ijmedinf.2022.104779] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 03/28/2022] [Accepted: 04/21/2022] [Indexed: 12/27/2022]
|
27
|
Sung SF, Hsieh CY, Hu YH. Early Prediction of Functional Outcomes After Acute Ischemic Stroke Using Unstructured Clinical Text: Retrospective Cohort Study. JMIR Med Inform 2022; 10:e29806. [PMID: 35175201 PMCID: PMC8895286 DOI: 10.2196/29806] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 07/17/2021] [Accepted: 01/02/2022] [Indexed: 02/06/2023] Open
Abstract
Background Several prognostic scores have been proposed to predict functional outcomes after an acute ischemic stroke (AIS). Most of these scores are based on structured information and have been used to develop prediction models via the logistic regression method. With the increased use of electronic health records and the progress in computational power, data-driven predictive modeling by using machine learning techniques is gaining popularity in clinical decision-making. Objective We aimed to investigate whether machine learning models created by using unstructured text could improve the prediction of functional outcomes at an early stage after AIS. Methods We identified all consecutive patients who were hospitalized for the first time for AIS from October 2007 to December 2019 by using a hospital stroke registry. The study population was randomly split into a training (n=2885) and test set (n=962). Free text in histories of present illness and computed tomography reports was transformed into input variables via natural language processing. Models were trained by using the extreme gradient boosting technique to predict a poor functional outcome at 90 days poststroke. Model performance on the test set was evaluated by using the area under the receiver operating characteristic curve (AUC). Results The AUCs of text-only models ranged from 0.768 to 0.807 and were comparable to that of the model using National Institutes of Health Stroke Scale (NIHSS) scores (0.811). Models using both patient age and text achieved AUCs of 0.823 and 0.825, which were similar to those of the model containing age and NIHSS scores (0.841); the model containing preadmission comorbidities, level of consciousness, age, and neurological deficit (PLAN) scores (0.837); and the model containing Acute Stroke Registry and Analysis of Lausanne (ASTRAL) scores (0.840). Adding variables from clinical text improved the predictive performance of the model containing age and NIHSS scores, the model containing PLAN scores, and the model containing ASTRAL scores (the AUC increased from 0.841 to 0.861, from 0.837 to 0.856, and from 0.840 to 0.860, respectively). Conclusions Unstructured clinical text can be used to improve the performance of existing models for predicting poststroke functional outcomes. However, considering the different terminologies that are used across health systems, each individual health system may consider using the proposed methods to develop and validate its own models.
Collapse
Affiliation(s)
- Sheng-Feng Sung
- Division of Neurology, Department of Internal Medicine, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi City, Taiwan.,Department of Nursing, Min-Hwei Junior College of Health Care Management, Tainan, Taiwan
| | - Cheng-Yang Hsieh
- Department of Neurology, Tainan Sin Lau Hospital, Tainan, Taiwan
| | - Ya-Han Hu
- Department of Information Management, National Central University, Taoyuan City, Taiwan
| |
Collapse
|
28
|
Sung SF, Chen CH, Pan RC, Hu YH, Jeng JS. Natural Language Processing Enhances Prediction of Functional Outcome After Acute Ischemic Stroke. J Am Heart Assoc 2021; 10:e023486. [PMID: 34796719 PMCID: PMC9075227 DOI: 10.1161/jaha.121.023486] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Background Conventional prognostic scores usually require predefined clinical variables to predict outcome. The advancement of natural language processing has made it feasible to derive meaning from unstructured data. We aimed to test whether using unstructured text in electronic health records can improve the prediction of functional outcome after acute ischemic stroke. Methods and Results Patients hospitalized for acute ischemic stroke were identified from 2 hospital stroke registries (3847 and 2668 patients, respectively). Prediction models developed using the first cohort were externally validated using the second cohort, and vice versa. Free text in the history of present illness and computed tomography reports was used to build machine learning models using natural language processing to predict poor functional outcome at 90 days poststroke. Four conventional prognostic models were used as baseline models. The area under the receiver operating characteristic curves of the model using history of present illness in the internal and external validation sets were 0.820 and 0.792, respectively, which were comparable to the National Institutes of Health Stroke Scale score (0.811 and 0.807). The model using computed tomography reports achieved area under the receiver operating characteristic curves of 0.758 and 0.658. Adding information from clinical text significantly improved the predictive performance of each baseline model in terms of area under the receiver operating characteristic curves, net reclassification improvement, and integrated discrimination improvement indices (all P<0.001). Swapping the study cohorts led to similar results. Conclusions By using natural language processing, unstructured text in electronic health records can provide an alternative tool for stroke prognostication, and even enhance the performance of existing prognostic scores.
Collapse
Affiliation(s)
- Sheng-Feng Sung
- Division of Neurology Department of Internal Medicine Ditmanson Medical Foundation, Chia-Yi Christian Hospital Chiayi City Taiwan.,Department of Information Management and Institute of Healthcare Information Management National Chung Cheng University Chiayi County Taiwan.,Department of Nursing Min-Hwei Junior College of Health Care Management Tainan Taiwan
| | - Chih-Hao Chen
- Stroke Center and Department of Neurology National Taiwan University Hospital Taipei Taiwan
| | - Ru-Chiou Pan
- Division of Neurology Department of Internal Medicine Ditmanson Medical Foundation, Chia-Yi Christian Hospital Chiayi City Taiwan
| | - Ya-Han Hu
- Department of Information Management National Central University Taoyuan City Taiwan
| | - Jiann-Shing Jeng
- Stroke Center and Department of Neurology National Taiwan University Hospital Taipei Taiwan
| |
Collapse
|
29
|
Artificial Intelligence: A Shifting Paradigm in Cardio-Cerebrovascular Medicine. J Clin Med 2021; 10:jcm10235710. [PMID: 34884412 PMCID: PMC8658222 DOI: 10.3390/jcm10235710] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 12/02/2021] [Indexed: 12/21/2022] Open
Abstract
The future of healthcare is an organic blend of technology, innovation, and human connection. As artificial intelligence (AI) is gradually becoming a go-to technology in healthcare to improve efficiency and outcomes, we must understand our limitations. We should realize that our goal is not only to provide faster and more efficient care, but also to deliver an integrated solution to ensure that the care is fair and not biased to a group of sub-population. In this context, the field of cardio-cerebrovascular diseases, which encompasses a wide range of conditions-from heart failure to stroke-has made some advances to provide assistive tools to care providers. This article aimed to provide an overall thematic review of recent development focusing on various AI applications in cardio-cerebrovascular diseases to identify gaps and potential areas of improvement. If well designed, technological engines have the potential to improve healthcare access and equitability while reducing overall costs, diagnostic errors, and disparity in a system that affects patients and providers and strives for efficiency.
Collapse
|
30
|
Conic RRZ, Geis C, Vincent HK. Social Determinants of Health in Physiatry: Challenges and Opportunities for Clinical Decision Making and Improving Treatment Precision. Front Public Health 2021; 9:738253. [PMID: 34858922 PMCID: PMC8632538 DOI: 10.3389/fpubh.2021.738253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 10/11/2021] [Indexed: 11/15/2022] Open
Abstract
Physiatry is a medical specialty focused on improving functional outcomes in patients with a variety of medical conditions that affect the brain, spinal cord, peripheral nerves, muscles, bones, joints, ligaments, and tendons. Social determinants of health (SDH) play a key role in determining therapeutic process and patient functional outcomes. Big data and precision medicine have been used in other fields and to some extent in physiatry to predict patient outcomes, however many challenges remain. The interplay between SDH and physiatry outcomes is highly variable depending on different phases of care, and more favorable patient profiles in acute care may be less favorable in the outpatient setting. Furthermore, SDH influence which treatments or interventional procedures are accessible to the patient and thus determine outcomes. This opinion paper describes utility of existing datasets in combination with novel data such as movement, gait patterning and patient perceived outcomes could be analyzed with artificial intelligence methods to determine the best treatment plan for individual patients in order to achieve maximal functional capacity.
Collapse
Affiliation(s)
- Rosalynn R Z Conic
- Department of Family Medicine and Public Health, University of California, San Diego, San Diego, CA, United States
| | - Carolyn Geis
- Department of Physical Medicine and Rehabilitation, University of Florida, Gainesville, FL, United States
| | - Heather K Vincent
- Department of Physical Medicine and Rehabilitation, University of Florida, Gainesville, FL, United States
| |
Collapse
|
31
|
Steinkamp J, Cook TS. Basic Artificial Intelligence Techniques: Natural Language Processing of Radiology Reports. Radiol Clin North Am 2021; 59:919-931. [PMID: 34689877 DOI: 10.1016/j.rcl.2021.06.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Natural language processing (NLP) is a subfield of computer science and linguistics that can be applied to extract meaningful information from radiology reports. Symbolic NLP is rule based and well suited to problems that can be explicitly defined by a set of rules. Statistical NLP is better situated to problems that cannot be well defined and requires annotated or labeled examples from which machine learning algorithms can infer the rules. Both symbolic and statistical NLP have found success in a variety of radiology use cases. More recently, deep learning approaches, including transformers, have gained traction and demonstrated good performance.
Collapse
Affiliation(s)
- Jackson Steinkamp
- Department of Medicine, Hospital of the University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, USA
| | - Tessa S Cook
- Perelman School of Medicine at the University of Pennsylvania, 3400 Spruce Street, 1 Silverstein Radiology, Philadelphia, PA 19104, USA.
| |
Collapse
|
32
|
Olthof AW, van Ooijen PMA, Cornelissen LJ. Deep Learning-Based Natural Language Processing in Radiology: The Impact of Report Complexity, Disease Prevalence, Dataset Size, and Algorithm Type on Model Performance. J Med Syst 2021; 45:91. [PMID: 34480231 PMCID: PMC8416876 DOI: 10.1007/s10916-021-01761-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 08/04/2021] [Indexed: 12/12/2022]
Abstract
In radiology, natural language processing (NLP) allows the extraction of valuable information from radiology reports. It can be used for various downstream tasks such as quality improvement, epidemiological research, and monitoring guideline adherence. Class imbalance, variation in dataset size, variation in report complexity, and algorithm type all influence NLP performance but have not yet been systematically and interrelatedly evaluated. In this study, we investigate these factors on the performance of four types [a fully connected neural network (Dense), a long short-term memory recurrent neural network (LSTM), a convolutional neural network (CNN), and a Bidirectional Encoder Representations from Transformers (BERT)] of deep learning-based NLP. Two datasets consisting of radiologist-annotated reports of both trauma radiographs (n = 2469) and chest radiographs and computer tomography (CT) studies (n = 2255) were split into training sets (80%) and testing sets (20%). The training data was used as a source to train all four model types in 84 experiments (Fracture-data) and 45 experiments (Chest-data) with variation in size and prevalence. The performance was evaluated on sensitivity, specificity, positive predictive value, negative predictive value, area under the curve, and F score. After the NLP of radiology reports, all four model-architectures demonstrated high performance with metrics up to > 0.90. CNN, LSTM, and Dense were outperformed by the BERT algorithm because of its stable results despite variation in training size and prevalence. Awareness of variation in prevalence is warranted because it impacts sensitivity and specificity in opposite directions.
Collapse
Affiliation(s)
- A W Olthof
- Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Hanzeplein 1, Groningen, The Netherlands. .,Treant Health Care Group, Department of Radiology, Dr G.H. Amshoffweg 1, Hoogeveen, The Netherlands. .,Hospital Group Twente (ZGT), Department of Radiology, Almelo, The Netherlands.
| | - P M A van Ooijen
- Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Hanzeplein 1, Groningen, The Netherlands.,Data Science Center in Health (DASH), University of Groningen, University Medical Center Groningen, Machine Learning Lab, L.J, Zielstraweg 2, Groningen, The Netherlands
| | - L J Cornelissen
- Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Hanzeplein 1, Groningen, The Netherlands.,COSMONiO Imaging BV, L.J, Zielstraweg 2, Groningen, The Netherlands
| |
Collapse
|
33
|
Mozayan A, Fabbri AR, Maneevese M, Tocino I, Chheang S. Practical Guide to Natural Language Processing for Radiology. Radiographics 2021; 41:1446-1453. [PMID: 34469212 DOI: 10.1148/rg.2021200113] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Natural language processing (NLP) is the subset of artificial intelligence focused on the computer interpretation of human language. It is an invaluable tool in the analysis, aggregation, and simplification of free text. It has already demonstrated significant potential in the analysis of radiology reports. There are abundant open-source libraries and tools available that facilitate its application to the benefit of radiology. Radiologists who understand its limitations and potential will be better positioned to evaluate NLP models, understand how they can improve clinical workflow, and facilitate research endeavors involving large amounts of human language. The advent of increasingly affordable and powerful computer processing, the large quantities of medical and radiologic data, and advances in machine learning algorithms have contributed to the large potential of NLP. In turn, radiology has significant potential to benefit from the ability of NLP to convert relatively standardized radiology reports to machine-readable data. NLP benefits from standardized reporting, but because of its ability to interpret free text by using context clues, NLP does not necessarily depend on it. An overview and practical approach to NLP is featured, with specific emphasis on its applications to radiology. A brief history of NLP, the strengths and challenges inherent to its use, and freely available resources and tools are covered to guide further exploration and study within the field. Particular attention is devoted to the recent development of the Word2Vec and BERT (Bidirectional Encoder Representations from Transformers) language models, which have exponentially increased the power and utility of NLP for a variety of applications. Online supplemental material is available for this article. ©RSNA, 2021.
Collapse
Affiliation(s)
- Ali Mozayan
- From the Department of Radiology and Biomedical Imaging, Yale School of Medicine, PO Box 208042, Tompkins East 2, New Haven, CT 06520 (A.M., M.M., I.T., S.C.); and Department of Computer Science, Yale University, New Haven, Conn (A.R.F.)
| | - Alexander R Fabbri
- From the Department of Radiology and Biomedical Imaging, Yale School of Medicine, PO Box 208042, Tompkins East 2, New Haven, CT 06520 (A.M., M.M., I.T., S.C.); and Department of Computer Science, Yale University, New Haven, Conn (A.R.F.)
| | - Michelle Maneevese
- From the Department of Radiology and Biomedical Imaging, Yale School of Medicine, PO Box 208042, Tompkins East 2, New Haven, CT 06520 (A.M., M.M., I.T., S.C.); and Department of Computer Science, Yale University, New Haven, Conn (A.R.F.)
| | - Irena Tocino
- From the Department of Radiology and Biomedical Imaging, Yale School of Medicine, PO Box 208042, Tompkins East 2, New Haven, CT 06520 (A.M., M.M., I.T., S.C.); and Department of Computer Science, Yale University, New Haven, Conn (A.R.F.)
| | - Sophie Chheang
- From the Department of Radiology and Biomedical Imaging, Yale School of Medicine, PO Box 208042, Tompkins East 2, New Haven, CT 06520 (A.M., M.M., I.T., S.C.); and Department of Computer Science, Yale University, New Haven, Conn (A.R.F.)
| |
Collapse
|
34
|
Olthof AW, Shouche P, Fennema EM, IJpma FFA, Koolstra RHC, Stirler VMA, van Ooijen PMA, Cornelissen LJ. Machine learning based natural language processing of radiology reports in orthopaedic trauma. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 208:106304. [PMID: 34333208 DOI: 10.1016/j.cmpb.2021.106304] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 07/18/2021] [Indexed: 06/13/2023]
Abstract
OBJECTIVES To compare different Machine Learning (ML) Natural Language Processing (NLP) methods to classify radiology reports in orthopaedic trauma for the presence of injuries. Assessing NLP performance is a prerequisite for downstream tasks and therefore of importance from a clinical perspective (avoiding missed injuries, quality check, insight in diagnostic yield) as well as from a research perspective (identification of patient cohorts, annotation of radiographs). METHODS Datasets of Dutch radiology reports of injured extremities (n = 2469, 33% fractures) and chest radiographs (n = 799, 20% pneumothorax) were collected in two different hospitals and labeled by radiologists and trauma surgeons for the presence or absence of injuries. NLP classification was applied and optimized by testing different preprocessing steps and different classifiers (Rule-based, ML, and Bidirectional Encoder Representations from Transformers (BERT)). Performance was assessed by F1-score, AUC, sensitivity, specificity and accuracy. RESULTS The deep learning based BERT model outperforms all other classification methods which were assessed. The model achieved an F1-score of (95 ± 2)% and accuracy of (96 ± 1)% on a dataset of simple reports (n= 2469), and an F1 of (83 ± 7)% with accuracy (93 ± 2)% on a dataset of complex reports (n= 799). CONCLUSION BERT NLP outperforms traditional ML and rule-base classifiers when applied to Dutch radiology reports in orthopaedic trauma.
Collapse
Affiliation(s)
- A W Olthof
- Department of Radiology, Treant Health Care Group, Dr. G.H. Amshoffweg 1, Hoogeveen, the Netherlands; Department of Radiation Oncology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, the Netherlands.
| | - P Shouche
- Department of Radiation Oncology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, the Netherlands
| | - E M Fennema
- Department of Trauma Surgery, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, the Netherlands
| | - F F A IJpma
- Department of Trauma Surgery, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, the Netherlands
| | - R H C Koolstra
- Department of Radiology, Treant Health Care Group, Dr. G.H. Amshoffweg 1, Hoogeveen, the Netherlands
| | - V M A Stirler
- Department of Trauma Surgery, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, the Netherlands
| | - P M A van Ooijen
- Department of Radiation Oncology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, the Netherlands; Machine Learning Lab, Data Science Center in Health (DASH),University Medical Center Groningen, University of Groningen, L.J. Zielstraweg 2, Groningen, the Netherlands
| | - L J Cornelissen
- Department of Radiation Oncology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, the Netherlands; COSMONiO Imaging BV, L.J. Zielstraweg 2, Groningen, the Netherlands
| |
Collapse
|
35
|
Predicting short and long-term mortality after acute ischemic stroke using EHR. J Neurol Sci 2021; 427:117560. [PMID: 34218182 DOI: 10.1016/j.jns.2021.117560] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 06/21/2021] [Accepted: 06/25/2021] [Indexed: 12/14/2022]
Abstract
OBJECTIVE Despite improvements in treatment, stroke remains a leading cause of mortality and long-term disability. In this study, we leveraged administrative data to build predictive models of short- and long-term post-stroke all-cause-mortality. METHODS The study was conducted and reported according to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) guideline. We used patient-level data from electronic health records, three algorithms, and six prediction windows to develop models for post-stroke mortality. RESULTS We included 7144 patients from which 5347 had survived their ischemic stroke after two years. The proportion of mortality was between 8%(605/7144) within 1-month, to 25%(1797/7144) for the 2-years window. The three most common comorbidities were hypertension, dyslipidemia, and diabetes. The best Area Under the ROC curve(AUROC) was reached with the Random Forest model at 0.82 for the 1-month prediction window. The negative predictive value (NPV) was highest for the shorter prediction windows - 0.91 for the 1-month - and the best positive predictive value (PPV) was reached for the 6-months prediction window at 0.92. Age, hemoglobin levels, and body mass index were the top associated factors. Laboratory variables had higher importance when compared to past medical history and comorbidities. Hypercoagulation state, smoking, and end-stage renal disease were more strongly associated with long-term mortality. CONCLUSION All the selected algorithms could be trained to predict the short and long-term mortality after stroke. The factors associated with mortality differed depending on the prediction window. Our classifier highlighted the importance of controlling risk factors, as indicated by laboratory measures.
Collapse
|
36
|
Casey A, Davidson E, Poon M, Dong H, Duma D, Grivas A, Grover C, Suárez-Paniagua V, Tobin R, Whiteley W, Wu H, Alex B. A systematic review of natural language processing applied to radiology reports. BMC Med Inform Decis Mak 2021; 21:179. [PMID: 34082729 PMCID: PMC8176715 DOI: 10.1186/s12911-021-01533-7] [Citation(s) in RCA: 64] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 05/17/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Natural language processing (NLP) has a significant role in advancing healthcare and has been found to be key in extracting structured information from radiology reports. Understanding recent developments in NLP application to radiology is of significance but recent reviews on this are limited. This study systematically assesses and quantifies recent literature in NLP applied to radiology reports. METHODS We conduct an automated literature search yielding 4836 results using automated filtering, metadata enriching steps and citation search combined with manual review. Our analysis is based on 21 variables including radiology characteristics, NLP methodology, performance, study, and clinical application characteristics. RESULTS We present a comprehensive analysis of the 164 publications retrieved with publications in 2019 almost triple those in 2015. Each publication is categorised into one of 6 clinical application categories. Deep learning use increases in the period but conventional machine learning approaches are still prevalent. Deep learning remains challenged when data is scarce and there is little evidence of adoption into clinical practice. Despite 17% of studies reporting greater than 0.85 F1 scores, it is hard to comparatively evaluate these approaches given that most of them use different datasets. Only 14 studies made their data and 15 their code available with 10 externally validating results. CONCLUSIONS Automated understanding of clinical narratives of the radiology reports has the potential to enhance the healthcare process and we show that research in this field continues to grow. Reproducibility and explainability of models are important if the domain is to move applications into clinical use. More could be done to share code enabling validation of methods on different institutional data and to reduce heterogeneity in reporting of study properties allowing inter-study comparisons. Our results have significance for researchers in the field providing a systematic synthesis of existing work to build on, identify gaps, opportunities for collaboration and avoid duplication.
Collapse
Affiliation(s)
- Arlene Casey
- School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland
| | - Emma Davidson
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
| | - Michael Poon
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
| | - Hang Dong
- Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, Scotland
- Health Data Research UK, London, UK
| | - Daniel Duma
- School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland
| | - Andreas Grivas
- Institute for Language, Cognition and Computation, School of informatics, University of Edinburgh, Edinburgh, Scotland
| | - Claire Grover
- Institute for Language, Cognition and Computation, School of informatics, University of Edinburgh, Edinburgh, Scotland
| | - Víctor Suárez-Paniagua
- Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, Scotland
- Health Data Research UK, London, UK
| | - Richard Tobin
- Institute for Language, Cognition and Computation, School of informatics, University of Edinburgh, Edinburgh, Scotland
| | - William Whiteley
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Honghan Wu
- Health Data Research UK, London, UK
- Institute of Health Informatics, University College London, London, UK
| | - Beatrice Alex
- School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland
- Edinburgh Futures Institute, University of Edinburgh, Edinburgh, Scotland
| |
Collapse
|
37
|
Datta S, Khanpara S, Riascos RF, Roberts K. Leveraging Spatial Information in Radiology Reports for Ischemic Stroke Phenotyping. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2021; 2021:170-179. [PMID: 34457131 PMCID: PMC8378604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Classifying fine-grained ischemic stroke phenotypes relies on identifying important clinical information. Radiology reports provide relevant information with context to determine such phenotype information. We focus on stroke phenotypes with location-specific information: brain region affected, laterality, stroke stage, and lacunarity. We use an existing fine-grained spatial information extraction system-Rad-SpatialNet-to identify clinically important information and apply simple domain rules on the extracted information to classify phenotypes. The performance of our proposed approach is promising (recall of 89.62% for classifying brain region and 74.11% for classifying brain region, side, and stroke stage together). Our work demonstrates that an information extraction system based on a fine-grained schema can be utilized to determine complex phenotypes with the inclusion of simple domain rules. These phenotypes have the potential to facilitate stroke research focusing on post-stroke outcome and treatment planning based on the stroke location.
Collapse
Affiliation(s)
- Surabhi Datta
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX
| | - Shekhar Khanpara
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX
| | - Roy F Riascos
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX
| | - Kirk Roberts
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX
| |
Collapse
|
38
|
Park H, Song M, Lee EB, Seo BK, Choi CM. An Attention Model With Transfer Embeddings to Classify Pneumonia-Related Bilingual Imaging Reports: Algorithm Development and Validation. JMIR Med Inform 2021; 9:e24803. [PMID: 33820755 PMCID: PMC8167619 DOI: 10.2196/24803] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 12/21/2020] [Accepted: 04/04/2021] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND In the analysis of electronic health records, proper labeling of outcomes is mandatory. To obtain proper information from radiologic reports, several studies were conducted to classify radiologic reports using deep learning. However, the classification of pneumonia in bilingual radiologic reports has not been conducted previously. OBJECTIVE The aim of this research was to classify radiologic reports into pneumonia or no pneumonia using a deep learning method. METHODS A data set of radiology reports for chest computed tomography and chest x-rays of surgical patients from January 2008 to January 2018 in the Asan Medical Center in Korea was retrospectively analyzed. The classification performance of our long short-term memory (LSTM)-Attention model was compared with various deep learning and machine learning methods. The area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve, sensitivity, specificity, accuracy, and F1 score for the models were compared. RESULTS A total of 5450 radiologic reports were included that contained at least one pneumonia-related word. In the test set (n=1090), our proposed model showed 91.01% (992/1090) accuracy (AUROCs for negative, positive, and obscure were 0.98, 0.97, and 0.90, respectively). The top 3 performances of the models were based on FastText or LSTM. The convolutional neural network-based model showed a lower accuracy 73.03% (796/1090) than the other 2 algorithms. The classification of negative results had an F1 score of 0.96, whereas the classification of positive and uncertain results showed a lower performance (positive F1 score 0.83; uncertain F1 score 0.62). In the extra-validation set, our model showed 80.0% (642/803) accuracy (AUROCs for negative, positive, and obscure were 0.92, 0.96, and 0.84, respectively). CONCLUSIONS Our method showed excellent performance in classifying pneumonia in bilingual radiologic reports. The method could enrich the research on pneumonia by obtaining exact outcomes from electronic health data.
Collapse
Affiliation(s)
- Hyung Park
- Department of Pulmonary and Critical Care Medicine, Asan Medical Center, Seoul, Republic of Korea
| | - Min Song
- Yonsei University, Seoul, Republic of Korea
| | | | | | - Chang Min Choi
- Department of Pulmonary and Critical Care Medicine, Asan Medical Center, Seoul, Republic of Korea.,Department of Oncology, Asan Medical Center, Seoul, Republic of Korea
| |
Collapse
|
39
|
Sreekrishnan A, Ong CJ, Mahajan R, Prescott B, Smirnakis SM, Bevers MB, Feske SK, Snider SB. Subcortical Sparing Associated with Ambulatory Independence after Hemicraniectomy for Malignant Infarction. J Stroke Cerebrovasc Dis 2021; 30:105850. [PMID: 34000606 DOI: 10.1016/j.jstrokecerebrovasdis.2021.105850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 04/20/2021] [Indexed: 11/12/2022] Open
Affiliation(s)
- Anirudh Sreekrishnan
- Division of Neurocritical Care, Department of Neurology, Brigham & Women's Hospital, 75 Francis St, 02115 Boston, MA, USA.
| | - Charlene J Ong
- Division of Neurocritical Care, Department of Neurology, Boston Medical Center, Boston, MA, USA
| | - Rahul Mahajan
- Division of Neurocritical Care, Department of Neurology, Brigham & Women's Hospital, 75 Francis St, 02115 Boston, MA, USA
| | - Brenton Prescott
- Division of Neurocritical Care, Department of Neurology, Boston Medical Center, Boston, MA, USA
| | - Stelios M Smirnakis
- Division of Neurocritical Care, Department of Neurology, Brigham & Women's Hospital, 75 Francis St, 02115 Boston, MA, USA
| | - Matthew B Bevers
- Division of Neurocritical Care, Department of Neurology, Brigham & Women's Hospital, 75 Francis St, 02115 Boston, MA, USA
| | - Steven K Feske
- Division of Neurocritical Care, Department of Neurology, Brigham & Women's Hospital, 75 Francis St, 02115 Boston, MA, USA
| | - Samuel B Snider
- Division of Neurocritical Care, Department of Neurology, Brigham & Women's Hospital, 75 Francis St, 02115 Boston, MA, USA
| |
Collapse
|
40
|
Wiggins WF, Kitamura F, Santos I, Prevedello LM. Natural Language Processing of Radiology Text Reports: Interactive Text Classification. Radiol Artif Intell 2021; 3:e210035. [PMID: 34350414 DOI: 10.1148/ryai.2021210035] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 04/15/2021] [Accepted: 04/22/2021] [Indexed: 11/11/2022]
Abstract
This report presents a hands-on introduction to natural language processing (NLP) of radiology reports with deep neural networks in Google Colaboratory (Colab) to introduce readers to the rapidly evolving field of NLP. The implementation of the Google Colab notebook was designed with code hidden to facilitate learning for noncoders (ie, individuals with little or no computer programming experience). The data used for this module are the corpus of radiology reports from the Indiana University chest x-ray collection available from the National Library of Medicine's Open-I service. The module guides learners through the process of exploring the data, splitting the data for model training and testing, preparing the data for NLP analysis, and training a deep NLP model to classify the reports as normal or abnormal. Concepts in NLP, such as tokenization, numericalization, language modeling, and word embeddings, are demonstrated in the module. The module is implemented in a guided fashion with the authors presenting the material and explaining concepts. Interactive features and extensive text commentary are provided directly in the notebook to facilitate self-guided learning and experimentation with the module. Keywords: Neural Networks, Negative Expression Recognition, Natural Language Processing, Computer Applications, Informatics © RSNA, 2021.
Collapse
Affiliation(s)
- Walter F Wiggins
- Department of Radiology, Duke University Health System, Duke University Hospital, Box 3808, 2301 Erwin Rd, Durham, NC 27710 (W.F.W.); Department of Diagnostic Imaging, Universidade Federal de São Paulo, Escola Paulista de Medicina, São Paulo, Brazil (F.K., I.S.); Head of AI, Diagnósticos da América SA (DASA), São Paulo, Brazil (F.K.); FIDI, NESS Health, São Paulo, Brazil (I.S.); and Department of Radiology, Ohio State University, Columbus, Ohio (L.M.P.)
| | - Felipe Kitamura
- Department of Radiology, Duke University Health System, Duke University Hospital, Box 3808, 2301 Erwin Rd, Durham, NC 27710 (W.F.W.); Department of Diagnostic Imaging, Universidade Federal de São Paulo, Escola Paulista de Medicina, São Paulo, Brazil (F.K., I.S.); Head of AI, Diagnósticos da América SA (DASA), São Paulo, Brazil (F.K.); FIDI, NESS Health, São Paulo, Brazil (I.S.); and Department of Radiology, Ohio State University, Columbus, Ohio (L.M.P.)
| | - Igor Santos
- Department of Radiology, Duke University Health System, Duke University Hospital, Box 3808, 2301 Erwin Rd, Durham, NC 27710 (W.F.W.); Department of Diagnostic Imaging, Universidade Federal de São Paulo, Escola Paulista de Medicina, São Paulo, Brazil (F.K., I.S.); Head of AI, Diagnósticos da América SA (DASA), São Paulo, Brazil (F.K.); FIDI, NESS Health, São Paulo, Brazil (I.S.); and Department of Radiology, Ohio State University, Columbus, Ohio (L.M.P.)
| | - Luciano M Prevedello
- Department of Radiology, Duke University Health System, Duke University Hospital, Box 3808, 2301 Erwin Rd, Durham, NC 27710 (W.F.W.); Department of Diagnostic Imaging, Universidade Federal de São Paulo, Escola Paulista de Medicina, São Paulo, Brazil (F.K., I.S.); Head of AI, Diagnósticos da América SA (DASA), São Paulo, Brazil (F.K.); FIDI, NESS Health, São Paulo, Brazil (I.S.); and Department of Radiology, Ohio State University, Columbus, Ohio (L.M.P.)
| |
Collapse
|
41
|
Yu AYX, Liu ZA, Pou-Prom C, Lopes K, Kapral MK, Aviv RI, Mamdani M. Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study. JMIR Med Inform 2021; 9:e24381. [PMID: 33944791 PMCID: PMC8132979 DOI: 10.2196/24381] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 11/10/2020] [Accepted: 04/16/2021] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Diagnostic neurovascular imaging data are important in stroke research, but obtaining these data typically requires laborious manual chart reviews. OBJECTIVE We aimed to determine the accuracy of a natural language processing (NLP) approach to extract information on the presence and location of vascular occlusions as well as other stroke-related attributes based on free-text reports. METHODS From the full reports of 1320 consecutive computed tomography (CT), CT angiography, and CT perfusion scans of the head and neck performed at a tertiary stroke center between October 2017 and January 2019, we manually extracted data on the presence of proximal large vessel occlusion (primary outcome), as well as distal vessel occlusion, ischemia, hemorrhage, Alberta stroke program early CT score (ASPECTS), and collateral status (secondary outcomes). Reports were randomly split into training (n=921) and validation (n=399) sets, and attributes were extracted using rule-based NLP. We reported the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the overall accuracy of the NLP approach relative to the manually extracted data. RESULTS The overall prevalence of large vessel occlusion was 12.2%. In the training sample, the NLP approach identified this attribute with an overall accuracy of 97.3% (95.5% sensitivity, 98.1% specificity, 84.1% PPV, and 99.4% NPV). In the validation set, the overall accuracy was 95.2% (90.0% sensitivity, 97.4% specificity, 76.3% PPV, and 98.5% NPV). The accuracy of identifying distal or basilar occlusion as well as hemorrhage was also high, but there were limitations in identifying cerebral ischemia, ASPECTS, and collateral status. CONCLUSIONS NLP may improve the efficiency of large-scale imaging data collection for stroke surveillance and research.
Collapse
Affiliation(s)
- Amy Y X Yu
- Department of Medicine (Neurology), University of Toronto - Sunnybrook Health Sciences Centre, Toronto, ON, Canada
| | - Zhongyu A Liu
- Department of Medicine (Neurology), University of Toronto - Sunnybrook Health Sciences Centre, Toronto, ON, Canada
| | | | - Kaitlyn Lopes
- Department of Medicine (Neurology), University of Toronto - Sunnybrook Health Sciences Centre, Toronto, ON, Canada
| | - Moira K Kapral
- Department of Medicine (General Internal Medicine), University of Toronto - University Health Network, Toronto, ON, Canada
| | - Richard I Aviv
- Department of Radiology, Division of Neuroradiology, University of Ottawa, Ottawa, ON, Canada
| | - Muhammad Mamdani
- Department of Medicine, Unity Health Toronto, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
42
|
Velagapudi L, Mouchtouris N, Baldassari MP, Nauheim D, Khanna O, Saiegh FA, Herial N, Gooch MR, Tjoumakaris S, Rosenwasser RH, Jabbour P. Discrepancies in Stroke Distribution and Dataset Origin in Machine Learning for Stroke. J Stroke Cerebrovasc Dis 2021; 30:105832. [PMID: 33940363 DOI: 10.1016/j.jstrokecerebrovasdis.2021.105832] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2021] [Revised: 04/11/2021] [Accepted: 04/11/2021] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Machine learning algorithms depend on accurate and representative datasets for training in order to become valuable clinical tools that are widely generalizable to a varied population. We aim to conduct a review of machine learning uses in stroke literature to assess the geographic distribution of datasets and patient cohorts used to train these models and compare them to stroke distribution to evaluate for disparities. AIMS 582 studies were identified on initial searching of the PubMed database. Of these studies, 106 full texts were assessed after title and abstract screening which resulted in 489 papers excluded. Of these 106 studies, 79 were excluded due to using cohorts from outside the United States or being review articles or editorials. 27 studies were thus included in this analysis. SUMMARY OF REVIEW Of the 27 studies included, 7 (25.9%) used patient data from California, 6 (22.2%) were multicenter, 3 (11.1%) were in Massachusetts, 2 (7.4%) each in Illinois, Missouri, and New York, and 1 (3.7%) each from South Carolina, Washington, West Virginia, and Wisconsin. 1 (3.7%) study used data from Utah and Texas. These were qualitatively compared to a CDC study showing the highest distribution of stroke in Mississippi (4.3%) followed by Oklahoma (3.4%), Washington D.C. (3.4%), Louisiana (3.3%), and Alabama (3.2%) while the prevalence in California was 2.6%. CONCLUSIONS It is clear that a strong disconnect exists between the datasets and patient cohorts used in training machine learning algorithms in clinical research and the stroke distribution in which clinical tools using these algorithms will be implemented. In order to ensure a lack of bias and increase generalizability and accuracy in future machine learning studies, datasets using a varied patient population that reflects the unequal distribution of stroke risk factors would greatly benefit the usability of these tools and ensure accuracy on a nationwide scale.
Collapse
Affiliation(s)
- Lohit Velagapudi
- Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA
| | | | | | - David Nauheim
- Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA
| | - Omaditya Khanna
- Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA
| | - Fadi Al Saiegh
- Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA
| | - Nabeel Herial
- Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA
| | - M Reid Gooch
- Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA
| | | | | | - Pascal Jabbour
- Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA.
| |
Collapse
|
43
|
De Silva K, Mathews N, Teede H, Forbes A, Jönsson D, Demmer RT, Enticott J. Clinical notes as prognostic markers of mortality associated with diabetes mellitus following critical care: A retrospective cohort analysis using machine learning and unstructured big data. Comput Biol Med 2021; 132:104305. [PMID: 33705995 DOI: 10.1016/j.compbiomed.2021.104305] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 02/23/2021] [Accepted: 02/27/2021] [Indexed: 12/14/2022]
Abstract
BACKGROUND Clinical notes are ubiquitous resources offering potential value in optimizing critical care via data mining technologies. OBJECTIVE To determine the predictive value of clinical notes as prognostic markers of 1-year all-cause mortality among people with diabetes following critical care. MATERIALS AND METHODS Mortality of diabetes patients were predicted using three cohorts of clinical text in a critical care database, written by physicians (n = 45253), nurses (159027), and both (n = 204280). Natural language processing was used to pre-process text documents and LASSO-regularized logistic regression models were trained and tested. Confusion matrix metrics of each model were calculated and AUROC estimates between models were compared. All predictive words and corresponding coefficients were extracted. Outcome probability associated with each text document was estimated. RESULTS Models built on clinical text of physicians, nurses, and the combined cohort predicted mortality with AUROC of 0.996, 0.893, and 0.922, respectively. Predictive performance of the models significantly differed from one another whereas inter-rater reliability ranged from substantial to almost perfect across them. Number of predictive words with non-zero coefficients were 3994, 8159, and 10579, respectively, in the models of physicians, nurses, and the combined cohort. Physicians' and nursing notes, both individually and when combined, strongly predicted 1-year all-cause mortality among people with diabetes following critical care. CONCLUSION Clinical notes of physicians and nurses are strong and novel prognostic markers of diabetes-associated mortality in critical care, offering potentially generalizable and scalable applications. Clinical text-derived personalized risk estimates of prognostic outcomes such as mortality could be used to optimize patient care.
Collapse
Affiliation(s)
- Kushan De Silva
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, 3168, Australia.
| | - Noel Mathews
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, 3168, Australia
| | - Helena Teede
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, 3168, Australia
| | - Andrew Forbes
- Biostatistics Unit, Division of Research Methodology, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Melbourne, 3004, Australia
| | - Daniel Jönsson
- Department of Periodontology, Faculty of Odontology, Malmö University, Malmö, 21119, Sweden; Swedish Dental Service of Skane, Lund, 22647, Sweden
| | - Ryan T Demmer
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, USA; Mailman School of Public Health, Columbia University, New York, USA
| | - Joanne Enticott
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, 3168, Australia
| |
Collapse
|
44
|
Gupta M, Bansal A, Jain B, Rochelle J, Oak A, Jalali MS. Whether the weather will help us weather the COVID-19 pandemic: Using machine learning to measure twitter users' perceptions. Int J Med Inform 2021; 145:104340. [PMID: 33242762 PMCID: PMC7654388 DOI: 10.1016/j.ijmedinf.2020.104340] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 11/03/2020] [Accepted: 11/09/2020] [Indexed: 12/19/2022]
Abstract
OBJECTIVE The potential ability for weather to affect SARS-CoV-2 transmission has been an area of controversial discussion during the COVID-19 pandemic. Individuals' perceptions of the impact of weather can inform their adherence to public health guidelines; however, there is no measure of their perceptions. We quantified Twitter users' perceptions of the effect of weather and analyzed how they evolved with respect to real-world events and time. MATERIALS AND METHODS We collected 166,005 English tweets posted between January 23 and June 22, 2020 and employed machine learning/natural language processing techniques to filter for relevant tweets, classify them by the type of effect they claimed, and identify topics of discussion. RESULTS We identified 28,555 relevant tweets and estimate that 40.4 % indicate uncertainty about weather's impact, 33.5 % indicate no effect, and 26.1 % indicate some effect. We tracked changes in these proportions over time. Topic modeling revealed major latent areas of discussion. DISCUSSION There is no consensus among the public for weather's potential impact. Earlier months were characterized by tweets that were uncertain of weather's effect or claimed no effect; later, the portion of tweets claiming some effect of weather increased. Tweets claiming no effect of weather comprised the largest class by June. Major topics of discussion included comparisons to influenza's seasonality, President Trump's comments on weather's effect, and social distancing. CONCLUSION We exhibit a research approach that is effective in measuring population perceptions and identifying misconceptions, which can inform public health communications.
Collapse
Affiliation(s)
- Marichi Gupta
- MGH Institute for Technology Assessment, Harvard Medical School, Boston, MA, USA; The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Aditya Bansal
- MGH Institute for Technology Assessment, Harvard Medical School, Boston, MA, USA; Indian Institute of Technology Delhi, New Delhi, Delhi, India
| | - Bhav Jain
- MGH Institute for Technology Assessment, Harvard Medical School, Boston, MA, USA; Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jillian Rochelle
- MGH Institute for Technology Assessment, Harvard Medical School, Boston, MA, USA; Northwestern University, Evanston, IL, USA
| | - Atharv Oak
- MGH Institute for Technology Assessment, Harvard Medical School, Boston, MA, USA; Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Mohammad S Jalali
- MGH Institute for Technology Assessment, Harvard Medical School, Boston, MA, USA; Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
45
|
Heo TS, Kim YS, Choi JM, Jeong YS, Seo SY, Lee JH, Jeon JP, Kim C. Prediction of Stroke Outcome Using Natural Language Processing-Based Machine Learning of Radiology Report of Brain MRI. J Pers Med 2020; 10:jpm10040286. [PMID: 33339385 PMCID: PMC7766032 DOI: 10.3390/jpm10040286] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 12/09/2020] [Accepted: 12/15/2020] [Indexed: 01/28/2023] Open
Abstract
Brain magnetic resonance imaging (MRI) is useful for predicting the outcome of patients with acute ischemic stroke (AIS). Although deep learning (DL) using brain MRI with certain image biomarkers has shown satisfactory results in predicting poor outcomes, no study has assessed the usefulness of natural language processing (NLP)-based machine learning (ML) algorithms using brain MRI free-text reports of AIS patients. Therefore, we aimed to assess whether NLP-based ML algorithms using brain MRI text reports could predict poor outcomes in AIS patients. This study included only English text reports of brain MRIs examined during admission of AIS patients. Poor outcome was defined as a modified Rankin Scale score of 3-6, and the data were captured by trained nurses and physicians. We only included MRI text report of the first MRI scan during the admission. The text dataset was randomly divided into a training and test dataset with a 7:3 ratio. Text was vectorized to word, sentence, and document levels. In the word level approach, which did not consider the sequence of words, and the "bag-of-words" model was used to reflect the number of repetitions of text token. The "sent2vec" method was used in the sensation-level approach considering the sequence of words, and the word embedding was used in the document level approach. In addition to conventional ML algorithms, DL algorithms such as the convolutional neural network (CNN), long short-term memory, and multilayer perceptron were used to predict poor outcomes using 5-fold cross-validation and grid search techniques. The performance of each ML classifier was compared with the area under the receiver operating characteristic (AUROC) curve. Among 1840 subjects with AIS, 645 patients (35.1%) had a poor outcome 3 months after the stroke onset. Random forest was the best classifier (0.782 of AUROC) using a word-level approach. Overall, the document-level approach exhibited better performance than did the word- or sentence-level approaches. Among all the ML classifiers, the multi-CNN algorithm demonstrated the best classification performance (0.805), followed by the CNN (0.799) algorithm. When predicting future clinical outcomes using NLP-based ML of radiology free-text reports of brain MRI, DL algorithms showed superior performance over the other ML algorithms. In particular, the prediction of poor outcomes in document-level NLP DL was improved more by multi-CNN and CNN than by recurrent neural network-based algorithms. NLP-based DL algorithms can be used as an important digital marker for unstructured electronic health record data DL prediction.
Collapse
Affiliation(s)
- Tak Sung Heo
- Department of Convergence Software, Hallym University, Chuncheon 24252, Korea; (T.S.H.); (Y.S.K.); (J.M.C.); (Y.S.J.); (S.Y.S.)
| | - Yu Seop Kim
- Department of Convergence Software, Hallym University, Chuncheon 24252, Korea; (T.S.H.); (Y.S.K.); (J.M.C.); (Y.S.J.); (S.Y.S.)
| | - Jeong Myeong Choi
- Department of Convergence Software, Hallym University, Chuncheon 24252, Korea; (T.S.H.); (Y.S.K.); (J.M.C.); (Y.S.J.); (S.Y.S.)
| | - Yeong Seok Jeong
- Department of Convergence Software, Hallym University, Chuncheon 24252, Korea; (T.S.H.); (Y.S.K.); (J.M.C.); (Y.S.J.); (S.Y.S.)
| | - Soo Young Seo
- Department of Convergence Software, Hallym University, Chuncheon 24252, Korea; (T.S.H.); (Y.S.K.); (J.M.C.); (Y.S.J.); (S.Y.S.)
| | - Jun Ho Lee
- Department of Otorhinolaryngology and Head and Neck Surgery, Chuncheon Sacred Heart Hospital, Chuncheon 24253, Korea;
| | - Jin Pyeong Jeon
- Department of Neurosurgery, Chuncheon Sacred Heart Hospital, Chuncheon 24253, Korea;
| | - Chulho Kim
- Department of Neurology, Chuncheon Sacred Heart Hospital, Chuncheon 24253, Korea
- Correspondence: ; Tel.: +82-332-405-255; Fax: +82-332-5562-44
| |
Collapse
|