1
|
Barlow J, Sragi Z, Rivera-Rivera G, Al-Awady A, Daşdöğen Ü, Courey MS, Kirke DN. The Use of Deep Learning Software in the Detection of Voice Disorders: A Systematic Review. Otolaryngol Head Neck Surg 2024; 170:1531-1543. [PMID: 38168017 DOI: 10.1002/ohn.636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 11/30/2023] [Accepted: 12/07/2023] [Indexed: 01/05/2024]
Abstract
OBJECTIVE To summarize the use of deep learning in the detection of voice disorders using acoustic and laryngoscopic input, compare specific neural networks in terms of accuracy, and assess their effectiveness compared to expert clinical visual examination. DATA SOURCES Embase, MEDLINE, and Cochrane Central. REVIEW METHODS Databases were screened through November 11, 2023 for relevant studies. The inclusion criteria required studies to utilize a specified deep learning method, use laryngoscopy or acoustic input, and measure accuracy of binary classification between healthy patients and those with voice disorders. RESULTS Thirty-four studies met the inclusion criteria, with 18 focusing on voice analysis, 15 on imaging analysis, and 1 both. Across the 18 acoustic studies, 21 programs were used for identification of organic and functional voice disorders. These technologies included 10 convolutional neural networks (CNNs), 6 multilayer perceptrons (MLPs), and 5 other neural networks. The binary classification systems yielded a mean accuracy of 89.0% overall, including 93.7% for MLP programs and 84.5% for CNNs. Among the 15 imaging analysis studies, a total of 23 programs were utilized, resulting in a mean accuracy of 91.3%. Specifically, the twenty CNNs achieved a mean accuracy of 92.6% compared to 83.0% for the 3 MLPs. CONCLUSION Deep learning models were shown to be highly accurate in the detection of voice pathology, with CNNs most effective for assessing laryngoscopy images and MLPs most effective for assessing acoustic input. While deep learning methods outperformed expert clinical exam in limited comparisons, further studies integrating external validation are necessary.
Collapse
Affiliation(s)
- Joshua Barlow
- Department of Otolaryngology-Head and Neck Surgery, Icahn School of Medicine at Mount Sinai, New York City, New York, USA
| | - Zara Sragi
- Department of Otolaryngology-Head and Neck Surgery, Icahn School of Medicine at Mount Sinai, New York City, New York, USA
| | - Gabriel Rivera-Rivera
- Department of Otolaryngology-Head and Neck Surgery, Icahn School of Medicine at Mount Sinai, New York City, New York, USA
| | - Abdurrahman Al-Awady
- Department of Otolaryngology-Head and Neck Surgery, Icahn School of Medicine at Mount Sinai, New York City, New York, USA
| | - Ümit Daşdöğen
- Department of Otolaryngology-Head and Neck Surgery, Icahn School of Medicine at Mount Sinai, New York City, New York, USA
| | - Mark S Courey
- Department of Otolaryngology-Head and Neck Surgery, Icahn School of Medicine at Mount Sinai, New York City, New York, USA
| | - Diana N Kirke
- Department of Otolaryngology-Head and Neck Surgery, Icahn School of Medicine at Mount Sinai, New York City, New York, USA
| |
Collapse
|
2
|
Altahawi F, Owens A, Caruso CH, Wetzel JR, Strnad GJ, Chiunda AB, Spindler KP, Subhas N. Development and Operationalization of an Automated Workflow for Correlation of Knee MRI and Arthroscopy Findings. J Am Coll Radiol 2024; 21:609-616. [PMID: 37302680 DOI: 10.1016/j.jacr.2023.04.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 03/23/2023] [Accepted: 04/06/2023] [Indexed: 06/13/2023]
Abstract
OBJECTIVE In this study, we sought to establish and evaluate an automated workflow to prospectively capture and correlate knee MRI findings with surgical findings in a large medical center. METHODS This retrospective analysis included data from patients who had undergone knee MRI followed by arthroscopic knee surgery within 6 months during a 2-year period (2019-2020). Discrete data were automatically extracted from a structured knee MRI report template implementing pick lists. Operative findings were recorded discretely by surgeons using a custom-built web-based telephone application. MRI findings were classified as true-positive, true-negative, false-positive, or false-negative for medial meniscus (MM), lateral meniscus (LM), and anterior cruciate ligament (ACL) tears, with arthroscopy used as the reference standard. An automated dashboard displaying up-to-date concordance and individual and group accuracy was enabled for each radiologist. Manual correlation between MRI and operative reports was performed on a random sample of 10% of cases for comparison with automatically derived values. RESULTS Data from 3,187 patients (1,669 male; mean age, 47 years) were analyzed. Automatic correlation was available for 60% of cases, with an overall MRI diagnostic accuracy of 93% (MM, 92%; LM, 89%; ACL, 98%). In cases reviewed manually, the number of cases that could be correlated with surgery was higher (84%). Concordance between automated and manual review was 99% when both were available (MM, 98%; LM, 100%; ACL, 99%). CONCLUSION This automated system was able to accurately and continuously assess correlation between imaging and operative findings for a large number of MRI examinations.
Collapse
Affiliation(s)
| | - Amirtha Owens
- Imaging Institute, Cleveland Clinic, Cleveland, Ohio
| | | | | | - Gregory J Strnad
- Orthopaedic and Rheumatologic Institute, Cleveland Clinic, Cleveland, Ohio
| | - Allan B Chiunda
- Imaging Institute, Cleveland Clinic, Cleveland, Ohio; Director of Clinical Effectiveness and Innovations and Brentwood Foundation Chair in Research and Data Analytics
| | - Kurt P Spindler
- Director of Clinical Research and Outcomes, Orthopaedic Surgery, Cleveland Clinic Florida, Weston, Florida
| | - Naveen Subhas
- Vice Chair of Clinical Effectiveness and Efficiency, Imaging Institute, Cleveland Clinic, Cleveland, Ohio
| |
Collapse
|
3
|
C Pereira S, Mendonça AM, Campilho A, Sousa P, Teixeira Lopes C. Automated image label extraction from radiology reports - A review. Artif Intell Med 2024; 149:102814. [PMID: 38462277 DOI: 10.1016/j.artmed.2024.102814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 11/29/2023] [Accepted: 02/12/2024] [Indexed: 03/12/2024]
Abstract
Machine Learning models need large amounts of annotated data for training. In the field of medical imaging, labeled data is especially difficult to obtain because the annotations have to be performed by qualified physicians. Natural Language Processing (NLP) tools can be applied to radiology reports to extract labels for medical images automatically. Compared to manual labeling, this approach requires smaller annotation efforts and can therefore facilitate the creation of labeled medical image data sets. In this article, we summarize the literature on this topic spanning from 2013 to 2023, starting with a meta-analysis of the included articles, followed by a qualitative and quantitative systematization of the results. Overall, we found four types of studies on the extraction of labels from radiology reports: those describing systems based on symbolic NLP, statistical NLP, neural NLP, and those describing systems combining or comparing two or more of the latter. Despite the large variety of existing approaches, there is still room for further improvement. This work can contribute to the development of new techniques or the improvement of existing ones.
Collapse
Affiliation(s)
- Sofia C Pereira
- Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), Portugal; Faculty of Engineering of the University of Porto, Portugal.
| | - Ana Maria Mendonça
- Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), Portugal; Faculty of Engineering of the University of Porto, Portugal.
| | - Aurélio Campilho
- Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), Portugal; Faculty of Engineering of the University of Porto, Portugal.
| | - Pedro Sousa
- Hospital Center of Vila Nova de Gaia/Espinho, Portugal.
| | - Carla Teixeira Lopes
- Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), Portugal; Faculty of Engineering of the University of Porto, Portugal.
| |
Collapse
|
4
|
Lin H, Ni L, Phuong C, Hong JC. Natural Language Processing for Radiation Oncology: Personalizing Treatment Pathways. Pharmgenomics Pers Med 2024; 17:65-76. [PMID: 38370334 PMCID: PMC10874185 DOI: 10.2147/pgpm.s396971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 01/29/2024] [Indexed: 02/20/2024] Open
Abstract
Natural language processing (NLP), a technology that translates human language into machine-readable data, is revolutionizing numerous sectors, including cancer care. This review outlines the evolution of NLP and its potential for crafting personalized treatment pathways for cancer patients. Leveraging NLP's ability to transform unstructured medical data into structured learnable formats, researchers can tap into the potential of big data for clinical and research applications. Significant advancements in NLP have spurred interest in developing tools that automate information extraction from clinical text, potentially transforming medical research and clinical practices in radiation oncology. Applications discussed include symptom and toxicity monitoring, identification of social determinants of health, improving patient-physician communication, patient education, and predictive modeling. However, several challenges impede the full realization of NLP's benefits, such as privacy and security concerns, biases in NLP models, and the interpretability and generalizability of these models. Overcoming these challenges necessitates a collaborative effort between computer scientists and the radiation oncology community. This paper serves as a comprehensive guide to understanding the intricacies of NLP algorithms, their performance assessment, past research contributions, and the future of NLP in radiation oncology research and clinics.
Collapse
Affiliation(s)
- Hui Lin
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
- UC Berkeley-UCSF Graduate Program in Bioengineering, University of California, Berkeley and San Francisco, San Francisco, CA, USA
| | - Lisa Ni
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
| | - Christina Phuong
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
| | - Julian C Hong
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, USA
- Joint Program in Computational Precision Health, University of California, Berkeley and San Francisco, Berkeley, CA, USA
| |
Collapse
|
5
|
Benger M, Wood DA, Kafiabadi S, Al Busaidi A, Guilhem E, Lynch J, Townend M, Montvila A, Siddiqui J, Gadapa N, Barker G, Ourselin S, Cole JH, Booth TC. Factors affecting the labelling accuracy of brain MRI studies relevant for deep learning abnormality detection. FRONTIERS IN RADIOLOGY 2023; 3:1251825. [PMID: 38089643 PMCID: PMC10711054 DOI: 10.3389/fradi.2023.1251825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 11/02/2023] [Indexed: 02/01/2024]
Abstract
Unlocking the vast potential of deep learning-based computer vision classification systems necessitates large data sets for model training. Natural Language Processing (NLP)-involving automation of dataset labelling-represents a potential avenue to achieve this. However, many aspects of NLP for dataset labelling remain unvalidated. Expert radiologists manually labelled over 5,000 MRI head reports in order to develop a deep learning-based neuroradiology NLP report classifier. Our results demonstrate that binary labels (normal vs. abnormal) showed high rates of accuracy, even when only two MRI sequences (T2-weighted and those based on diffusion weighted imaging) were employed as opposed to all sequences in an examination. Meanwhile, the accuracy of more specific labelling for multiple disease categories was variable and dependent on the category. Finally, resultant model performance was shown to be dependent on the expertise of the original labeller, with worse performance seen with non-expert vs. expert labellers.
Collapse
Affiliation(s)
- Matthew Benger
- Department of Neuroradiology, Kings College Hospital, London, United Kingdom
| | - David A. Wood
- School of Biomedical Engineering & Imaging Sciences, Kings College London, London, United Kingdom
| | - Sina Kafiabadi
- Department of Neuroradiology, Kings College Hospital, London, United Kingdom
| | - Aisha Al Busaidi
- Department of Neuroradiology, Kings College Hospital, London, United Kingdom
| | - Emily Guilhem
- Department of Neuroradiology, Kings College Hospital, London, United Kingdom
| | - Jeremy Lynch
- Department of Neuroradiology, Kings College Hospital, London, United Kingdom
| | - Matthew Townend
- School of Biomedical Engineering & Imaging Sciences, Kings College London, London, United Kingdom
| | - Antanas Montvila
- School of Biomedical Engineering & Imaging Sciences, Kings College London, London, United Kingdom
| | - Juveria Siddiqui
- Department of Neuroradiology, Kings College Hospital, London, United Kingdom
| | - Naveen Gadapa
- Department of Neuroradiology, Kings College Hospital, London, United Kingdom
| | - Gareth Barker
- Institute of Psychiatry, Psychology & Neuroscience, Kings College London, London, United Kingdom
| | - Sebastian Ourselin
- School of Biomedical Engineering & Imaging Sciences, Kings College London, London, United Kingdom
| | - James H. Cole
- Institute of Psychiatry, Psychology & Neuroscience, Kings College London, London, United Kingdom
- Centre for Medical Image Computing, Dementia Research, University College London, London, United Kingdom
| | - Thomas C. Booth
- Department of Neuroradiology, Kings College Hospital, London, United Kingdom
- School of Biomedical Engineering & Imaging Sciences, Kings College London, London, United Kingdom
| |
Collapse
|
6
|
Yang E, Li MD, Raghavan S, Deng F, Lang M, Succi MD, Huang AJ, Kalpathy-Cramer J. Transformer versus traditional natural language processing: how much data is enough for automated radiology report classification? Br J Radiol 2023; 96:20220769. [PMID: 37162253 PMCID: PMC10461267 DOI: 10.1259/bjr.20220769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 04/21/2023] [Accepted: 04/26/2023] [Indexed: 05/11/2023] Open
Abstract
OBJECTIVES Current state-of-the-art natural language processing (NLP) techniques use transformer deep-learning architectures, which depend on large training datasets. We hypothesized that traditional NLP techniques may outperform transformers for smaller radiology report datasets. METHODS We compared the performance of BioBERT, a deep-learning-based transformer model pre-trained on biomedical text, and three traditional machine-learning models (gradient boosted tree, random forest, and logistic regression) on seven classification tasks given free-text radiology reports. Tasks included detection of appendicitis, diverticulitis, bowel obstruction, and enteritis/colitis on abdomen/pelvis CT reports, ischemic infarct on brain CT/MRI reports, and medial and lateral meniscus tears on knee MRI reports (7,204 total annotated reports). The performance of NLP models on held-out test sets was compared after training using the full training set, and 2.5%, 10%, 25%, 50%, and 75% random subsets of the training data. RESULTS In all tested classification tasks, BioBERT performed poorly at smaller training sample sizes compared to non-deep-learning NLP models. Specifically, BioBERT required training on approximately 1,000 reports to perform similarly or better than non-deep-learning models. At around 1,250 to 1,500 training samples, the testing performance for all models began to plateau, where additional training data yielded minimal performance gain. CONCLUSIONS With larger sample sizes, transformer NLP models achieved superior performance in radiology report binary classification tasks. However, with smaller sizes (<1000) and more imbalanced training data, traditional NLP techniques performed better. ADVANCES IN KNOWLEDGE Our benchmarks can help guide clinical NLP researchers in selecting machine-learning models according to their dataset characteristics.
Collapse
Affiliation(s)
| | - Matthew D Li
- Department of Radiology and Diagnostic Imaging, University of Alberta, Edmonton, Alberta, Canada
| | - Shruti Raghavan
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Francis Deng
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Min Lang
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Marc D Succi
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Ambrose J Huang
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | | |
Collapse
|
7
|
Zhang J, Mazurowski MA, Allen BC, Wildman-Tobriner B. Multistep Automated Data Labelling Procedure (MADLaP) for thyroid nodules on ultrasound: An artificial intelligence approach for automating image annotation. Artif Intell Med 2023; 141:102553. [PMID: 37295897 DOI: 10.1016/j.artmed.2023.102553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 02/14/2023] [Accepted: 04/11/2023] [Indexed: 06/12/2023]
Abstract
Machine learning (ML) for diagnosis of thyroid nodules on ultrasound is an active area of research. However, ML tools require large, well-labeled datasets, the curation of which is time-consuming and labor-intensive. The purpose of our study was to develop and test a deep-learning-based tool to facilitate and automate the data annotation process for thyroid nodules; we named our tool Multistep Automated Data Labelling Procedure (MADLaP). MADLaP was designed to take multiple inputs including pathology reports, ultrasound images, and radiology reports. Using multiple step-wise 'modules' including rule-based natural language processing, deep-learning-based imaging segmentation, and optical character recognition, MADLaP automatically identified images of a specific thyroid nodule and correctly assigned a pathology label. The model was developed using a training set of 378 patients across our health system and tested on a separate set of 93 patients. Ground truths for both sets were selected by an experienced radiologist. Performance metrics including yield (how many labeled images the model produced) and accuracy (percentage correct) were measured using the test set. MADLaP achieved a yield of 63 % and an accuracy of 83 %. The yield progressively increased as the input data moved through each module, while accuracy peaked part way through. Error analysis showed that inputs from certain examination sites had lower accuracy (40 %) than the other sites (90 %, 100 %). MADLaP successfully created curated datasets of labeled ultrasound images of thyroid nodules. While accurate, the relatively suboptimal yield of MADLaP exposed some challenges when trying to automatically label radiology images from heterogeneous sources. The complex task of image curation and annotation could be automated, allowing for enrichment of larger datasets for use in machine learning development.
Collapse
Affiliation(s)
- Jikai Zhang
- Department of Electrical and Computer Engineering, Duke University, Room 10070, 2424 Erwin Rd, Durham, NC 27705, United States.
| | - Maciej A Mazurowski
- Department of Radiology, Duke University Medical Center, Durham, NC, United States; Department of Electrical and Computer Engineering, Department of Biostatistics and Bioinformatics, Department of Computer Science, Duke University, Room 9044, 2424 Erwin Rd, Durham, NC 27705, United States
| | - Brian C Allen
- Department of Radiology, Duke University Medical Center, Duke University, Dept of Radiology, Box 3808, Durham, NC 27710, United States
| | - Benjamin Wildman-Tobriner
- Department of Radiology, Duke University Medical Center, Duke University, Dept of Radiology, Box 3808, Durham, NC 27710, United States
| |
Collapse
|
8
|
Galbusera F, Cina A, Bassani T, Panico M, Sconfienza LM. Automatic Diagnosis of Spinal Disorders on Radiographic Images: Leveraging Existing Unstructured Datasets With Natural Language Processing. Global Spine J 2023; 13:1257-1266. [PMID: 34219477 PMCID: PMC10416592 DOI: 10.1177/21925682211026910] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
STUDY DESIGN Retrospective study. OBJECTIVES Huge amounts of images and medical reports are being generated in radiology departments. While these datasets can potentially be employed to train artificial intelligence tools to detect findings on radiological images, the unstructured nature of the reports limits the accessibility of information. In this study, we tested if natural language processing (NLP) can be useful to generate training data for deep learning models analyzing planar radiographs of the lumbar spine. METHODS NLP classifiers based on the Bidirectional Encoder Representations from Transformers (BERT) model able to extract structured information from radiological reports were developed and used to generate annotations for a large set of radiographic images of the lumbar spine (N = 10 287). Deep learning (ResNet-18) models aimed at detecting radiological findings directly from the images were then trained and tested on a set of 204 human-annotated images. RESULTS The NLP models had accuracies between 0.88 and 0.98 and specificities between 0.84 and 0.99; 7 out of 12 radiological findings had sensitivity >0.90. The ResNet-18 models showed performances dependent on the specific radiological findings with sensitivities and specificities between 0.53 and 0.93. CONCLUSIONS NLP generates valuable data to train deep learning models able to detect radiological findings in spine images. Despite the noisy nature of reports and NLP predictions, this approach effectively mitigates the difficulties associated with the manual annotation of large quantities of data and opens the way to the era of big data for artificial intelligence in musculoskeletal radiology.
Collapse
Affiliation(s)
| | - Andrea Cina
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
| | - Tito Bassani
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
| | - Matteo Panico
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
- Department of Chemistry, Materials and Chemical Engineering “Giulio Natta,” Politecnico di Milano, Milan, Italy
| | - Luca Maria Sconfienza
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
- Department of Biomedical Sciences for Health, Università degli Studi di Milano, Milan, Italy
| |
Collapse
|
9
|
Yamada A, Kamagata K, Hirata K, Ito R, Nakaura T, Ueda D, Fujita S, Fushimi Y, Fujima N, Matsui Y, Tatsugami F, Nozaki T, Fujioka T, Yanagawa M, Tsuboyama T, Kawamura M, Naganawa S. Clinical applications of artificial intelligence in liver imaging. LA RADIOLOGIA MEDICA 2023:10.1007/s11547-023-01638-1. [PMID: 37165151 DOI: 10.1007/s11547-023-01638-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 04/21/2023] [Indexed: 05/12/2023]
Abstract
This review outlines the current status and challenges of the clinical applications of artificial intelligence in liver imaging using computed tomography or magnetic resonance imaging based on a topic analysis of PubMed search results using latent Dirichlet allocation. LDA revealed that "segmentation," "hepatocellular carcinoma and radiomics," "metastasis," "fibrosis," and "reconstruction" were current main topic keywords. Automatic liver segmentation technology using deep learning is beginning to assume new clinical significance as part of whole-body composition analysis. It has also been applied to the screening of large populations and the acquisition of training data for machine learning models and has resulted in the development of imaging biomarkers that have a significant impact on important clinical issues, such as the estimation of liver fibrosis, recurrence, and prognosis of malignant tumors. Deep learning reconstruction is expanding as a new technological clinical application of artificial intelligence and has shown results in reducing contrast and radiation doses. However, there is much missing evidence, such as external validation of machine learning models and the evaluation of the diagnostic performance of specific diseases using deep learning reconstruction, suggesting that the clinical application of these technologies is still in development.
Collapse
Affiliation(s)
- Akira Yamada
- Department of Radiology, Shinshu University School of Medicine, Matsumoto, Nagano, Japan.
| | - Koji Kamagata
- Department of Radiology, Juntendo University Graduate School of Medicine, Bunkyo-Ku, Tokyo, Japan
| | - Kenji Hirata
- Department of Nuclear Medicine, Hokkaido University Hospital, Sapporo, Japan
| | - Rintaro Ito
- Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan
| | - Takeshi Nakaura
- Department of Diagnostic Radiology, Kumamoto University Graduate School of Medicine, Chuo-Ku, Kumamoto, Japan
| | - Daiju Ueda
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Abeno-Ku, Osaka, Japan
| | - Shohei Fujita
- Department of Radiology, University of Tokyo, Tokyo, Japan
| | - Yasutaka Fushimi
- Department of Diagnostic Imaging and Nuclear Medicine, Kyoto University Graduate School of Medicine, Sakyoku, Kyoto, Japan
| | - Noriyuki Fujima
- Department of Diagnostic and Interventional Radiology, Hokkaido University Hospital, Sapporo, Japan
| | - Yusuke Matsui
- Department of Radiology, Faculty of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, Kita-Ku, Okayama, Japan
| | - Fuminari Tatsugami
- Department of Diagnostic Radiology, Hiroshima University, Minami-Ku, Hiroshima City, Hiroshima, Japan
| | - Taiki Nozaki
- Department of Radiology, St. Luke's International Hospital, Tokyo, Japan
| | - Tomoyuki Fujioka
- Department of Diagnostic Radiology, Tokyo Medical and Dental University, Tokyo, Japan
| | - Masahiro Yanagawa
- Department of Radiology, Osaka University Graduate School of Medicine, Suita-City, Osaka, Japan
| | - Takahiro Tsuboyama
- Department of Radiology, Osaka University Graduate School of Medicine, Suita-City, Osaka, Japan
| | - Mariko Kawamura
- Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan
| | - Shinji Naganawa
- Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan
| |
Collapse
|
10
|
Mondal HS, Ahmed KA, Birbilis N, Hossain MZ. Machine learning for detecting DNA attachment on SPR biosensor. Sci Rep 2023; 13:3742. [PMID: 36879019 PMCID: PMC9987359 DOI: 10.1038/s41598-023-29395-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 02/03/2023] [Indexed: 03/08/2023] Open
Abstract
Optoelectric biosensors measure the conformational changes of biomolecules and their molecular interactions, allowing researchers to use them in different biomedical diagnostics and analysis activities. Among different biosensors, surface plasmon resonance (SPR)-based biosensors utilize label-free and gold-based plasmonic principles with high precision and accuracy, allowing these gold-based biosensors as one of the preferred methods. The dataset generated from these biosensors are being used in different machine learning (ML) models for disease diagnosis and prognosis, but there is a scarcity of models to develop or assess the accuracy of SPR-based biosensors and ensure a reliable dataset for downstream model development. Current study proposed innovative ML-based DNA detection and classification models from the reflective light angles on different gold surfaces of biosensors and associated properties. We have conducted several statistical analyses and different visualization techniques to evaluate the SPR-based dataset and applied t-SNE feature extraction and min-max normalization to differentiate classifiers of low-variances. We experimented with several ML classifiers, namely support vector machine (SVM), decision tree (DT), multi-layer perceptron (MLP), k-nearest neighbors (KNN), logistic regression (LR) and random forest (RF) and evaluated our findings in terms of different evaluation metrics. Our analysis showed the best accuracy of 0.94 by RF, DT and KNN for DNA classification and 0.96 by RF and KNN for DNA detection tasks. Considering area under the receiver operating characteristic curve (AUC) (0.97), precision (0.96) and F1-score (0.97), we found RF performed best for both tasks. Our research shows the potentiality of ML models in the field of biosensor development, which can be expanded to develop novel disease diagnosis and prognosis tools in the future.
Collapse
Affiliation(s)
- Himadri Shekhar Mondal
- ANU College of Engineering, Computing and Cybernetics, The Australian National University, Canberra, ACT, 2600, Australia. .,Data61, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Canberra, ACT, 2601, Australia.
| | - Khandaker Asif Ahmed
- Australian Centre for Disease Preparedness (ACDP), CSIRO, Geelong, VIC, 3220, Australia
| | - Nick Birbilis
- ANU College of Engineering, Computing and Cybernetics, The Australian National University, Canberra, ACT, 2600, Australia.,Faculty of Science, Engineering and Built Environment, Deakin University, Burwood, VIC, 3125, Australia
| | - Md Zakir Hossain
- ANU College of Engineering, Computing and Cybernetics, The Australian National University, Canberra, ACT, 2600, Australia. .,Biological Data Science Institute, The Australian National University, Canberra, ACT, 2600, Australia. .,Data61, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Canberra, ACT, 2601, Australia. .,Faculty of Science and Engineering, Curtin University, Perth, WA, 6102, Australia.
| |
Collapse
|
11
|
Jantscher M, Gunzer F, Kern R, Hassler E, Tschauner S, Reishofer G. Information extraction from German radiological reports for general clinical text and language understanding. Sci Rep 2023; 13:2353. [PMID: 36759679 PMCID: PMC9911592 DOI: 10.1038/s41598-023-29323-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Accepted: 02/02/2023] [Indexed: 02/11/2023] Open
Abstract
Recent advances in deep learning and natural language processing (NLP) have opened many new opportunities for automatic text understanding and text processing in the medical field. This is of great benefit as many clinical downstream tasks rely on information from unstructured clinical documents. However, for low-resource languages like German, the use of modern text processing applications that require a large amount of training data proves to be difficult, as only few data sets are available mainly due to legal restrictions. In this study, we present an information extraction framework that was initially pre-trained on real-world computed tomographic (CT) reports of head examinations, followed by domain adaptive fine-tuning on reports from different imaging examinations. We show that in the pre-training phase, the semantic and contextual meaning of one clinical reporting domain can be captured and effectively transferred to foreign clinical imaging examinations. Moreover, we introduce an active learning approach with an intrinsic strategic sampling method to generate highly informative training data with low human annotation cost. We see that the model performance can be significantly improved by an appropriate selection of the data to be annotated, without the need to train the model on a specific downstream task. With a general annotation scheme that can be used not only in the radiology field but also in a broader clinical setting, we contribute to a more consistent labeling and annotation process that also facilitates the verification and evaluation of language models in the German clinical setting.
Collapse
Affiliation(s)
| | - Felix Gunzer
- Division of Neuroradiology, Vascular and Interventional Radiology, Department of Radiology, Medical University Graz, 8036, Graz, Austria
| | | | - Eva Hassler
- Division of Neuroradiology, Vascular and Interventional Radiology, Department of Radiology, Medical University Graz, 8036, Graz, Austria
| | - Sebastian Tschauner
- Division of Pediatric Radiology, Department of Radiology, Medical University Graz, 8036, Graz, Austria
| | - Gernot Reishofer
- Department of Radiology, Medical University Graz, 8036, Graz, Austria. .,BioTechMed-Graz, 8010, Graz, Austria.
| |
Collapse
|
12
|
Moassefi M, Faghani S, Khosravi B, Rouzrokh P, Erickson BJ. Artificial Intelligence in Radiology: Overview of Application Types, Design, and Challenges. Semin Roentgenol 2023; 58:170-177. [PMID: 37087137 DOI: 10.1053/j.ro.2023.01.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 01/16/2023] [Accepted: 01/18/2023] [Indexed: 02/17/2023]
|
13
|
Nunez JJ, Leung B, Ho C, Bates AT, Ng RT. Predicting the Survival of Patients With Cancer From Their Initial Oncology Consultation Document Using Natural Language Processing. JAMA Netw Open 2023; 6:e230813. [PMID: 36848085 PMCID: PMC9972192 DOI: 10.1001/jamanetworkopen.2023.0813] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open
Abstract
IMPORTANCE Predicting short- and long-term survival of patients with cancer may improve their care. Prior predictive models either use data with limited availability or predict the outcome of only 1 type of cancer. OBJECTIVE To investigate whether natural language processing can predict survival of patients with general cancer from a patient's initial oncologist consultation document. DESIGN, SETTING, AND PARTICIPANTS This retrospective prognostic study used data from 47 625 of 59 800 patients who started cancer care at any of the 6 BC Cancer sites located in the province of British Columbia between April 1, 2011, and December 31, 2016. Mortality data were updated until April 6, 2022, and data were analyzed from update until September 30, 2022. All patients with a medical or radiation oncologist consultation document generated within 180 days of diagnosis were included; patients seen for multiple cancers were excluded. EXPOSURES Initial oncologist consultation documents were analyzed using traditional and neural language models. MAIN OUTCOMES AND MEASURES The primary outcome was the performance of the predictive models, including balanced accuracy and receiver operating characteristics area under the curve (AUC). The secondary outcome was investigating what words the models used. RESULTS Of the 47 625 patients in the sample, 25 428 (53.4%) were female and 22 197 (46.6%) were male, with a mean (SD) age of 64.9 (13.7) years. A total of 41 447 patients (87.0%) survived 6 months, 31 143 (65.4%) survived 36 months, and 27 880 (58.5%) survived 60 months, calculated from their initial oncologist consultation. The best models achieved a balanced accuracy of 0.856 (AUC, 0.928) for predicting 6-month survival, 0.842 (AUC, 0.918) for 36-month survival, and 0.837 (AUC, 0.918) for 60-month survival, on a holdout test set. Differences in what words were important for predicting 6- vs 60-month survival were found. CONCLUSIONS AND RELEVANCE These findings suggest that models performed comparably with or better than previous models predicting cancer survival and that they may be able to predict survival using readily available data without focusing on 1 cancer type.
Collapse
Affiliation(s)
- John-Jose Nunez
- BC Cancer, Vancouver, British Columbia, Canada
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
| | | | - Cheryl Ho
- BC Cancer, Vancouver, British Columbia, Canada
| | - Alan T. Bates
- BC Cancer, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
| | - Raymond T. Ng
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
14
|
Cheung ATM, Nasir-Moin M, Fred Kwon YJ, Guan J, Liu C, Jiang L, Raimondo C, Chotai S, Chambless L, Ahmad HS, Chauhan D, Yoon JW, Hollon T, Buch V, Kondziolka D, Chen D, Al-Aswad LA, Aphinyanaphongs Y, Oermann EK. Methods and Impact for Using Federated Learning to Collaborate on Clinical Research. Neurosurgery 2023; 92:431-438. [PMID: 36399428 DOI: 10.1227/neu.0000000000002198] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 08/20/2022] [Indexed: 11/21/2022] Open
Abstract
BACKGROUND The development of accurate machine learning algorithms requires sufficient quantities of diverse data. This poses a challenge in health care because of the sensitive and siloed nature of biomedical information. Decentralized algorithms through federated learning (FL) avoid data aggregation by instead distributing algorithms to the data before centrally updating one global model. OBJECTIVE To establish a multicenter collaboration and assess the feasibility of using FL to train machine learning models for intracranial hemorrhage (ICH) detection without sharing data between sites. METHODS Five neurosurgery departments across the United States collaborated to establish a federated network and train a convolutional neural network to detect ICH on computed tomography scans. The global FL model was benchmarked against a standard, centrally trained model using a held-out data set and was compared against locally trained models using site data. RESULTS A federated network of practicing neurosurgeon scientists was successfully initiated to train a model for predicting ICH. The FL model achieved an area under the ROC curve of 0.9487 (95% CI 0.9471-0.9503) when predicting all subtypes of ICH compared with a benchmark (non-FL) area under the ROC curve of 0.9753 (95% CI 0.9742-0.9764), although performance varied by subtype. The FL model consistently achieved top three performance when validated on any site's data, suggesting improved generalizability. A qualitative survey described the experience of participants in the federated network. CONCLUSION This study demonstrates the feasibility of implementing a federated network for multi-institutional collaboration among clinicians and using FL to conduct machine learning research, thereby opening a new paradigm for neurosurgical collaboration.
Collapse
Affiliation(s)
| | | | | | | | - Chris Liu
- Department of Neurosurgery, NYU Langone Health, New York, New York, USA
| | - Lavender Jiang
- Department of Neurosurgery, NYU Langone Health, New York, New York, USA.,Center for Data Science, New York University, New York, New York, USA
| | | | - Silky Chotai
- Department of Neurosurgery, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Lola Chambless
- Department of Neurosurgery, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Hasan S Ahmad
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Daksh Chauhan
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Jang W Yoon
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Todd Hollon
- Department of Neurosurgery, University of Michigan School of Medicine, Ann Arbor, Michigan, USA
| | - Vivek Buch
- Department of Neurosurgery, Stanford University School of Medicine, Stanford, California, USA
| | | | - Dinah Chen
- Department of Ophthalmology, NYU Langone Health, New York, New York, USA
| | - Lama A Al-Aswad
- Department of Ophthalmology, NYU Langone Health, New York, New York, USA
| | | | - Eric Karl Oermann
- Department of Neurosurgery, NYU Langone Health, New York, New York, USA.,Center for Data Science, New York University, New York, New York, USA.,Department of Radiology, NYU Langone Health, New York, New York, USA
| |
Collapse
|
15
|
Choe J, Lee SM, Hwang HJ, Lee SM, Yun J, Kim N, Seo JB. Artificial Intelligence in Lung Imaging. Semin Respir Crit Care Med 2022; 43:946-960. [PMID: 36174647 DOI: 10.1055/s-0042-1755571] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Recently, interest and advances in artificial intelligence (AI) including deep learning for medical images have surged. As imaging plays a major role in the assessment of pulmonary diseases, various AI algorithms have been developed for chest imaging. Some of these have been approved by governments and are now commercially available in the marketplace. In the field of chest radiology, there are various tasks and purposes that are suitable for AI: initial evaluation/triage of certain diseases, detection and diagnosis, quantitative assessment of disease severity and monitoring, and prediction for decision support. While AI is a powerful technology that can be applied to medical imaging and is expected to improve our current clinical practice, some obstacles must be addressed for the successful implementation of AI in workflows. Understanding and becoming familiar with the current status and potential clinical applications of AI in chest imaging, as well as remaining challenges, would be essential for radiologists and clinicians in the era of AI. This review introduces the potential clinical applications of AI in chest imaging and also discusses the challenges for the implementation of AI in daily clinical practice and future directions in chest imaging.
Collapse
Affiliation(s)
- Jooae Choe
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea
| | - Sang Min Lee
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea
| | - Hye Jeon Hwang
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea
| | - Sang Min Lee
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea
| | - Jihye Yun
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea
| | - Namkug Kim
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea.,Department of Convergence Medicine, Biomedical Engineering Research Center, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea
| | - Joon Beom Seo
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea
| |
Collapse
|
16
|
Liu W, Zhang X, Lv H, Li J, Liu Y, Yang Z, Weng X, Lin Y, Song H, Wang Z. Using a classification model for determining the value of liver radiological reports of patients with colorectal cancer. Front Oncol 2022; 12:913806. [DOI: 10.3389/fonc.2022.913806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 11/04/2022] [Indexed: 11/22/2022] Open
Abstract
BackgroundMedical imaging is critical in clinical practice, and high value radiological reports can positively assist clinicians. However, there is a lack of methods for determining the value of reports.ObjectiveThe purpose of this study was to establish an ensemble learning classification model using natural language processing (NLP) applied to the Chinese free text of radiological reports to determine their value for liver lesion detection in patients with colorectal cancer (CRC).MethodsRadiological reports of upper abdominal computed tomography (CT) and magnetic resonance imaging (MRI) were divided into five categories according to the results of liver lesion detection in patients with CRC. The NLP methods including word segmentation, stop word removal, and n-gram language model establishment were applied for each dataset. Then, a word-bag model was built, high-frequency words were selected as features, and an ensemble learning classification model was constructed. Several machine learning methods were applied, including logistic regression (LR), random forest (RF), and so on. We compared the accuracy between priori choosing pertinent word strings and our machine language methodologies.ResultsThe dataset of 2790 patients included CT without contrast (10.2%), CT with/without contrast (73.3%), MRI without contrast (1.8%), and MRI with/without contrast (14.6%). The ensemble learning classification model determined the value of reports effectively, reaching 95.91% in the CT with/without contrast dataset using XGBoost. The logistic regression, random forest, and support vector machine also achieved good classification accuracy, reaching 95.89%, 95.04%, and 95.00% respectively. The results of XGBoost were visualized using a confusion matrix. The numbers of errors in categories I, II and V were very small. ELI5 was used to select important words for each category. Words such as “no abnormality”, “suggest”, “fatty liver”, and “transfer” showed a relatively large degree of positive correlation with classification accuracy. The accuracy based on string pattern search method model was lower than that of machine learning.ConclusionsThe learning classification model based on NLP was an effective tool for determining the value of radiological reports focused on liver lesions. The study made it possible to analyze the value of medical imaging examinations on a large scale.
Collapse
|
17
|
Fink MA, Kades K, Bischoff A, Moll M, Schnell M, Küchler M, Köhler G, Sellner J, Heussel CP, Kauczor HU, Schlemmer HP, Maier-Hein K, Weber TF, Kleesiek J. Deep Learning-based Assessment of Oncologic Outcomes from Natural Language Processing of Structured Radiology Reports. Radiol Artif Intell 2022; 4:e220055. [PMID: 36204531 PMCID: PMC9530771 DOI: 10.1148/ryai.220055] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 06/20/2022] [Accepted: 07/07/2022] [Indexed: 06/16/2023]
Abstract
PURPOSE To train a deep natural language processing (NLP) model, using data mined structured oncology reports (SOR), for rapid tumor response category (TRC) classification from free-text oncology reports (FTOR) and to compare its performance with human readers and conventional NLP algorithms. MATERIALS AND METHODS In this retrospective study, databases of three independent radiology departments were queried for SOR and FTOR dated from March 2018 to August 2021. An automated data mining and curation pipeline was developed to extract Response Evaluation Criteria in Solid Tumors-related TRCs for SOR for ground truth definition. The deep NLP bidirectional encoder representations from transformers (BERT) model and three feature-rich algorithms were trained on SOR to predict TRCs in FTOR. Models' F1 scores were compared against scores of radiologists, medical students, and radiology technologist students. Lexical and semantic analyses were conducted to investigate human and model performance on FTOR. RESULTS Oncologic findings and TRCs were accurately mined from 9653 of 12 833 (75.2%) queried SOR, yielding oncology reports from 10 455 patients (mean age, 60 years ± 14 [SD]; 5303 women) who met inclusion criteria. On 802 FTOR in the test set, BERT achieved better TRC classification results (F1, 0.70; 95% CI: 0.68, 0.73) than the best-performing reference linear support vector classifier (F1, 0.63; 95% CI: 0.61, 0.66) and technologist students (F1, 0.65; 95% CI: 0.63, 0.67), had similar performance to medical students (F1, 0.73; 95% CI: 0.72, 0.75), but was inferior to radiologists (F1, 0.79; 95% CI: 0.78, 0.81). Lexical complexity and semantic ambiguities in FTOR influenced human and model performance, revealing maximum F1 score drops of -0.17 and -0.19, respectively. CONCLUSION The developed deep NLP model reached the performance level of medical students but not radiologists in curating oncologic outcomes from radiology FTOR.Keywords: Neural Networks, Computer Applications-Detection/Diagnosis, Oncology, Research Design, Staging, Tumor Response, Comparative Studies, Decision Analysis, Experimental Investigations, Observer Performance, Outcomes Analysis Supplemental material is available for this article. © RSNA, 2022.
Collapse
|
18
|
Gunter D, Puac-Polanco P, Miguel O, Thornhill RE, Yu AYX, Liu ZA, Mamdani M, Pou-Prom C, Aviv RI. Rule-based natural language processing for automation of stroke data extraction: a validation study. Neuroradiology 2022; 64:2357-2362. [PMID: 35913525 DOI: 10.1007/s00234-022-03029-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 07/25/2022] [Indexed: 11/30/2022]
Abstract
PURPOSE Data extraction from radiology free-text reports is time consuming when performed manually. Recently, more automated extraction methods using natural language processing (NLP) are proposed. A previously developed rule-based NLP algorithm showed promise in its ability to extract stroke-related data from radiology reports. We aimed to externally validate the accuracy of CHARTextract, a rule-based NLP algorithm, to extract stroke-related data from free-text radiology reports. METHODS Free-text reports of CT angiography (CTA) and perfusion (CTP) studies of consecutive patients with acute ischemic stroke admitted to a regional stroke center for endovascular thrombectomy were analyzed from January 2015 to 2021. Stroke-related variables were manually extracted as reference standard from clinical reports, including proximal and distal anterior circulation occlusion, posterior circulation occlusion, presence of ischemia or hemorrhage, Alberta stroke program early CT score (ASPECTS), and collateral status. These variables were simultaneously extracted using a rule-based NLP algorithm. The NLP algorithm's accuracy, specificity, sensitivity, positive predictive value (PPV), and negative predictive value (NPV) were assessed. RESULTS The NLP algorithm's accuracy was > 90% for identifying distal anterior occlusion, posterior circulation occlusion, hemorrhage, and ASPECTS. Accuracy was 85%, 74%, and 79% for proximal anterior circulation occlusion, presence of ischemia, and collateral status respectively. The algorithm confirmed the absence of variables from radiology reports with an 87-100% accuracy. CONCLUSIONS Rule-based NLP has a moderate to good performance for stroke-related data extraction from free-text imaging reports. The algorithm's accuracy was affected by inconsistent report styles and lexicon among reporting radiologists.
Collapse
Affiliation(s)
- Dane Gunter
- The Ottawa Hospital Research Institute, Ottawa, ON, Canada
| | - Paulo Puac-Polanco
- Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, The Ottawa Hospital Civic Campus Room C110, 1053 Carling Ave, Ottawa, ON, ON K1Y 4E9, Canada
| | - Olivier Miguel
- Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, The Ottawa Hospital Civic Campus Room C110, 1053 Carling Ave, Ottawa, ON, ON K1Y 4E9, Canada
| | - Rebecca E Thornhill
- Division of Medical Physics, Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, Ottawa, ON, Canada
| | - Amy Y X Yu
- Department of Medicine (Neurology), University of Toronto, Sunnybrook Health Sciences Centre, Toronto, ON, Canada
| | - Zhongyu A Liu
- Department of Medicine (Neurology), University of Toronto, Sunnybrook Health Sciences Centre, Toronto, ON, Canada
| | - Muhammad Mamdani
- Department of Medicine, Unity Health Toronto, University of Toronto, Toronto, ON, Canada
| | | | - Richard I Aviv
- The Ottawa Hospital Research Institute, Ottawa, ON, Canada. .,Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, The Ottawa Hospital Civic Campus Room C110, 1053 Carling Ave, Ottawa, ON, ON K1Y 4E9, Canada.
| |
Collapse
|
19
|
Miller MI, Orfanoudaki A, Cronin M, Saglam H, So Yeon Kim I, Balogun O, Tzalidi M, Vasilopoulos K, Fanaropoulou G, Fanaropoulou NM, Kalin J, Hutch M, Prescott BR, Brush B, Benjamin EJ, Shin M, Mian A, Greer DM, Smirnakis SM, Ong CJ. Natural Language Processing of Radiology Reports to Detect Complications of Ischemic Stroke. Neurocrit Care 2022; 37:291-302. [PMID: 35534660 PMCID: PMC9986939 DOI: 10.1007/s12028-022-01513-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 04/05/2022] [Indexed: 02/01/2023]
Abstract
BACKGROUND Abstraction of critical data from unstructured radiologic reports using natural language processing (NLP) is a powerful tool to automate the detection of important clinical features and enhance research efforts. We present a set of NLP approaches to identify critical findings in patients with acute ischemic stroke from radiology reports of computed tomography (CT) and magnetic resonance imaging (MRI). METHODS We trained machine learning classifiers to identify categorical outcomes of edema, midline shift (MLS), hemorrhagic transformation, and parenchymal hematoma, as well as rule-based systems (RBS) to identify intraventricular hemorrhage (IVH) and continuous MLS measurements within CT/MRI reports. Using a derivation cohort of 2289 reports from 550 individuals with acute middle cerebral artery territory ischemic strokes, we externally validated our models on reports from a separate institution as well as from patients with ischemic strokes in any vascular territory. RESULTS In all data sets, a deep neural network with pretrained biomedical word embeddings (BioClinicalBERT) achieved the highest discrimination performance for binary prediction of edema (area under precision recall curve [AUPRC] > 0.94), MLS (AUPRC > 0.98), hemorrhagic conversion (AUPRC > 0.89), and parenchymal hematoma (AUPRC > 0.76). BioClinicalBERT outperformed lasso regression (p < 0.001) for all outcomes except parenchymal hematoma (p = 0.755). Tailored RBS for IVH and continuous MLS outperformed BioClinicalBERT (p < 0.001) and linear regression, respectively (p < 0.001). CONCLUSIONS Our study demonstrates robust performance and external validity of a core NLP tool kit for identifying both categorical and continuous outcomes of ischemic stroke from unstructured radiographic text data. Medically tailored NLP methods have multiple important big data applications, including scalable electronic phenotyping, augmentation of clinical risk prediction models, and facilitation of automatic alert systems in the hospital setting.
Collapse
Affiliation(s)
- Matthew I Miller
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA
| | | | - Michael Cronin
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA
| | - Hanife Saglam
- Department of Neurology, West Virginia University School of Medicine, Morgantown, WV, USA
| | | | - Oluwafemi Balogun
- Boston Medical Center, Boston, MA, USA.,Boston University School of Public Health, Boston, MA, USA
| | - Maria Tzalidi
- School of Medicine, University of Crete, Heraklion, Greece
| | | | | | - Nina M Fanaropoulou
- School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Jack Kalin
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA
| | - Meghan Hutch
- Department of Preventive Medicine, Northwestern University, Chicago, IL, USA.,Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Benjamin Brush
- Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| | - Emelia J Benjamin
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA.,Boston University School of Public Health, Boston, MA, USA
| | - Min Shin
- Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC, USA
| | - Asim Mian
- Department of Radiology, Boston Medical Center, Boston, MA, USA
| | - David M Greer
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA.,Boston Medical Center, Boston, MA, USA
| | - Stelios M Smirnakis
- Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA.,Jamaica Plain Veterans Administration Hospital, Boston, MA, USA
| | - Charlene J Ong
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA. .,Boston Medical Center, Boston, MA, USA. .,Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA. .,Department of Neurology, Massachusetts General Hospital, Boston, MA, USA. .,Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
20
|
Automatic detection of actionable findings and communication mentions in radiology reports using natural language processing. Eur Radiol 2022; 32:3996-4002. [PMID: 34989840 DOI: 10.1007/s00330-021-08467-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 10/25/2021] [Accepted: 11/15/2021] [Indexed: 11/04/2022]
Abstract
OBJECTIVES To develop and validate classifiers for automatic detection of actionable findings and documentation of nonroutine communication in routinely delivered radiology reports. METHODS Two radiologists annotated all actionable findings and communication mentions in a training set of 1,306 radiology reports and a test set of 1,000 reports randomly selected from the electronic health record system of a large tertiary hospital. Various feature sets were constructed based on the impression section of the reports using different preprocessing steps (stemming, removal of stop words, negations, and previously known or stable findings) and n-grams. Random forest classifiers were trained to detect actionable findings, and a decision-rule classifier was trained to find communication mentions. Classifier performance was evaluated by the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. RESULTS On the training set, the actionable finding classifier with the highest cross-validated performance was obtained for a feature set of unigrams, after stemming and removal of negated, known, and stable findings. On the test set, this classifier achieved an AUC of 0.876 (95% CI 0.854-0.898). The classifier for communication detection was trained after negation removal, using unigrams as features. The resultant decision rule had a sensitivity of 0.841 (95% CI 0.706-0.921) and specificity of 0.990 (95% CI 0.981-0.994) on the test set. CONCLUSIONS Automatic detection of actionable findings and subsequent communication in routinely delivered radiology reports is possible. This can serve quality control purposes and may alert radiologists to the presence of actionable findings during reporting. KEY POINTS • Classifiers were developed for automatic detection of the broad spectrum of actionable findings and subsequent communication mentions in routinely delivered radiology reports. • Straightforward report preprocessing and simple feature sets can produce well-performing classifiers. • The resultant classifiers show good performance for detection of actionable findings and excellent performance for detection of communication mentions.
Collapse
|
21
|
Iorga M, Drakopoulos M, Naidech AM, Katsaggelos AK, Parrish TB, Hill VB. Labeling Noncontrast Head CT Reports for Common Findings Using Natural Language Processing. AJNR Am J Neuroradiol 2022; 43:721-726. [PMID: 35483905 PMCID: PMC9089256 DOI: 10.3174/ajnr.a7500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 03/14/2022] [Indexed: 11/07/2022]
Abstract
BACKGROUND AND PURPOSE Prioritizing reading of noncontrast head CT examinations through an automated triage system may improve time to care for patients with acute neuroradiologic findings. We present a natural language-processing approach for labeling findings in noncontrast head CT reports, which permits creation of a large, labeled dataset of head CT images for development of emergent-finding detection and reading-prioritization algorithms. MATERIALS AND METHODS In this retrospective study, 1002 clinical radiology reports from noncontrast head CTs collected between 2008 and 2013 were manually labeled across 12 common neuroradiologic finding categories. Each report was then encoded using an n-gram model of unigrams, bigrams, and trigrams. A logistic regression model was then trained to label each report for every common finding. Models were trained and assessed using a combination of L2 regularization and 5-fold cross-validation. RESULTS Model performance was strongest for the fracture, hemorrhage, herniation, mass effect, pneumocephalus, postoperative status, and volume loss models in which the area under the receiver operating characteristic curve exceeded 0.95. Performance was relatively weaker for the edema, hydrocephalus, infarct, tumor, and white-matter disease models (area under the receiver operating characteristic curve > 0.85). Analysis of coefficients revealed finding-specific words among the top coefficients in each model. Class output probabilities were found to be a useful indicator of predictive error on individual report examples in higher-performing models. CONCLUSIONS Combining logistic regression with n-gram encoding is a robust approach to labeling common findings in noncontrast head CT reports.
Collapse
Affiliation(s)
- M Iorga
- From the Departments of Radiology (M.I., M.D., T.B.P., V.B.H.)
- Departments of Biomedical Engineering (M.I., A.K.K., T.B.P.)
| | - M Drakopoulos
- From the Departments of Radiology (M.I., M.D., T.B.P., V.B.H.)
| | - A M Naidech
- Neurology (A.M.N.), Northwestern University Feinberg School of Medicine, Chicago, Illinois
| | - A K Katsaggelos
- Departments of Biomedical Engineering (M.I., A.K.K., T.B.P.)
- Electrical and Computer Engineering (A.K.K.)
- Computer Science (A.K.K.), Northwestern University, Chicago, Illinois
| | - T B Parrish
- From the Departments of Radiology (M.I., M.D., T.B.P., V.B.H.)
- Departments of Biomedical Engineering (M.I., A.K.K., T.B.P.)
| | - V B Hill
- From the Departments of Radiology (M.I., M.D., T.B.P., V.B.H.)
| |
Collapse
|
22
|
Overview of Deep Learning Models in Biomedical Domain with the Help of R Statistical Software. SERBIAN JOURNAL OF EXPERIMENTAL AND CLINICAL RESEARCH 2022. [DOI: 10.2478/sjecr-2018-0063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Abstract
With the increase in volume of data and presence of structured and unstructured data in the biomedical filed, there is a need for building models which can handle complex & non-linear relations in the data and also predict and classify outcomes with higher accuracy. Deep learning models are one of such models which can handle complex and nonlinear data and are being increasingly used in the biomedical filed in the recent years. Deep learning methodology evolved from artificial neural networks which process the input data through multiple hidden layers with higher level of abstraction. Deep Learning networks are used in various fields such as image processing, speech recognition, fraud deduction, classification and prediction. Objectives of this paper is to provide an overview of Deep Learning Models and its application in the biomedical domain using R Statistical software Deep Learning concepts are illustrated by using the R statistical software package. X-ray Images from NIH datasets used to explain the prediction accuracy of the deep learning models. Deep Learning models helped to classify the outcomes under study with 91% accuracy. The paper provided an overview of Deep Learning Models, its types, its application in biomedical domain. - is paper has shown the effect of deep learning network in classifying images into normal and disease with 91% accuracy with help of the R statistical package.
Collapse
|
23
|
Linna N, Kahn CE. Applications of Natural Language Processing in Radiology: A Systematic Review. Int J Med Inform 2022; 163:104779. [DOI: 10.1016/j.ijmedinf.2022.104779] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 03/28/2022] [Accepted: 04/21/2022] [Indexed: 12/27/2022]
|
24
|
Automated Radiology-Arthroscopy Correlation of Knee Meniscal Tears Using Natural Language Processing Algorithms. Acad Radiol 2022; 29:479-487. [PMID: 33583713 DOI: 10.1016/j.acra.2021.01.017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 01/19/2021] [Accepted: 01/21/2021] [Indexed: 12/29/2022]
Abstract
RATIONALE AND OBJECTIVES Train and apply natural language processing (NLP) algorithms for automated radiology-arthroscopy correlation of meniscal tears. MATERIALS AND METHODS In this retrospective single-institution study, we trained supervised machine learning models (logistic regression, support vector machine, and random forest) to detect medial or lateral meniscus tears on free-text MRI reports. We trained and evaluated model performances with cross-validation using 3593 manually annotated knee MRI reports. To assess radiology-arthroscopy correlation, we then randomly partitioned this dataset 80:20 for training and testing, where 108 test set MRIs were followed by knee arthroscopy within 1 year. These free-text arthroscopy reports were also manually annotated. The NLP algorithms trained on the knee MRI training dataset were then evaluated on the MRI and arthroscopy report test datasets. We assessed radiology-arthroscopy agreement using the ensembled NLP-extracted findings versus manually annotated findings. RESULTS The NLP models showed high cross-validation performance for meniscal tear detection on knee MRI reports (medial meniscus F1 scores 0.93-0.94, lateral meniscus F1 scores 0.86-0.88). When these algorithms were evaluated on arthroscopy reports, despite never training on arthroscopy reports, performance was similar, though higher with model ensembling (medial meniscus F1 score 0.97, lateral meniscus F1 score 0.99). However, ensembling did not improve performance on knee MRI reports. In the radiology-arthroscopy test set, the ensembled NLP models were able to detect mismatches between MRI and arthroscopy reports with sensitivity 79% and specificity 87%. CONCLUSION Radiology-arthroscopy correlation can be automated for knee meniscal tears using NLP algorithms, which shows promise for education and quality improvement.
Collapse
|
25
|
Crombé A, Seux M, Bratan F, Bergerot JF, Banaste N, Thomson V, Lecomte JC, Gorincour G. What Influences the Way Radiologists Express Themselves in Their Reports? A Quantitative Assessment Using Natural Language Processing. J Digit Imaging 2022; 35:993-1007. [PMID: 35318544 PMCID: PMC8939885 DOI: 10.1007/s10278-022-00619-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Revised: 03/07/2022] [Accepted: 03/09/2022] [Indexed: 11/29/2022] Open
Abstract
Although using standardized reports is encouraged, most emergency radiological reports in France remain in free-text format that can be mined with natural language processing for epidemiological purposes, activity monitoring or data collection. These reports are obtained under various on-call conditions by radiologists with various backgrounds. Our aim was to investigate what influences the radiologists’ written expressions. To do so, this retrospective multicentric study included 30,227 emergency radiological reports of computed tomography scans and magnetic resonance imaging involving exactly one body region, only with pathological findings, interpreted from 2019–09-01 to 2020–02-28 by 165 radiologists. After text pre-processing, one-word tokenization and use of dictionaries for stop words, polarity, sentiment and uncertainty, 11 variables depicting the structure and content of words and sentences in the reports were extracted and summarized to 3 principal components capturing 93.7% of the dataset variance. In multivariate analysis, the 1st principal component summarized the length and lexical diversity of the reports and was significantly influenced by the weekday, time slot, workload, number of examinations previously interpreted by the radiologist during the on-call period, type of examination, emergency level and radiologists’ gender (P value range: < 0.0001–0.0029). The 2nd principal component summarized negative formulations, polarity and sentence length and was correlated with the number of examination previously interpreted by the radiologist, type of examination, emergency level, imaging modality and radiologists’ experience (P value range: < 0.0001–0.0032). The last principal component summarized questioning, uncertainty and polarity and was correlated with the type of examination and emergency level (all P values < 0.0001). Thus, the length, structure and content of emergency radiological reports were significantly influenced by organizational, radiologist- and examination-related characteristics, highlighting the subjectivity and variability in the way radiologists express themselves during their clinical activity. These findings advocate for more homogeneous practices in radiological reporting and stress the need to consider these influential features when developing models based on natural language processing.
Collapse
Affiliation(s)
- Amandine Crombé
- IMADIS, 48 rue quivogne, 63002, Lyon, France. .,University of Bordeaux, 33000, Bordeaux, France.
| | - Mylène Seux
- IMADIS, 48 rue quivogne, 63002, Lyon, France
| | - Flavie Bratan
- IMADIS, 48 rue quivogne, 63002, Lyon, France.,Department of Diagnostic and Interventional Imaging, Centre Hospitalier Saint-Joseph Saint-Luc, 69007, Lyon, France
| | - Jean-François Bergerot
- IMADIS, 48 rue quivogne, 63002, Lyon, France.,Ramsay Générale de Santé, Clinique Convert, 01000, Bourg-en-Bresse, France
| | - Nathan Banaste
- IMADIS, 48 rue quivogne, 63002, Lyon, France.,Department of Radiology, Hôpital Nord-Ouest, 69400, Villefranche-sur-Saône, France
| | - Vivien Thomson
- IMADIS, 48 rue quivogne, 63002, Lyon, France.,Ramsay Générale de Santé, Clinique de la Sauvegarde, 69009, Lyon, France
| | - Jean-Christophe Lecomte
- IMADIS, 48 rue quivogne, 63002, Lyon, France.,Centre Hospitalier de Saintonge, 17100, Saintes, France.,Centre Aquitain d'Imagerie, 33600, Pessac, France
| | - Guillaume Gorincour
- IMADIS, 48 rue quivogne, 63002, Lyon, France.,ELSAN, Clinique Bouchard, 13006, Marseille, France
| |
Collapse
|
26
|
Jujjavarapu C, Pejaver V, Cohen TA, Mooney SD, Heagerty PJ, Jarvik JG. A Comparison of Natural Language Processing Methods for the Classification of Lumbar Spine Imaging Findings Related to Lower Back Pain. Acad Radiol 2022; 29 Suppl 3:S188-S200. [PMID: 34862122 PMCID: PMC8917985 DOI: 10.1016/j.acra.2021.09.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 08/22/2021] [Accepted: 09/04/2021] [Indexed: 11/28/2022]
Abstract
RATIONALE AND OBJECTIVES The use of natural language processing (NLP) in radiology provides an opportunity to assist clinicians with phenotyping patients. However, the performance and generalizability of NLP across healthcare systems is uncertain. We assessed the performance within and generalizability across four healthcare systems of different NLP representational methods, coupled with elastic-net logistic regression to classify lower back pain-related findings from lumbar spine imaging reports. MATERIALS AND METHODS We used a dataset of 871 X-ray and magnetic resonance imaging reports sampled from a prospective study across four healthcare systems between October 2013 and September 2016. We annotated each report for 26 findings potentially related to lower back pain. Our framework applied four different NLP methods to convert text into feature sets (representations). For each representation, our framework used an elastic-net logistic regression model for each finding (i.e., 26 binary or "one-vs.-rest" classification models). For performance evaluation, we split data into training (80%, 697/871) and testing (20%, 174/871). In the training set, we used cross validation to identify the optimal hyperparameter value and then retrained on the full training set. We then assessed performance based on area under the curve (AUC) for the test set. We repeated this process 25 times with each repeat using a different random train/test split of the data, so that we could estimate 95% confidence intervals, and assess significant difference in performance between representations. For generalizability evaluation, we trained models on data from three healthcare systems with cross validation and then tested on the fourth. We repeated this process for each system, then calculated mean and standard deviation (SD) of AUC across the systems. RESULTS For individual representations, n-grams had the best average performance across all 26 findings (AUC: 0.960). For generalizability, document embeddings had the most consistent average performance across systems (SD: 0.010). Out of these 26 findings, we considered eight as potentially clinically important (any stenosis, central stenosis, lateral stenosis, foraminal stenosis, disc extrusion, nerve root displacement compression, endplate edema, and listhesis grade 2) since they have a relatively greater association with a history of lower back pain compared to the remaining 18 classes. We found a similar pattern for these eight in which n-grams and document embeddings had the best average performance (AUC: 0.954) and generalizability (SD: 0.007), respectively. CONCLUSION Based on performance assessment, we found that n-grams is the preferred method if classifier development and deployment occur at the same system. However, for deployment at multiple systems outside of the development system, or potentially if physician behavior changes within a system, one should consider document embeddings since embeddings appear to have the most consistent performance across systems.
Collapse
Affiliation(s)
- Chethan Jujjavarapu
- Department of Biomedical Informatics and Medical Education, School
of Medicine, University of Washington, Seattle, Washington
| | - Vikas Pejaver
- Department of Biomedical Informatics and Medical Education, School
of Medicine, University of Washington, Seattle, Washington
| | - Trevor A. Cohen
- Department of Biomedical Informatics and Medical Education, School
of Medicine, University of Washington, Seattle, Washington
| | - Sean D. Mooney
- Department of Biomedical Informatics and Medical Education, School
of Medicine, University of Washington, Seattle, Washington
| | - Patrick J. Heagerty
- Department of Biostatistics, University of Washington, Seattle,
Washington,Center for Biomedical Statistics, University of Washington,
Seattle, Washington
| | - Jeffrey G. Jarvik
- Department of Radiology, University of Washington, 1959 NE Pacific
Street, Seattle WA 98195,Department of Neurological Surgery, University of Washington,
Seattle, Washington,Department of Health Services, University of Washington, Seattle
Washington,Clinical Learning, Evidence And Research Center, University of
Washington, Seattle, Washington
| |
Collapse
|
27
|
Tiwari M, Piech C, Baitemirova M, Prajna NV, Srinivasan M, Lalitha P, Villegas N, Balachandar N, Chua JT, Redd T, Lietman TM, Thrun S, Lin CC. Differentiation of Active Corneal Infections from Healed Scars Using Deep Learning. Ophthalmology 2022; 129:139-146. [PMID: 34352302 PMCID: PMC8792172 DOI: 10.1016/j.ophtha.2021.07.033] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 07/16/2021] [Accepted: 07/26/2021] [Indexed: 02/03/2023] Open
Abstract
PURPOSE To develop and evaluate an automated, portable algorithm to differentiate active corneal ulcers from healed scars using only external photographs. DESIGN A convolutional neural network was trained and tested using photographs of corneal ulcers and scars. PARTICIPANTS De-identified photographs of corneal ulcers were obtained from the Steroids for Corneal Ulcers Trial (SCUT), Mycotic Ulcer Treatment Trial (MUTT), and Byers Eye Institute at Stanford University. METHODS Photographs of corneal ulcers (n = 1313) and scars (n = 1132) from the SCUT and MUTT were used to train a convolutional neural network (CNN). The CNN was tested on 2 different patient populations from eye clinics in India (n = 200) and the Byers Eye Institute at Stanford University (n = 101). Accuracy was evaluated against gold standard clinical classifications. Feature importances for the trained model were visualized using gradient-weighted class activation mapping. MAIN OUTCOME MEASURES Accuracy of the CNN was assessed via F1 score. The area under the receiver operating characteristic (ROC) curve (AUC) was used to measure the precision-recall trade-off. RESULTS The CNN correctly classified 115 of 123 active ulcers and 65 of 77 scars in patients with corneal ulcer from India (F1 score, 92.0% [95% confidence interval (CI), 88.2%-95.8%]; sensitivity, 93.5% [95% CI, 89.1%-97.9%]; specificity, 84.42% [95% CI, 79.42%-89.42%]; ROC: AUC, 0.9731). The CNN correctly classified 43 of 55 active ulcers and 42 of 46 scars in patients with corneal ulcers from Northern California (F1 score, 84.3% [95% CI, 77.2%-91.4%]; sensitivity, 78.2% [95% CI, 67.3%-89.1%]; specificity, 91.3% [95% CI, 85.8%-96.8%]; ROC: AUC, 0.9474). The CNN visualizations correlated with clinically relevant features such as corneal infiltrate, hypopyon, and conjunctival injection. CONCLUSIONS The CNN classified corneal ulcers and scars with high accuracy and generalized to patient populations outside of its training data. The CNN focused on clinically relevant features when it made a diagnosis. The CNN demonstrated potential as an inexpensive diagnostic approach that may aid triage in communities with limited access to eye care.
Collapse
Affiliation(s)
- Mo Tiwari
- Department of Computer Science, Stanford University, Stanford, California
| | - Chris Piech
- Department of Computer Science, Stanford University, Stanford, California
| | - Medina Baitemirova
- Department of Biomedical Informatics, Stanford University, Stanford, California
| | | | | | | | | | | | - Janice T Chua
- School of Medicine, University of California, Irvine, Irvine, California
| | - Travis Redd
- Department of Ophthalmology, Casey Eye Institute, Oregon Health and Science University, Portland, Oregon
| | - Thomas M Lietman
- Francis I. Proctor Foundation, University of California San Francisco, San Francisco, California
| | - Sebastian Thrun
- Department of Computer Science, Stanford University, Stanford, California
| | - Charles C Lin
- Byers Eye Institute, Stanford University, Stanford, California.
| |
Collapse
|
28
|
AI musculoskeletal clinical applications: how can AI increase my day-to-day efficiency? Skeletal Radiol 2022; 51:293-304. [PMID: 34341865 DOI: 10.1007/s00256-021-03876-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/28/2021] [Revised: 07/21/2021] [Accepted: 07/21/2021] [Indexed: 02/02/2023]
Abstract
Artificial intelligence (AI) is expected to bring greater efficiency in radiology by performing tasks that would otherwise require human intelligence, also at a much faster rate than human performance. In recent years, milestone deep learning models with unprecedented low error rates and high computational efficiency have shown remarkable performance for lesion detection, classification, and segmentation tasks. However, the growing field of AI has significant implications for radiology that are not limited to visual tasks. These are essential applications for optimizing imaging workflows and improving noninterpretive tasks. This article offers an overview of the recent literature on AI, focusing on the musculoskeletal imaging chain, including initial patient scheduling, optimized protocoling, magnetic resonance imaging reconstruction, image enhancement, medical image-to-image translation, and AI-aided image interpretation. The substantial developments of advanced algorithms, the emergence of massive quantities of medical data, and the interest of researchers and clinicians reveal the potential for the growing applications of AI to augment the day-to-day efficiency of musculoskeletal radiologists.
Collapse
|
29
|
Swain S, Bhushan B, Dhiman G, Viriyasitavat W. Appositeness of Optimized and Reliable Machine Learning for Healthcare: A Survey. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING : STATE OF THE ART REVIEWS 2022; 29:3981-4003. [PMID: 35342282 PMCID: PMC8939887 DOI: 10.1007/s11831-022-09733-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Accepted: 02/09/2022] [Indexed: 05/04/2023]
Abstract
Machine Learning (ML) has been categorized as a branch of Artificial Intelligence (AI) under the Computer Science domain wherein programmable machines imitate human learning behavior with the help of statistical methods and data. The Healthcare industry is one of the largest and busiest sectors in the world, functioning with an extensive amount of manual moderation at every stage. Most of the clinical documents concerning patient care are hand-written by experts, selective reports are machine-generated. This process elevates the chances of misdiagnosis thereby, imposing a risk to a patient's life. Recent technological adoptions for automating manual operations have witnessed extensive use of ML in its applications. The paper surveys the applicability of ML approaches in automating medical systems. The paper discusses most of the optimized statistical ML frameworks that encourage better service delivery in clinical aspects. The universal adoption of various Deep Learning (DL) and ML techniques as the underlying systems for a variety of wellness applications, is delineated by challenges and elevated by myriads of security. This work tries to recognize a variety of vulnerabilities occurring in medical procurement, admitting the concerns over its predictive performance from a privacy point of view. Finally providing possible risk delimiting facts and directions for active challenges in the future.
Collapse
Affiliation(s)
- Subhasmita Swain
- Department of Computer Science and Engineering, School of Engineering and Technology, Sharda University, Greater Noida, India
| | - Bharat Bhushan
- Department of Computer Science and Engineering, School of Engineering and Technology, Sharda University, Greater Noida, India
| | - Gaurav Dhiman
- Department of Computer Science, Government Bikram College of Commerce, Patiala, India
- University Centre for Research and Development, Department of Computer Science and Engineering, Chandigarh University, Gharuan, Mohali, India
- Department of Computer Science and Engineering, Graphic Era Deemed to be University, Dehradun, India
| | - Wattana Viriyasitavat
- Department of Statistics, Faculty of Commerce and Accountancy, Chulalongkorn Business School, Bangkok, Thailand
| |
Collapse
|
30
|
Buchlak QD, Esmaili N, Bennett C, Farrokhi F. Natural Language Processing Applications in the Clinical Neurosciences: A Machine Learning Augmented Systematic Review. ACTA NEUROCHIRURGICA. SUPPLEMENT 2022; 134:277-289. [PMID: 34862552 DOI: 10.1007/978-3-030-85292-4_32] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Natural language processing (NLP), a domain of artificial intelligence (AI) that models human language, has been used in medicine to automate diagnostics, detect adverse events, support decision making and predict clinical outcomes. However, applications to the clinical neurosciences appear to be limited. NLP has matured with the implementation of deep transformer models (e.g., XLNet, BERT, T5, and RoBERTa) and transfer learning. The objectives of this study were to (1) systematically review NLP applications in the clinical neurosciences, and (2) explore NLP analysis to facilitate literature synthesis, providing clear examples to demonstrate the potential capabilities of these technologies for a clinical audience. Our NLP analysis consisted of keyword identification, text summarization and document classification. A total of 48 articles met inclusion criteria. NLP has been applied in the clinical neurosciences to facilitate literature synthesis, data extraction, patient identification, automated clinical reporting and outcome prediction. The number of publications applying NLP has increased rapidly over the past five years. Document classifiers trained to differentiate included and excluded articles demonstrated moderate performance (XLNet AUC = 0.66, BERT AUC = 0.59, RoBERTa AUC = 0.62). The T5 transformer model generated acceptable abstract summaries. The application of NLP has the potential to enhance research and practice in the clinical neurosciences.
Collapse
Affiliation(s)
- Quinlan D Buchlak
- School of Medicine, The University of Notre Dame Australia, Sydney, NSW, Australia.
| | - Nazanin Esmaili
- School of Medicine, The University of Notre Dame Australia, Sydney, NSW, Australia
- Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW, Australia
| | - Christine Bennett
- School of Medicine, The University of Notre Dame Australia, Sydney, NSW, Australia
| | - Farrokh Farrokhi
- Neuroscience Institute, Virginia Mason Medical Center, Seattle, WA, USA
| |
Collapse
|
31
|
Feghali J, Jimenez AE, Schilling AT, Azad TD. Overview of Algorithms for Natural Language Processing and Time Series Analyses. ACTA NEUROCHIRURGICA. SUPPLEMENT 2021; 134:221-242. [PMID: 34862546 DOI: 10.1007/978-3-030-85292-4_26] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
A host of machine learning algorithms have been used to perform several different tasks in NLP and TSA. Prior to implementing these algorithms, some degree of data preprocessing is required. Deep learning approaches utilizing multilayer perceptrons, recurrent neural networks (RNNs), and convolutional neural networks (CNNs) represent commonly used techniques. In supervised learning applications, all these models map inputs into a predicted output and then model the discrepancy between predicted values and the real output according to a loss function. The parameters of the mapping function are then optimized through the process of gradient descent and backward propagation in order to minimize this loss. This is the main premise behind many supervised learning algorithms. As experience with these algorithms grows, increased applications in the fields of medicine and neuroscience are anticipated.
Collapse
Affiliation(s)
- James Feghali
- Department of Neurosurgery, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Adrian E Jimenez
- Department of Neurosurgery, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Andrew T Schilling
- Department of Neurosurgery, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Tej D Azad
- Department of Neurosurgery, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
32
|
Kulkarni V, Gawali M, Kharat A. Key Technology Considerations in Developing and Deploying Machine Learning Models in Clinical Radiology Practice. JMIR Med Inform 2021; 9:e28776. [PMID: 34499049 PMCID: PMC8461525 DOI: 10.2196/28776] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 06/29/2021] [Accepted: 07/10/2021] [Indexed: 12/29/2022] Open
Abstract
The use of machine learning to develop intelligent software tools for the interpretation of radiology images has gained widespread attention in recent years. The development, deployment, and eventual adoption of these models in clinical practice, however, remains fraught with challenges. In this paper, we propose a list of key considerations that machine learning researchers must recognize and address to make their models accurate, robust, and usable in practice. We discuss insufficient training data, decentralized data sets, high cost of annotations, ambiguous ground truth, imbalance in class representation, asymmetric misclassification costs, relevant performance metrics, generalization of models to unseen data sets, model decay, adversarial attacks, explainability, fairness and bias, and clinical validation. We describe each consideration and identify the techniques used to address it. Although these techniques have been discussed in prior research, by freshly examining them in the context of medical imaging and compiling them in the form of a laundry list, we hope to make them more accessible to researchers, software developers, radiologists, and other stakeholders.
Collapse
Affiliation(s)
| | | | - Amit Kharat
- DeepTek Inc, Pune, India
- D Y Patil University, Pune, India
| |
Collapse
|
33
|
Olthof AW, van Ooijen PMA, Cornelissen LJ. Deep Learning-Based Natural Language Processing in Radiology: The Impact of Report Complexity, Disease Prevalence, Dataset Size, and Algorithm Type on Model Performance. J Med Syst 2021; 45:91. [PMID: 34480231 PMCID: PMC8416876 DOI: 10.1007/s10916-021-01761-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 08/04/2021] [Indexed: 12/12/2022]
Abstract
In radiology, natural language processing (NLP) allows the extraction of valuable information from radiology reports. It can be used for various downstream tasks such as quality improvement, epidemiological research, and monitoring guideline adherence. Class imbalance, variation in dataset size, variation in report complexity, and algorithm type all influence NLP performance but have not yet been systematically and interrelatedly evaluated. In this study, we investigate these factors on the performance of four types [a fully connected neural network (Dense), a long short-term memory recurrent neural network (LSTM), a convolutional neural network (CNN), and a Bidirectional Encoder Representations from Transformers (BERT)] of deep learning-based NLP. Two datasets consisting of radiologist-annotated reports of both trauma radiographs (n = 2469) and chest radiographs and computer tomography (CT) studies (n = 2255) were split into training sets (80%) and testing sets (20%). The training data was used as a source to train all four model types in 84 experiments (Fracture-data) and 45 experiments (Chest-data) with variation in size and prevalence. The performance was evaluated on sensitivity, specificity, positive predictive value, negative predictive value, area under the curve, and F score. After the NLP of radiology reports, all four model-architectures demonstrated high performance with metrics up to > 0.90. CNN, LSTM, and Dense were outperformed by the BERT algorithm because of its stable results despite variation in training size and prevalence. Awareness of variation in prevalence is warranted because it impacts sensitivity and specificity in opposite directions.
Collapse
Affiliation(s)
- A W Olthof
- Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Hanzeplein 1, Groningen, The Netherlands. .,Treant Health Care Group, Department of Radiology, Dr G.H. Amshoffweg 1, Hoogeveen, The Netherlands. .,Hospital Group Twente (ZGT), Department of Radiology, Almelo, The Netherlands.
| | - P M A van Ooijen
- Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Hanzeplein 1, Groningen, The Netherlands.,Data Science Center in Health (DASH), University of Groningen, University Medical Center Groningen, Machine Learning Lab, L.J, Zielstraweg 2, Groningen, The Netherlands
| | - L J Cornelissen
- Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Hanzeplein 1, Groningen, The Netherlands.,COSMONiO Imaging BV, L.J, Zielstraweg 2, Groningen, The Netherlands
| |
Collapse
|
34
|
Cheng PM, Montagnon E, Yamashita R, Pan I, Cadrin-Chênevert A, Perdigón Romero F, Chartrand G, Kadoury S, Tang A. Deep Learning: An Update for Radiologists. Radiographics 2021; 41:1427-1445. [PMID: 34469211 DOI: 10.1148/rg.2021200210] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Deep learning is a class of machine learning methods that has been successful in computer vision. Unlike traditional machine learning methods that require hand-engineered feature extraction from input images, deep learning methods learn the image features by which to classify data. Convolutional neural networks (CNNs), the core of deep learning methods for imaging, are multilayered artificial neural networks with weighted connections between neurons that are iteratively adjusted through repeated exposure to training data. These networks have numerous applications in radiology, particularly in image classification, object detection, semantic segmentation, and instance segmentation. The authors provide an update on a recent primer on deep learning for radiologists, and they review terminology, data requirements, and recent trends in the design of CNNs; illustrate building blocks and architectures adapted to computer vision tasks, including generative architectures; and discuss training and validation, performance metrics, visualization, and future directions. Familiarity with the key concepts described will help radiologists understand advances of deep learning in medical imaging and facilitate clinical adoption of these techniques. Online supplemental material is available for this article. ©RSNA, 2021.
Collapse
Affiliation(s)
- Phillip M Cheng
- From the Department of Radiology, Keck School of Medicine of the University of Southern California, Los Angeles, Calif (P.M.C.); Research Center (E.M., F.P.R., S.K., A.T.) and Department of Radiology (A.T.), Centre Hospitalier de l'Université de Montréal, 1058-2117 rue Saint-Denis, Montréal, QC, Canada H2X 3J4; Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (R.Y.); Warren Alpert Medical School, Brown University, Providence, RI (I.P.); Department of Medical Imaging, CISSS Lanaudière, Université Laval, Joliette, Québec, Canada (A.C.C., S.K.); École Polytechnique, Montréal, Québec, Canada (F.P.R.); and AFX Medical, Montréal, Québec, Canada (G.C.)
| | - Emmanuel Montagnon
- From the Department of Radiology, Keck School of Medicine of the University of Southern California, Los Angeles, Calif (P.M.C.); Research Center (E.M., F.P.R., S.K., A.T.) and Department of Radiology (A.T.), Centre Hospitalier de l'Université de Montréal, 1058-2117 rue Saint-Denis, Montréal, QC, Canada H2X 3J4; Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (R.Y.); Warren Alpert Medical School, Brown University, Providence, RI (I.P.); Department of Medical Imaging, CISSS Lanaudière, Université Laval, Joliette, Québec, Canada (A.C.C., S.K.); École Polytechnique, Montréal, Québec, Canada (F.P.R.); and AFX Medical, Montréal, Québec, Canada (G.C.)
| | - Rikiya Yamashita
- From the Department of Radiology, Keck School of Medicine of the University of Southern California, Los Angeles, Calif (P.M.C.); Research Center (E.M., F.P.R., S.K., A.T.) and Department of Radiology (A.T.), Centre Hospitalier de l'Université de Montréal, 1058-2117 rue Saint-Denis, Montréal, QC, Canada H2X 3J4; Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (R.Y.); Warren Alpert Medical School, Brown University, Providence, RI (I.P.); Department of Medical Imaging, CISSS Lanaudière, Université Laval, Joliette, Québec, Canada (A.C.C., S.K.); École Polytechnique, Montréal, Québec, Canada (F.P.R.); and AFX Medical, Montréal, Québec, Canada (G.C.)
| | - Ian Pan
- From the Department of Radiology, Keck School of Medicine of the University of Southern California, Los Angeles, Calif (P.M.C.); Research Center (E.M., F.P.R., S.K., A.T.) and Department of Radiology (A.T.), Centre Hospitalier de l'Université de Montréal, 1058-2117 rue Saint-Denis, Montréal, QC, Canada H2X 3J4; Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (R.Y.); Warren Alpert Medical School, Brown University, Providence, RI (I.P.); Department of Medical Imaging, CISSS Lanaudière, Université Laval, Joliette, Québec, Canada (A.C.C., S.K.); École Polytechnique, Montréal, Québec, Canada (F.P.R.); and AFX Medical, Montréal, Québec, Canada (G.C.)
| | - Alexandre Cadrin-Chênevert
- From the Department of Radiology, Keck School of Medicine of the University of Southern California, Los Angeles, Calif (P.M.C.); Research Center (E.M., F.P.R., S.K., A.T.) and Department of Radiology (A.T.), Centre Hospitalier de l'Université de Montréal, 1058-2117 rue Saint-Denis, Montréal, QC, Canada H2X 3J4; Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (R.Y.); Warren Alpert Medical School, Brown University, Providence, RI (I.P.); Department of Medical Imaging, CISSS Lanaudière, Université Laval, Joliette, Québec, Canada (A.C.C., S.K.); École Polytechnique, Montréal, Québec, Canada (F.P.R.); and AFX Medical, Montréal, Québec, Canada (G.C.)
| | - Francisco Perdigón Romero
- From the Department of Radiology, Keck School of Medicine of the University of Southern California, Los Angeles, Calif (P.M.C.); Research Center (E.M., F.P.R., S.K., A.T.) and Department of Radiology (A.T.), Centre Hospitalier de l'Université de Montréal, 1058-2117 rue Saint-Denis, Montréal, QC, Canada H2X 3J4; Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (R.Y.); Warren Alpert Medical School, Brown University, Providence, RI (I.P.); Department of Medical Imaging, CISSS Lanaudière, Université Laval, Joliette, Québec, Canada (A.C.C., S.K.); École Polytechnique, Montréal, Québec, Canada (F.P.R.); and AFX Medical, Montréal, Québec, Canada (G.C.)
| | - Gabriel Chartrand
- From the Department of Radiology, Keck School of Medicine of the University of Southern California, Los Angeles, Calif (P.M.C.); Research Center (E.M., F.P.R., S.K., A.T.) and Department of Radiology (A.T.), Centre Hospitalier de l'Université de Montréal, 1058-2117 rue Saint-Denis, Montréal, QC, Canada H2X 3J4; Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (R.Y.); Warren Alpert Medical School, Brown University, Providence, RI (I.P.); Department of Medical Imaging, CISSS Lanaudière, Université Laval, Joliette, Québec, Canada (A.C.C., S.K.); École Polytechnique, Montréal, Québec, Canada (F.P.R.); and AFX Medical, Montréal, Québec, Canada (G.C.)
| | - Samuel Kadoury
- From the Department of Radiology, Keck School of Medicine of the University of Southern California, Los Angeles, Calif (P.M.C.); Research Center (E.M., F.P.R., S.K., A.T.) and Department of Radiology (A.T.), Centre Hospitalier de l'Université de Montréal, 1058-2117 rue Saint-Denis, Montréal, QC, Canada H2X 3J4; Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (R.Y.); Warren Alpert Medical School, Brown University, Providence, RI (I.P.); Department of Medical Imaging, CISSS Lanaudière, Université Laval, Joliette, Québec, Canada (A.C.C., S.K.); École Polytechnique, Montréal, Québec, Canada (F.P.R.); and AFX Medical, Montréal, Québec, Canada (G.C.)
| | - An Tang
- From the Department of Radiology, Keck School of Medicine of the University of Southern California, Los Angeles, Calif (P.M.C.); Research Center (E.M., F.P.R., S.K., A.T.) and Department of Radiology (A.T.), Centre Hospitalier de l'Université de Montréal, 1058-2117 rue Saint-Denis, Montréal, QC, Canada H2X 3J4; Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (R.Y.); Warren Alpert Medical School, Brown University, Providence, RI (I.P.); Department of Medical Imaging, CISSS Lanaudière, Université Laval, Joliette, Québec, Canada (A.C.C., S.K.); École Polytechnique, Montréal, Québec, Canada (F.P.R.); and AFX Medical, Montréal, Québec, Canada (G.C.)
| |
Collapse
|
35
|
Mozayan A, Fabbri AR, Maneevese M, Tocino I, Chheang S. Practical Guide to Natural Language Processing for Radiology. Radiographics 2021; 41:1446-1453. [PMID: 34469212 DOI: 10.1148/rg.2021200113] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Natural language processing (NLP) is the subset of artificial intelligence focused on the computer interpretation of human language. It is an invaluable tool in the analysis, aggregation, and simplification of free text. It has already demonstrated significant potential in the analysis of radiology reports. There are abundant open-source libraries and tools available that facilitate its application to the benefit of radiology. Radiologists who understand its limitations and potential will be better positioned to evaluate NLP models, understand how they can improve clinical workflow, and facilitate research endeavors involving large amounts of human language. The advent of increasingly affordable and powerful computer processing, the large quantities of medical and radiologic data, and advances in machine learning algorithms have contributed to the large potential of NLP. In turn, radiology has significant potential to benefit from the ability of NLP to convert relatively standardized radiology reports to machine-readable data. NLP benefits from standardized reporting, but because of its ability to interpret free text by using context clues, NLP does not necessarily depend on it. An overview and practical approach to NLP is featured, with specific emphasis on its applications to radiology. A brief history of NLP, the strengths and challenges inherent to its use, and freely available resources and tools are covered to guide further exploration and study within the field. Particular attention is devoted to the recent development of the Word2Vec and BERT (Bidirectional Encoder Representations from Transformers) language models, which have exponentially increased the power and utility of NLP for a variety of applications. Online supplemental material is available for this article. ©RSNA, 2021.
Collapse
Affiliation(s)
- Ali Mozayan
- From the Department of Radiology and Biomedical Imaging, Yale School of Medicine, PO Box 208042, Tompkins East 2, New Haven, CT 06520 (A.M., M.M., I.T., S.C.); and Department of Computer Science, Yale University, New Haven, Conn (A.R.F.)
| | - Alexander R Fabbri
- From the Department of Radiology and Biomedical Imaging, Yale School of Medicine, PO Box 208042, Tompkins East 2, New Haven, CT 06520 (A.M., M.M., I.T., S.C.); and Department of Computer Science, Yale University, New Haven, Conn (A.R.F.)
| | - Michelle Maneevese
- From the Department of Radiology and Biomedical Imaging, Yale School of Medicine, PO Box 208042, Tompkins East 2, New Haven, CT 06520 (A.M., M.M., I.T., S.C.); and Department of Computer Science, Yale University, New Haven, Conn (A.R.F.)
| | - Irena Tocino
- From the Department of Radiology and Biomedical Imaging, Yale School of Medicine, PO Box 208042, Tompkins East 2, New Haven, CT 06520 (A.M., M.M., I.T., S.C.); and Department of Computer Science, Yale University, New Haven, Conn (A.R.F.)
| | - Sophie Chheang
- From the Department of Radiology and Biomedical Imaging, Yale School of Medicine, PO Box 208042, Tompkins East 2, New Haven, CT 06520 (A.M., M.M., I.T., S.C.); and Department of Computer Science, Yale University, New Haven, Conn (A.R.F.)
| |
Collapse
|
36
|
Mitsopoulos K, Somers S, Schooler J, Lebiere C, Pirolli P, Thomson R. Toward a Psychology of Deep Reinforcement Learning Agents Using a Cognitive Architecture. Top Cogn Sci 2021; 14:756-779. [PMID: 34467649 DOI: 10.1111/tops.12573] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 08/12/2021] [Accepted: 08/12/2021] [Indexed: 11/28/2022]
Abstract
We argue that cognitive models can provide a common ground between human users and deep reinforcement learning (Deep RL) algorithms for purposes of explainable artificial intelligence (AI). Casting both the human and learner as cognitive models provides common mechanisms to compare and understand their underlying decision-making processes. This common grounding allows us to identify divergences and explain the learner's behavior in human understandable terms. We present novel salience techniques that highlight the most relevant features in each model's decision-making, as well as examples of this technique in common training environments such as Starcraft II and an OpenAI gridworld.
Collapse
Affiliation(s)
| | | | - Joel Schooler
- Institute for Human and Machine Cognition, Pensacola
| | | | - Peter Pirolli
- Institute for Human and Machine Cognition, Pensacola
| | - Robert Thomson
- Psychology Department, Carnegie Mellon University.,Army Cyber Institute, United States Military Academy
| |
Collapse
|
37
|
Juluru K, Shih HH, Keshava Murthy KN, Elnajjar P. Bag-of-Words Technique in Natural Language Processing: A Primer for Radiologists. Radiographics 2021; 41:1420-1426. [PMID: 34388050 DOI: 10.1148/rg.2021210025] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Natural language processing (NLP) is a methodology designed to extract concepts and meaning from human-generated unstructured (free-form) text. It is intended to be implemented by using computer algorithms so that it can be run on a corpus of documents quickly and reliably. To enable machine learning (ML) techniques in NLP, free-form text must be converted to a numerical representation. After several stages of preprocessing including tokenization, removal of stop words, token normalization, and creation of a master dictionary, the bag-of-words (BOW) technique can be used to represent each remaining word as a feature of the document. The preprocessing steps simplify the documents but also potentially degrade meaning. The values of the features in BOW can be modified by using techniques such as term count, term frequency, and term frequency-inverse document frequency. Experience and experimentation will guide decisions on which specific techniques will optimize ML performance. These and other NLP techniques are being applied in radiology. Radiologists' understanding of the strengths and limitations of these techniques will help in communication with data scientists and in implementation for specific tasks. Online supplemental material is available for this article. ©RSNA, 2021.
Collapse
Affiliation(s)
- Krishna Juluru
- From the Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Ave, Box 29, New York, NY 10065
| | - Hao-Hsin Shih
- From the Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Ave, Box 29, New York, NY 10065
| | - Krishna Nand Keshava Murthy
- From the Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Ave, Box 29, New York, NY 10065
| | - Pierre Elnajjar
- From the Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Ave, Box 29, New York, NY 10065
| |
Collapse
|
38
|
Wood DA, Kafiabadi S, Al Busaidi A, Guilhem EL, Lynch J, Townend MK, Montvila A, Kiik M, Siddiqui J, Gadapa N, Benger MD, Mazumder A, Barker G, Ourselin S, Cole JH, Booth TC. Deep learning to automate the labelling of head MRI datasets for computer vision applications. Eur Radiol 2021; 32:725-736. [PMID: 34286375 PMCID: PMC8660736 DOI: 10.1007/s00330-021-08132-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 06/02/2021] [Accepted: 06/14/2021] [Indexed: 02/07/2023]
Abstract
Objectives The purpose of this study was to build a deep learning model to derive labels from neuroradiology reports and assign these to the corresponding examinations, overcoming a bottleneck to computer vision model development. Methods Reference-standard labels were generated by a team of neuroradiologists for model training and evaluation. Three thousand examinations were labelled for the presence or absence of any abnormality by manually scrutinising the corresponding radiology reports (‘reference-standard report labels’); a subset of these examinations (n = 250) were assigned ‘reference-standard image labels’ by interrogating the actual images. Separately, 2000 reports were labelled for the presence or absence of 7 specialised categories of abnormality (acute stroke, mass, atrophy, vascular abnormality, small vessel disease, white matter inflammation, encephalomalacia), with a subset of these examinations (n = 700) also assigned reference-standard image labels. A deep learning model was trained using labelled reports and validated in two ways: comparing predicted labels to (i) reference-standard report labels and (ii) reference-standard image labels. The area under the receiver operating characteristic curve (AUC-ROC) was used to quantify model performance. Accuracy, sensitivity, specificity, and F1 score were also calculated. Results Accurate classification (AUC-ROC > 0.95) was achieved for all categories when tested against reference-standard report labels. A drop in performance (ΔAUC-ROC > 0.02) was seen for three categories (atrophy, encephalomalacia, vascular) when tested against reference-standard image labels, highlighting discrepancies in the original reports. Once trained, the model assigned labels to 121,556 examinations in under 30 min. Conclusions Our model accurately classifies head MRI examinations, enabling automated dataset labelling for downstream computer vision applications. Key Points • Deep learning is poised to revolutionise image recognition tasks in radiology; however, a barrier to clinical adoption is the difficulty of obtaining large labelled datasets for model training. • We demonstrate a deep learning model which can derive labels from neuroradiology reports and assign these to the corresponding examinations at scale, facilitating the development of downstream computer vision models. • We rigorously tested our model by comparing labels predicted on the basis of neuroradiology reports with two sets of reference-standard labels: (1) labels derived by manually scrutinising each radiology report and (2) labels derived by interrogating the actual images. Supplementary Information The online version contains supplementary material available at 10.1007/s00330-021-08132-0.
Collapse
Affiliation(s)
- David A Wood
- School of Biomedical Engineering & Imaging Sciences, Kings College London, Rayne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK
| | - Sina Kafiabadi
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | - Aisha Al Busaidi
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | - Emily L Guilhem
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | - Jeremy Lynch
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | | | - Antanas Montvila
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK.,Hospital of Lithuanian University of Health Sciences, Kaunas Clinics, Kaunas, Lithuania
| | - Martin Kiik
- School of Biomedical Engineering & Imaging Sciences, Kings College London, Rayne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK
| | - Juveria Siddiqui
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | - Naveen Gadapa
- Department of Neurology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | - Matthew D Benger
- Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
| | - Asif Mazumder
- Guy's and St Thomas' NHS Foundation Trust, Westminster Bridge Road, London, SE1 7EH, UK
| | - Gareth Barker
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, SE5 8AF, UK
| | - Sebastian Ourselin
- School of Biomedical Engineering & Imaging Sciences, Kings College London, Rayne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK
| | - James H Cole
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, SE5 8AF, UK.,Centre for Medical Image Computing, Department of Computer Science, University College London, London, WC1V 6LJ, UK.,Dementia Research Centre, University College London, London, WC1N 3BG, UK
| | - Thomas C Booth
- School of Biomedical Engineering & Imaging Sciences, Kings College London, Rayne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK. .,Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK.
| |
Collapse
|
39
|
Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:6663884. [PMID: 34306597 PMCID: PMC8285182 DOI: 10.1155/2021/6663884] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 05/29/2021] [Accepted: 06/29/2021] [Indexed: 11/17/2022]
Abstract
Methods We used EHR data of patients included in the Second Manifestations of ARTerial disease (SMART) study. We propose a deep learning-based multimodal architecture for our text mining pipeline that integrates neural text representation with preprocessed clinical predictors for the prediction of recurrence of major cardiovascular events in cardiovascular patients. Text preprocessing, including cleaning and stemming, was first applied to filter out the unwanted texts from X-ray radiology reports. Thereafter, text representation methods were used to numerically represent unstructured radiology reports with vectors. Subsequently, these text representation methods were added to prediction models to assess their clinical relevance. In this step, we applied logistic regression, support vector machine (SVM), multilayer perceptron neural network, convolutional neural network, long short-term memory (LSTM), and bidirectional LSTM deep neural network (BiLSTM). Results We performed various experiments to evaluate the added value of the text in the prediction of major cardiovascular events. The two main scenarios were the integration of radiology reports (1) with classical clinical predictors and (2) with only age and sex in the case of unavailable clinical predictors. In total, data of 5603 patients were used with 5-fold cross-validation to train the models. In the first scenario, the multimodal BiLSTM (MI-BiLSTM) model achieved an area under the curve (AUC) of 84.7%, misclassification rate of 14.3%, and F1 score of 83.8%. In this scenario, the SVM model, trained on clinical variables and bag-of-words representation, achieved the lowest misclassification rate of 12.2%. In the case of unavailable clinical predictors, the MI-BiLSTM model trained on radiology reports and demographic (age and sex) variables reached an AUC, F1 score, and misclassification rate of 74.5%, 70.8%, and 20.4%, respectively. Conclusions Using the case study of routine care chest X-ray radiology reports, we demonstrated the clinical relevance of integrating text features and classical predictors in our text mining pipeline for cardiovascular risk prediction. The MI-BiLSTM model with word embedding representation appeared to have a desirable performance when trained on text data integrated with the clinical variables from the SMART study. Our results mined from chest X-ray reports showed that models using text data in addition to laboratory values outperform those using only known clinical predictors.
Collapse
|
40
|
Casey A, Davidson E, Poon M, Dong H, Duma D, Grivas A, Grover C, Suárez-Paniagua V, Tobin R, Whiteley W, Wu H, Alex B. A systematic review of natural language processing applied to radiology reports. BMC Med Inform Decis Mak 2021; 21:179. [PMID: 34082729 PMCID: PMC8176715 DOI: 10.1186/s12911-021-01533-7] [Citation(s) in RCA: 63] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 05/17/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Natural language processing (NLP) has a significant role in advancing healthcare and has been found to be key in extracting structured information from radiology reports. Understanding recent developments in NLP application to radiology is of significance but recent reviews on this are limited. This study systematically assesses and quantifies recent literature in NLP applied to radiology reports. METHODS We conduct an automated literature search yielding 4836 results using automated filtering, metadata enriching steps and citation search combined with manual review. Our analysis is based on 21 variables including radiology characteristics, NLP methodology, performance, study, and clinical application characteristics. RESULTS We present a comprehensive analysis of the 164 publications retrieved with publications in 2019 almost triple those in 2015. Each publication is categorised into one of 6 clinical application categories. Deep learning use increases in the period but conventional machine learning approaches are still prevalent. Deep learning remains challenged when data is scarce and there is little evidence of adoption into clinical practice. Despite 17% of studies reporting greater than 0.85 F1 scores, it is hard to comparatively evaluate these approaches given that most of them use different datasets. Only 14 studies made their data and 15 their code available with 10 externally validating results. CONCLUSIONS Automated understanding of clinical narratives of the radiology reports has the potential to enhance the healthcare process and we show that research in this field continues to grow. Reproducibility and explainability of models are important if the domain is to move applications into clinical use. More could be done to share code enabling validation of methods on different institutional data and to reduce heterogeneity in reporting of study properties allowing inter-study comparisons. Our results have significance for researchers in the field providing a systematic synthesis of existing work to build on, identify gaps, opportunities for collaboration and avoid duplication.
Collapse
Affiliation(s)
- Arlene Casey
- School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland
| | - Emma Davidson
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
| | - Michael Poon
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
| | - Hang Dong
- Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, Scotland
- Health Data Research UK, London, UK
| | - Daniel Duma
- School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland
| | - Andreas Grivas
- Institute for Language, Cognition and Computation, School of informatics, University of Edinburgh, Edinburgh, Scotland
| | - Claire Grover
- Institute for Language, Cognition and Computation, School of informatics, University of Edinburgh, Edinburgh, Scotland
| | - Víctor Suárez-Paniagua
- Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, Scotland
- Health Data Research UK, London, UK
| | - Richard Tobin
- Institute for Language, Cognition and Computation, School of informatics, University of Edinburgh, Edinburgh, Scotland
| | - William Whiteley
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Honghan Wu
- Health Data Research UK, London, UK
- Institute of Health Informatics, University College London, London, UK
| | - Beatrice Alex
- School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland
- Edinburgh Futures Institute, University of Edinburgh, Edinburgh, Scotland
| |
Collapse
|
41
|
Templated Text Synthesis for Expert-Guided Multi-Label Extraction from Radiology Reports. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2021. [DOI: 10.3390/make3020015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Training medical image analysis models traditionally requires large amounts of expertly annotated imaging data which is time-consuming and expensive to obtain. One solution is to automatically extract scan-level labels from radiology reports. Previously, we showed that, by extending BERT with a per-label attention mechanism, we can train a single model to perform automatic extraction of many labels in parallel. However, if we rely on pure data-driven learning, the model sometimes fails to learn critical features or learns the correct answer via simplistic heuristics (e.g., that “likely” indicates positivity), and thus fails to generalise to rarer cases which have not been learned or where the heuristics break down (e.g., “likely represents prominent VR space or lacunar infarct” which indicates uncertainty over two differential diagnoses). In this work, we propose template creation for data synthesis, which enables us to inject expert knowledge about unseen entities from medical ontologies, and to teach the model rules on how to label difficult cases, by producing relevant training examples. Using this technique alongside domain-specific pre-training for our underlying BERT architecture i.e., PubMedBERT, we improve F1 micro from 0.903 to 0.939 and F1 macro from 0.512 to 0.737 on an independent test set for 33 labels in head CT reports for stroke patients. Our methodology offers a practical way to combine domain knowledge with machine learning for text classification tasks.
Collapse
|
42
|
Senders JT, Cho LD, Calvachi P, McNulty JJ, Ashby JL, Schulte IS, Almekkawi AK, Mehrtash A, Gormley WB, Smith TR, Broekman MLD, Arnaout O. Automating Clinical Chart Review: An Open-Source Natural Language Processing Pipeline Developed on Free-Text Radiology Reports From Patients With Glioblastoma. JCO Clin Cancer Inform 2021; 4:25-34. [PMID: 31977252 DOI: 10.1200/cci.19.00060] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
PURPOSE The aim of this study was to develop an open-source natural language processing (NLP) pipeline for text mining of medical information from clinical reports. We also aimed to provide insight into why certain variables or reports are more suitable for clinical text mining than others. MATERIALS AND METHODS Various NLP models were developed to extract 15 radiologic characteristics from free-text radiology reports for patients with glioblastoma. Ten-fold cross-validation was used to optimize the hyperparameter settings and estimate model performance. We examined how model performance was associated with quantitative attributes of the radiologic characteristics and reports. RESULTS In total, 562 unique brain magnetic resonance imaging reports were retrieved. NLP extracted 15 radiologic characteristics with high to excellent discrimination (area under the curve, 0.82 to 0.98) and accuracy (78.6% to 96.6%). Model performance was correlated with the inter-rater agreement of the manually provided labels (ρ = 0.904; P < .001) but not with the frequency distribution of the variables of interest (ρ = 0.179; P = .52). All variables labeled with a near perfect inter-rater agreement were classified with excellent performance (area under the curve > 0.95). Excellent performance could be achieved for variables with only 50 to 100 observations in the minority group and class imbalances up to a 9:1 ratio. Report-level classification accuracy was not associated with the number of words or the vocabulary size in the distinct text documents. CONCLUSION This study provides an open-source NLP pipeline that allows for text mining of narratively written clinical reports. Small sample sizes and class imbalance should not be considered as absolute contraindications for text mining in clinical research. However, future studies should report measures of inter-rater agreement whenever ground truth is based on a consensus label and use this measure to identify clinical variables eligible for text mining.
Collapse
Affiliation(s)
- Joeky T Senders
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Department of Neurosurgery, Leiden University Medical Center, Leiden, the Netherlands
| | - Logan D Cho
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Department of Neuroscience, Brown University, Providence, RI
| | - Paola Calvachi
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - John J McNulty
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Vagelos College of Physicians and Surgeons, Columbia University, New York, NY
| | - Joanna L Ashby
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Isabelle S Schulte
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Ahmad Kareem Almekkawi
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Alireza Mehrtash
- Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - William B Gormley
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Timothy R Smith
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Marike L D Broekman
- Department of Neurosurgery, Leiden University Medical Center, Leiden, the Netherlands.,Department of Neurosurgery, Haaglanden Medical Center, The Hague, the Netherlands
| | - Omar Arnaout
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| |
Collapse
|
43
|
Chan HP, Hadjiiski LM, Samala RK. Computer-aided diagnosis in the era of deep learning. Med Phys 2021; 47:e218-e227. [PMID: 32418340 DOI: 10.1002/mp.13764] [Citation(s) in RCA: 99] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Revised: 05/13/2019] [Accepted: 05/13/2019] [Indexed: 12/15/2022] Open
Abstract
Computer-aided diagnosis (CAD) has been a major field of research for the past few decades. CAD uses machine learning methods to analyze imaging and/or nonimaging patient data and makes assessment of the patient's condition, which can then be used to assist clinicians in their decision-making process. The recent success of the deep learning technology in machine learning spurs new research and development efforts to improve CAD performance and to develop CAD for many other complex clinical tasks. In this paper, we discuss the potential and challenges in developing CAD tools using deep learning technology or artificial intelligence (AI) in general, the pitfalls and lessons learned from CAD in screening mammography and considerations needed for future implementation of CAD or AI in clinical use. It is hoped that the past experiences and the deep learning technology will lead to successful advancement and lasting growth in this new era of CAD, thereby enabling CAD to deliver intelligent aids to improve health care.
Collapse
Affiliation(s)
- Heang-Ping Chan
- Department of Radiology, University of Michigan, Ann Arbor, MI, 48109-5842, USA
| | - Lubomir M Hadjiiski
- Department of Radiology, University of Michigan, Ann Arbor, MI, 48109-5842, USA
| | - Ravi K Samala
- Department of Radiology, University of Michigan, Ann Arbor, MI, 48109-5842, USA
| |
Collapse
|
44
|
Goyal S. An Overview of Current Trends, Techniques, Prospects, and Pitfalls of Artificial Intelligence in Breast Imaging. REPORTS IN MEDICAL IMAGING 2021. [DOI: 10.2147/rmi.s295205] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
|
45
|
Ahmad R. Reviewing the relationship between machines and radiology: the application of artificial intelligence. Acta Radiol Open 2021; 10:2058460121990296. [PMID: 33623711 PMCID: PMC7876935 DOI: 10.1177/2058460121990296] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Accepted: 01/07/2021] [Indexed: 12/13/2022] Open
Abstract
Background The scope and productivity of artificial intelligence applications in health
science and medicine, particularly in medical imaging, are rapidly
progressing, with relatively recent developments in big data and deep
learning and increasingly powerful computer algorithms. Accordingly, there
are a number of opportunities and challenges for the radiological
community. Purpose To provide review on the challenges and barriers experienced in diagnostic
radiology on the basis of the key clinical applications of machine learning
techniques. Material and Methods Studies published in 2010–2019 were selected that report on the efficacy of
machine learning models. A single contingency table was selected for each
study to report the highest accuracy of radiology professionals and machine
learning algorithms, and a meta-analysis of studies was conducted based on
contingency tables. Results The specificity for all the deep learning models ranged from 39% to 100%,
whereas sensitivity ranged from 85% to 100%. The pooled sensitivity and
specificity were 89% and 85% for the deep learning algorithms for detecting
abnormalities compared to 75% and 91% for radiology experts, respectively.
The pooled specificity and sensitivity for comparison between radiology
professionals and deep learning algorithms were 91% and 81% for deep
learning models and 85% and 73% for radiology professionals (p < 0.000),
respectively. The pooled sensitivity detection was 82% for health-care
professionals and 83% for deep learning algorithms (p < 0.005). Conclusion Radiomic information extracted through machine learning programs form images
that may not be discernible through visual examination, thus may improve the
prognostic and diagnostic value of data sets.
Collapse
Affiliation(s)
- Rani Ahmad
- King Abdulaziz University, King Abdulaziz University Hospital, Jeddah, Saudi Arabia
| |
Collapse
|
46
|
Qayyum A, Qadir J, Bilal M, Al-Fuqaha A. Secure and Robust Machine Learning for Healthcare: A Survey. IEEE Rev Biomed Eng 2021; 14:156-180. [PMID: 32746371 DOI: 10.1109/rbme.2020.3013489] [Citation(s) in RCA: 81] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Recent years have witnessed widespread adoption of machine learning (ML)/deep learning (DL) techniques due to their superior performance for a variety of healthcare applications ranging from the prediction of cardiac arrest from one-dimensional heart signals to computer-aided diagnosis (CADx) using multi-dimensional medical images. Notwithstanding the impressive performance of ML/DL, there are still lingering doubts regarding the robustness of ML/DL in healthcare settings (which is traditionally considered quite challenging due to the myriad security and privacy issues involved), especially in light of recent results that have shown that ML/DL are vulnerable to adversarial attacks. In this paper, we present an overview of various application areas in healthcare that leverage such techniques from security and privacy point of view and present associated challenges. In addition, we present potential methods to ensure secure and privacy-preserving ML for healthcare applications. Finally, we provide insight into the current research challenges and promising directions for future research.
Collapse
|
47
|
Sun L, Zhu W, Chen X, Jiang J, Ji Y, Liu N, Xu Y, Zhuang Y, Sun Z, Wang Q, Zhang F. Machine Learning to Predict Contrast-Induced Acute Kidney Injury in Patients With Acute Myocardial Infarction. Front Med (Lausanne) 2020; 7:592007. [PMID: 33282893 PMCID: PMC7691423 DOI: 10.3389/fmed.2020.592007] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 10/27/2020] [Indexed: 11/30/2022] Open
Abstract
Objective: To develop predictive models for contrast induced acute kidney injury (CI-AKI) among acute myocardial infarction (AMI) patients treated invasively. Methods: Patients with AMI who underwent angiography therapy were enrolled and randomly divided into training cohort (75%) and validation cohort (25%). Machine learning algorithms were used to construct predictive models for CI-AKI. The predictive models were tested in a validation cohort. Results: A total of 1,495 patients with AMI were included. Of all the patients, 226 (15.1%) cases developed CI-AKI. In the validation cohort, Random Forest (RF) model with top 15 variables reached an area under the curve (AUC) of 0.82 (95% CI: 0.76–0.87), while the best logistic model had an AUC of 0.69 (95% CI: 0.62–0.76). ACEF (age, creatinine, and ejection fraction) model reached an AUC of 0.62 (95% CI: 0.53–0.71). RF model with top 15 variables achieved a high recall rate of 71.9% and an accuracy of 73.5% in the validation group. Random Forest model significantly outperformed logistic regression in every comparison. Conclusions: Machine learning algorithms especially Random Forest algorithm improves the accuracy of risk stratifying patients with AMI and should be used to accurately identify the risk of CI-AKI in AMI patients.
Collapse
Affiliation(s)
- Ling Sun
- Department of Cardiology, The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou, China
| | - Wenwu Zhu
- Section of Pacing and Electrophysiology, Division of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Xin Chen
- Department of Cardiology, The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou, China
| | - Jianguang Jiang
- Department of Cardiology, The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou, China
| | - Yuan Ji
- Department of Cardiology, The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou, China
| | - Nan Liu
- Department of DSA, The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou, China
| | - Yajing Xu
- Department of Cardiology, The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou, China
| | - Yi Zhuang
- Department of Cardiology, The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou, China
| | - Zhiqin Sun
- School of Clinical Medicine, The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou, China
| | - Qingjie Wang
- Department of Cardiology, The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou, China
| | - Fengxiang Zhang
- Section of Pacing and Electrophysiology, Division of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| |
Collapse
|
48
|
Spasic I, Button K. Patient Triage by Topic Modeling of Referral Letters: Feasibility Study. JMIR Med Inform 2020; 8:e21252. [PMID: 33155985 PMCID: PMC7679210 DOI: 10.2196/21252] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 09/17/2020] [Accepted: 10/05/2020] [Indexed: 01/22/2023] Open
Abstract
Background Musculoskeletal conditions are managed within primary care, but patients can be referred to secondary care if a specialist opinion is required. The ever-increasing demand for health care resources emphasizes the need to streamline care pathways with the ultimate aim of ensuring that patients receive timely and optimal care. Information contained in referral letters underpins the referral decision-making process but is yet to be explored systematically for the purposes of treatment prioritization for musculoskeletal conditions. Objective This study aims to explore the feasibility of using natural language processing and machine learning to automate the triage of patients with musculoskeletal conditions by analyzing information from referral letters. Specifically, we aim to determine whether referral letters can be automatically assorted into latent topics that are clinically relevant, that is, considered relevant when prescribing treatments. Here, clinical relevance is assessed by posing 2 research questions. Can latent topics be used to automatically predict treatment? Can clinicians interpret latent topics as cohorts of patients who share common characteristics or experiences such as medical history, demographics, and possible treatments? Methods We used latent Dirichlet allocation to model each referral letter as a finite mixture over an underlying set of topics and model each topic as an infinite mixture over an underlying set of topic probabilities. The topic model was evaluated in the context of automating patient triage. Given a set of treatment outcomes, a binary classifier was trained for each outcome using previously extracted topics as the input features of the machine learning algorithm. In addition, a qualitative evaluation was performed to assess the human interpretability of topics. Results The prediction accuracy of binary classifiers outperformed the stratified random classifier by a large margin, indicating that topic modeling could be used to predict the treatment, thus effectively supporting patient triage. The qualitative evaluation confirmed the high clinical interpretability of the topic model. Conclusions The results established the feasibility of using natural language processing and machine learning to automate triage of patients with knee or hip pain by analyzing information from their referral letters.
Collapse
Affiliation(s)
- Irena Spasic
- School of Computer Science & Informatics, Cardiff University, Cardiff, United Kingdom
| | - Kate Button
- School of Healthcare Sciences, Cardiff University, Cardiff, United Kingdom
| |
Collapse
|
49
|
Draelos RL, Dov D, Mazurowski MA, Lo JY, Henao R, Rubin GD, Carin L. Machine-learning-based multiple abnormality prediction with large-scale chest computed tomography volumes. Med Image Anal 2020; 67:101857. [PMID: 33129142 DOI: 10.1016/j.media.2020.101857] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Revised: 09/15/2020] [Accepted: 09/18/2020] [Indexed: 12/11/2022]
Abstract
Machine learning models for radiology benefit from large-scale data sets with high quality labels for abnormalities. We curated and analyzed a chest computed tomography (CT) data set of 36,316 volumes from 19,993 unique patients. This is the largest multiply-annotated volumetric medical imaging data set reported. To annotate this data set, we developed a rule-based method for automatically extracting abnormality labels from free-text radiology reports with an average F-score of 0.976 (min 0.941, max 1.0). We also developed a model for multi-organ, multi-disease classification of chest CT volumes that uses a deep convolutional neural network (CNN). This model reached a classification performance of AUROC >0.90 for 18 abnormalities, with an average AUROC of 0.773 for all 83 abnormalities, demonstrating the feasibility of learning from unfiltered whole volume CT data. We show that training on more labels improves performance significantly: for a subset of 9 labels - nodule, opacity, atelectasis, pleural effusion, consolidation, mass, pericardial effusion, cardiomegaly, and pneumothorax - the model's average AUROC increased by 10% when the number of training labels was increased from 9 to all 83. All code for volume preprocessing, automated label extraction, and the volume abnormality prediction model is publicly available. The 36,316 CT volumes and labels will also be made publicly available pending institutional approval.
Collapse
Affiliation(s)
- Rachel Lea Draelos
- Computer Science Department, Duke University, LSRC Building D101, 308 Research Drive, Duke Box 90129, Durham, North Carolina 27708-0129, United States of America; School of Medicine, Duke University, DUMC 3710, Durham, North Carolina 27710, United States of America.
| | - David Dov
- Electrical and Computer Engineering Department, Edmund T. Pratt Jr. School of Engineering, Duke University, Box 90291, Durham, North Carolina 27708, United States of America
| | - Maciej A Mazurowski
- Electrical and Computer Engineering Department, Edmund T. Pratt Jr. School of Engineering, Duke University, Box 90291, Durham, North Carolina 27708, United States of America; Radiology Department, Duke University, Box 3808 DUMC, Durham, North Carolina 27710, United States of America; Biostatistics and Bioinformatics Department, Duke University, DUMC 2424 Erwin Road, Suite 1102 Hock Plaza, Box 2721 Durham, North Carolina 27710, United States of America
| | - Joseph Y Lo
- Electrical and Computer Engineering Department, Edmund T. Pratt Jr. School of Engineering, Duke University, Box 90291, Durham, North Carolina 27708, United States of America; Radiology Department, Duke University, Box 3808 DUMC, Durham, North Carolina 27710, United States of America; Biomedical Engineering Department, Edmund T. Pratt Jr. School of Engineering, Duke University, Room 1427, Fitzpatrick Center (FCIEMAS), 101 Science Drive, Campus Box 90281, Durham, North Carolina 27708-0281, United States of America
| | - Ricardo Henao
- Electrical and Computer Engineering Department, Edmund T. Pratt Jr. School of Engineering, Duke University, Box 90291, Durham, North Carolina 27708, United States of America; Biostatistics and Bioinformatics Department, Duke University, DUMC 2424 Erwin Road, Suite 1102 Hock Plaza, Box 2721 Durham, North Carolina 27710, United States of America
| | - Geoffrey D Rubin
- Radiology Department, Duke University, Box 3808 DUMC, Durham, North Carolina 27710, United States of America
| | - Lawrence Carin
- Computer Science Department, Duke University, LSRC Building D101, 308 Research Drive, Duke Box 90129, Durham, North Carolina 27708-0129, United States of America; Electrical and Computer Engineering Department, Edmund T. Pratt Jr. School of Engineering, Duke University, Box 90291, Durham, North Carolina 27708, United States of America; Statistical Science Department, Duke University, Box 90251, Durham, North Carolina 27708-0251, United States of America
| |
Collapse
|
50
|
Affiliation(s)
- David Z Wang
- Neurovascular Division, Department of Neurology, Barrow Neurological Institute, St Joseph Hospital and Medical Center, Phoenix, AZ, USA
| | - Lee H Schwamm
- Comprehensive Stroke Center, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Tianyi Qian
- Department of Public Health, School of Medicine, Tsinghua University, Beijing
- Tencent Healthcare, Tencent AIMIS, Shenzhen, China
| | - Qionghai Dai
- Tencent Healthcare, Tencent AIMIS, Shenzhen, China
| |
Collapse
|