Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Senders JT, Karhade AV, Cote DJ, Mehrtash A, Lamba N, DiRisio A, Muskens IS, Gormley WB, Smith TR, Broekman MLD, Arnaout O. Natural Language Processing for Automated Quantification of Brain Metastases Reported in Free-Text Radiology Reports. JCO Clin Cancer Inform 2020;3:1-9. [PMID: 31002562 DOI: 10.1200/cci.18.00138] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open

For:	Senders JT, Karhade AV, Cote DJ, Mehrtash A, Lamba N, DiRisio A, Muskens IS, Gormley WB, Smith TR, Broekman MLD, Arnaout O. Natural Language Processing for Automated Quantification of Brain Metastases Reported in Free-Text Radiology Reports. JCO Clin Cancer Inform 2020;3:1-9. [PMID: 31002562 DOI: 10.1200/cci.18.00138] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open

Number

Cited by Other Article(s)

Hicks-Courant K, Ko EM, Matsuo K, Melamed A, Nasioudis D, Rauh-Hain JA, Uppal S, Wright JD, Ramirez PT. Secondary databases in gynecologic cancer research. Int J Gynecol Cancer 2024;34:1619-1629. [PMID: 39043573 DOI: 10.1136/ijgc-2024-005677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/25/2024] Open

Hamdoune M, Jounaidi K, Ammari N, Gantare A. Digital health for cancer symptom management in palliative medicine: systematic review. BMJ Support Palliat Care 2024:spcare-2024-005107. [PMID: 39317426 DOI: 10.1136/spcare-2024-005107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Accepted: 09/08/2024] [Indexed: 09/26/2024]

Abstract

BACKGROUND

Digital health technologies (DHTs) play a crucial role in symptom management, particularly in palliative care, by providing patients with accessible tools to monitor and manage their symptoms effectively. The aim of this systematic review was to examine and synthesise the scientific literature on DHTs for symptom management in palliative oncology care.

METHODS

A systematic review was conducted in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines for systematic reviews and meta-analyses from 2 June to 20 June 2024. Databases including Scopus, Web of Science, ScienceDirect, PubMed and the Cochrane Library were searched. Data were extracted using a standardised form based on the PICOTT (Population, Intervention, Comparison, Outcome, Type and Technology) framework. The quality of the included studies was assessed using the Appraisal of Guidelines for Research & Evaluation (AGREE) II tool during the selection process.

RESULTS

The systematic review included seven articles describing six DHTs from five countries: the UK, Kenya, Tanzania, the Netherlands and the USA. The findings of this comprehensive literature review elucidate four principal themes: the specific types of DHTs used for symptom management in palliative cancer care, their roles and advantages, as well as the factors that limit or promote their adoption by patients and healthcare professionals.

CONCLUSION

The findings of this review give valuable insights into the ongoing discourse on integrating digital health solutions into palliative care practices, highlighting its potential role in enhancing symptom management within palliative cancer care and showcasing its possible benefits while also identifying key factors influencing their adoption among patients and healthcare professionals.

Collapse

Mapundu MT, Kabudula CW, Musenge E, Olago V, Celik T. Text mining of verbal autopsy narratives to extract mortality causes and most prevalent diseases using natural language processing. PLoS One 2024;19:e0308452. [PMID: 39298425 DOI: 10.1371/journal.pone.0308452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Accepted: 07/24/2024] [Indexed: 09/21/2024] Open

Abstract

Verbal autopsy (VA) narratives play a crucial role in understanding and documenting the causes of mortality, especially in regions lacking robust medical infrastructure. In this study, we propose a comprehensive approach to extract mortality causes and identify prevalent diseases from VA narratives utilizing advanced text mining techniques, so as to better understand the underlying health issues leading to mortality. Our methodology integrates n-gram-based language processing, Latent Dirichlet Allocation (LDA), and BERTopic, offering a multi-faceted analysis to enhance the accuracy and depth of information extraction. This is a retrospective study that uses secondary data analysis. We used data from the Agincourt Health and Demographic Surveillance Site (HDSS), which had 16338 observations collected between 1993 and 2015. Our text mining steps entailed data acquisition, pre-processing, feature extraction, topic segmentation, and discovered knowledge. The results suggest that the HDSS population may have died from mortality causes such as vomiting, chest/stomach pain, fever, coughing, loss of weight, low energy, headache. Additionally, we discovered that the most prevalent diseases entailed human immunodeficiency virus (HIV), tuberculosis (TB), diarrhoea, cancer, neurological disorders, malaria, diabetes, high blood pressure, chronic ailments (kidney, heart, lung, liver), maternal and accident related deaths. This study is relevant in that it avails valuable insights regarding mortality causes and most prevalent diseases using novel text mining approaches. These results can be integrated in the diagnosis pipeline for ease of human annotation and interpretation. As such, this will help with effective informed intervention programmes that can improve primary health care systems and chronic based delivery, thus increasing life expectancy.

Collapse

Martín-Noguerol T, López-Úbeda P, Pons-Escoda A, Luna A. Natural language processing deep learning models for the differential between high-grade gliomas and metastasis: what if the key is how we report them? Eur Radiol 2024;34:2113-2120. [PMID: 37665389 DOI: 10.1007/s00330-023-10202-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 07/10/2023] [Accepted: 07/20/2023] [Indexed: 09/05/2023]

Abstract

OBJECTIVES

The differential between high-grade glioma (HGG) and metastasis remains challenging in common radiological practice. We compare different natural language processing (NLP)-based deep learning models to assist radiologists based on data contained in radiology reports.

METHODS

This retrospective study included 185 MRI reports between 2010 and 2022 from two different institutions. A total of 117 reports were used for the training and 21 were reserved for the validation set, while the rest were used as a test set. A comparison of the performance of different deep learning models for HGG and metastasis classification has been carried out. Specifically, Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (BiLSTM), a hybrid version of BiLSTM and CNN, and a radiology-specific Bidirectional Encoder Representations from Transformers (RadBERT) model were used.

RESULTS

For the classification of MRI reports, the CNN network provided the best results among all tested, showing a macro-avg precision of 87.32%, a sensitivity of 87.45%, and an F1 score of 87.23%. In addition, our NLP algorithm detected keywords such as tumor, temporal, and lobe to positively classify a radiological report as HGG or metastasis group.

CONCLUSIONS

A deep learning model based on CNN enables radiologists to discriminate between HGG and metastasis based on MRI reports with high-precision values. This approach should be considered an additional tool in diagnosing these central nervous system lesions.

CLINICAL RELEVANCE STATEMENT

The use of our NLP model enables radiologists to differentiate between patients with high-grade glioma and metastasis based on their MRI reports and can be used as an additional tool to the conventional image-based approach for this challenging task.

KEY POINTS

• Differential between high-grade glioma and metastasis is still challenging in common radiological practice. • Natural language processing (NLP)-based deep learning models can assist radiologists based on data contained in radiology reports. • We have developed and tested a natural language processing model for discriminating between high-grade glioma and metastasis based on MRI reports that show high precision for this task.

Collapse

Yang E, Li MD, Raghavan S, Deng F, Lang M, Succi MD, Huang AJ, Kalpathy-Cramer J. Transformer versus traditional natural language processing: how much data is enough for automated radiology report classification? Br J Radiol 2023;96:20220769. [PMID: 37162253 PMCID: PMC10461267 DOI: 10.1259/bjr.20220769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 04/21/2023] [Accepted: 04/26/2023] [Indexed: 05/11/2023] Open

Abstract

OBJECTIVES

Current state-of-the-art natural language processing (NLP) techniques use transformer deep-learning architectures, which depend on large training datasets. We hypothesized that traditional NLP techniques may outperform transformers for smaller radiology report datasets.

METHODS

We compared the performance of BioBERT, a deep-learning-based transformer model pre-trained on biomedical text, and three traditional machine-learning models (gradient boosted tree, random forest, and logistic regression) on seven classification tasks given free-text radiology reports. Tasks included detection of appendicitis, diverticulitis, bowel obstruction, and enteritis/colitis on abdomen/pelvis CT reports, ischemic infarct on brain CT/MRI reports, and medial and lateral meniscus tears on knee MRI reports (7,204 total annotated reports). The performance of NLP models on held-out test sets was compared after training using the full training set, and 2.5%, 10%, 25%, 50%, and 75% random subsets of the training data.

RESULTS

In all tested classification tasks, BioBERT performed poorly at smaller training sample sizes compared to non-deep-learning NLP models. Specifically, BioBERT required training on approximately 1,000 reports to perform similarly or better than non-deep-learning models. At around 1,250 to 1,500 training samples, the testing performance for all models began to plateau, where additional training data yielded minimal performance gain.

CONCLUSIONS

With larger sample sizes, transformer NLP models achieved superior performance in radiology report binary classification tasks. However, with smaller sizes (<1000) and more imbalanced training data, traditional NLP techniques performed better.

ADVANCES IN KNOWLEDGE

Our benchmarks can help guide clinical NLP researchers in selecting machine-learning models according to their dataset characteristics.

Collapse

Laurent G, Craynest F, Thobois M, Hajjaji N. Automatic Classification of Tumor Response From Radiology Reports With Rule-Based Natural Language Processing Integrated Into the Clinical Oncology Workflow. JCO Clin Cancer Inform 2023;7:e2200139. [PMID: 36780606 DOI: 10.1200/cci.22.00139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023] Open

Abstract

PURPOSE

Imaging reports in oncology provide critical information about the disease evolution that should be timely shared to tailor the clinical decision making and care coordination of patients with advanced cancer. However, tumor response stays unstructured in free-text and underexploited. Natural language processing (NLP) methods can help provide this critical information into the electronic health records (EHR) in real time to assist health care workers.

METHODS

A rule-based algorithm was developed using SAS tools to automatically extract and categorize tumor response within progression or no progression categories. 2,970 magnetic resonance imaging, computed tomography scan, and positron emission tomography French reports were extracted from the EHR of a large comprehensive cancer center to build a 2,637-document training set and a 603-document validation set. The model was also tested on 189 imaging reports from 46 different radiology centers. A tumor dashboard was created in the EHR using the Timeline tool of the vis.js javascript library.

RESULTS

An NLP methodology was applied to create an ontology of radiographic terms defining tumor response, mapping text to five main concepts, and application decision rules on the basis of clinical practice RECIST guidelines. The model achieved an overall accuracy of 0.88 (ranging from 0.87 to 0.94), with similar performance on both progression and no progression classification. The overall accuracy was 0.82 on reports from different radiology centers. Data were visualized and organized in a dynamic tumor response timeline. This tool was deployed successfully at our institution both retrospectively and prospectively as part of an automatic pipeline to screen reports and classify tumor response in real time for all metastatic patients.

CONCLUSION

Our approach provides an NLP-based framework to structure and classify tumor response from the EHR and integrate tumor response classification into the clinical oncology workflow.

Collapse

Binsfeld Gonçalves L, Nesic I, Obradovic M, Stieltjes B, Weikert T, Bremerich J. Natural Language Processing and Graph Theory: Making Sense of Imaging Records in a Novel Representation Frame. JMIR Med Inform 2022;10:e40534. [PMID: 36542426 PMCID: PMC9813822 DOI: 10.2196/40534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 09/13/2022] [Accepted: 11/30/2022] [Indexed: 12/24/2022] Open

Abstract

BACKGROUND

A concise visualization framework of related reports would increase readability and improve patient management. To this end, temporal referrals to prior comparative exams are an essential connection to previous exams in written reports. Due to unstructured narrative texts' variable structure and content, their extraction is hampered by poor computer readability. Natural language processing (NLP) permits the extraction of structured information from unstructured texts automatically and can serve as an essential input for such a novel visualization framework.

OBJECTIVE

This study proposes and evaluates an NLP-based algorithm capable of extracting the temporal referrals in written radiology reports, applies it to all the radiology reports generated for 10 years, introduces a graphical representation of imaging reports, and investigates its benefits for clinical and research purposes.

METHODS

In this single-center, university hospital, retrospective study, we developed a convolutional neural network capable of extracting the date of referrals from imaging reports. The model's performance was assessed by calculating precision, recall, and F1-score using an independent test set of 149 reports. Next, the algorithm was applied to our department's radiology reports generated from 2011 to 2021. Finally, the reports and their metadata were represented in a modulable graph.

RESULTS

For extracting the date of referrals, the named-entity recognition (NER) model had a high precision of 0.93, a recall of 0.95, and an F1-score of 0.94. A total of 1,684,635 reports were included in the analysis. Temporal reference was mentioned in 53.3% (656,852/1,684,635), explicitly stated as not available in 21.0% (258,386/1,684,635), and omitted in 25.7% (317,059/1,684,635) of the reports. Imaging records can be visualized in a directed and modulable graph, in which the referring links represent the connecting arrows.

CONCLUSIONS

Automatically extracting the date of referrals from unstructured radiology reports using deep learning NLP algorithms is feasible. Graphs refined the selection of distinct pathology pathways, facilitated the revelation of missing comparisons, and enabled the query of specific referring exam sequences. Further work is needed to evaluate its benefits in clinics, research, and resource planning.

Collapse

Nandish S, R J P, N M N. Natural Language Processing Approaches for Automated Multilevel and Multiclass Classification of Breast Lesions on Free-Text Cytopathology Reports. JCO Clin Cancer Inform 2022;6:e2200036. [PMID: 36103641 DOI: 10.1200/cci.22.00036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Abstract

PURPOSE

The extensive growth and use of electronic health records (EHRs) and extending medical literature have led to huge opportunities to automate the extraction of relevant clinical information that helps in concise and effective clinical decision support. However, processing such information has traditionally been dependent on labor-intensive processes with human errors such as fatigue, oversight, and interobserver variability. Hence, this study aims at the processing of EHRs and performing multilevel and multiclass classification by fetching dominant characteristic features that are sufficient to detect and differentiate various types of breast lesions.

PATIENTS AND METHODS

In this study, unstructured EHRs on breast lesions obtained through fine-needle aspiration cytology technique are considered. The raw text was normalized into structured tabular form and converted to scores by performing sentiment analysis that helps to decide the total polarity or class label of the EHR. Supervised machine learning approaches, namely random forest and feed-forward neural network trained using Levenberg-Marquardt training function, are used for classification of the collected EHR data set containing 2,879 records that are split in the ratio of 80:20 as training and testing data sets, respectively.

RESULTS

Random forest and feed-forward neural network classifiers gave the best performance with an accuracy of 99.36%, an overall receiver operating characteristic-area under the curve of 99.2%, a correlation with ground truth of 98.3%, and a histopathologic correlation of 98.6%.

CONCLUSION

Natural language processing has huge potential to automate the extraction of clinical features from breast lesions. The proposed multilevel and multiclass classification approach is used to classify 13 different types of breast lesions with 20 different labels into five classes to decide the type of treatment that should be given to patients by a physician or oncologist.

Collapse

Wang L, Fu S, Wen A, Ruan X, He H, Liu S, Moon S, Mai M, Riaz IB, Wang N, Yang P, Xu H, Warner JL, Liu H. Assessment of Electronic Health Record for Cancer Research and Patient Care Through a Scoping Review of Cancer Natural Language Processing. JCO Clin Cancer Inform 2022;6:e2200006. [PMID: 35917480 PMCID: PMC9470142 DOI: 10.1200/cci.22.00006] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 03/18/2022] [Accepted: 06/15/2022] [Indexed: 11/20/2022] Open

Batch KE, Yue J, Darcovich A, Lupton K, Liu CC, Woodlock DP, El Amine MAK, Causa-Andrieu PI, Gazit L, Nguyen GH, Zulkernine F, Do RKG, Simpson AL. Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports. Front Artif Intell 2022;5:826402. [PMID: 35310959 PMCID: PMC8924403 DOI: 10.3389/frai.2022.826402] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 01/27/2022] [Indexed: 11/13/2022] Open

Abstract

The development of digital cancer twins relies on the capture of high-resolution representations of individual cancer patients throughout the course of their treatment. Our research aims to improve the detection of metastatic disease over time from structured radiology reports by exposing prediction models to historical information. We demonstrate that Natural language processing (NLP) can generate better weak labels for semi-supervised classification of computed tomography (CT) reports when it is exposed to consecutive reports through a patient's treatment history. Around 714,454 structured radiology reports from Memorial Sloan Kettering Cancer Center adhering to a standardized departmental structured template were used for model development with a subset of the reports included for validation. To develop the models, a subset of the reports was curated for ground-truth: 7,732 total reports in the lung metastases dataset from 867 individual patients; 2,777 reports in the liver metastases dataset from 315 patients; and 4,107 reports in the adrenal metastases dataset from 404 patients. We use NLP to extract and encode important features from the structured text reports, which are then used to develop, train, and validate models. Three models—a simple convolutional neural network (CNN), a CNN augmented with an attention layer, and a recurrent neural network (RNN)—were developed to classify the type of metastatic disease and validated against the ground truth labels. The models use features from consecutive structured text radiology reports of a patient to predict the presence of metastatic disease in the reports. A single-report model, previously developed to analyze one report instead of multiple past reports, is included and the results from all four models are compared based on accuracy, precision, recall, and F1-score. The best model is used to label all 714,454 reports to generate metastases maps. Our results suggest that NLP models can extract cancer progression patterns from multiple consecutive reports and predict the presence of metastatic disease in multiple organs with higher performance when compared with a single-report-based prediction. It demonstrates a promising automated approach to label large numbers of radiology reports without involving human experts in a time- and cost-effective manner and enables tracking of cancer progression over time.

Collapse

Lin S, Lin Y, Wu K, Wang Y, Feng Z, Duan M, Liu S, Fan Y, Huang L, Zhou F. FeCO3, constructing the network biomarkers using the inter-feature correlation coefficients and its application in detecting high-order breast cancer biomarkers. Curr Bioinform 2022. [DOI: 10.2174/1574893617666220124123303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Abstract Aims: This study aims to formulate the inter-feature correlation as the engineered features. Background: Modern biotechnologies tend to generate a huge number of characteristics of a sample, while an OMIC dataset usually has a few dozens or hundreds of samples due to the high costs of generating the OMIC data. So many bio-OMIC studies assumed the inter-feature independence and selected a feature with a high phenotype-association. Objective: However, many features are closely associated with each other due to their physical or functional interactions, which may be utilized as a new view of features. Method: This study proposed a feature engineering algorithm based on the correlation coefficients (FeCO3) by utilizing the correlations between a given sample and a few reference samples. A comprehensive evaluation was carried out for the proposed FeCO3 network features using 24 bio-OMIC datasets. Result: The experimental data suggested that the newly calculated FeCO3 network features tended to achieve better classification performances than the original features, using the same popular feature selection and classification algorithms. The FeCO3 network features were also consistently supported by the literature. FeCO3 was utilized to investigate the high-order engineered biomarkers of breast cancer, and detected the PBX2 gene (Pre-B-Cell Leukemia Transcription Factor 2) as one of the candidate breast cancer biomarkers. Although the two methylated residues cg14851325 (Pvalue=8.06e-2) and cg16602460 (Pvalue=1.19e-1) within PBX2 did not have statistically significant association with breast cancers, the high-order inter-feature correlations showed a significant association with breast cancers. Conclusion: The proposed FeCO3 network features calculated the high-order inter-feature correlations as novel features, and may facilitate the investigations of complex diseases from this new perspective. The source code is available in FigShare at 10.6084/m9.figshare.13550051 or the web site http://www.healthinformaticslab.org/supp/ . Collapse

Affiliation(s)

Shenggeng Lin College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
Yuqi Lin College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
Kexin Wu College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
Yueying Wang College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China Department of Epidemiology and Biostatistics, School of Public Health, Jilin University, Changchun, Jilin Province, China
Zixuan Feng College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
Meiyu Duan College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
Shuai Liu College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
Yusi Fan College of Software, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
Lan Huang College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
Fengfeng Zhou College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China

Collapse

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part V-A Practical Approach to Regression Problems. ACTA NEUROCHIRURGICA. SUPPLEMENT 2021;134:43-50. [PMID: 34862526 DOI: 10.1007/978-3-030-85292-4_6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part IV-A Practical Approach to Binary Classification Problems. ACTA NEUROCHIRURGICA. SUPPLEMENT 2021;134:33-41. [PMID: 34862525 DOI: 10.1007/978-3-030-85292-4_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Feghali J, Jimenez AE, Schilling AT, Azad TD. Overview of Algorithms for Natural Language Processing and Time Series Analyses. ACTA NEUROCHIRURGICA. SUPPLEMENT 2021;134:221-242. [PMID: 34862546 DOI: 10.1007/978-3-030-85292-4_26] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]

Staartjes VE, Regli L, Serra C. Machine Intelligence in Clinical Neuroscience: Taming the Unchained Prometheus. ACTA NEUROCHIRURGICA. SUPPLEMENT 2021;134:1-4. [PMID: 34862521 DOI: 10.1007/978-3-030-85292-4_1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Do RKG, Lupton K, Causa Andrieu PI, Luthra A, Taya M, Batch K, Nguyen H, Rahurkar P, Gazit L, Nicholas K, Fong CJ, Gangai N, Schultz N, Zulkernine F, Sevilimedu V, Juluru K, Simpson A, Hricak H. Patterns of Metastatic Disease in Patients with Cancer Derived from Natural Language Processing of Structured CT Radiology Reports over a 10-year Period. Radiology 2021;301:115-122. [PMID: 34342503 DOI: 10.1148/radiol.2021210043] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Abstract

Background Patterns of metastasis in cancer are increasingly relevant to prognostication and treatment planning but have historically been documented by means of autopsy series. Purpose To show the feasibility of using natural language processing (NLP) to gather accurate data from radiology reports for assessing spatial and temporal patterns of metastatic spread in a large patient cohort. Materials and Methods In this retrospective longitudinal study, consecutive patients who underwent CT from July 2009 to April 2019 and whose CT reports followed a departmental structured template were included. Three radiologists manually curated a sample of 2219 reports for the presence or absence of metastases across 13 organs; these manually curated reports were used to develop three NLP models with an 80%-20% split for training and test sets. A separate random sample of 448 manually curated reports was used for validation. Model performance was measured by accuracy, precision, and recall for each organ. The best-performing NLP model was used to generate a final database of metastatic disease across all patients. For each cancer type, statistical descriptive reports were provided by analyzing the frequencies of metastatic disease at the report and patient levels. Results In 91 665 patients (mean age ± standard deviation, 61 years ± 15; 46 939 women), 387 359 reports were labeled. The best-performing NLP model achieved accuracies from 90% to 99% across all organs. Metastases were most frequently reported in abdominopelvic (23.6% of all reports) and thoracic (17.6%) nodes, followed by lungs (14.7%), liver (13.7%), and bones (9.9%). Metastatic disease tropism is distinct among common cancers, with the most common first site being bones in prostate and breast cancers and liver among pancreatic and colorectal cancers. Conclusion Natural language processing may be applied to cancer patients' CT reports to generate a large database of metastatic phenotypes. Such a database could be combined with genomic studies and used to explore prognostic imaging phenotypes with relevance to treatment planning. © RSNA, 2021 Online supplemental material is available for this article.

Collapse

Affiliation(s)

Richard K G Do From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Kaelan Lupton From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Pamela I Causa Andrieu From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Anisha Luthra From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Michio Taya From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Karen Batch From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Huy Nguyen From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Prachi Rahurkar From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Lior Gazit From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Kevin Nicholas From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Christopher J Fong From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Natalie Gangai From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Nikolaus Schultz From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Farhana Zulkernine From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Varadan Sevilimedu From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Krishna Juluru From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Amber Simpson From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)
Hedvig Hricak From the Department of Radiology (R.K.G.D., P.I.C.A., M.T., N.G., K.J., H.H.), Human Pathology and Pathogenesis Program, Center for Molecular Oncology (A.L.), Department of Strategy and Innovation (H.N., P.R., L.G., K.N.), and Biostatistics Service, Department of Epidemiology and Biostatistics (C.J.F., N.S., V.S.), Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; and School of Computing, Queens University, Kingston, Canada (K.L., K.B., F.Z., A.S.)

Collapse

Wood DA, Kafiabadi S, Al Busaidi A, Guilhem EL, Lynch J, Townend MK, Montvila A, Kiik M, Siddiqui J, Gadapa N, Benger MD, Mazumder A, Barker G, Ourselin S, Cole JH, Booth TC. Deep learning to automate the labelling of head MRI datasets for computer vision applications. Eur Radiol 2021;32:725-736. [PMID: 34286375 PMCID: PMC8660736 DOI: 10.1007/s00330-021-08132-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 06/02/2021] [Accepted: 06/14/2021] [Indexed: 02/07/2023]

Abstract

Objectives

The purpose of this study was to build a deep learning model to derive labels from neuroradiology reports and assign these to the corresponding examinations, overcoming a bottleneck to computer vision model development.

Methods

Reference-standard labels were generated by a team of neuroradiologists for model training and evaluation. Three thousand examinations were labelled for the presence or absence of any abnormality by manually scrutinising the corresponding radiology reports (‘reference-standard report labels’); a subset of these examinations (n = 250) were assigned ‘reference-standard image labels’ by interrogating the actual images. Separately, 2000 reports were labelled for the presence or absence of 7 specialised categories of abnormality (acute stroke, mass, atrophy, vascular abnormality, small vessel disease, white matter inflammation, encephalomalacia), with a subset of these examinations (n = 700) also assigned reference-standard image labels. A deep learning model was trained using labelled reports and validated in two ways: comparing predicted labels to (i) reference-standard report labels and (ii) reference-standard image labels. The area under the receiver operating characteristic curve (AUC-ROC) was used to quantify model performance. Accuracy, sensitivity, specificity, and F1 score were also calculated.

Results

Accurate classification (AUC-ROC > 0.95) was achieved for all categories when tested against reference-standard report labels. A drop in performance (ΔAUC-ROC > 0.02) was seen for three categories (atrophy, encephalomalacia, vascular) when tested against reference-standard image labels, highlighting discrepancies in the original reports. Once trained, the model assigned labels to 121,556 examinations in under 30 min.

Conclusions

Our model accurately classifies head MRI examinations, enabling automated dataset labelling for downstream computer vision applications.

Key Points

• Deep learning is poised to revolutionise image recognition tasks in radiology; however, a barrier to clinical adoption is the difficulty of obtaining large labelled datasets for model training.

• We demonstrate a deep learning model which can derive labels from neuroradiology reports and assign these to the corresponding examinations at scale, facilitating the development of downstream computer vision models.

• We rigorously tested our model by comparing labels predicted on the basis of neuroradiology reports with two sets of reference-standard labels: (1) labels derived by manually scrutinising each radiology report and (2) labels derived by interrogating the actual images.

Supplementary Information

The online version contains supplementary material available at 10.1007/s00330-021-08132-0.

Collapse

Affiliation(s)

David A Wood School of Biomedical Engineering & Imaging Sciences, Kings College London, Rayne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK
Sina Kafiabadi Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
Aisha Al Busaidi Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
Emily L Guilhem Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
Jeremy Lynch Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
Matthew K Townend Wrightington, Wigan & Leigh NHSFT, Wigan, WN1 2NN, UK
Antanas Montvila Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK.,Hospital of Lithuanian University of Health Sciences, Kaunas Clinics, Kaunas, Lithuania
Martin Kiik School of Biomedical Engineering & Imaging Sciences, Kings College London, Rayne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK
Juveria Siddiqui Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
Naveen Gadapa Department of Neurology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
Matthew D Benger Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK
Asif Mazumder Guy's and St Thomas' NHS Foundation Trust, Westminster Bridge Road, London, SE1 7EH, UK
Gareth Barker Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, SE5 8AF, UK
Sebastian Ourselin School of Biomedical Engineering & Imaging Sciences, Kings College London, Rayne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK
James H Cole Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, SE5 8AF, UK.,Centre for Medical Image Computing, Department of Computer Science, University College London, London, WC1V 6LJ, UK.,Dementia Research Centre, University College London, London, WC1N 3BG, UK
Thomas C Booth School of Biomedical Engineering & Imaging Sciences, Kings College London, Rayne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK. .,Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, SE5 9RS, UK.

Collapse

Senders JT, Cho LD, Calvachi P, McNulty JJ, Ashby JL, Schulte IS, Almekkawi AK, Mehrtash A, Gormley WB, Smith TR, Broekman MLD, Arnaout O. Automating Clinical Chart Review: An Open-Source Natural Language Processing Pipeline Developed on Free-Text Radiology Reports From Patients With Glioblastoma. JCO Clin Cancer Inform 2021;4:25-34. [PMID: 31977252 DOI: 10.1200/cci.19.00060] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open

Abstract

PURPOSE

The aim of this study was to develop an open-source natural language processing (NLP) pipeline for text mining of medical information from clinical reports. We also aimed to provide insight into why certain variables or reports are more suitable for clinical text mining than others.

MATERIALS AND METHODS

Various NLP models were developed to extract 15 radiologic characteristics from free-text radiology reports for patients with glioblastoma. Ten-fold cross-validation was used to optimize the hyperparameter settings and estimate model performance. We examined how model performance was associated with quantitative attributes of the radiologic characteristics and reports.

RESULTS

In total, 562 unique brain magnetic resonance imaging reports were retrieved. NLP extracted 15 radiologic characteristics with high to excellent discrimination (area under the curve, 0.82 to 0.98) and accuracy (78.6% to 96.6%). Model performance was correlated with the inter-rater agreement of the manually provided labels (ρ = 0.904; P < .001) but not with the frequency distribution of the variables of interest (ρ = 0.179; P = .52). All variables labeled with a near perfect inter-rater agreement were classified with excellent performance (area under the curve > 0.95). Excellent performance could be achieved for variables with only 50 to 100 observations in the minority group and class imbalances up to a 9:1 ratio. Report-level classification accuracy was not associated with the number of words or the vocabulary size in the distinct text documents.

CONCLUSION

This study provides an open-source NLP pipeline that allows for text mining of narratively written clinical reports. Small sample sizes and class imbalance should not be considered as absolute contraindications for text mining in clinical research. However, future studies should report measures of inter-rater agreement whenever ground truth is based on a consensus label and use this measure to identify clinical variables eligible for text mining.

Collapse

Affiliation(s)

Joeky T Senders Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Department of Neurosurgery, Leiden University Medical Center, Leiden, the Netherlands
Logan D Cho Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Department of Neuroscience, Brown University, Providence, RI
Paola Calvachi Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
John J McNulty Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Vagelos College of Physicians and Surgeons, Columbia University, New York, NY
Joanna L Ashby Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
Isabelle S Schulte Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
Ahmad Kareem Almekkawi Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
Alireza Mehrtash Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
William B Gormley Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
Timothy R Smith Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
Marike L D Broekman Department of Neurosurgery, Leiden University Medical Center, Leiden, the Netherlands.,Department of Neurosurgery, Haaglanden Medical Center, The Hague, the Netherlands
Omar Arnaout Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA

Collapse

Decker BM, Hill CE, Baldassano SN, Khankhanian P. Can antiepileptic efficacy and epilepsy variables be studied from electronic health records? A review of current approaches. Seizure 2021;85:138-144. [PMID: 33461032 DOI: 10.1016/j.seizure.2020.11.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 11/16/2020] [Accepted: 11/17/2020] [Indexed: 12/16/2022] Open

Heo TS, Kim YS, Choi JM, Jeong YS, Seo SY, Lee JH, Jeon JP, Kim C. Prediction of Stroke Outcome Using Natural Language Processing-Based Machine Learning of Radiology Report of Brain MRI. J Pers Med 2020;10:jpm10040286. [PMID: 33339385 PMCID: PMC7766032 DOI: 10.3390/jpm10040286] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 12/09/2020] [Accepted: 12/15/2020] [Indexed: 01/28/2023] Open

Abstract

Brain magnetic resonance imaging (MRI) is useful for predicting the outcome of patients with acute ischemic stroke (AIS). Although deep learning (DL) using brain MRI with certain image biomarkers has shown satisfactory results in predicting poor outcomes, no study has assessed the usefulness of natural language processing (NLP)-based machine learning (ML) algorithms using brain MRI free-text reports of AIS patients. Therefore, we aimed to assess whether NLP-based ML algorithms using brain MRI text reports could predict poor outcomes in AIS patients. This study included only English text reports of brain MRIs examined during admission of AIS patients. Poor outcome was defined as a modified Rankin Scale score of 3-6, and the data were captured by trained nurses and physicians. We only included MRI text report of the first MRI scan during the admission. The text dataset was randomly divided into a training and test dataset with a 7:3 ratio. Text was vectorized to word, sentence, and document levels. In the word level approach, which did not consider the sequence of words, and the "bag-of-words" model was used to reflect the number of repetitions of text token. The "sent2vec" method was used in the sensation-level approach considering the sequence of words, and the word embedding was used in the document level approach. In addition to conventional ML algorithms, DL algorithms such as the convolutional neural network (CNN), long short-term memory, and multilayer perceptron were used to predict poor outcomes using 5-fold cross-validation and grid search techniques. The performance of each ML classifier was compared with the area under the receiver operating characteristic (AUROC) curve. Among 1840 subjects with AIS, 645 patients (35.1%) had a poor outcome 3 months after the stroke onset. Random forest was the best classifier (0.782 of AUROC) using a word-level approach. Overall, the document-level approach exhibited better performance than did the word- or sentence-level approaches. Among all the ML classifiers, the multi-CNN algorithm demonstrated the best classification performance (0.805), followed by the CNN (0.799) algorithm. When predicting future clinical outcomes using NLP-based ML of radiology free-text reports of brain MRI, DL algorithms showed superior performance over the other ML algorithms. In particular, the prediction of poor outcomes in document-level NLP DL was improved more by multi-CNN and CNN than by recurrent neural network-based algorithms. NLP-based DL algorithms can be used as an important digital marker for unstructured electronic health record data DL prediction.

Collapse

Applications of artificial intelligence in neuro-oncology. Curr Opin Neurol 2020;32:850-856. [PMID: 31609739 DOI: 10.1097/wco.0000000000000761] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Tsai CC, Lin YC, Ng SH, Chen YL, Cheng JS, Lu CS, Weng YH, Lin SH, Chen PY, Wu YM, Wang JJ. A Method for the Prediction of Clinical Outcome Using Diffusion Magnetic Resonance Imaging: Application on Parkinson's Disease. J Clin Med 2020;9:jcm9030647. [PMID: 32121190 PMCID: PMC7141247 DOI: 10.3390/jcm9030647] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Revised: 02/10/2020] [Accepted: 02/18/2020] [Indexed: 01/06/2023] Open

Abstract

Robust early prediction of clinical outcomes in Parkinson's disease (PD) is paramount for implementing appropriate management interventions. We propose a method that uses the baseline MRI, measuring diffusion parameters from multiple parcellated brain regions, to predict the 2-year clinical outcome in Parkinson's disease. Diffusion tensor imaging was obtained from 82 patients (males/females = 45/37, mean age: 60.9 ± 7.3 years, baseline and after 23.7 ± 0.7 months) using a 3T MR scanner, which was normalized and parcellated according to the Automated Anatomical Labelling template. All patients were diagnosed with probable Parkinson's disease by the National Institute of Neurological Disorders and Stroke criteria. Clinical outcome was graded using disease severity (Unified Parkinson's Disease Rating Scale and Modified Hoehn and Yahr staging), drug administration (levodopa equivalent daily dose), and quality of life (39-item PD Questionnaire). Selection and regularization of diffusion parameters, the mean diffusivity and fractional anisotropy, were performed using least absolute shrinkage and selection operator (LASSO) between baseline diffusion index and clinical outcome over 2 years. Identified features were entered into a stepwise multivariate regression model, followed by a leave-one-out/5-fold cross validation and additional blind validation using an independent dataset. The predicted Unified Parkinson's Disease Rating Scale for each individual was consistent with the observed values at blind validation (adjusted R² 0.76) by using 13 features, such as mean diffusivity in lingual, nodule lobule of cerebellum vermis and fractional anisotropy in rolandic operculum, and quadrangular lobule of cerebellum. We conclude that baseline diffusion MRI is potentially capable of predicting 2-year clinical outcomes in patients with Parkinson's disease on an individual basis.

Collapse

Affiliation(s)

Chih-Chien Tsai Healthy Aging Research Center, Chang Gung University, Taoyuan 33302, Taiwan;
Yu-Chun Lin Department of Medical Imaging and Intervention, Chang Gung Memorial Hospital, Linkou, Taoyuan 33375, Taiwan; (Y.-C.L.); (S.-H.N.); (Y.-L.C.); (Y.-M.W.) Department of Medical Imaging and Radiological Sciences, Chang Gung University, Taoyuan 33302, Taiwan; (S.-H.L.); (P.-Y.C.)
Shu-Hang Ng Department of Medical Imaging and Intervention, Chang Gung Memorial Hospital, Linkou, Taoyuan 33375, Taiwan; (Y.-C.L.); (S.-H.N.); (Y.-L.C.); (Y.-M.W.) Department of Medical Imaging and Radiological Sciences, Chang Gung University, Taoyuan 33302, Taiwan; (S.-H.L.); (P.-Y.C.)
Yao-Liang Chen Department of Medical Imaging and Intervention, Chang Gung Memorial Hospital, Linkou, Taoyuan 33375, Taiwan; (Y.-C.L.); (S.-H.N.); (Y.-L.C.); (Y.-M.W.) Department of Diagnostic Radiology, Chang Gung Memorial Hospital, Keelung City 20401, Taiwan
Jur-Shan Cheng Clinical Informatics and Medical Statistics Research Center, College of Medicine, Chang Gung University, Taoyuan 33302, Taiwan; Department of Emergency Medicine, Chang Gung Memorial Hospital, Keelung City 20401, Taiwan
Chin-Song Lu Professor Lu Neurological Clinic, Taoyuan 33375, Taiwan; Division of Movement Disorders, Department of Neurology, Chang Gung Memorial Hospital, Linkou, Taoyuan 33375, Taiwan; Neuroscience Research Center, Chang Gung Memorial Hospital, Linkou, Taoyuan 33375, Taiwan
Yi-Hsin Weng Division of Movement Disorders, Department of Neurology, Chang Gung Memorial Hospital, Linkou, Taoyuan 33375, Taiwan; Neuroscience Research Center, Chang Gung Memorial Hospital, Linkou, Taoyuan 33375, Taiwan School of Medicine, Chang Gung University, Taoyuan 33302, Taiwan
Sung-Han Lin Department of Medical Imaging and Radiological Sciences, Chang Gung University, Taoyuan 33302, Taiwan; (S.-H.L.); (P.-Y.C.)
Po-Yuan Chen Department of Medical Imaging and Radiological Sciences, Chang Gung University, Taoyuan 33302, Taiwan; (S.-H.L.); (P.-Y.C.)
Yi-Ming Wu Department of Medical Imaging and Intervention, Chang Gung Memorial Hospital, Linkou, Taoyuan 33375, Taiwan; (Y.-C.L.); (S.-H.N.); (Y.-L.C.); (Y.-M.W.) Department of Medical Imaging and Radiological Sciences, Chang Gung University, Taoyuan 33302, Taiwan; (S.-H.L.); (P.-Y.C.)
Jiun-Jie Wang Healthy Aging Research Center, Chang Gung University, Taoyuan 33302, Taiwan; Department of Medical Imaging and Radiological Sciences, Chang Gung University, Taoyuan 33302, Taiwan; (S.-H.L.); (P.-Y.C.) Department of Diagnostic Radiology, Chang Gung Memorial Hospital, Keelung City 20401, Taiwan Medical Imaging Research Center, Institute for Radiological Research, Chang Gung University/Chang Gung Memorial Hospital, Linkou 33375, Taoyuan, Taiwan Correspondence: ; Tel.: +886-3-211-8800 (ext. 5391); Fax: +886-3-397-1936

Collapse

Chen Z, Pang M, Zhao Z, Li S, Miao R, Zhang Y, Feng X, Feng X, Zhang Y, Duan M, Huang L, Zhou F. Feature selection may improve deep neural networks for the bioinformatics problems. Bioinformatics 2019;36:1542-1552. [DOI: 10.1093/bioinformatics/btz763] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 09/03/2019] [Accepted: 10/02/2019] [Indexed: 12/22/2022] Open

Affiliation(s)

Zheng Chen BioKnow Health Informatics Lab, College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
Meng Pang BioKnow Health Informatics Lab, College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
Zixin Zhao BioKnow Health Informatics Lab, College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
Shuainan Li BioKnow Health Informatics Lab, College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
Rui Miao BioKnow Health Informatics Lab, College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
Yifan Zhang BioKnow Health Informatics Lab, College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
Xiaoyue Feng BioKnow Health Informatics Lab, College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
Xin Feng BioKnow Health Informatics Lab, College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
Yexian Zhang BioKnow Health Informatics Lab, College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
Meiyu Duan BioKnow Health Informatics Lab, College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
Lan Huang BioKnow Health Informatics Lab, College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
Fengfeng Zhou BioKnow Health Informatics Lab, College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China

Collapse