Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Leyh-Bannurah SR, Tian Z, Karakiewicz PI, Wolffgang U, Sauter G, Fisch M, Pehrke D, Huland H, Graefen M, Budäus L. Deep Learning for Natural Language Processing in Urology: State-of-the-Art Automated Extraction of Detailed Pathologic Prostate Cancer Data From Narratively Written Electronic Health Records. JCO Clin Cancer Inform 2019;2:1-9. [PMID: 30652616 DOI: 10.1200/cci.18.00080] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

For:	Leyh-Bannurah SR, Tian Z, Karakiewicz PI, Wolffgang U, Sauter G, Fisch M, Pehrke D, Huland H, Graefen M, Budäus L. Deep Learning for Natural Language Processing in Urology: State-of-the-Art Automated Extraction of Detailed Pathologic Prostate Cancer Data From Narratively Written Electronic Health Records. JCO Clin Cancer Inform 2019;2:1-9. [PMID: 30652616 DOI: 10.1200/cci.18.00080] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Number

Cited by Other Article(s)

Kaufmann B, Busby D, Das CK, Tillu N, Menon M, Tewari AK, Gorin MA. Validation of a Zero-shot Learning Natural Language Processing Tool to Facilitate Data Abstraction for Urologic Research. Eur Urol Focus 2024;10:279-287. [PMID: 38278710 DOI: 10.1016/j.euf.2024.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 12/18/2023] [Accepted: 01/15/2024] [Indexed: 01/28/2024]

Abstract

BACKGROUND

Urologic research often requires data abstraction from unstructured text contained within the electronic health record. A number of natural language processing (NLP) tools have been developed to aid with this time-consuming task; however, the generalizability of these tools is typically limited by the need for task-specific training.

OBJECTIVE

To describe the development and validation of a zero-shot learning NLP tool to facilitate data abstraction from unstructured text for use in downstream urologic research.

DESIGN, SETTING, AND PARTICIPANTS

An NLP tool based on the GPT-3.5 model from OpenAI was developed and compared with three physicians for time to task completion and accuracy for abstracting 14 unique variables from a set of 199 deidentified radical prostatectomy pathology reports. The reports were processed in vectorized and scanned formats to establish the impact of optical character recognition on data abstraction.

INTERVENTION

A zero-shot learning NLP tool for data abstraction.

OUTCOME MEASUREMENTS AND STATISTICAL ANALYSIS

The tool was compared with the human abstractors in terms of superiority for data abstraction speed and noninferiority for accuracy.

RESULTS AND LIMITATIONS

The human abstractors required a median (interquartile range) of 93 s (72-122 s) per report for data abstraction, whereas the software required a median of 12 s (10-15 s) for the vectorized reports and 15 s (13-17 s) for the scanned reports (p < 0.001 for all paired comparisons). The accuracies of the three human abstractors were 94.7% (95% confidence interval [CI], 93.8-95.5%), 97.8% (95% CI, 97.2-98.3%), and 96.4% (95% CI, 95.6-97%) for the combined set of 2786 data points. The tool had accuracy of 94.2% (95% CI, 93.3-94.9%) for the vectorized reports and was noninferior to the human abstractors at a margin of -10% (α = 0.025). The tool had slightly lower accuracy of 88.7% (95% CI 87.5-89.9%) for the scanned reports, making it noninferior to two of three human abstractors.

CONCLUSIONS

The developed zero-shot learning NLP tool offers urologic researchers a highly generalizable and accurate method for data abstraction from unstructured text. An open access version of the tool is available for immediate use by the urologic community.

PATIENT SUMMARY

In this report, we describe the design and validation of an artificial intelligence tool for abstracting discrete data from unstructured notes contained within the electronic medical record. This freely available tool, which is based on the GPT-3.5 technology from OpenAI, is intended to facilitate research and scientific discovery by the urologic community.

Collapse

Truhn D, Loeffler CM, Müller-Franzes G, Nebelung S, Hewitt KJ, Brandner S, Bressem KK, Foersch S, Kather JN. Extracting structured information from unstructured histopathology reports using generative pre-trained transformer 4 (GPT-4). J Pathol 2024;262:310-319. [PMID: 38098169 DOI: 10.1002/path.6232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 09/16/2023] [Accepted: 11/03/2023] [Indexed: 02/06/2024]

Bozkurt S, Magnani CJ, Seneviratne MG, Brooks JD, Hernandez-Boussard T. Expanding the Secondary Use of Prostate Cancer Real World Data: Automated Classifiers for Clinical and Pathological Stage. Front Digit Health 2022;4:793316. [PMID: 35721793 PMCID: PMC9201076 DOI: 10.3389/fdgth.2022.793316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Accepted: 05/12/2022] [Indexed: 11/30/2022] Open

Abstract

Background

Explicit documentation of stage is an endorsed quality metric by the National Quality Forum. Clinical and pathological cancer staging is inconsistently recorded within clinical narratives but can be derived from text in the Electronic Health Record (EHR). To address this need, we developed a Natural Language Processing (NLP) solution for extraction of clinical and pathological TNM stages from the clinical notes in prostate cancer patients.

Methods

Data for patients diagnosed with prostate cancer between 2010 and 2018 were collected from a tertiary care academic healthcare system's EHR records in the United States. This system is linked to the California Cancer Registry, and contains data on diagnosis, histology, cancer stage, treatment and outcomes. A randomly selected sample of patients were manually annotated for stage to establish the ground truth for training and validating the NLP methods. For each patient, a vector representation of clinical text (written in English) was used to train a machine learning model alongside a rule-based model and compared with the ground truth.

Results

A total of 5,461 prostate cancer patients were identified in the clinical data warehouse and over 30% were missing stage information. Thirty-three to thirty-six percent of patients were missing a clinical stage and the models accurately imputed the stage in 21-32% of cases. Twenty-one percent had a missing pathological stage and using NLP 71% of missing T stages and 56% of missing N stages were imputed. For both clinical and pathological T and N stages, the rule-based NLP approach out-performed the ML approach with a minimum F1 score of 0.71 and 0.40, respectively. For clinical M stage the ML approach out-performed the rule-based model with a minimum F1 score of 0.79 and 0.88, respectively.

Conclusions

We developed an NLP pipeline to successfully extract clinical and pathological staging information from clinical narratives. Our results can serve as a proof of concept for using NLP to augment clinical and pathological stage reporting in cancer registries and EHRs to enhance the secondary use of these data.

Collapse

Artificial Intelligence Applications in Urology: Reporting Standards to Achieve Fluency for Urologists. Urol Clin North Am 2022;49:65-117. [PMID: 34776055 PMCID: PMC9147289 DOI: 10.1016/j.ucl.2021.07.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]

Qiu M, Li Y, Na K, Qi Z, Ma S, Zhou H, Xu X, Li J, Xu K, Wang X, Han Y. A Novel Multiple Risk Score Model for Prediction of Long-Term Ischemic Risk in Patients With Coronary Artery Disease Undergoing Percutaneous Coronary Intervention: Insights From the I-LOVE-IT 2 Trial. Front Cardiovasc Med 2022;8:756379. [PMID: 35096990 PMCID: PMC8793781 DOI: 10.3389/fcvm.2021.756379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 12/06/2021] [Indexed: 11/13/2022] Open

Abstract Backgrounds: A plug-and-play standardized algorithm to identify the ischemic risk in patients with coronary artery disease (CAD) undergoing percutaneous coronary intervention (PCI) could play a valuable step to help a wide spectrum of clinic workers. This study intended to investigate the ability to use the accumulation of multiple clinical routine risk scores to predict long-term ischemic events in patients with CAD undergoing PCI.Methods: This was a secondary analysis of the I-LOVE-IT 2 (Evaluate Safety and Effectiveness of the Tivoli drug-eluting stent (DES) and the Firebird DES for Treatment of Coronary Revascularization) trial, which was a prospective, multicenter, and randomized study. The Global Registry for Acute Coronary Events (GRACE), baseline Synergy Between Percutaneous Coronary Intervention with Taxus and Cardiac Surgery (SYNTAX), residual SYNTAX, and age, creatinine, and ejection fraction (ACEF) score were calculated in all patients. Risk stratification was based on the number of these four scores that met the established thresholds for the ischemic risk. The primary end point was ischemic events at 48 months, defined as the composite of cardiac death, nonfatal myocardial infarction, stroke, or definite/probable stent thrombosis (ST).Results: The 48-month ischemic events had a significant trend for higher event rates (from 6.61 to 16.93%) with an incremental number of risk scores presenting the higher ischemic risk from 0 to ≥3 (p trend < 0.001). In addition, the categories were associated with increased risk for all components of ischemic events, including cardiac death (from 1.36 to 3.15%), myocardial infarction (MI) (from 3.31 to 9.84%), stroke (3.31 to 6.10%), definite/probable ST (from 0.58 to 1.97%), and all-cause mortality (from 2.14 to 6.30%) (all p trend < 0.05). The net reclassification index after combined with four risk scores was 12.5% (5.3–20.0%), 9.4% (2.0–16.8%), 12.1% (4.5–19.7%), and 10.7% (3.3–18.1%), which offered statistically significant improvement in the performance, compared with SYNTAX, residual SYNTAX, ACEF, and GRACE score, respectively.Conclusion: The novel multiple risk score model was significantly associated with the risk of long-term ischemic events in these patients with an increment of scores. A meaningful improvement to predict adverse outcomes when multiple risk scores were applied to risk stratification. Collapse

Stenzl A, Sternberg CN, Ghith J, Serfass L, Schijvenaars BJA, Sboner A. Application of Artificial Intelligence to Overcome Clinical Information Overload in Urologic Cancer. BJU Int 2021;130:291-300. [PMID: 34846775 DOI: 10.1111/bju.15662] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Abstract

OBJECTIVE

To describe the use of artificial intelligence (AI) in medical literature and trial data extraction, and its applications in uro-oncology. This bridging review, which consolidates information from the diverse applications of AI, highlights how AI users can investigate more sophisticated queries than with traditional methods, leading to synthesis of raw data and complex outputs into more actionable and personalized results, particularly in the field of uro-oncology.

METHODS

Literature and clinical trial searches were performed in PubMed, Dimensions, Embase and Google (1999-2020). The searches focused on the use of AI and its various forms to facilitate literature searches, clinical guidelines development, and clinical trial data extraction in uro-oncology. To illustrate how AI can be applied toaddress questions about optimizing therapeutic decision making and individualizing treatment regimens, the Dimensions-linked information platform was searched for "prostate cancer" keywords (76 publications were identified; 48 were included).

RESULTS

AI offers the promise of transforming raw data and complex outputs into actionable insights. Literature and clinical trial searches can be automated, enabling clinicians to develop and analyze publications expeditiously on complex issues such as therapeutic sequencing and to obtain updates on documents that evolve at the pace and scope of the landscape. An AI-based platform inclusive of 12 trial databases and >100 scientific literature sources enabled the creation of an interactive visualization.

CONCLUSION

As the literature and clinical trial landscape continues to grow in complexity and with increasing speed, the ability to pull the right information at the right time from different search engines and resources while excluding social media bias becomes more challenging. This review demonstrates that by applying natural language processing and machine learning algorithms, validated and optimized AI leads to a speedier, more personalized, efficient and focused search compared with traditional methods.

Collapse

Abedian S, Sholle ET, Adekkanattu PM, Cusick MM, Weiner SE, Shoag JE, Hu JC, Campion TR. Automated Extraction of Tumor Staging and Diagnosis Information From Surgical Pathology Reports. JCO Clin Cancer Inform 2021;5:1054-1061. [PMID: 34694896 PMCID: PMC8812635 DOI: 10.1200/cci.21.00065] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 08/25/2021] [Accepted: 09/29/2021] [Indexed: 11/20/2022] Open

Shin D, Kam HJ, Jeon MS, Kim HY. Automatic Classification of Thyroid Findings Using Static and Contextualized Ensemble Natural Language Processing Systems: Development Study. JMIR Med Inform 2021;9:e30223. [PMID: 34546183 PMCID: PMC8493453 DOI: 10.2196/30223] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 07/15/2021] [Accepted: 08/02/2021] [Indexed: 11/30/2022] Open

Abstract

Background

In the case of Korean institutions and enterprises that collect nonstandardized and nonunified formats of electronic medical examination results from multiple medical institutions, a group of experienced nurses who can understand the results and related contexts initially classified the reports manually. The classification guidelines were established by years of workers’ clinical experiences and there were attempts to automate the classification work. However, there have been problems in which rule-based algorithms or human labor–intensive efforts can be time-consuming or limited owing to high potential errors. We investigated natural language processing (NLP) architectures and proposed ensemble models to create automated classifiers.

Objective

This study aimed to develop practical deep learning models with electronic medical records from 284 health care institutions and open-source corpus data sets for automatically classifying 3 thyroid conditions: healthy, caution required, and critical. The primary goal is to increase the overall accuracy of the classification, yet there are practical and industrial needs to correctly predict healthy (negative) thyroid condition data, which are mostly medical examination results, and minimize false-negative rates under the prediction of healthy thyroid conditions.

Methods

The data sets included thyroid and comprehensive medical examination reports. The textual data are not only documented in fully complete sentences but also written in lists of words or phrases. Therefore, we propose static and contextualized ensemble NLP network (SCENT) systems to successfully reflect static and contextual information and handle incomplete sentences. We prepared each convolution neural network (CNN)-, long short-term memory (LSTM)-, and efficiently learning an encoder that classifies token replacements accurately (ELECTRA)-based ensemble model by training or fine-tuning them multiple times. Through comprehensive experiments, we propose 2 versions of ensemble models, SCENT-v1 and SCENT-v2, with the single-architecture–based CNN, LSTM, and ELECTRA ensemble models for the best classification performance and practical use, respectively. SCENT-v1 is an ensemble of CNN and ELECTRA ensemble models, and SCENT-v2 is a hierarchical ensemble of CNN, LSTM, and ELECTRA ensemble models. SCENT-v2 first classifies the 3 labels using an ELECTRA ensemble model and then reclassifies them using an ensemble model of CNN and LSTM if the ELECTRA ensemble model predicted them as “healthy” labels.

Results

SCENT-v1 outperformed all the suggested models, with the highest F1 score (92.56%). SCENT-v2 had the second-highest recall value (94.44%) and the fewest misclassifications for caution-required thyroid condition while maintaining 0 classification error for the critical thyroid condition under the prediction of the healthy thyroid condition.

Conclusions

The proposed SCENT demonstrates good classification performance despite the unique characteristics of the Korean language and problems of data lack and imbalance, especially for the extremely low amount of critical condition data. The result of SCENT-v1 indicates that different perspectives of static and contextual input token representations can enhance classification performance. SCENT-v2 has a strong impact on the prediction of healthy thyroid conditions.

Collapse

Zeng J, Banerjee I, Henry AS, Wood DJ, Shachter RD, Gensheimer MF, Rubin DL. Natural Language Processing to Identify Cancer Treatments With Electronic Medical Records. JCO Clin Cancer Inform 2021;5:379-393. [PMID: 33822653 DOI: 10.1200/cci.20.00173] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Abstract

PURPOSE

Knowing the treatments administered to patients with cancer is important for treatment planning and correlating treatment patterns with outcomes for personalized medicine study. However, existing methods to identify treatments are often lacking. We develop a natural language processing approach with structured electronic medical records and unstructured clinical notes to identify the initial treatment administered to patients with cancer.

METHODS

We used a total number of 4,412 patients with 483,782 clinical notes from the Stanford Cancer Institute Research Database containing patients with nonmetastatic prostate, oropharynx, and esophagus cancer. We trained treatment identification models for each cancer type separately and compared performance of using only structured, only unstructured (bag-of-words, doc2vec, fasttext), and combinations of both (structured + bow, structured + doc2vec, structured + fasttext). We optimized the identification model among five machine learning methods (logistic regression, multilayer perceptrons, random forest, support vector machines, and stochastic gradient boosting). The treatment information recorded in the cancer registry is the gold standard and compares our methods to an identification baseline with billing codes.

RESULTS

For prostate cancer, we achieved an f1-score of 0.99 (95% CI, 0.97 to 1.00) for radiation and 1.00 (95% CI, 0.99 to 1.00) for surgery using structured + doc2vec. For oropharynx cancer, we achieved an f1-score of 0.78 (95% CI, 0.58 to 0.93) for chemoradiation and 0.83 (95% CI, 0.69 to 0.95) for surgery using doc2vec. For esophagus cancer, we achieved an f1-score of 1.0 (95% CI, 1.0 to 1.0) for both chemoradiation and surgery using all combinations of structured and unstructured data. We found that employing the free-text clinical notes outperforms using the billing codes or only structured data for all three cancer types.

CONCLUSION

Our results show that treatment identification using free-text clinical notes greatly improves upon the performance using billing codes and simple structured data. The approach can be used for treatment cohort identification and adapted for longitudinal cancer treatment identification.

Collapse

Accurate pattern-based extraction of complex Gleason score expressions from pathology reports. J Biomed Inform 2021;120:103850. [PMID: 34182148 DOI: 10.1016/j.jbi.2021.103850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 04/25/2021] [Accepted: 06/19/2021] [Indexed: 11/20/2022]

López-Úbeda P, Pomares-Quimbaya A, Díaz-Galiano MC, Schulz S. Collecting specialty-related medical terms: Development and evaluation of a resource for Spanish. BMC Med Inform Decis Mak 2021;21:145. [PMID: 33947365 PMCID: PMC8094531 DOI: 10.1186/s12911-021-01495-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 04/03/2021] [Indexed: 11/20/2022] Open

Validation of deep learning natural language processing algorithm for keyword extraction from pathology reports in electronic health records. Sci Rep 2020;10:20265. [PMID: 33219276 PMCID: PMC7679382 DOI: 10.1038/s41598-020-77258-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Accepted: 11/05/2020] [Indexed: 11/20/2022] Open

Cassim N, Ahmad A, Wadee R, George JA, Glencross DK. Using Systematized Nomenclature of Medicine clinical term codes to assign histological findings for prostate biopsies in the Gauteng province, South Africa: Lessons learnt. Afr J Lab Med 2020;9:909. [PMID: 33102166 PMCID: PMC7565135 DOI: 10.4102/ajlm.v9i1.909] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Accepted: 06/24/2020] [Indexed: 12/25/2022] Open

Oliwa T, Maron SB, Chase LM, Lomnicki S, Catenacci DVT, Furner B, Volchenboum SL. Obtaining Knowledge in Pathology Reports Through a Natural Language Processing Approach With Classification, Named-Entity Recognition, and Relation-Extraction Heuristics. JCO Clin Cancer Inform 2020;3:1-8. [PMID: 31365274 DOI: 10.1200/cci.19.00008] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Abstract

PURPOSE

Robust institutional tumor banks depend on continuous sample curation or else subsequent biopsy or resection specimens are overlooked after initial enrollment. Curation automation is hindered by semistructured free-text clinical pathology notes, which complicate data abstraction. Our motivation is to develop a natural language processing method that dynamically identifies existing pathology specimen elements necessary for locating specimens for future use in a manner that can be re-implemented by other institutions.

PATIENTS AND METHODS

Pathology reports from patients with gastroesophageal cancer enrolled in The University of Chicago GI oncology tumor bank were used to train and validate a novel composite natural language processing-based pipeline with a supervised machine learning classification step to separate notes into internal (primary review) and external (consultation) reports; a named-entity recognition step to obtain label (accession number), location, date, and sublabels (block identifiers); and a results proofreading step.

RESULTS

We analyzed 188 pathology reports, including 82 internal reports and 106 external consult reports, and successfully extracted named entities grouped as sample information (label, date, location). Our approach identified up to 24 additional unique samples in external consult notes that could have been overlooked. Our classification model obtained 100% accuracy on the basis of 10-fold cross-validation. Precision, recall, and F1 for class-specific named-entity recognition models show strong performance.

CONCLUSION

Through a combination of natural language processing and machine learning, we devised a re-implementable and automated approach that can accurately extract specimen attributes from semistructured pathology notes to dynamically populate a tumor registry.

Collapse

Baid U, Talbar S, Rane S, Gupta S, Thakur MH, Moiyadi A, Sable N, Akolkar M, Mahajan A. A Novel Approach for Fully Automatic Intra-Tumor Segmentation With 3D U-Net Architecture for Gliomas. Front Comput Neurosci 2020;14:10. [PMID: 32132913 PMCID: PMC7041417 DOI: 10.3389/fncom.2020.00010] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Accepted: 01/27/2020] [Indexed: 02/05/2023] Open

Abstract

Purpose: Gliomas are the most common primary brain malignancies, with varying degrees of aggressiveness and prognosis. Understanding of tumor biology and intra-tumor heterogeneity is necessary for planning personalized therapy and predicting response to therapy. Accurate tumoral and intra-tumoral segmentation on MRI is the first step toward understanding the tumor biology through computational methods. The purpose of this study was to design a segmentation algorithm and evaluate its performance on pre-treatment brain MRIs obtained from patients with gliomas.

Materials and Methods: In this study, we have designed a novel 3D U-Net architecture that segments various radiologically identifiable sub-regions like edema, enhancing tumor, and necrosis. Weighted patch extraction scheme from the tumor border regions is proposed to address the problem of class imbalance between tumor and non-tumorous patches. The architecture consists of a contracting path to capture context and the symmetric expanding path that enables precise localization. The Deep Convolutional Neural Network (DCNN) based architecture is trained on 285 patients, validated on 66 patients and tested on 191 patients with Glioma from Brain Tumor Segmentation (BraTS) 2018 challenge dataset. Three dimensional patches are extracted from multi-channel BraTS training dataset to train 3D U-Net architecture. The efficacy of the proposed approach is also tested on an independent dataset of 40 patients with High Grade Glioma from our tertiary cancer center. Segmentation results are assessed in terms of Dice Score, Sensitivity, Specificity, and Hausdorff 95 distance (ITCN intra-tumoral classification network).

Result: Our proposed architecture achieved Dice scores of 0.88, 0.83, and 0.75 for the whole tumor, tumor core and enhancing tumor, respectively, on BraTS validation dataset and 0.85, 0.77, 0.67 on test dataset. The results were similar on the independent patients' dataset from our hospital, achieving Dice scores of 0.92, 0.90, and 0.81 for the whole tumor, tumor core and enhancing tumor, respectively.

Conclusion: The results of this study show the potential of patch-based 3D U-Net for the accurate intra-tumor segmentation. From experiments, it is observed that the weighted patch-based segmentation approach gives comparable performance with the pixel-based approach when there is a thin boundary between tumor subparts.

Collapse