Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Schroeck FR, Patterson OV, Alba PR, Pattison EA, Seigne JD, DuVall SL, Robertson DJ, Sirovich B, Goodney PP. Development of a Natural Language Processing Engine to Generate Bladder Cancer Pathology Data for Health Services Research. Urology 2017;110:84-91. [PMID: 28916254 PMCID: PMC5696035 DOI: 10.1016/j.urology.2017.07.056] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Revised: 07/13/2017] [Accepted: 07/25/2017] [Indexed: 11/16/2022]

For:	Schroeck FR, Patterson OV, Alba PR, Pattison EA, Seigne JD, DuVall SL, Robertson DJ, Sirovich B, Goodney PP. Development of a Natural Language Processing Engine to Generate Bladder Cancer Pathology Data for Health Services Research. Urology 2017;110:84-91. [PMID: 28916254 PMCID: PMC5696035 DOI: 10.1016/j.urology.2017.07.056] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Revised: 07/13/2017] [Accepted: 07/25/2017] [Indexed: 11/16/2022]

Number

Cited by Other Article(s)

McGonagle K, Dematt EJ, Mi Z, Biswas K, Schroeck FR. Non-Muscle Invasive Bladder Cancer: Many More Patients Die With It Than Of It. Bladder Cancer 2024;10:113-117. [PMID: 39131873 PMCID: PMC11308635 DOI: 10.3233/blc-230099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 04/06/2024] [Indexed: 08/13/2024]

Hashemi Gheinani A, Kim J, You S, Adam RM. Bioinformatics in urology - molecular characterization of pathophysiology and response to treatment. Nat Rev Urol 2024;21:214-242. [PMID: 37604982 DOI: 10.1038/s41585-023-00805-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/13/2023] [Indexed: 08/23/2023]

Narayan VM, Siolas D, Meadows ES, Turzhitsky V, Sillah A, Imai K, McMurry AJ, Li H. Evaluation of a Natural Language Processing Model to Identify and Characterize Patients in the United States With High-Risk Non-Muscle-Invasive Bladder Cancer. JCO Clin Cancer Inform 2023;7:e2300096. [PMID: 37906722 PMCID: PMC10642898 DOI: 10.1200/cci.23.00096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 08/08/2023] [Accepted: 09/14/2023] [Indexed: 11/02/2023] Open

Abstract

PURPOSE

Treatment of non-muscle-invasive bladder cancer (NMIBC) is guided by risk stratification using clinical and pathologic criteria. This study aimed to develop a natural language processing (NLP) model for identifying patients with high-risk NMIBC retrospectively from unstructured electronic medical records (EMRs) and to apply the model to describe patient and tumor characteristics.

METHODS

We used three independent EMR-derived data sets including adult patients with a bladder cancer diagnosis in 2011-2020 for NLP model development and training (n = 140), validation (n = 697), and application for the retrospective cohort analysis (n = 4,402). Deep learning methods were used to train NLP recognition of medical chart terminology to identify seven high-risk NMIBC criteria; model performance was assessed using the F1 score, weighted across features. An algorithm was then used to classify each patient as high-risk NMIBC (yes/no). Manually reviewed records served as the gold standard.

RESULTS

The F1 scores after model training were >0.7 for all but one uncommon feature (prostatic urethral involvement). The highest area under the receiver operating curves (AUC) was observed for Ta (0.897) and T1 (0.897); the lowest AUC was for carcinoma in situ (CIS; 0.617). For high-risk NMIBC classification, positive predictive value was 79.4%, negative predictive value was 93.2%, and false-positive rate was 8.9%. Sensitivity and specificity were 83.7% and 91.1%, respectively. Of 748 patients manually confirmed as having high-risk NMIBC, 196 (26%) had CIS (of whom 19% also had T1 and 23% also had Ta disease); 552 tumors (74%) had no associated CIS.

CONCLUSION

The NLP model, combined with a rule-based algorithm, identified high-risk NMIBC with good performance and will enable future work to study real-world treatment patterns and clinical outcomes for high-risk NMIBC.

Collapse

Wang L, Fu S, Wen A, Ruan X, He H, Liu S, Moon S, Mai M, Riaz IB, Wang N, Yang P, Xu H, Warner JL, Liu H. Assessment of Electronic Health Record for Cancer Research and Patient Care Through a Scoping Review of Cancer Natural Language Processing. JCO Clin Cancer Inform 2022;6:e2200006. [PMID: 35917480 PMCID: PMC9470142 DOI: 10.1200/cci.22.00006] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 03/18/2022] [Accepted: 06/15/2022] [Indexed: 11/20/2022] Open

Santos T, Tariq A, Gichoya JW, Trivedi H, Banerjee I. Automatic Classification of Cancer Pathology Reports: A Systematic Review. J Pathol Inform 2022;13:100003. [PMID: 35242443 PMCID: PMC8860734 DOI: 10.1016/j.jpi.2022.100003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 11/12/2021] [Indexed: 11/30/2022] Open

Yang R, Zhu D, Howard LE, De Hoedt A, Schroeck FR, Klaassen Z, Freedland SJ, Williams SB. Context-Based Identification of Muscle Invasion Status in Patients With Bladder Cancer Using Natural Language Processing. JCO Clin Cancer Inform 2022;6:e2100097. [PMID: 35073149 DOI: 10.1200/cci.21.00097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Park B, Altieri N, DeNero J, Odisho AY, Yu B. Improving natural language information extraction from cancer pathology reports using transfer learning and zero-shot string similarity. JAMIA Open 2021;4:ooab085. [PMID: 34604711 PMCID: PMC8484934 DOI: 10.1093/jamiaopen/ooab085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Revised: 09/06/2021] [Accepted: 09/22/2021] [Indexed: 11/16/2022] Open

Chang RW, Tucker LY, Rothenberg KA, Lancaster EM, Avins AL, Kuang HC, Faruqi RM, Nguyen-Huynh MN. Establishing a carotid artery stenosis disease cohort for comparative effectiveness research using natural language processing. J Vasc Surg 2021;74:1937-1947.e3. [PMID: 34182027 DOI: 10.1016/j.jvs.2021.05.054] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 05/19/2021] [Indexed: 11/24/2022]

Abstract

OBJECTIVE

Investigation of asymptomatic carotid stenosis treatment is hindered by the lack of a contemporary population-based disease cohort. We describe the use of natural language processing (NLP) to identify stenosis in patients undergoing carotid imaging.

METHODS

Adult patients with carotid imaging between 2008 and 2012 in a large integrated health care system were identified and followed through 2017. An NLP process was developed to characterize carotid stenosis according to the Society of Radiologists in Ultrasound (for ultrasounds) and North American Symptomatic Carotid Endarterectomy Trial (NASCET) (for axial imaging) guidelines. The resulting algorithm assessed text descriptors to categorize normal/non-hemodynamically significant stenosis, moderate or severe stenosis as well as occlusion in both carotid ultrasound (US) and axial imaging (computed tomography and magnetic resonance angiography [CTA/MRA]). For US reports, internal carotid artery systolic and diastolic velocities and velocity ratios were assessed and matched for laterality to supplement accuracy. To validate the NLP algorithm, positive predictive value (PPV or precision) and sensitivity (recall) were calculated from simple random samples from the population of all imaging studies. Lastly, all non-normal studies were manually reviewed for confirmation for prevalence estimates and disease cohort assembly.

RESULTS

A total of 95,896 qualifying index studies (76,276 US and 19,620 CTA/MRA) were identified among 94,822 patients including 1059 patients who underwent multiple studies on the same day. For studies of normal/non-hemodynamically significant stenosis arteries, the NLP algorithm showed excellent performance with a PPV of 99% for US and 96.5% for CTA/MRA. PPV/sensitivity to identify a non-normal artery with correct laterality in the CTA/MRA and US samples were 76.9% (95% confidence interval [CI], 74.1%-79.5%)/93.1% (95% CI, 91.1%-94.8%) and 74.7% (95% CI, 69.3%-79.5%)/94% (95% CI, 90.2%-96.7%), respectively. Regarding cohort assembly, 15,522 patients were identified with diseased carotid artery, including 2674 exhibiting equal bilateral disease. This resulted in a laterality-specific cohort with 12,828 moderate, 5283 severe, and 1895 occluded arteries and 326 diseased arteries with unknown stenosis. During follow-up, 30.1% of these patients underwent 61,107 additional studies.

CONCLUSIONS

Use of NLP to detect carotid stenosis or occlusion can result in accurate exclusion of normal/non-hemodynamically significant stenosis disease states with more moderate precision with lesion identification, which can substantially reduce the need for manual review. The resulting cohort allows for efficient research and holds promise for similar reporting in other vascular diseases.

Collapse

Senders JT, Cho LD, Calvachi P, McNulty JJ, Ashby JL, Schulte IS, Almekkawi AK, Mehrtash A, Gormley WB, Smith TR, Broekman MLD, Arnaout O. Automating Clinical Chart Review: An Open-Source Natural Language Processing Pipeline Developed on Free-Text Radiology Reports From Patients With Glioblastoma. JCO Clin Cancer Inform 2021;4:25-34. [PMID: 31977252 DOI: 10.1200/cci.19.00060] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open

Abstract

PURPOSE

The aim of this study was to develop an open-source natural language processing (NLP) pipeline for text mining of medical information from clinical reports. We also aimed to provide insight into why certain variables or reports are more suitable for clinical text mining than others.

MATERIALS AND METHODS

Various NLP models were developed to extract 15 radiologic characteristics from free-text radiology reports for patients with glioblastoma. Ten-fold cross-validation was used to optimize the hyperparameter settings and estimate model performance. We examined how model performance was associated with quantitative attributes of the radiologic characteristics and reports.

RESULTS

In total, 562 unique brain magnetic resonance imaging reports were retrieved. NLP extracted 15 radiologic characteristics with high to excellent discrimination (area under the curve, 0.82 to 0.98) and accuracy (78.6% to 96.6%). Model performance was correlated with the inter-rater agreement of the manually provided labels (ρ = 0.904; P < .001) but not with the frequency distribution of the variables of interest (ρ = 0.179; P = .52). All variables labeled with a near perfect inter-rater agreement were classified with excellent performance (area under the curve > 0.95). Excellent performance could be achieved for variables with only 50 to 100 observations in the minority group and class imbalances up to a 9:1 ratio. Report-level classification accuracy was not associated with the number of words or the vocabulary size in the distinct text documents.

CONCLUSION

This study provides an open-source NLP pipeline that allows for text mining of narratively written clinical reports. Small sample sizes and class imbalance should not be considered as absolute contraindications for text mining in clinical research. However, future studies should report measures of inter-rater agreement whenever ground truth is based on a consensus label and use this measure to identify clinical variables eligible for text mining.

Collapse

Affiliation(s)

Joeky T Senders Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Department of Neurosurgery, Leiden University Medical Center, Leiden, the Netherlands
Logan D Cho Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Department of Neuroscience, Brown University, Providence, RI
Paola Calvachi Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
John J McNulty Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Vagelos College of Physicians and Surgeons, Columbia University, New York, NY
Joanna L Ashby Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
Isabelle S Schulte Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
Ahmad Kareem Almekkawi Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
Alireza Mehrtash Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
William B Gormley Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
Timothy R Smith Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
Marike L D Broekman Department of Neurosurgery, Leiden University Medical Center, Leiden, the Netherlands.,Department of Neurosurgery, Haaglanden Medical Center, The Hague, the Netherlands
Omar Arnaout Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA

Collapse

Rezaee ME, Ismail AAO, Okorie CL, Seigne JD, Lynch KE, Schroeck FR. Partial Versus Complete Bacillus Calmette-Guérin Intravesical Therapy and Bladder Cancer Outcomes in High-risk Non-muscle-invasive Bladder Cancer: Is NIMBUS the Full Story? EUR UROL SUPPL 2021;26:35-43. [PMID: 34337506 PMCID: PMC8317819 DOI: 10.1016/j.euros.2021.01.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/25/2021] [Indexed: 01/09/2023] Open

Abstract

Background

It is important to understand the implications of reduced bacillus Calmette-Guérin (BCG) treatment intensity, given global shortages and early termination of the NIMBUS trial.

Objective

To assess the association of partial versus complete BCG induction with outcomes.

Design, setting, and participants

This is a retrospective cohort study of veterans diagnosed with high-risk non–muscle-invasive bladder cancer (NMIBC; high grade [HG] Ta, T1, or carcinoma in situ) between 2005 and 2011 with follow-up through 2014.

Intervention

Patients were categorized into partial versus complete BCG induction (one to five vs five or more instillations). Partial BCG induction subgroups were defined for comparison with the NIMBUS trial.

Outcome measurements and statistical analysis

Propensity score–adjusted regression models were used to assess the association of partial BCG induction with risk of recurrence and bladder cancer death.

Results and limitations

Among 540 patients, 114 (21.1%) underwent partial BCG induction. Partial versus complete BCG induction was not significantly associated with the risk of recurrence in HG Ta (cumulative incidence [CIn] 46.6% vs 53.9% at 5 yr, p = 0.38) or T1 (CIn 47.1% vs 56.7 at 5 yr, p = 0.19) disease. Similarly, we found no increased risk of bladder cancer death (HG Ta: CIn 4.7%7vs 5.4% at 5 yr, p = 0.87; T1: CIn 10.0% vs 11.4% at 5 yr, p = 0.77). NIMBUS-like induction was associated with an increased risk of recurrence in patients with HG Ta disease, although not statistically significant. Unmeasured confounding is a limitation.

Conclusions

Cancer outcomes were similar among high-risk NMIBC patients who underwent partial versus complete BCG induction, suggesting that future research is needed to determine how to optimize BCG delivery for the greatest number of patients, especially during global shortages.

Patient summary

Outcomes were similar between patients receiving partial and complete courses of bacillus Calmette-Guérin (BCG) therapy. Future research is needed to determine how to best deliver BCG to the greatest number of patients, particularly during medication shortages.

Collapse

Oliveira CR, Niccolai P, Ortiz AM, Sheth SS, Shapiro ED, Niccolai LM, Brandt CA. Natural Language Processing for Surveillance of Cervical and Anal Cancer and Precancer: Algorithm Development and Split-Validation Study. JMIR Med Inform 2020;8:e20826. [PMID: 32469840 PMCID: PMC7671846 DOI: 10.2196/20826] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 09/18/2020] [Accepted: 10/04/2020] [Indexed: 12/13/2022] Open

Abstract

Background

Accurate identification of new diagnoses of human papillomavirus–associated cancers and precancers is an important step toward the development of strategies that optimize the use of human papillomavirus vaccines. The diagnosis of human papillomavirus cancers hinges on a histopathologic report, which is typically stored in electronic medical records as free-form, or unstructured, narrative text. Previous efforts to perform surveillance for human papillomavirus cancers have relied on the manual review of pathology reports to extract diagnostic information, a process that is both labor- and resource-intensive. Natural language processing can be used to automate the structuring and extraction of clinical data from unstructured narrative text in medical records and may provide a practical and effective method for identifying patients with vaccine-preventable human papillomavirus disease for surveillance and research.

Objective

This study's objective was to develop and assess the accuracy of a natural language processing algorithm for the identification of individuals with cancer or precancer of the cervix and anus.

Methods

A pipeline-based natural language processing algorithm was developed, which incorporated machine learning and rule-based methods to extract diagnostic elements from the narrative pathology reports. To test the algorithm’s classification accuracy, we used a split-validation study design. Full-length cervical and anal pathology reports were randomly selected from 4 clinical pathology laboratories. Two study team members, blinded to the classifications produced by the natural language processing algorithm, manually and independently reviewed all reports and classified them at the document level according to 2 domains (diagnosis and human papillomavirus testing results). Using the manual review as the gold standard, the algorithm’s performance was evaluated using standard measurements of accuracy, recall, precision, and F-measure.

Results

The natural language processing algorithm’s performance was validated on 949 pathology reports. The algorithm demonstrated accurate identification of abnormal cytology, histology, and positive human papillomavirus tests with accuracies greater than 0.91. Precision was lowest for anal histology reports (0.87, 95% CI 0.59-0.98) and highest for cervical cytology (0.98, 95% CI 0.95-0.99). The natural language processing algorithm missed 2 out of the 15 abnormal anal histology reports, which led to a relatively low recall (0.68, 95% CI 0.43-0.87).

Conclusions

This study outlines the development and validation of a freely available and easily implementable natural language processing algorithm that can automate the extraction and classification of clinical data from cervical and anal cytology and histology.

Collapse

Odisho AY, Park B, Altieri N, DeNero J, Cooperberg MR, Carroll PR, Yu B. Natural language processing systems for pathology parsing in limited data environments with uncertainty estimation. JAMIA Open 2020;3:431-438. [PMID: 33381748 PMCID: PMC7751177 DOI: 10.1093/jamiaopen/ooaa029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 06/09/2020] [Accepted: 07/13/2020] [Indexed: 12/05/2022] Open

Rezaee ME, Lynch KE, Li Z, MacKenzie TA, Seigne JD, Robertson DJ, Sirovich B, Goodney PP, Schroeck FR. The impact of low- versus high-intensity surveillance cystoscopy on surgical care and cancer outcomes in patients with high-risk non-muscle-invasive bladder cancer (NMIBC). PLoS One 2020;15:e0230417. [PMID: 32203532 PMCID: PMC7089561 DOI: 10.1371/journal.pone.0230417] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Accepted: 02/28/2020] [Indexed: 11/18/2022] Open

Abstract

Purpose

To assess the association of low- vs. guideline-recommended high-intensity cystoscopic surveillance with outcomes among patients with high-risk non-muscle invasive bladder cancer (NMIBC).

Materials & methods

A retrospective cohort study of Veterans Affairs patients diagnosed with high-risk NMIBC between 2005 and 2011 with follow-up through 2014. Patients were categorized by number of surveillance cystoscopies over two years following diagnosis: low- (1–5) vs. high-intensity (6 or more) surveillance. Propensity score adjusted regression models were used to assess the association of low-intensity cystoscopic surveillance with frequency of transurethral resections, and risk of progression to invasive disease and bladder cancer death.

Results

Among 1,542 patients, 520 (33.7%) underwent low-intensity cystoscopic surveillance. Patients undergoing low-intensity surveillance had fewer transurethral resections (37 vs. 99 per 100 person-years; p<0.001). Risk of death from bladder cancer did not differ significantly by low (cumulative incidence [CIn] 8.4% [95% CI 6.5–10.9) at 5 years) vs. high-intensity surveillance (CIn 9.1% [95% CI 7.4–11.2) at 5 years, p = 0.61). Low vs. high-intensity surveillance was not associated with increased risk of bladder cancer death among patients with Ta (CIn 5.7% vs. 8.2% at 5 years p = 0.24) or T1 disease at diagnosis (CIn 10.2% vs. 9.1% at 5 years, p = 0.58). Among patients with Ta disease, low-intensity surveillance was associated with decreased risk of progression to invasive disease (T1 or T2) or bladder cancer death (CIn 19.3% vs. 31.3% at 5 years, p = 0.002).

Conclusions

Patients with high-risk NMIBC undergoing low- vs. high-intensity cystoscopic surveillance underwent fewer transurethral resections, but did not experience an increased risk of progression or bladder cancer death. These findings provide a strong rationale for a clinical trial to determine whether low-intensity surveillance is comparable to high-intensity surveillance for cancer control in high-risk NMIBC.

Collapse

Affiliation(s)

Michael E. Rezaee White River Junction VA Medical Center, White River Junction, VT, United States of America Section of Urology Dartmouth Hitchcock Medical Center, Lebanon, NH, United States of America
Kristine E. Lynch VA Salt Lake City Health Care System and University of Utah, Salt Lake City, UT, United States of America
Zhongze Li Biomedical Data Science Department, Geisel School of Medicine at Dartmouth College, Lebanon, NH, United States of America
Todd A. MacKenzie Biomedical Data Science Department, Geisel School of Medicine at Dartmouth College, Lebanon, NH, United States of America The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth College, Lebanon, NH, United States of America
John D. Seigne White River Junction VA Medical Center, White River Junction, VT, United States of America Norris Cotton Cancer Center, Dartmouth Hitchcock Medical Center, Lebanon, NH, United States of America
Douglas J. Robertson White River Junction VA Medical Center, White River Junction, VT, United States of America The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth College, Lebanon, NH, United States of America
Brenda Sirovich White River Junction VA Medical Center, White River Junction, VT, United States of America The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth College, Lebanon, NH, United States of America
Philip P. Goodney White River Junction VA Medical Center, White River Junction, VT, United States of America The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth College, Lebanon, NH, United States of America
Florian R. Schroeck White River Junction VA Medical Center, White River Junction, VT, United States of America Section of Urology Dartmouth Hitchcock Medical Center, Lebanon, NH, United States of America The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth College, Lebanon, NH, United States of America Norris Cotton Cancer Center, Dartmouth Hitchcock Medical Center, Lebanon, NH, United States of America * E-mail:

Collapse

Levine MN, Alexander G, Sathiyapalan A, Agrawal A, Pond G. Learning Health System for Breast Cancer: Pilot Project Experience. JCO Clin Cancer Inform 2019;3:1-11. [DOI: 10.1200/cci.19.00032] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Abstract PURPOSE Clinicians need accurate and timely information on the impact of treatments on patient outcomes. The electronic health record (EHR) offers the potential for insight into real-world patient experiences and outcomes, but it is difficult to tap into. Our goal was to apply artificial intelligence technology to the EHR to characterize the clinical course of patients with stage III breast cancer. PATIENTS AND METHODS Data from patients with stage III breast cancer who presented between 2013 and 2015 were extracted from the EHR, de-identified, and imported into the IBM Cloud. Specialized natural language processing (NLP) annotators were developed to extract medical concepts from unstructured clinical text and transform them to structured attributes. In the validation phase, these annotators were applied to 19 additional patients with stage III breast cancer from the same period. The resulting data were compared with that in the medical chart (gold standard) for nine key indicators. RESULTS Information was extracted for 50 patients, including tumor stage (94% stage IIIA, 6% stage IIIB), age (28% 50 years or younger, 52% between 51 and 70 years, and 24% older than 70 years), receptor status (84% estrogen receptor positive, 74% progesterone receptor positive), and first treatment (72% surgery, 26% chemotherapy, 2% endocrine). Events in the patient’s journey were compiled to create a timeline. For 171 data elements, NLP and the chart disagreed for 41 (24%; 95% CI, 17.8% to 31.1%). With additional manipulation using simple logic, the disagreement was reduced to six elements (3.5%; 95% CI, 1.3% to 7.5%; F1 statistic, 0.9694). CONCLUSION It is possible to extract, read, and combine data from the EHR to view the patient journey. The agreement between NLP and the gold standard was high, which supports validity. Collapse

A frame semantic overview of NLP-based information extraction for cancer-related EHR notes. J Biomed Inform 2019;100:103301. [PMID: 31589927 DOI: 10.1016/j.jbi.2019.103301] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 09/04/2019] [Accepted: 10/03/2019] [Indexed: 02/07/2023]

Abstract

OBJECTIVE

There is a lot of information about cancer in Electronic Health Record (EHR) notes that can be useful for biomedical research provided natural language processing (NLP) methods are available to extract and structure this information. In this paper, we present a scoping review of existing clinical NLP literature for cancer.

METHODS

We identified studies describing an NLP method to extract specific cancer-related information from EHR sources from PubMed, Google Scholar, ACL Anthology, and existing reviews. Two exclusion criteria were used in this study. We excluded articles where the extraction techniques used were too broad to be represented as frames (e.g., document classification) and also where very low-level extraction methods were used (e.g. simply identifying clinical concepts). 78 articles were included in the final review. We organized this information according to frame semantic principles to help identify common areas of overlap and potential gaps.

RESULTS

Frames were created from the reviewed articles pertaining to cancer information such as cancer diagnosis, tumor description, cancer procedure, breast cancer diagnosis, prostate cancer diagnosis and pain in prostate cancer patients. These frames included both a definition as well as specific frame elements (i.e. extractable attributes). We found that cancer diagnosis was the most common frame among the reviewed papers (36 out of 78), with recent work focusing on extracting information related to treatment and breast cancer diagnosis.

CONCLUSION

The list of common frames described in this paper identifies important cancer-related information extracted by existing NLP techniques and serves as a useful resource for future researchers requiring cancer information extracted from EHR notes. We also argue, due to the heavy duplication of cancer NLP systems, that a general purpose resource of annotated cancer frames and corresponding NLP tools would be valuable.

Collapse

Odisho AY, Bridge M, Webb M, Ameli N, Eapen RS, Stauf F, Cowan JE, Washington SL, Herlemann A, Carroll PR, Cooperberg MR. Automating the Capture of Structured Pathology Data for Prostate Cancer Clinical Care and Research. JCO Clin Cancer Inform 2019;3:1-8. [PMID: 31314550 PMCID: PMC6874052 DOI: 10.1200/cci.18.00084] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/17/2019] [Indexed: 01/19/2023] Open

Jain NM, Culley A, Knoop T, Micheel C, Osterman T, Levy M. Conceptual Framework to Support Clinical Trial Optimization and End-to-End Enrollment Workflow. JCO Clin Cancer Inform 2019;3:1-10. [PMID: 31225983 PMCID: PMC6873934 DOI: 10.1200/cci.19.00033] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/02/2019] [Indexed: 12/19/2022] Open

Han DS, Lynch KE, Chang JW, Sirovich B, Robertson DJ, Swanton AR, Seigne JD, Goodney PP, Schroeck FR. Overuse of Cystoscopic Surveillance Among Patients With Low-risk Non-Muscle-invasive Bladder Cancer - A National Study of Patient, Provider, and Facility Factors. Urology 2019;131:112-119. [PMID: 31145947 DOI: 10.1016/j.urology.2019.04.036] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Revised: 03/05/2019] [Accepted: 04/06/2019] [Indexed: 11/24/2022]

Schroeck FR, Lynch KE, Li Z, MacKenzie TA, Han DS, Seigne JD, Robertson DJ, Sirovich B, Goodney PP. The impact of frequent cystoscopy on surgical care and cancer outcomes among patients with low-risk, non-muscle-invasive bladder cancer. Cancer 2019;125:3147-3154. [PMID: 31120559 DOI: 10.1002/cncr.32185] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Revised: 03/21/2019] [Accepted: 04/29/2019] [Indexed: 01/23/2023]

Affiliation(s)

Florian R Schroeck Department of Veterans Affairs (VA) Outcomes Group, White River Junction VA Medical Center, White River Junction, Vermont.,Section of Urology, Dartmouth Hitchcock Medical Center, Lebanon, New Hampshire.,Norris Cotton Cancer Center, Dartmouth Hitchcock Medical Center, Lebanon, New Hampshire.,The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth College, Lebanon, New Hampshire
Kristine E Lynch VA Salt Lake City Health Care System and Division of Epidemiology, University of Utah, Salt Lake City, Utah
Zhongze Li Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth College, Lebanon, New Hampshire
Todd A MacKenzie The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth College, Lebanon, New Hampshire.,Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth College, Lebanon, New Hampshire
David S Han Section of Urology, Dartmouth Hitchcock Medical Center, Lebanon, New Hampshire.,The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth College, Lebanon, New Hampshire
John D Seigne Section of Urology, Dartmouth Hitchcock Medical Center, Lebanon, New Hampshire.,Norris Cotton Cancer Center, Dartmouth Hitchcock Medical Center, Lebanon, New Hampshire
Douglas J Robertson Department of Veterans Affairs (VA) Outcomes Group, White River Junction VA Medical Center, White River Junction, Vermont.,The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth College, Lebanon, New Hampshire
Brenda Sirovich Department of Veterans Affairs (VA) Outcomes Group, White River Junction VA Medical Center, White River Junction, Vermont.,The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth College, Lebanon, New Hampshire
Philip P Goodney Department of Veterans Affairs (VA) Outcomes Group, White River Junction VA Medical Center, White River Junction, Vermont.,The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth College, Lebanon, New Hampshire

Collapse

Graham LA. Databases for surgical health services research: Veterans Health Administration data. Surgery 2018;165:876-878. [PMID: 30177251 DOI: 10.1016/j.surg.2018.07.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Accepted: 07/25/2018] [Indexed: 11/20/2022]

Schroeck FR, Lynch KE, Chang JW, MacKenzie TA, Seigne JD, Robertson DJ, Goodney PP, Sirovich B. Extent of Risk-Aligned Surveillance for Cancer Recurrence Among Patients With Early-Stage Bladder Cancer. JAMA Netw Open 2018;1:e183442. [PMID: 30465041 PMCID: PMC6241521 DOI: 10.1001/jamanetworkopen.2018.3442] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Accepted: 08/12/2018] [Indexed: 12/23/2022] Open

Abstract

IMPORTANCE

Cancer care guidelines recommend aligning surveillance frequency with underlying cancer risk, ie, more frequent surveillance for patients at high vs low risk of cancer recurrence.

OBJECTIVE

To assess the extent to which such risk-aligned surveillance is practiced within US Department of Veterans Affairs facilities by classifying surveillance patterns for low- vs high-risk patients with early-stage bladder cancer.

DESIGN SETTING AND PARTICIPANTS

US national retrospective cohort study of a population-based sample of patients diagnosed with low-risk or high-risk early-stage bladder between January 1, 2005, and December 31, 2011, with follow-up through December 31, 2014. Analyses were performed March 2017 to April 2018. The study included all Veterans Affairs facilities (n = 85) where both low-and high-risk patients were treated.

EXPOSURES

Low-risk vs high-risk cancer status, based on definitions from the European Association of Urology risk stratification guidelines and on data extracted from diagnostic pathology reports via validated natural language processing algorithms.

MAIN OUTCOMES AND MEASURES

Adjusted cystoscopy frequency for low-risk and high-risk patients for each facility, estimated using multilevel modeling.

RESULTS

The study included 1278 low-risk and 2115 high-risk patients (median [interquartile range] age, 77 [71-82] years; 99% [3368 of 3393] male). Across facilities, the adjusted frequency of surveillance cystoscopy ranged from 3.7 to 6.2 (mean, 4.8) procedures over 2 years per patient for low-risk patients and from 4.6 to 6.0 (mean, 5.4) procedures over 2 years per patient for high-risk patients. In 70 of 85 facilities, surveillance was performed at a comparable frequency for low- and high-risk patients, differing by less than 1 cystoscopy over 2 years. Surveillance frequency among high-risk patients statistically significantly exceeded surveillance among low-risk patients at only 4 facilities. Across all facilities, surveillance frequencies for low- vs high-risk patients were moderately strongly correlated (r = 0.52; P < .001).

CONCLUSIONS AND RELEVANCE

Patients with early-stage bladder cancer undergo cystoscopic surveillance at comparable frequencies regardless of risk. This finding highlights the need to understand barriers to risk-aligned surveillance with the goal of making it easier for clinicians to deliver it in routine practice.

Collapse