1
|
Hanauer DA, Barnholtz-Sloan JS, Beno MF, Del Fiol G, Durbin EB, Gologorskaya O, Harris D, Harnett B, Kawamoto K, May B, Meeks E, Pfaff E, Weiss J, Zheng K. Electronic Medical Record Search Engine (EMERSE): An Information Retrieval Tool for Supporting Cancer Research. JCO Clin Cancer Inform 2021; 4:454-463. [PMID: 32412846 PMCID: PMC7265780 DOI: 10.1200/cci.19.00134] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
PURPOSE The Electronic Medical Record Search Engine (EMERSE) is a software tool built to aid research spanning cohort discovery, population health, and data abstraction for clinical trials. EMERSE is now live at three academic medical centers, with additional sites currently working on implementation. In this report, we describe how EMERSE has been used to support cancer research based on a variety of metrics. METHODS We identified peer-reviewed publications that used EMERSE through online searches as well as through direct e-mails to users based on audit logs. These logs were also used to summarize use at each of the three sites. Search terms for two of the sites were characterized using the natural language processing tool MetaMap to determine to which semantic types the terms could be mapped. RESULTS We identified a total of 326 peer-reviewed publications that used EMERSE through August 2019, although this is likely an underestimation of the true total based on the use log analysis. Oncology-related research comprised nearly one third (n = 105; 32.2%) of all research output. The use logs showed that EMERSE had been used by multiple people at each site (nearly 3,500 across all three) who had collectively logged into the system > 100,000 times. Many user-entered search queries could not be mapped to a semantic type, but the most common semantic type for terms that did match was “disease or syndrome,” followed by “pharmacologic substance.” CONCLUSION EMERSE has been shown to be a valuable tool for supporting cancer research. It has been successfully deployed at other sites, despite some implementation challenges unique to each deployment environment.
Collapse
Affiliation(s)
- David A Hanauer
- Department of Pediatrics, University of Michigan Medical School, Ann Arbor, MI
| | - Jill S Barnholtz-Sloan
- Case Western Reserve University School of Medicine, Cleveland, OH.,Cleveland Institute for Computational Biology, Cleveland, OH
| | - Mark F Beno
- Case Western Reserve University School of Medicine, Cleveland, OH.,Cleveland Institute for Computational Biology, Cleveland, OH
| | - Guilherme Del Fiol
- Department of Biomedical Informatics, University of Utah, Salt Lake City, UT
| | - Eric B Durbin
- Markey Cancer Center, UK HealthCare, Lexington, KY.,Division of Biomedical Informatics, University of Kentucky, Lexington, KY
| | - Oksana Gologorskaya
- Clinical and Translational Science Institute, University of California San Francisco, San Francisco, CA
| | - Daniel Harris
- Markey Cancer Center, UK HealthCare, Lexington, KY.,Division of Biomedical Informatics, University of Kentucky, Lexington, KY
| | - Brett Harnett
- Department of Biomedical Informatics, University of Cincinnati College of Medicine, Cincinnati, OH
| | - Kensaku Kawamoto
- Department of Biomedical Informatics, University of Utah, Salt Lake City, UT
| | - Benjamin May
- Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY
| | - Eric Meeks
- Clinical and Translational Science Institute, University of California San Francisco, San Francisco, CA
| | - Emily Pfaff
- North Carolina Translational and Clinical Sciences Institute, University of North Carolina School of Medicine, Chapel Hill, NC
| | - Janie Weiss
- Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY
| | - Kai Zheng
- Department of Informatics, University of California, Irvine, CA
| |
Collapse
|
2
|
Giori NJ, Radin J, Callahan A, Fries JA, Halilaj E, Ré C, Delp SL, Shah NH, Harris AHS. Assessment of Extractability and Accuracy of Electronic Health Record Data for Joint Implant Registries. JAMA Netw Open 2021; 4:e211728. [PMID: 33720372 PMCID: PMC7961313 DOI: 10.1001/jamanetworkopen.2021.1728] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
IMPORTANCE Implant registries provide valuable information on the performance of implants in a real-world setting, yet they have traditionally been expensive to establish and maintain. Electronic health records (EHRs) are widely used and may include the information needed to generate clinically meaningful reports similar to a formal implant registry. OBJECTIVES To quantify the extractability and accuracy of registry-relevant data from the EHR and to assess the ability of these data to track trends in implant use and the durability of implants (hereafter referred to as implant survivorship), using data stored since 2000 in the EHR of the largest integrated health care system in the United States. DESIGN, SETTING, AND PARTICIPANTS Retrospective cohort study of a large EHR of veterans who had 45 351 total hip arthroplasty procedures in Veterans Health Administration hospitals from 2000 to 2017. Data analysis was performed from January 1, 2000, to December 31, 2017. EXPOSURES Total hip arthroplasty. MAIN OUTCOMES AND MEASURES Number of total hip arthroplasty procedures extracted from the EHR, trends in implant use, and relative survivorship of implants. RESULTS A total of 45 351 total hip arthroplasty procedures were identified from 2000 to 2017 with 192 805 implant parts. Data completeness improved over the time. After 2014, 85% of prosthetic heads, 91% of shells, 81% of stems, and 85% of liners used in the Veterans Health Administration health care system were identified by part number. Revision burden and trends in metal vs ceramic prosthetic femoral head use were found to reflect data from the American Joint Replacement Registry. Recalled implants were obvious negative outliers in implant survivorship using Kaplan-Meier curves. CONCLUSIONS AND RELEVANCE Although loss to follow-up remains a challenge that requires additional attention to improve the quantitative nature of calculated implant survivorship, we conclude that data collected during routine clinical care and stored in the EHR of a large health system over 18 years were sufficient to provide clinically meaningful data on trends in implant use and to identify poor implants that were subsequently recalled. This automated approach was low cost and had no reporting burden. This low-cost, low-overhead method to assess implant use and performance within a large health care setting may be useful to internal quality assurance programs and, on a larger scale, to postmarket surveillance of implant performance.
Collapse
Affiliation(s)
- Nicholas J. Giori
- Center for Innovation to Implementation, VA Palo Alto Health Care System, Palo Alto, California
- Department of Orthopedic Surgery, Stanford University, Stanford, California
| | - John Radin
- Center for Innovation to Implementation, VA Palo Alto Health Care System, Palo Alto, California
| | - Alison Callahan
- Center for Biomedical Informatics Research, Stanford University, Stanford, California
| | - Jason A. Fries
- Center for Biomedical Informatics Research, Stanford University, Stanford, California
- Department of Computer Science, Stanford University, Stanford, California
| | - Eni Halilaj
- Department of Bioengineering, Stanford University, Stanford, California
| | - Christopher Ré
- Department of Computer Science, Stanford University, Stanford, California
| | - Scott L. Delp
- Department of Bioengineering, Stanford University, Stanford, California
| | - Nigam H. Shah
- Center for Biomedical Informatics Research, Stanford University, Stanford, California
| | - Alex H. S. Harris
- Center for Innovation to Implementation, VA Palo Alto Health Care System, Palo Alto, California
- Department of Surgery, Stanford University, Stanford, California
| |
Collapse
|
3
|
Hameed BMZ, S. Dhavileswarapu AVL, Naik N, Karimi H, Hegde P, Rai BP, Somani BK. Big Data Analytics in urology: the story so far and the road ahead. Ther Adv Urol 2021; 13:1756287221998134. [PMID: 33747134 PMCID: PMC7940776 DOI: 10.1177/1756287221998134] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 02/04/2021] [Indexed: 12/25/2022] Open
Abstract
Artificial intelligence (AI) has a proven record of application in the field of medicine and is used in various urological conditions such as oncology, urolithiasis, paediatric urology, urogynaecology, infertility and reconstruction. Data is the driving force of AI and the past decades have undoubtedly witnessed an upsurge in healthcare data. Urology is a specialty that has always been at the forefront of innovation and research and has rapidly embraced technologies to improve patient outcomes and experience. Advancements made in Big Data Analytics raised the expectations about the future of urology. This review aims to investigate the role of big data and its blend with AI for trends and use in urology. We explore the different sources of big data in urology and explicate their current and future applications. A positive trend has been exhibited by the advent and implementation of AI in urology with data available from several databases. The extensive use of big data for the diagnosis and treatment of urological disorders is still in its early stage and under validation. In future however, big data will no doubt play a major role in the management of urological conditions.
Collapse
Affiliation(s)
- B. M. Zeeshan Hameed
- Department of Urology, Kasturba Medical College Manipal, Manipal Academy of Higher Education, Manipal, India KMC Innovation Centre, Manipal Academy of Higher Education, Manipal, India iTRUE (International Training and Research in Uro-Oncology and Endourology) Group
| | | | - Nithesh Naik
- Department of Mechanical and Manufacturing Engineering, Faculty of Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
- iTRUE (International Training and Research in Uro-Oncology and Endourology) Group
| | - Hadis Karimi
- Department of Pharmacy, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Padmaraj Hegde
- Department of Urology, Kasturba Medical College Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Bhavan Prasad Rai
- iTRUE (International Training and Research in Uro-Oncology and Endourology) Group Department of Urology, Freeman Hospital, Newcastle, UK
| | - Bhaskar K. Somani
- Department of Urology, Kasturba Medical College Manipal, Manipal Academy of Higher Education, Manipal, India
- iTRUE (International Training and Research in Uro-oncology and Endourology) Group Department of Urology, University Hospital Southampton NHS Trust, Southampton, UK
| |
Collapse
|
4
|
Bozkurt S, Paul R, Coquet J, Sun R, Banerjee I, Brooks JD, Hernandez-Boussard T. Phenotyping severity of patient-centered outcomes using clinical notes: A prostate cancer use case. Learn Health Syst 2020; 4:e10237. [PMID: 33083539 PMCID: PMC7556418 DOI: 10.1002/lrh2.10237] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 06/15/2020] [Accepted: 06/23/2020] [Indexed: 01/12/2023] Open
Abstract
Introduction A learning health system (LHS) must improve care in ways that are meaningful to patients, integrating patient‐centered outcomes (PCOs) into core infrastructure. PCOs are common following cancer treatment, such as urinary incontinence (UI) following prostatectomy. However, PCOs are not systematically recorded because they can only be described by the patient, are subjective and captured as unstructured text in the electronic health record (EHR). Therefore, PCOs pose significant challenges for phenotyping patients. Here, we present a natural language processing (NLP) approach for phenotyping patients with UI to classify their disease into severity subtypes, which can increase opportunities to provide precision‐based therapy and promote a value‐based delivery system. Methods Patients undergoing prostate cancer treatment from 2008 to 2018 were identified at an academic medical center. Using a hybrid NLP pipeline that combines rule‐based and deep learning methodologies, we classified positive UI cases as mild, moderate, and severe by mining clinical notes. Results The rule‐based model accurately classified UI into disease severity categories (accuracy: 0.86), which outperformed the deep learning model (accuracy: 0.73). In the deep learning model, the recall rates for mild and moderate group were higher than the precision rate (0.78 and 0.79, respectively). A hybrid model that combined both methods did not improve the accuracy of the rule‐based model but did outperform the deep learning model (accuracy: 0.75). Conclusion Phenotyping patients based on indication and severity of PCOs is essential to advance a patient centered LHS. EHRs contain valuable information on PCOs and by using NLP methods, it is feasible to accurately and efficiently phenotype PCO severity. Phenotyping must extend beyond the identification of disease to provide classification of disease severity that can be used to guide treatment and inform shared decision‐making. Our methods demonstrate a path to a patient centered LHS that could advance precision medicine.
Collapse
Affiliation(s)
- Selen Bozkurt
- Department of Medicine, Biomedical Informatics Research Stanford University Stanford California USA
| | - Rohan Paul
- Department of Biomedical Data Sciences Stanford University Stanford California USA
| | - Jean Coquet
- Department of Medicine, Biomedical Informatics Research Stanford University Stanford California USA
| | - Ran Sun
- Department of Medicine, Biomedical Informatics Research Stanford University Stanford California USA
| | - Imon Banerjee
- Department of Biomedical Data Sciences Stanford University Stanford California USA.,Department of Radiology Stanford University Stanford California USA
| | - James D Brooks
- Department of Urology Stanford University Stanford California USA
| | - Tina Hernandez-Boussard
- Department of Medicine, Biomedical Informatics Research Stanford University Stanford California USA.,Department of Biomedical Data Sciences Stanford University Stanford California USA.,Department of Surgery Stanford University Stanford California USA
| |
Collapse
|
5
|
Hernandez-Boussard T, Blayney DW, Brooks JD. Leveraging Digital Data to Inform and Improve Quality Cancer Care. Cancer Epidemiol Biomarkers Prev 2020; 29:816-822. [PMID: 32066619 DOI: 10.1158/1055-9965.epi-19-0873] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 10/03/2019] [Accepted: 02/12/2020] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Efficient capture of routine clinical care and patient outcomes is needed at a population-level, as is evidence on important treatment-related side effects and their effect on well-being and clinical outcomes. The increasing availability of electronic health records (EHR) offers new opportunities to generate population-level patient-centered evidence on oncologic care that can better guide treatment decisions and patient-valued care. METHODS This study includes patients seeking care at an academic medical center, 2008 to 2018. Digital data sources are combined to address missingness, inaccuracy, and noise common to EHR data. Clinical concepts were identified and extracted from EHR unstructured data using natural language processing (NLP) and machine/deep learning techniques. All models are trained, tested, and validated on independent data samples using standard metrics. RESULTS We provide use cases for using EHR data to assess guideline adherence and quality measurements among patients with cancer. Pretreatment assessment was evaluated by guideline adherence and quality metrics for cancer staging metrics. Our studies in perioperative quality focused on medications administered and guideline adherence. Patient outcomes included treatment-related side effects and patient-reported outcomes. CONCLUSIONS Advanced technologies applied to EHRs present opportunities to advance population-level quality assessment, to learn from routinely collected clinical data for personalized treatment guidelines, and to augment epidemiologic and population health studies. The effective use of digital data can inform patient-valued care, quality initiatives, and policy guidelines. IMPACT A comprehensive set of health data analyzed with advanced technologies results in a unique resource that facilitates wide-ranging, innovative, and impactful research on prostate cancer. This work demonstrates new ways to use the EHRs and technology to advance epidemiologic studies and benefit oncologic care.See all articles in this CEBP Focus section, "Modernizing Population Science."
Collapse
Affiliation(s)
- Tina Hernandez-Boussard
- Department of Medicine, Stanford University, Stanford, California. .,Department of Biomedical Data Science, Stanford University, Stanford, California.,Department of Surgery, Stanford University School of Medicine, Stanford, California
| | - Douglas W Blayney
- Department of Medicine, Stanford University, Stanford, California.,Stanford Cancer Institute, Stanford University School of Medicine, Stanford, California
| | - James D Brooks
- Stanford Cancer Institute, Stanford University School of Medicine, Stanford, California.,Department of Urology, Stanford University School of Medicine, Stanford, California
| |
Collapse
|
6
|
Li K, Banerjee I, Magnani CJ, Blayney DW, Brooks JD, Hernandez-Boussard T. Clinical Documentation to Predict Factors Associated with Urinary Incontinence Following Prostatectomy for Prostate Cancer. Res Rep Urol 2020; 12:7-14. [PMID: 32158720 PMCID: PMC6986242 DOI: 10.2147/rru.s234178] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Accepted: 12/11/2019] [Indexed: 02/01/2023] Open
Abstract
Background Advances in data collection provide opportunities to use population samples in identifying risk factors for urinary incontinence (UI), which occurs in up to 71% of men with prostate cancer following prostatectomy. Most studies on patient-centered outcomes use surveys or manual chart abstraction for data collection, which can be costly and difficult to scale. We sought to evaluate rates of and risk factors for UI following prostatectomy using natural language processing on electronic health record (EHR) data. Methods We conducted a retrospective analysis of patients undergoing prostatectomy for prostate cancer between January 2008 and August 2018 using EHR data from an academic medical center. UI incidence for each patient in the cohort was assessed using natural language processing from clinical notes generated pre- and postoperatively. Multivariable logistic regression was used to evaluate potential risk factors for postoperative UI at various time points within 2 years following surgery. Results We identified 3792 patients who underwent prostatectomy for prostate cancer. We found a significant association between preoperative UI and UI in the first (odds ratio [OR], 2.30; 95% confidence interval [CI], 1.24–4.28) and second (OR 2.24, 95% CI 1.04–4.83) years following surgery. Preoperative body mass index was also associated with UI in the second postoperative year (OR 1.11, 95% CI 1.02–1.21). Conclusion We show that a natural language processing approach using clinical narratives can be used to assess risk for UI in prostate cancer patients. Unstructured clinical narrative text can help advance future population-level research in patient-centered outcomes and quality of care.
Collapse
Affiliation(s)
- Kevin Li
- Stanford University School of Medicine, Stanford, CA, USA
| | - Imon Banerjee
- Department of Biomedical Informatics, Emory School of Medicine, Atlanta, GA, USA
| | | | - Douglas W Blayney
- Department of Medicine (Oncology), Stanford University School of Medicine, Stanford, CA, USA
| | - James D Brooks
- Department of Urology (Urologic Oncology), Stanford University School of Medicine, Stanford, CA, USA
| | - Tina Hernandez-Boussard
- Department of Medicine (Biomedical Informatics), Biomedical Data Sciences, and Surgery, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
7
|
Chan L, Beers K, Yau AA, Chauhan K, Duffy Á, Chaudhary K, Debnath N, Saha A, Pattharanitima P, Cho J, Kotanko P, Federman A, Coca SG, Van Vleck T, Nadkarni GN. Natural language processing of electronic health records is superior to billing codes to identify symptom burden in hemodialysis patients. Kidney Int 2019; 97:383-392. [PMID: 31883805 DOI: 10.1016/j.kint.2019.10.023] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 09/27/2019] [Accepted: 10/18/2019] [Indexed: 02/07/2023]
Abstract
Symptoms are common in patients on maintenance hemodialysis but identification is challenging. New informatics approaches including natural language processing (NLP) can be utilized to identify symptoms from narrative clinical documentation. Here we utilized NLP to identify seven patient symptoms from notes of maintenance hemodialysis patients of the BioMe Biobank and validated our findings using a separate cohort and the MIMIC-III database. NLP performance was compared for symptom detection with International Classification of Diseases (ICD)-9/10 codes and the performance of both methods were validated against manual chart review. From 1034 and 519 hemodialysis patients within BioMe and MIMIC-III databases, respectively, the most frequently identified symptoms by NLP were fatigue, pain, and nausea/vomiting. In BioMe, sensitivity for NLP (0.85 - 0.99) was higher than for ICD codes (0.09 - 0.59) for all symptoms with similar results in the BioMe validation cohort and MIMIC-III. ICD codes were significantly more specific for nausea/vomiting in BioMe and more specific for fatigue, depression, and pain in the MIMIC-III database. A majority of patients in both cohorts had four or more symptoms. Patients with more symptoms identified by NLP, ICD, and chart review had more clinical encounters. NLP had higher specificity in inpatient notes but higher sensitivity in outpatient notes and performed similarly across pain severity subgroups. Thus, NLP had higher sensitivity compared to ICD codes for identification of seven common hemodialysis-related symptoms, with comparable specificity between the two methods. Hence, NLP may be useful for the high-throughput identification of patient-centered outcomes when using electronic health records.
Collapse
Affiliation(s)
- Lili Chan
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA; The Charles Bronfman Institute for Personalized Medicine, Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.
| | - Kelly Beers
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Amy A Yau
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Kinsuk Chauhan
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Áine Duffy
- The Charles Bronfman Institute for Personalized Medicine, Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Kumardeep Chaudhary
- The Charles Bronfman Institute for Personalized Medicine, Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Neha Debnath
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Aparna Saha
- The Charles Bronfman Institute for Personalized Medicine, Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Pattharawin Pattharanitima
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Judy Cho
- The Charles Bronfman Institute for Personalized Medicine, Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Peter Kotanko
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA; Renal Research Institute, New York, New York, USA
| | - Alex Federman
- Division of General Internal Medicine, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Steven G Coca
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Tielman Van Vleck
- The Charles Bronfman Institute for Personalized Medicine, Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Girish N Nadkarni
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA; The Charles Bronfman Institute for Personalized Medicine, Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.
| |
Collapse
|
8
|
Savova GK, Danciu I, Alamudun F, Miller T, Lin C, Bitterman DS, Tourassi G, Warner JL. Use of Natural Language Processing to Extract Clinical Cancer Phenotypes from Electronic Medical Records. Cancer Res 2019; 79:5463-5470. [PMID: 31395609 PMCID: PMC7227798 DOI: 10.1158/0008-5472.can-19-0579] [Citation(s) in RCA: 76] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Revised: 06/17/2019] [Accepted: 07/29/2019] [Indexed: 12/12/2022]
Abstract
Current models for correlating electronic medical records with -omics data largely ignore clinical text, which is an important source of phenotype information for patients with cancer. This data convergence has the potential to reveal new insights about cancer initiation, progression, metastasis, and response to treatment. Insights from this real-world data will catalyze clinical care, research, and regulatory activities. Natural language processing (NLP) methods are needed to extract these rich cancer phenotypes from clinical text. Here, we review the advances of NLP and information extraction methods relevant to oncology based on publications from PubMed as well as NLP and machine learning conference proceedings in the last 3 years. Given the interdisciplinary nature of the fields of oncology and information extraction, this analysis serves as a critical trail marker on the path to higher fidelity oncology phenotypes from real-world data.
Collapse
Affiliation(s)
- Guergana K Savova
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts.
- Harvard Medical School, Boston, Massachusetts
| | | | | | - Timothy Miller
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts
- Harvard Medical School, Boston, Massachusetts
| | - Chen Lin
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts
| | - Danielle S Bitterman
- Harvard Medical School, Boston, Massachusetts
- Dana Farber Cancer Institute, Boston, Massachusetts
| | | | | |
Collapse
|
9
|
Extracting Patient-Centered Outcomes from Clinical Notes in Electronic Health Records: Assessment of Urinary Incontinence After Radical Prostatectomy. EGEMS 2019; 7:43. [PMID: 31497615 PMCID: PMC6706996 DOI: 10.5334/egems.297] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Objective: To assess documentation of urinary incontinence (UI) in prostatectomy patients using unstructured clinical notes from Electronic Health Records (EHRs). Methods: We developed a weakly-supervised natural language processing tool to extract assessments, as recorded in unstructured text notes, of UI before and after radical prostatectomy in a single academic practice across multiple clinicians. Validation was carried out using a subset of patients who completed EPIC-26 surveys before and after surgery. The prevalence of UI as assessed by EHR and EPIC-26 was compared using repeated-measures ANOVA. The agreement of reported UI between EHR and EPIC-26 was evaluated using Cohen’s Kappa coefficient. Results: A total of 4870 patients and 716 surveys were included. Preoperative prevalence of UI was 12.7 percent. Postoperative prevalence was 71.8 percent at 3 months, 50.2 percent at 6 months and 34.4 and 41.8 at 12 and 24 months, respectively. Similar rates were recorded by physicians in the EHR, particularly for early follow-up. For all time points, the agreement between EPIC-26 and the EHR was moderate (all p < 0.001) and ranged from 86.7 percent agreement at baseline (Kappa = 0.48) to 76.4 percent agreement at 24 months postoperative (Kappa = 0.047). Conclusions: We have developed a tool to assess documentation of UI after prostatectomy using EHR clinical notes. Our results suggest such a tool can facilitate unbiased measurement of important PCOs using real-word data, which are routinely recorded in EHR unstructured clinician notes. Integrating PCO information into clinical decision support can help guide shared treatment decisions and promote patient-valued care.
Collapse
|
10
|
Banerjee I, Li K, Seneviratne M, Ferrari M, Seto T, Brooks JD, Rubin DL, Hernandez-Boussard T. Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment. JAMIA Open 2019; 2:150-159. [PMID: 31032481 PMCID: PMC6482003 DOI: 10.1093/jamiaopen/ooy057] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 11/14/2018] [Accepted: 11/28/2018] [Indexed: 11/13/2022] Open
Abstract
Background The population-based assessment of patient-centered outcomes (PCOs) has been limited by the efficient and accurate collection of these data. Natural language processing (NLP) pipelines can determine whether a clinical note within an electronic medical record contains evidence on these data. We present and demonstrate the accuracy of an NLP pipeline that targets to assess the presence, absence, or risk discussion of two important PCOs following prostate cancer treatment: urinary incontinence (UI) and bowel dysfunction (BD). Methods We propose a weakly supervised NLP approach which annotates electronic medical record clinical notes without requiring manual chart review. A weighted function of neural word embedding was used to create a sentence-level vector representation of relevant expressions extracted from the clinical notes. Sentence vectors were used as input for a multinomial logistic model, with output being either presence, absence or risk discussion of UI/BD. The classifier was trained based on automated sentence annotation depending only on domain-specific dictionaries (weak supervision). Results The model achieved an average F1 score of 0.86 for the sentence-level, three-tier classification task (presence/absence/risk) in both UI and BD. The model also outperformed a pre-existing rule-based model for note-level annotation of UI with significant margin. Conclusions We demonstrate a machine learning method to categorize clinical notes based on important PCOs that trains a classifier on sentence vector representations labeled with a domain-specific dictionary, which eliminates the need for manual engineering of linguistic rules or manual chart review for extracting the PCOs. The weakly supervised NLP pipeline showed promising sensitivity and specificity for identifying important PCOs in unstructured clinical text notes compared to rule-based algorithms.
Collapse
Affiliation(s)
- Imon Banerjee
- Department of Biomedical Data Science, Stanford University School of Medicine, Medical School Office Building (MSOB), 1265 Welch Road, Stanford, California 94305-5479, USA
| | - Kevin Li
- Stanford University School of Medicine, 291 Campus Drive, Stanford, California 94305-5479, USA
| | - Martin Seneviratne
- Department of Biomedical Data Science, Stanford University School of Medicine, Medical School Office Building (MSOB), 1265 Welch Road, Stanford, California 94305-5479, USA
- Department of Biomedical Informatics, Stanford University School of Medicine, Medical School Office Building (MSOB), 1265 Welch Road, Stanford, California 94305-5479, USA
| | - Michelle Ferrari
- Department of Urology - Divisions, Stanford University School of Medicine, 875 Blake Wilbur, Stanford, California 94305-5479, USA
| | - Tina Seto
- IRT Research Technology, Stanford University School of Medicine, Stanford, California 94305-5479, USA
| | - James D Brooks
- Department of Urology - Divisions, Stanford University School of Medicine, 875 Blake Wilbur, Stanford, California 94305-5479, USA
| | - Daniel L Rubin
- Department of Biomedical Data Science, Stanford University School of Medicine, Medical School Office Building (MSOB), 1265 Welch Road, Stanford, California 94305-5479, USA
- Department of Radiology, Stanford University School of Medicine, Stanford, California 94305-5479, USA
- Department of Medicine (Biomedical Informatics), Stanford University School of Medicine, Medical School Office Building (MSOB), 1265 Welch Road, Stanford, California 94305-5479, USA
| | - Tina Hernandez-Boussard
- Department of Biomedical Data Science, Stanford University School of Medicine, Medical School Office Building (MSOB), 1265 Welch Road, Stanford, California 94305-5479, USA
- Department of Medicine (Biomedical Informatics), Stanford University School of Medicine, Medical School Office Building (MSOB), 1265 Welch Road, Stanford, California 94305-5479, USA
- Department of Surgery, Stanford University School of Medicine, 300 Pasteur Drive Stanford, California 94305-2200, USA
| |
Collapse
|
11
|
Liu LH, Choden S, Yazdany J. Quality improvement initiatives in rheumatology: an integrative review of the last 5 years. Curr Opin Rheumatol 2019; 31:98-108. [PMID: 30608250 PMCID: PMC7391997 DOI: 10.1097/bor.0000000000000586] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
PURPOSE OF REVIEW We reviewed recent quality improvement initiatives in the field of rheumatology to identify common strategies and themes leading to measurable change. RECENT FINDINGS Efforts to improve quality of care in rheumatology have accelerated in the last 5 years. Most studies in this area have focused on interventions to improve process measures such as increasing the collection of patient-reported outcomes and vaccination rates, but some studies have examined interventions to improve health outcomes. Increasingly, researchers are studying electronic health record (EHR)-based interventions, such as standardized templates, flowsheets, best practice alerts and order sets. EHR-based interventions were most successful when reinforced with provider education, reminders and performance feedback. Most studies also redesigned workflows, distributing tasks among clinical staff. Given the common challenges and solutions facing rheumatology clinics under new value-based payment models, there are important opportunities to accelerate quality improvement by building on the successful efforts to date. Structured quality improvement models such as the learning collaborative may help to disseminate successful initiatives across practices. SUMMARY Review of recent quality improvement initiatives in rheumatology demonstrated common solutions, particularly involving leveraging health IT and workflow redesign.
Collapse
Affiliation(s)
- Lucy H Liu
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, California, USA
| | | | | |
Collapse
|
12
|
Utilization of Prostate Cancer Quality Metrics for Research and Quality Improvement: A Structured Review. Jt Comm J Qual Patient Saf 2018; 45:217-226. [PMID: 30236510 DOI: 10.1016/j.jcjq.2018.06.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Revised: 06/21/2018] [Accepted: 06/26/2018] [Indexed: 02/07/2023]
Abstract
BACKGROUND The shift toward value-based care in the United States emphasizes the role of quality measures in payment models. Many diseases, such as prostate cancer, have a proliferation of quality measures, resulting in resource burden and physician burnout. This study aimed to identify and summarize proposed prostate cancer quality measures and describe their frequency and use in peer-reviewed literature. METHODS The PubMed database was used to identify quality measures relevant to prostate cancer care, and included articles in English through April 2018. A gray literature search for other documents was also conducted. After the selection process of the pertinent articles, measure characteristics were abstracted, and uses were summarized for the 10 most frequently utilized measures in the literature. RESULTS A total of 26 articles were identified for review. Of the 71 proposed prostate cancer quality measures, only 47 were used, and less than 10% of these were endorsed by the National Quality Forum. Process measures were most frequently reported (84.5%). Only 6 outcome measures (8.5%) were proposed-none of which were among the most frequently utilized. CONCLUSION Although a high number of proposed prostate cancer quality measures are reported in the literature, few were assessed, and the majority of these were non-endorsed process measures. Process measures were most commonly assessed; outcome measures were rarely evaluated. In a step to close the quality chasm, a "top 5" core set of quality measures for prostate cancer care, including structure, process, and outcomes measures, is suggested. Future studies should consider this comprehensive set of quality measures.
Collapse
|
13
|
Abstract
Background Electronic health record (EHR) based research in oncology can be limited by missing data and a lack of structured data elements. Clinical research data warehouses for specific cancer types can enable the creation of more robust research cohorts. Methods We linked data from the Stanford University EHR with the Stanford Cancer Institute Research Database (SCIRDB) and the California Cancer Registry (CCR) to create a research data warehouse for prostate cancer. The database was supplemented with information from clinical trials, natural language processing of clinical notes and surveys on patient-reported outcomes. Results 11,898 unique prostate cancer patients were identified in the Stanford EHR, of which 3,936 were matched to the Stanford cancer registry and 6153 in the CCR. 7158 patients with EHR data and at least one of SCIRDB and CCR data were initially included in the warehouse. Conclusions A disease-specific clinical research data warehouse combining multiple data sources can facilitate secondary data use and enhance observational research in oncology.
Collapse
|
14
|
Hernandez-Boussard T, Kourdis PD, Seto T, Ferrari M, Blayney DW, Rubin D, Brooks JD. Mining Electronic Health Records to Extract Patient-Centered Outcomes Following Prostate Cancer Treatment. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2017:876-882. [PMID: 29854154 PMCID: PMC5977629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The clinical, granular data in electronic health record (EHR) systems provide opportunities to improve patient care using informatics retrieval methods. However, it is well known that many methodological obstacles exist in accessing data within EHRs. In particular, clinical notes routinely stored in EHR are composed from narrative, highly unstructured and heterogeneous biomedical text. This inherent complexity hinders the ability to perform automated large-scale medical knowledge extraction tasks without the use of computational linguistics methods. The aim of this work was to develop and validate a Natural Language Processing (NLP) pipeline to detect important patient-centered outcomes (PCOs) as interpreted and documented by clinicians in their dictated notes for male patients receiving treatment for localized prostate cancer at an academic medical center.
Collapse
Affiliation(s)
| | | | - Tina Seto
- Stanford University, School of Medicine, Stanford, CA
| | | | | | - Daniel Rubin
- Stanford University, School of Medicine, Stanford, CA
| | | |
Collapse
|
15
|
Leveraging the electronic health record to improve quality and safety in rheumatology. Rheumatol Int 2017; 37:1603-1610. [PMID: 28852846 DOI: 10.1007/s00296-017-3804-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Accepted: 08/17/2017] [Indexed: 12/13/2022]
Abstract
During the last two decades, improving the quality and safety of healthcare has become a focus in rheumatology. Widespread use of electronic health records (EHRs) and the availability of digital data have the potential to drive quality improvement, improve patient outcomes, and prevent adverse events. In the coming years, developing and leveraging tools within the EHR will be the key to making the next big strides in improving the health of patients with rheumatoid arthritis and other rheumatic diseases, including building EHR infrastructure to capture patient outcomes and developing automated methods to retrieve information from free text of clinical notes.
Collapse
|
16
|
Holve E, Weiss S. Concordium 2015: Strategic Uses of Evidence to Transform Delivery Systems. EGEMS 2016; 4:1275. [PMID: 27683671 PMCID: PMC5019304 DOI: 10.13063/2327-9214.1275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
In September 2015 the EDM Forum hosted AcademyHealth’s newest national conference, Concordium. The 11 papers featured in the eGEMs “Concordium 2015” special issue successfully reflect the major themes and issues discussed at the meeting. Many of the papers address informatics or methodological approaches to natural language processing (NLP) or text analysis, which is indicative of the importance of analyzing text data to gain insights into care coordination and patient-centered outcomes. Perspectives on the tools and infrastructure requirements that are needed to build learning health systems were also recurrent themes.
Collapse
|