1
|
Poulsen MN, Freda PJ, Troiani V, Mowery DL. Developing a Framework to Infer Opioid Use Disorder Severity From Clinical Notes to Inform Natural Language Processing Methods: Characterization Study. JMIR Ment Health 2024; 11:e53366. [PMID: 38224481 PMCID: PMC10825772 DOI: 10.2196/53366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 11/30/2023] [Accepted: 12/02/2023] [Indexed: 01/16/2024] Open
Abstract
BACKGROUND Information regarding opioid use disorder (OUD) status and severity is important for patient care. Clinical notes provide valuable information for detecting and characterizing problematic opioid use, necessitating development of natural language processing (NLP) tools, which in turn requires reliably labeled OUD-relevant text and understanding of documentation patterns. OBJECTIVE To inform automated NLP methods, we aimed to develop and evaluate an annotation schema for characterizing OUD and its severity, and to document patterns of OUD-relevant information within clinical notes of heterogeneous patient cohorts. METHODS We developed an annotation schema to characterize OUD severity based on criteria from the Diagnostic and Statistical Manual of Mental Disorders, 5th edition. In total, 2 annotators reviewed clinical notes from key encounters of 100 adult patients with varied evidence of OUD, including patients with and those without chronic pain, with and without medication treatment for OUD, and a control group. We completed annotations at the sentence level. We calculated severity scores based on annotation of note text with 18 classes aligned with criteria for OUD severity and determined positive predictive values for OUD severity. RESULTS The annotation schema contained 27 classes. We annotated 1436 sentences from 82 patients; notes of 18 patients (11 of whom were controls) contained no relevant information. Interannotator agreement was above 70% for 11 of 15 batches of reviewed notes. Severity scores for control group patients were all 0. Among noncontrol patients, the mean severity score was 5.1 (SD 3.2), indicating moderate OUD, and the positive predictive value for detecting moderate or severe OUD was 0.71. Progress notes and notes from emergency department and outpatient settings contained the most and greatest diversity of information. Substance misuse and psychiatric classes were most prevalent and highly correlated across note types with high co-occurrence across patients. CONCLUSIONS Implementation of the annotation schema demonstrated strong potential for inferring OUD severity based on key information in a small set of clinical notes and highlighting where such information is documented. These advancements will facilitate NLP tool development to improve OUD prevention, diagnosis, and treatment.
Collapse
Affiliation(s)
- Melissa N Poulsen
- Department of Population Health Sciences, Geisinger, Danville, PA, United States
| | - Philip J Freda
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, West Hollywood, CA, United States
| | - Vanessa Troiani
- Department of Autism and Developmental Medicine, Geisinger, Danville, PA, United States
| | - Danielle L Mowery
- Department of Biostatistics, Epidemiology and Informatics, Institute of Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
2
|
Freda PJ, Kranzler HR, Moore JH. Novel digital approaches to the assessment of problematic opioid use. BioData Min 2022; 15:14. [PMID: 35840990 PMCID: PMC9284824 DOI: 10.1186/s13040-022-00301-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Accepted: 06/30/2022] [Indexed: 11/16/2022] Open
Abstract
The opioid epidemic continues to contribute to loss of life through overdose and significant social and economic burdens. Many individuals who develop problematic opioid use (POU) do so after being exposed to prescribed opioid analgesics. Therefore, it is important to accurately identify and classify risk factors for POU. In this review, we discuss the etiology of POU and highlight novel approaches to identifying its risk factors. These approaches include the application of polygenic risk scores (PRS) and diverse machine learning (ML) algorithms used in tandem with data from electronic health records (EHR), clinical notes, patient demographics, and digital footprints. The implementation and synergy of these types of data and approaches can greatly assist in reducing the incidence of POU and opioid-related mortality by increasing the knowledge base of patient-related risk factors, which can help to improve prescribing practices for opioid analgesics.
Collapse
Affiliation(s)
- Philip J Freda
- Cedars-Sinai Medical Center, Department of Computational Biomedicine, 700 N. San Vicente Blvd., Pacific Design Center Suite G540, West Hollywood, CA, 90069, USA.
| | - Henry R Kranzler
- University of Pennsylvania, Center for Studies of Addiction, 3535 Market St., Suite 500 and Crescenz VAMC, 3800 Woodland Ave., Philadelphia, PA, 19104, USA
| | - Jason H Moore
- Cedars-Sinai Medical Center, Department of Computational Biomedicine, 700 N. San Vicente Blvd., Pacific Design Center Suite G540, West Hollywood, CA, 90069, USA
| |
Collapse
|
3
|
Poulsen MN, Freda PJ, Troiani V, Davoudi A, Mowery DL. Classifying Characteristics of Opioid Use Disorder From Hospital Discharge Summaries Using Natural Language Processing. Front Public Health 2022; 10:850619. [PMID: 35615042 PMCID: PMC9124945 DOI: 10.3389/fpubh.2022.850619] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 04/19/2022] [Indexed: 11/25/2022] Open
Abstract
Background Opioid use disorder (OUD) is underdiagnosed in health system settings, limiting research on OUD using electronic health records (EHRs). Medical encounter notes can enrich structured EHR data with documented signs and symptoms of OUD and social risks and behaviors. To capture this information at scale, natural language processing (NLP) tools must be developed and evaluated. We developed and applied an annotation schema to deeply characterize OUD and related clinical, behavioral, and environmental factors, and automated the annotation schema using machine learning and deep learning-based approaches. Methods Using the MIMIC-III Critical Care Database, we queried hospital discharge summaries of patients with International Classification of Diseases (ICD-9) OUD diagnostic codes. We developed an annotation schema to characterize problematic opioid use, identify individuals with potential OUD, and provide psychosocial context. Two annotators reviewed discharge summaries from 100 patients. We randomly sampled patients with their associated annotated sentences and divided them into training (66 patients; 2,127 annotated sentences) and testing (29 patients; 1,149 annotated sentences) sets. We used the training set to generate features, employing three NLP algorithms/knowledge sources. We trained and tested prediction models for classification with a traditional machine learner (logistic regression) and deep learning approach (Autogluon based on ELECTRA's replaced token detection model). We applied a five-fold cross-validation approach to reduce bias in performance estimates. Results The resulting annotation schema contained 32 classes. We achieved moderate inter-annotator agreement, with F1-scores across all classes increasing from 48 to 66%. Five classes had a sufficient number of annotations for automation; of these, we observed consistently high performance (F1-scores) across training and testing sets for drug screening (training: 91-96; testing: 91-94) and opioid type (training: 86-96; testing: 86-99). Performance dropped from training and to testing sets for other drug use (training: 52-65; testing: 40-48), pain management (training: 72-78; testing: 61-78) and psychiatric (training: 73-80; testing: 72). Autogluon achieved the highest performance. Conclusion This pilot study demonstrated that rich information regarding problematic opioid use can be manually identified by annotators. However, more training samples and features would improve our ability to reliably identify less common classes from clinical text, including text from outpatient settings.
Collapse
Affiliation(s)
- Melissa N. Poulsen
- Department of Population Health Sciences, Geisinger, Danville, PA, United States
| | - Philip J. Freda
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, United States
| | - Vanessa Troiani
- Autism and Developmental Medicine Institute, Geisinger, Danville, PA, United States
| | - Anahita Davoudi
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, United States
| | - Danielle L. Mowery
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, United States
- Department of Biostatistics, Epidemiology and Informatics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
4
|
Lin EJD, Schroeder M, Huang Y, Linwood SL. Digital Health for the Opioid Crisis: A Historical Analysis of NIH Funding from 2013 to 2017. Digit Health 2022. [DOI: 10.36255/exon-publications-digital-health-opioid-crisis] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
5
|
Decker BM, Hill CE, Baldassano SN, Khankhanian P. Can antiepileptic efficacy and epilepsy variables be studied from electronic health records? A review of current approaches. Seizure 2021; 85:138-144. [PMID: 33461032 DOI: 10.1016/j.seizure.2020.11.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 11/16/2020] [Accepted: 11/17/2020] [Indexed: 12/16/2022] Open
Abstract
As automated data extraction and natural language processing (NLP) are rapidly evolving, improving healthcare delivery by harnessing large data is garnering great interest. Assessing antiepileptic drug (AED) efficacy and other epilepsy variables pertinent to healthcare delivery remain a critical barrier to improving patient care. In this systematic review, we examined automatic electronic health record (EHR) extraction methodologies pertinent to epilepsy. We also reviewed more generalizable NLP pipelines to extract other critical patient variables. Our review found varying reports of performance measures. Whereas automated data extraction pipelines are a crucial advancement, this review calls attention to standardizing NLP methodology and accuracy reporting for greater generalizability. Moreover, the use of crowdsourcing competitions to spur innovative NLP pipelines would further advance this field.
Collapse
Affiliation(s)
- Barbara M Decker
- Center for Neuroengineering and Therapeutics, Department of Neurology, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA, 19104, United States.
| | - Chloé E Hill
- Department of Neurology, University of Michigan, 1500 East Medical Center Drive, Ann Arbor, MI, 48109, United States
| | - Steven N Baldassano
- Center for Neuroengineering and Therapeutics, Department of Neurology, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA, 19104, United States
| | - Pouya Khankhanian
- Center for Neuroengineering and Therapeutics, Department of Neurology, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA, 19104, United States
| |
Collapse
|
6
|
Hazlehurst B, Green CA, Perrin NA, Brandes J, Carrell DS, Baer A, DeVeaugh-Geiss A, Coplan PM. Using natural language processing of clinical text to enhance identification of opioid-related overdoses in electronic health records data. Pharmacoepidemiol Drug Saf 2019; 28:1143-1151. [PMID: 31218780 PMCID: PMC6772185 DOI: 10.1002/pds.4810] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 04/24/2019] [Accepted: 05/08/2019] [Indexed: 01/04/2023]
Abstract
Purpose To enhance automated methods for accurately identifying opioid‐related overdoses and classifying types of overdose using electronic health record (EHR) databases. Methods We developed a natural language processing (NLP) software application to code clinical text documentation of overdose, including identification of intention for self‐harm, substances involved, substance abuse, and error in medication usage. Using datasets balanced with cases of suspected overdose and records of individuals at elevated risk for overdose, we developed and validated the application using Kaiser Permanente Northwest data, then tested portability of the application using Kaiser Permanente Washington data. Datasets were chart‐reviewed to provide a gold standard for comparison and evaluation of the automated method. Results The method performed well in identifying overdose (sensitivity = 0.80, specificity = 0.93), intentional overdose (sensitivity = 0.81, specificity = 0.98), and involvement of opioids (excluding heroin, sensitivity = 0.72, specificity = 0.96) and heroin (sensitivity = 0.84, specificity = 1.0). The method performed poorly at identifying adverse drug reactions and overdose due to patient error and fairly at identifying substance abuse in opioid‐related unintentional overdose (sensitivity = 0.67, specificity = 0.96). Evaluation using validation datasets yielded significant reductions, in specificity and negative predictive values only, for many classifications mentioned above. However, these measures remained above 0.80, thus, performance observed during development was largely maintained during validation. Similar results were obtained when evaluating portability, although there was a significant reduction in sensitivity for unintentional overdose that was attributed to missing text clinical notes in the database. Conclusions Methods that process text clinical notes show promise for improving accuracy and fidelity at identifying and classifying overdoses according to type using EHR data.
Collapse
Affiliation(s)
- Brian Hazlehurst
- Center for Health Research, Kaiser Permanente Northwest, Portland, OR
| | - Carla A Green
- Center for Health Research, Kaiser Permanente Northwest, Portland, OR
| | - Nancy A Perrin
- Center for Health Research, Kaiser Permanente Northwest, Portland, OR
| | - John Brandes
- Center for Health Research, Kaiser Permanente Northwest, Portland, OR
| | - David S Carrell
- Health Research Institute, Kaiser Permanente Washington, Seattle, WA
| | - Andrew Baer
- Group Health Research Institute, Group Health Cooperative, Seattle, WA
| | | | - Paul M Coplan
- Epidemiology, Medical Affairs, Purdue Pharma, LP, Stamford, CT.,Adjunct, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| |
Collapse
|
7
|
Clark MR, Hurley RW, Adams MCB. Re-assessing the Validity of the Opioid Risk Tool in a Tertiary Academic Pain Management Center Population. PAIN MEDICINE (MALDEN, MASS.) 2018; 19:1382-1395. [PMID: 29408996 PMCID: PMC7191882 DOI: 10.1093/pm/pnx332] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
OBJECTIVE To analyze the validity of the Opioid Risk Tool (ORT) in a large. diverse population. DESIGN A cross-sectional descriptive study. SETTING Academic tertiary pain management center. SUBJECTS A total of 225 consecutive new patients, aged 18 years or older. METHODS Data collection included demographics, ORT scores, aberrant behaviors, pain intensity scores, opioid type and dose, smoking status, employment, and marital status. RESULTS In this population, we were not able to replicate the findings of the initial ORT study. Self-report was no better than chance in predicting those who would have an opioid aberrant behavior. The ORT risk variables did not predict aberrant behaviors in either gender group. There was significant disparity in the scores between self-reported ORT and the ORT supplemented with medical record data (enhanced ORT). Using the enhanced ORT, high-risk patients were 2.5 times more likely to have an aberrant behavior than the low-risk group. The only risk variable associated with aberrant behavior was personal history of prescription drug misuse. CONCLUSIONS The self-report ORT was not a valid test for the prediction of future aberrant behaviors in this academic pain management population. The original risk categories (low, medium, high) were not supported in the either the self-reported version or the enhanced version; however, the enhanced data were able to differentiate between high- and low-risk patients. Unfortunately, without technological automation, the enhanced ORT suffers from practical limitations. The self-report ORT may not be a valid tool in current pain populations; however, modification into a binary (high/low) score system needs further study.
Collapse
Affiliation(s)
- Meredith R Clark
- Division of Pain Medicine, Department of Anesthesiology, Medical College of Wisconsin, Wauwatosa, Wisconsin
| | - Robert W Hurley
- Section of Pain Medicine, Department of Anesthesiology, Wake Forest School of Medicine, Medical Center Drive, Winston-Salem, North Carolina, USA
| | - Meredith C B Adams
- Section of Pain Medicine, Department of Anesthesiology, Wake Forest School of Medicine, Medical Center Drive, Winston-Salem, North Carolina, USA
| |
Collapse
|
8
|
Névéol A, Zweigenbaum P. Making Sense of Big Textual Data for Health Care: Findings from the Section on Clinical Natural Language Processing. Yearb Med Inform 2017; 26:228-234. [PMID: 29063569 PMCID: PMC6239234 DOI: 10.15265/iy-2017-027] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Indexed: 02/01/2023] Open
Abstract
Objectives: To summarize recent research and present a selection of the best papers published in 2016 in the field of clinical Natural Language Processing (NLP). Method: A survey of the literature was performed by the two section editors of the IMIA Yearbook NLP section. Bibliographic databases were searched for papers with a focus on NLP efforts applied to clinical texts or aimed at a clinical outcome. Papers were automatically ranked and then manually reviewed based on titles and abstracts. A shortlist of candidate best papers was first selected by the section editors before being peer-reviewed by independent external reviewers. Results: The five clinical NLP best papers provide a contribution that ranges from emerging original foundational methods to transitioning solid established research results to a practical clinical setting. They offer a framework for abbreviation disambiguation and coreference resolution, a classification method to identify clinically useful sentences, an analysis of counseling conversations to improve support to patients with mental disorder and grounding of gradable adjectives. Conclusions: Clinical NLP continued to thrive in 2016, with an increasing number of contributions towards applications compared to fundamental methods. Fundamental work addresses increasingly complex problems such as lexical semantics, coreference resolution, and discourse analysis. Research results translate into freely available tools, mainly for English.
Collapse
Affiliation(s)
- A. Névéol
- LIMSI, CNRS, Université Paris Saclay, Orsay, France
| | | | | |
Collapse
|