1
|
Adekkanattu P, Furmanchuk A, Wu Y, Pathak A, Patra BG, Bost S, Morrow D, Wang GHM, Yang Y, Forrest NJ, Luo Y, Walunas TL, Jenny WHLC, Gelad W, Bian J, Bao Y, Weiner M, Oslin D, Pathak J. Detection of Personal and Family History of Suicidal Thoughts and Behaviors using Deep Learning and Natural Language Processing: A Multi-Site Study. RESEARCH SQUARE 2024:rs.3.rs-4014472. [PMID: 38559051 PMCID: PMC10980141 DOI: 10.21203/rs.3.rs-4014472/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Objective Personal and family history of suicidal thoughts and behaviors (PSH and FSH, respectively) are significant risk factors associated with future suicide events. These are often captured in narrative clinical notes in electronic health records (EHRs). Collaboratively, Weill Cornell Medicine (WCM), Northwestern Medicine (NM), and the University of Florida (UF) developed and validated deep learning (DL)-based natural language processing (NLP) tools to detect PSH and FSH from such notes. The tool's performance was further benchmarked against a method relying exclusively on ICD-9/10 diagnosis codes. Materials and Methods We developed DL-based NLP tools utilizing pre-trained transformer models Bio_ClinicalBERT and GatorTron, and compared them with expert-informed, rule-based methods. The tools were initially developed and validated using manually annotated clinical notes at WCM. Their portability and performance were further evaluated using clinical notes at NM and UF. Results The DL tools outperformed the rule-based NLP tool in identifying PSH and FHS. For detecting PSH, the rule-based system obtained an F1-score of 0.75 ± 0.07, while the Bio_ClinicalBERT and GatorTron DL tools scored 0.83 ± 0.09 and 0.84 ± 0.07, respectively. For detecting FSH, the rule-based NLP tool's F1-score was 0.69 ± 0.11, compared to 0.89 ± 0.10 for Bio_ClinicalBERT and 0.92 ± 0.07 for GatorTron. For the gold standard corpora across the three sites, only 2.2% (WCM), 9.3% (NM), and 7.8% (UF) of patients reported to have an ICD-9/10 diagnosis code for suicidal thoughts and behaviors prior to the clinical notes report date. The best performing GatorTron DL tool identified 93.0% (WCM), 80.4% (NM), and 89.0% (UF) of patients with documented PSH, and 85.0%(WCM), 89.5%(NM), and 100%(UF) of patients with documented FSH in their notes. Discussion While PSH and FSH are significant risk factors for future suicide events, little effort has been made previously to identify individuals with these history. To address this, we developed a transformer based DL method and compared with conventional rule-based NLP approach. The varying effectiveness of the rule-based tools across sites suggests a need for improvement in its dictionary-based approach. In contrast, the performances of the DL tools were higher and comparable across sites. Furthermore, DL tools were fine-tuned using only small number of annotated notes at each site, underscores its greater adaptability to local documentation practices and lexical variations. Conclusion Variations in local documentation practices across health care systems pose challenges to rule-based NLP tools. In contrast, the developed DL tools can effectively extract PSH and FSH information from unstructured clinical notes. These tools will provide clinicians with crucial information for assessing and treating patients at elevated risk for suicide who are rarely been diagnosed.
Collapse
Affiliation(s)
| | - Al'ona Furmanchuk
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Yonghui Wu
- University of Florida College of Medicine, Gainesville, FL, USA
| | - Aman Pathak
- University of Florida College of Medicine, Gainesville, FL, USA
| | | | - Sarah Bost
- University of Florida College of Medicine, Gainesville, FL, USA
| | | | | | - Yuyang Yang
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | | | - Yuan Luo
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Theresa L Walunas
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Wei-Hsuan Lo-Ciganic Jenny
- University of Florida College of Medicine, Gainesville, FL, USA
- University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Walid Gelad
- University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Jiang Bian
- University of Florida College of Medicine, Gainesville, FL, USA
| | - Yuhua Bao
- Weill Cornell Medicine, New York, NY, USA
| | | | - David Oslin
- Corporal Michael J Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
| | | |
Collapse
|
2
|
Meerwijk EL, Jones GA, Shotqara AS, Reyes S, Tamang SR, Eddington HS, Reeves RM, Finlay AK, Harris AHS. Development of a 3-Step theory of suicide ontology to facilitate 3ST factor extraction from clinical progress notes. J Biomed Inform 2024; 150:104582. [PMID: 38160758 DOI: 10.1016/j.jbi.2023.104582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/21/2023] [Accepted: 12/22/2023] [Indexed: 01/03/2024]
Abstract
OBJECTIVE Suicide risk prediction algorithms at the Veterans Health Administration (VHA) do not include predictors based on the 3-Step Theory of suicide (3ST), which builds on hopelessness, psychological pain, connectedness, and capacity for suicide. These four factors are not available from structured fields in VHA electronic health records, but they are found in unstructured clinical text. An ontology and controlled vocabulary that maps psychosocial and behavioral terms to these factors does not exist. The objectives of this study were 1) to develop an ontology with a controlled vocabulary of terms that map onto classes that represent the 3ST factors as identified within electronic clinical progress notes, and 2) to determine the accuracy of automated extractions based on terms in the controlled vocabulary. METHODS A team of four annotators did linguistic annotation of 30,000 clinical progress notes from 231 Veterans in VHA electronic health records who attempted suicide or who died by suicide for terms relating to the 3ST factors. Annotation involved manually assigning a label to words or phrases that indicated presence or absence of the factor (polarity). These words and phrases were entered into a controlled vocabulary that was then used by our computational system to tag 14 million clinical progress notes from Veterans who attempted or died by suicide after 2013. Tagged text was extracted and machine-labelled for presence or absence of the 3ST factors. Accuracy of these machine-labels was determined for 1000 randomly selected extractions for each factor against a ground truth created by our annotators. RESULTS Linguistic annotation identified 8486 terms that related to 33 subclasses across the four factors and polarities. Precision of machine-labeled extractions ranged from 0.73 to 1.00 for most factor-polarity combinations, whereas recall was somewhat lower 0.65-0.91. CONCLUSION The ontology that was developed consists of classes that represent each of the four 3ST factors, subclasses, relationships, and terms that map onto those classes which are stored in a controlled vocabulary (https://bioportal.bioontology.org/ontologies/THREE-ST). The use case that we present shows how scores based on clinical notes tagged for terms in the controlled vocabulary capture meaningful change in the 3ST factors during weeks preceding a suicidal event.
Collapse
Affiliation(s)
- Esther L Meerwijk
- VA Health Services Research & Development, Center for Innovation to Implementation (Ci2i), VA Palo Alto Health Care System, Menlo Park, CA, USA.
| | - Gabrielle A Jones
- VA Health Services Research & Development, Center for Innovation to Implementation (Ci2i), VA Palo Alto Health Care System, Menlo Park, CA, USA
| | - Asqar S Shotqara
- VA Health Services Research & Development, Center for Innovation to Implementation (Ci2i), VA Palo Alto Health Care System, Menlo Park, CA, USA
| | - Sofia Reyes
- VA Health Services Research & Development, Center for Innovation to Implementation (Ci2i), VA Palo Alto Health Care System, Menlo Park, CA, USA
| | - Suzanne R Tamang
- VA Health Services Research & Development, Center for Innovation to Implementation (Ci2i), VA Palo Alto Health Care System, Menlo Park, CA, USA; Department of Medicine, Stanford University, Stanford, CA, USA
| | - Hyrum S Eddington
- VA Health Services Research & Development, Center for Innovation to Implementation (Ci2i), VA Palo Alto Health Care System, Menlo Park, CA, USA; Department of Surgery, Stanford University, Stanford, CA, USA
| | - Ruth M Reeves
- VA Tennessee Valley Healthcare System, Nashville, TN, USA; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Andrea K Finlay
- VA Health Services Research & Development, Center for Innovation to Implementation (Ci2i), VA Palo Alto Health Care System, Menlo Park, CA, USA; VA National Center on Homelessness Among Veterans, USA; Schar School of Policy and Government, George Mason University, Arlington, VA, USA
| | - Alex H S Harris
- VA Health Services Research & Development, Center for Innovation to Implementation (Ci2i), VA Palo Alto Health Care System, Menlo Park, CA, USA; Department of Surgery, Stanford University, Stanford, CA, USA
| |
Collapse
|