1
|
Tsui FR, Shi L, Ruiz V, Ryan ND, Biernesser C, Iyengar S, Walsh CG, Brent DA. Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts. JAMIA Open 2021; 4:ooab011. [PMID: 33758800 PMCID: PMC7966858 DOI: 10.1093/jamiaopen/ooab011] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 02/02/2021] [Accepted: 02/10/2021] [Indexed: 11/12/2022] Open
Abstract
OBJECTIVE Limited research exists in predicting first-time suicide attempts that account for two-thirds of suicide decedents. We aimed to predict first-time suicide attempts using a large data-driven approach that applies natural language processing (NLP) and machine learning (ML) to unstructured (narrative) clinical notes and structured electronic health record (EHR) data. METHODS This case-control study included patients aged 10-75 years who were seen between 2007 and 2016 from emergency departments and inpatient units. Cases were first-time suicide attempts from coded diagnosis; controls were randomly selected without suicide attempts regardless of demographics, following a ratio of nine controls per case. Four data-driven ML models were evaluated using 2-year historical EHR data prior to suicide attempt or control index visits, with prediction windows from 7 to 730 days. Patients without any historical notes were excluded. Model evaluation on accuracy and robustness was performed on a blind dataset (30% cohort). RESULTS The study cohort included 45 238 patients (5099 cases, 40 139 controls) comprising 54 651 variables from 5.7 million structured records and 798 665 notes. Using both unstructured and structured data resulted in significantly greater accuracy compared to structured data alone (area-under-the-curve [AUC]: 0.932 vs. 0.901 P < .001). The best-predicting model utilized 1726 variables with AUC = 0.932 (95% CI, 0.922-0.941). The model was robust across multiple prediction windows and subgroups by demographics, points of historical most recent clinical contact, and depression diagnosis history. CONCLUSIONS Our large data-driven approach using both structured and unstructured EHR data demonstrated accurate and robust first-time suicide attempt prediction, and has the potential to be deployed across various populations and clinical settings.
Collapse
Affiliation(s)
- Fuchiang R Tsui
- Tsui Laboratory, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Anesthesiology and Critical Care Medicine, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Anesthesiology and Critical Care, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Lingyun Shi
- Tsui Laboratory, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Victor Ruiz
- Tsui Laboratory, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Neal D Ryan
- Department of Psychiatry, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Candice Biernesser
- Department of Psychiatry, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Satish Iyengar
- Department of Statistics, School of Arts and Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Colin G Walsh
- Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, Tennessee, USA
| | - David A Brent
- Department of Psychiatry, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
2
|
Huppertz HI. Differenzialdiagnose der Schmerzen am Bewegungsapparat bei Kindern und Jugendlichen. Monatsschr Kinderheilkd 2020. [DOI: 10.1007/s00112-020-00984-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
3
|
Burke PC, Shirley RB, Raciniewski J, Simon JF, Wyllie R, Fraser TG. Development and Evaluation of a Fully Automated Surveillance System for Influenza-Associated Hospitalization at a Multihospital Health System in Northeast Ohio. Appl Clin Inform 2020; 11:564-569. [PMID: 32851617 DOI: 10.1055/s-0040-1715651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
Abstract
BACKGROUND Performing high-quality surveillance for influenza-associated hospitalization (IAH) is challenging, time-consuming, and essential. OBJECTIVES Our objectives were to develop a fully automated surveillance system for laboratory-confirmed IAH at our multihospital health system, to evaluate the performance of the automated system during the 2018 to 2019 influenza season at eight hospitals by comparing its sensitivity and positive predictive value to that of manual surveillance, and to estimate the time and cost savings associated with reliance on the automated surveillance system. METHODS Infection preventionists (IPs) perform manual surveillance for IAH by reviewing laboratory records and making a determination about each result. For automated surveillance, we programmed a query against our Enterprise Data Vault (EDV) for cases of IAH. The EDV query was established as a dynamic data source to feed our data visualization software, automatically updating every 24 hours.To establish a gold standard of cases of IAH against which to evaluate the performance of manual and automated surveillance systems, we generated a master list of possible IAH by querying four independent information systems. We reviewed medical records and adjudicated whether each possible case represented a true case of IAH. RESULTS We found 844 true cases of IAH, 577 (68.4%) of which were detected by the manual system and 774 (91.7%) of which were detected by the automated system. The positive predictive values of the manual and automated systems were 89.3 and 88.3%, respectively.Relying on the automated surveillance system for IAH resulted in an average recoup of 82 minutes per day for each IP and an estimated system-wide payroll redirection of $32,880 over the four heaviest weeks of influenza activity. CONCLUSION Surveillance for IAH can be entirely automated at multihospital health systems, saving time, and money while improving case detection.
Collapse
Affiliation(s)
- Patrick C Burke
- Department of Infection Prevention, Enterprise Quality and Patient Safety, Cleveland Clinic, Cleveland, Ohio, United States
| | - Rachel Benish Shirley
- Enterprise Quality and Patient Safety, Cleveland Clinic, Cleveland, Ohio, United States
| | - Jacob Raciniewski
- Department of Enterprise Analytics, Cleveland Clinic, Cleveland, Ohio, United States
| | - James F Simon
- Medical Operations Department, Cleveland Clinic, Cleveland, Ohio, United States
| | - Robert Wyllie
- Medical Operations Department, Cleveland Clinic, Cleveland, Ohio, United States
| | - Thomas G Fraser
- Department of Infectious Diseases, Cleveland Clinic, Cleveland, Ohio, United States
| |
Collapse
|
4
|
Aronis JM, Ferraro JP, Gesteland PH, Tsui F, Ye Y, Wagner MM, Cooper GF. A Bayesian approach for detecting a disease that is not being modeled. PLoS One 2020; 15:e0229658. [PMID: 32109254 PMCID: PMC7048291 DOI: 10.1371/journal.pone.0229658] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Accepted: 02/12/2020] [Indexed: 11/19/2022] Open
Abstract
Over the past decade, outbreaks of new or reemergent viruses such as severe acute respiratory syndrome (SARS) virus, Middle East respiratory syndrome (MERS) virus, and Zika have claimed thousands of lives and cost governments and healthcare systems billions of dollars. Because the appearance of new or transformed diseases is likely to continue, the detection and characterization of emergent diseases is an important problem. We describe a Bayesian statistical model that can detect and characterize previously unknown and unmodeled diseases from patient-care reports and evaluate its performance on historical data.
Collapse
Affiliation(s)
- John M. Aronis
- Real-time Outbreak and Disease Surveillance (RODS) Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Jeffrey P. Ferraro
- Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, United States of America
| | - Per H. Gesteland
- Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, United States of America
| | - Fuchiang Tsui
- Real-time Outbreak and Disease Surveillance (RODS) Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Ye Ye
- Real-time Outbreak and Disease Surveillance (RODS) Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Michael M. Wagner
- Real-time Outbreak and Disease Surveillance (RODS) Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Gregory F. Cooper
- Real-time Outbreak and Disease Surveillance (RODS) Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|