1
|
Chafjiri FMA, Reece L, Voke L, Landschaft A, Clark J, Kimia AA, Loddenkemper T. Natural language processing for identification of refractory status epilepticus in children. Epilepsia 2023; 64:3227-3237. [PMID: 37804085 DOI: 10.1111/epi.17789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 10/03/2023] [Accepted: 10/03/2023] [Indexed: 10/08/2023]
Abstract
OBJECTIVE Pediatric status epilepticus is one of the most frequent pediatric emergencies, with high mortality and morbidity. Utilizing electronic health records (EHRs) permits analysis of care approaches and disease outcomes at a lower cost than prospective research. However, reviewing EHR manually is time intensive. We aimed to compare refractory status epilepticus (rSE) cases identified by human EHR review with a natural language processing (NLP)-assisted rSE screen followed by a manual review. METHODS We used the NLP screening tool Document Review Tool (DrT) to generate regular expressions, trained a bag-of-words NLP classifier on EHRs from 2017 to 2019, and then tested our algorithm on data from February to December 2012. We compared results from manual review to NLP-assisted search followed by manual review. RESULTS Our algorithm identified 1528 notes in the test set. After removing notes pertaining to the same event by DrT, the user reviewed a total number of 400 notes to find patients with rSE. Within these 400 notes, we identified 31 rSE cases, including 12 new cases not found in manual review, and 19 of the 20 previously identified cases. The NLP-assisted model found 31 of 32 cases, with a sensitivity of 96.88% (95% CI = 82%-99.84%), whereas manual review identified 20 of 32 cases, with a sensitivity of 62.5% (95% CI = 43.75%-78.34%). SIGNIFICANCE DrT provided a highly sensitive model compared to human review and an increase in patient identification through EHRs. The use of DrT is a suitable application of NLP for identifying patients with a history of recent rSE, which ultimately contributes to the implementation of monitoring techniques and treatments in near real time.
Collapse
Affiliation(s)
- Fatemeh Mohammad Alizadeh Chafjiri
- Department of Neurology, Division of Epilepsy and Clinical Neurophysiology, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Latania Reece
- Department of Neurology, Division of Epilepsy and Clinical Neurophysiology, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
- Nexamp, Boston, Massachusetts, USA
| | - Lillian Voke
- Department of Neurology, Division of Epilepsy and Clinical Neurophysiology, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | | | - Justice Clark
- Department of Neurology, Division of Epilepsy and Clinical Neurophysiology, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Amir A Kimia
- Department of Medicine, Division of Emergency Medicine, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
- Connecticut Children's Hospital, Hartford, Connecticut, USA
| | - Tobias Loddenkemper
- Department of Neurology, Division of Epilepsy and Clinical Neurophysiology, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
2
|
Cloud Services for Patient Cohort Identification Using the Informatics for Integrating Biology and the Bedside Platform. BIOMED RESEARCH INTERNATIONAL 2020; 2020:2851713. [PMID: 32724799 PMCID: PMC7366204 DOI: 10.1155/2020/2851713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 06/08/2020] [Accepted: 06/15/2020] [Indexed: 11/17/2022]
Abstract
Despite the widespread use of the “Informatics for Integrating Biology and the Bedside” (i2b2) platform, there are substantial challenges for loading electronic health records (EHR) into i2b2 and for querying i2b2. We have previously presented a simplified framework for semantic abstraction of EHR records into i2b2. Building on our previous work, we have created a proof-of-concept implementation of cloud services on an i2b2 data store for cohort identification. Specifically, we have implemented a graphical user interface (GUI) that declares the key components for data import, transformation, and query of EHR data. The GUI integrates with Azure cloud services to create data pipelines for importing EHR data into i2b2, creation of derived facts, and querying for generating Sankey-like flow diagrams that characterize the patient cohorts. We have evaluated the implementation using the real-world MIMIC-III dataset. We discuss the key features of this implementation and direction for future work, which will advance the efforts of the research community for patient cohort identification.
Collapse
|
3
|
Spengler H, Lang C, Mahapatra T, Gatz I, Kuhn KA, Prasser F. Enabling Agile Clinical and Translational Data Warehousing: Platform Development and Evaluation. JMIR Med Inform 2020; 8:e15918. [PMID: 32706673 PMCID: PMC7404007 DOI: 10.2196/15918] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Revised: 02/16/2020] [Accepted: 05/06/2020] [Indexed: 01/16/2023] Open
Abstract
Background Modern data-driven medical research provides new insights into the development and course of diseases and enables novel methods of clinical decision support. Clinical and translational data warehouses, such as Informatics for Integrating Biology and the Bedside (i2b2) and tranSMART, are important infrastructure components that provide users with unified access to the large heterogeneous data sets needed to realize this and support use cases such as cohort selection, hypothesis generation, and ad hoc data analysis. Objective Often, different warehousing platforms are needed to support different use cases and different types of data. Moreover, to achieve an optimal data representation within the target systems, specific domain knowledge is needed when designing data-loading processes. Consequently, informaticians need to work closely with clinicians and researchers in short iterations. This is a challenging task as installing and maintaining warehousing platforms can be complex and time consuming. Furthermore, data loading typically requires significant effort in terms of data preprocessing, cleansing, and restructuring. The platform described in this study aims to address these challenges. Methods We formulated system requirements to achieve agility in terms of platform management and data loading. The derived system architecture includes a cloud infrastructure with unified management interfaces for multiple warehouse platforms and a data-loading pipeline with a declarative configuration paradigm and meta-loading approach. The latter compiles data and configuration files into forms required by existing loading tools, thereby automating a wide range of data restructuring and cleansing tasks. We demonstrated the fulfillment of the requirements and the originality of our approach by an experimental evaluation and a comparison with previous work. Results The platform supports both i2b2 and tranSMART with built-in security. Our experiments showed that the loading pipeline accepts input data that cannot be loaded with existing tools without preprocessing. Moreover, it lowered efforts significantly, reducing the size of configuration files required by factors of up to 22 for tranSMART and 1135 for i2b2. The time required to perform the compilation process was roughly equivalent to the time required for actual data loading. Comparison with other tools showed that our solution was the only tool fulfilling all requirements. Conclusions Our platform significantly reduces the efforts required for managing clinical and translational warehouses and for loading data in various formats and structures, such as complex entity-attribute-value structures often found in laboratory data. Moreover, it facilitates the iterative refinement of data representations in the target platforms, as the required configuration files are very compact. The quantitative measurements presented are consistent with our experiences of significantly reduced efforts for building warehousing platforms in close cooperation with medical researchers. Both the cloud-based hosting infrastructure and the data-loading pipeline are available to the community as open source software with comprehensive documentation.
Collapse
Affiliation(s)
- Helmut Spengler
- Institute of Medical Informatics, Statistics and Epidemiology, University Medical Center rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany
| | - Claudia Lang
- Institute of Medical Informatics, Statistics and Epidemiology, University Medical Center rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany
| | - Tanmaya Mahapatra
- Institute of Medical Informatics, Statistics and Epidemiology, University Medical Center rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany
| | - Ingrid Gatz
- Institute of Medical Informatics, Statistics and Epidemiology, University Medical Center rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany
| | - Klaus A Kuhn
- Institute of Medical Informatics, Statistics and Epidemiology, University Medical Center rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany
| | - Fabian Prasser
- Charité - Universitätsmedizin Berlin, Berlin, Germany.,Berlin Institute of Health, Berlin, Germany
| |
Collapse
|
4
|
Weiss RJ, Bates SV, Song Y, Zhang Y, Herzberg EM, Chen YC, Gong M, Chien I, Zhang L, Murphy SN, Gollub RL, Grant PE, Ou Y. Mining multi-site clinical data to develop machine learning MRI biomarkers: application to neonatal hypoxic ischemic encephalopathy. J Transl Med 2019; 17:385. [PMID: 31752923 PMCID: PMC6873573 DOI: 10.1186/s12967-019-2119-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Accepted: 10/31/2019] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Secondary and retrospective use of hospital-hosted clinical data provides a time- and cost-efficient alternative to prospective clinical trials for biomarker development. This study aims to create a retrospective clinical dataset of Magnetic Resonance Images (MRI) and clinical records of neonatal hypoxic ischemic encephalopathy (HIE), from which clinically-relevant analytic algorithms can be developed for MRI-based HIE lesion detection and outcome prediction. METHODS This retrospective study will use clinical registries and big data informatics tools to build a multi-site dataset that contains structural and diffusion MRI, clinical information including hospital course, short-term outcomes (during infancy), and long-term outcomes (~ 2 years of age) for at least 300 patients from multiple hospitals. DISCUSSION Within machine learning frameworks, we will test whether the quantified deviation from our recently-developed normative brain atlases can detect abnormal regions and predict outcomes for individual patients as accurately as, or even more accurately, than human experts. Trial Registration Not applicable. This study protocol mines existing clinical data thus does not meet the ICMJE definition of a clinical trial that requires registration.
Collapse
Affiliation(s)
- Rebecca J Weiss
- Division of Newborn Medicine, Department of Pediatrics, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
| | - Sara V Bates
- Division of Newborn Medicine, Department of Pediatrics, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
| | - Ya'nan Song
- Fetal Neonatal Neuroimaging and Developmental Science Center (FNNDSC), Boston Children's Hospital, Harvard Medical School, 401 Park Drive, Landmark Center 7022, Boston, MA, 02115, USA
| | - Yue Zhang
- Fetal Neonatal Neuroimaging and Developmental Science Center (FNNDSC), Boston Children's Hospital, Harvard Medical School, 401 Park Drive, Landmark Center 7022, Boston, MA, 02115, USA
| | - Emily M Herzberg
- Division of Newborn Medicine, Department of Pediatrics, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
| | - Yih-Chieh Chen
- Division of Newborn Medicine, Department of Pediatrics, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
| | - Maryann Gong
- Computer Science & Artificial Intelligence Lab (CSAIL), Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Isabel Chien
- Computer Science & Artificial Intelligence Lab (CSAIL), Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Lily Zhang
- Computer Science & Artificial Intelligence Lab (CSAIL), Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Shawn N Murphy
- Laboratory of Computer Science, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
| | - Randy L Gollub
- Department of Psychiatry and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
| | - P Ellen Grant
- Fetal Neonatal Neuroimaging and Developmental Science Center (FNNDSC), Boston Children's Hospital, Harvard Medical School, 401 Park Drive, Landmark Center 7022, Boston, MA, 02115, USA.
- Neuroradiology Division, Department of Radiology, Boston Children's Hospital, Harvard Medical School, Boston, MA, 02115, USA.
| | - Yangming Ou
- Fetal Neonatal Neuroimaging and Developmental Science Center (FNNDSC), Boston Children's Hospital, Harvard Medical School, 401 Park Drive, Landmark Center 7022, Boston, MA, 02115, USA.
- Neuroradiology Division, Department of Radiology, Boston Children's Hospital, Harvard Medical School, Boston, MA, 02115, USA.
- Computational Health Informatics Program (CHIP), Boston Children's Hospital, Harvard Medical School, Boston, MA, 02115, USA.
| |
Collapse
|
5
|
Wagholikar KB, Fischer CM, Goodson AP, Herrick CD, Maclean TE, Smith KV, Fera L, Gaziano TA, Dunning JR, Bosque-Hamilton J, Matta L, Toscano E, Richter B, Ainsworth L, Oates MF, Aronson S, MacRae CA, Scirica BM, Desai AS, Murphy SN. Phenotyping to Facilitate Accrual for a Cardiovascular Intervention. J Clin Med Res 2019; 11:458-463. [PMID: 31143314 PMCID: PMC6522233 DOI: 10.14740/jocmr3830] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Accepted: 04/30/2019] [Indexed: 01/29/2023] Open
Abstract
Background The conventional approach for clinical studies is to identify a cohort of potentially eligible patients and then screen for enrollment. In an effort to reduce the cost and manual effort involved in the screening process, several studies have leveraged electronic health records (EHR) to refine cohorts to better match the eligibility criteria, which is referred to as phenotyping. We extend this approach to dynamically identify a cohort by repeating phenotyping in alternation with manual screening. Methods Our approach consists of multiple screen cycles. At the start of each cycle, the phenotyping algorithm is used to identify eligible patients from the EHR, creating an ordered list such that patients that are most likely eligible are listed first. This list is then manually screened, and the results are analyzed to improve the phenotyping for the next cycle. We describe the preliminary results and challenges in the implementation of this approach for an intervention study on heart failure. Results A total of 1,022 patients were screened, with 223 (23%) of patients being found eligible for enrollment into the intervention study. The iterative approach improved the phenotyping in each screening cycle. Without an iterative approach, the positive screening rate (PSR) was expected to dip below the 20% measured in the first cycle; however, the cyclical approach increased the PSR to 23%. Conclusions Our study demonstrates that dynamic phenotyping can facilitate recruitment for prospective clinical study. Future directions include improved informatics infrastructure and governance policies to enable real-time updates to research repositories, tooling for EHR annotation, and methodologies to reduce human annotation.
Collapse
Affiliation(s)
- Kavishwar B Wagholikar
- Harvard Medical School, Boston, MA, USA.,Massachusetts General Hospital, Boston, MA, USA
| | | | | | | | | | | | | | | | | | | | - Lina Matta
- Brigham and Women's Hospital, Boston, MA, USA
| | | | | | | | | | | | - Calum A MacRae
- Harvard Medical School, Boston, MA, USA.,Brigham and Women's Hospital, Boston, MA, USA
| | - Benjamin M Scirica
- Harvard Medical School, Boston, MA, USA.,Brigham and Women's Hospital, Boston, MA, USA
| | - Akshay S Desai
- Harvard Medical School, Boston, MA, USA.,Brigham and Women's Hospital, Boston, MA, USA
| | - Shawn N Murphy
- Harvard Medical School, Boston, MA, USA.,Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
6
|
Wagholikar KB, Ainsworth L, Vernekar VP, Pathak A, Glynn C, Zelle D, Zagade A, Karipineni N, Herrick CD, McPartlin M, Bui TV, Mendis M, Klann J, Oates M, Gordon W, Cannon C, Patel R, Aronson SJ, MacRae CA, Scirica BM, Murphy SN. Extending i2b2 into a framework for semantic abstraction of EHR to facilitate rapid development and portability of Health IT applications. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2019; 2019:370-378. [PMID: 31258990 PMCID: PMC6568124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The wide gap between a care provider's conceptualization of electronic health record (EHR) and the structures for electronic health record (EHR) data storage and transmission, presents a multitude of obstacles for development of innovative Health IT applications. While developers model the EHR view of the clinicians at one end, they work with a different data view to construct health IT applications. Although there has been considerable progress to bridge this gap by evolution of developer friendly standards and tools for terminology mapping and data warehousing, there is a need for a simplified framework to facilitate development of interoperable applications. To this end, we propose a framework for creating a layer of semantic abstraction on the EHR and describe preliminary work on the implementation of this framework for management of hyperlipidemia and hypertension. Our goal is to facilitate the rapid development and portability of Health IT applications.
Collapse
Affiliation(s)
- Kavishwar B Wagholikar
- Harvard Medical School, Boston, MA
- Massachusetts General Hospital, Boston, MA
- Partners Healthcare Boston, MA
| | | | | | | | | | | | | | - Neelima Karipineni
- Harvard Medical School, Boston, MA
- Brigham and Women's Hospital, Boston, MA
| | | | - Marian McPartlin
- Harvard Medical School, Boston, MA
- Brigham and Women's Hospital, Boston, MA
- Massachusetts General Hospital, Boston, MA
- Persistent Systems, Pune, India
- Partners Healthcare Boston, MA
| | - Tiffany V Bui
- Harvard Medical School, Boston, MA
- Brigham and Women's Hospital, Boston, MA
- Massachusetts General Hospital, Boston, MA
- Persistent Systems, Pune, India
- Partners Healthcare Boston, MA
| | | | - Jeffery Klann
- Harvard Medical School, Boston, MA
- Massachusetts General Hospital, Boston, MA
- Partners Healthcare Boston, MA
| | | | | | - Christopher Cannon
- Harvard Medical School, Boston, MA
- Brigham and Women's Hospital, Boston, MA
- Massachusetts General Hospital, Boston, MA
- Persistent Systems, Pune, India
- Partners Healthcare Boston, MA
| | | | | | - Calum A MacRae
- Harvard Medical School, Boston, MA
- Brigham and Women's Hospital, Boston, MA
| | - Benjamin M Scirica
- Harvard Medical School, Boston, MA
- Brigham and Women's Hospital, Boston, MA
| | - Shawn N Murphy
- Harvard Medical School, Boston, MA
- Massachusetts General Hospital, Boston, MA
| |
Collapse
|