1
|
Peet ED, Schultz D, Lovejoy S, Tsui FR. The infant health effects of doulas: Leveraging big data and machine learning to inform cost-effective targeting. Health Econ 2024; 33:1387-1411. [PMID: 38462670 DOI: 10.1002/hec.4821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 12/14/2023] [Accepted: 02/19/2024] [Indexed: 03/12/2024]
Abstract
Doula services represent an underutilized maternal and child health intervention with the potential to improve outcomes through the provision of physical, emotional, and informational support. However, there is limited evidence of the infant health effects of doulas despite well-established connections between maternal and infant health. Moreover, because the availability of doulas is limited and often not covered by insurers, existing evidence leaves unclear if or how doula services should be allocated to achieve the greatest improvements in outcomes. We use unique data and machine learning to develop accurate predictive models of infant health and doula service participation. We then combine these predictive models within the double machine learning method to estimate the effects of doula services. We show that while doula services reduce risk on average, the benefits of doula services increase as the risk of negative infant health outcomes increases. We compare these benefits to the costs of doula services under alternative allocation schemes and show that leveraging the risk predictions dramatically increases the cost effectiveness of doula services. Our results show the potential of big data and novel analytic methods to provide cost-effective support to those at greatest risk of poor outcomes.
Collapse
Affiliation(s)
- Evan D Peet
- RAND Corporation, Pittsburgh, Pennsylvania, USA
| | | | | | - Fuchiang Rich Tsui
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
2
|
Chen CC, Massey SL, Kirschen MP, Yuan I, Padiyath A, Simpao AF, Tsui FR. Electroencephalogram-based machine learning models to predict neurologic outcome after cardiac arrest: A systematic review. Resuscitation 2024; 194:110049. [PMID: 37972682 PMCID: PMC11023717 DOI: 10.1016/j.resuscitation.2023.110049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/07/2023] [Accepted: 11/09/2023] [Indexed: 11/19/2023]
Abstract
AIM OF THE REVIEW The primary aim of this systematic review was to investigate the most common electroencephalogram (EEG)-based machine learning (ML) model with the highest Area Under Receiver Operating Characteristic Curve (AUC) in two ML categories, conventional ML and Deep Neural Network (DNN), to predict the neurologic outcomes after cardiac arrest; the secondary aim was to investigate common EEG features applied to ML models. METHODS Systematic search of medical literature from PubMed and engineering literature from Compendex up to June 2, 2023. One reviewer screened studies that used EEG-based ML models to predict the neurologic outcomes after cardiac arrest. Four reviewers validated that the studies met selection criteria. Nine variables were manually extracted. The top-five common EEG features were calculated. We evaluated each study's risk of bias using the Quality in Prognosis Studies guideline. RESULTS Out of 351 identified studies, 17 studies met the inclusion criteria. Random Forest (RF) (n = 7) was the most common ML model in the conventional ML category (n = 11), followed by Convolutional Neural Network (CNN) (n = 4) in the DNN category (n = 6). The AUCs for RF ranged between 0.8 and 0.97, while CNN had AUCs between 0.7 and 0.92. The top-three commonly used EEG features were band power (n = 12), Shannon's Entropy (n = 11), burst-suppression ratio (n = 9). CONCLUSIONS RF and CNN were the two most common ML models with the highest AUCs for predicting the neurologic outcomes after cardiac arrest. Using a multimodal model that combines EEG features and electronic health record data may further improve prognostic performance.
Collapse
Affiliation(s)
- Chao-Chen Chen
- Tsui Laboratory, Children's Hospital of Philadelphia, 734 Schuylkill Ave, Philadelphia, PA 19146, United States; Department of Bioengineering, University of Pennsylvania, 240 Skirkanich Hall, 210 S 33rd St, Philadelphia, PA 19104, United States
| | - Shavonne L Massey
- Department of Neurology and Pediatrics, Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA 19104, United States; Perelman School of Medicine, University of Pennsylvania, 3400 Civic Center Blvd, Philadelphia, PA 19104, United States
| | - Matthew P Kirschen
- Department of Neurology and Pediatrics, Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA 19104, United States; Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA 19104, United States; Perelman School of Medicine, University of Pennsylvania, 3400 Civic Center Blvd, Philadelphia, PA 19104, United States
| | - Ian Yuan
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA 19104, United States; Perelman School of Medicine, University of Pennsylvania, 3400 Civic Center Blvd, Philadelphia, PA 19104, United States
| | - Asif Padiyath
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA 19104, United States; Perelman School of Medicine, University of Pennsylvania, 3400 Civic Center Blvd, Philadelphia, PA 19104, United States
| | - Allan F Simpao
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA 19104, United States; Perelman School of Medicine, University of Pennsylvania, 3400 Civic Center Blvd, Philadelphia, PA 19104, United States
| | - Fuchiang Rich Tsui
- Tsui Laboratory, Children's Hospital of Philadelphia, 734 Schuylkill Ave, Philadelphia, PA 19146, United States; Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA 19104, United States; Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, 423 Guardian Dr, Philadelphia, PA 19104, United States; Perelman School of Medicine, University of Pennsylvania, 3400 Civic Center Blvd, Philadelphia, PA 19104, United States.
| |
Collapse
|
3
|
van de Kamp E, Ma J, Monangi N, Tsui FR, Jani SG, Kim JH, Kahn RS, Wang CJ. Addressing Health-Related Social Needs and Mental Health Needs in the Neonatal Intensive Care Unit: Exploring Challenges and the Potential of Technology. Int J Environ Res Public Health 2023; 20:7161. [PMID: 38131713 PMCID: PMC10742453 DOI: 10.3390/ijerph20247161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 11/21/2023] [Accepted: 12/06/2023] [Indexed: 12/23/2023]
Abstract
Unaddressed health-related social needs (HRSNs) and parental mental health needs in an infant's environment can negatively affect their health outcomes. This study examines the challenges and potential technological solutions for addressing these needs in the neonatal intensive care unit (NICU) setting and beyond. In all, 22 semistructured interviews were conducted with members of the NICU care team and other relevant stakeholders, based on an interpretive description approach. The participants were selected from three safety net hospitals in the U.S. with level IV NICUs. The challenges identified include navigating the multitude of burdens families in the NICU experience, resource constraints within and beyond the health system, a lack of streamlined or consistent processes, no closed-loop referrals to track status and outcomes, and gaps in support postdischarge. Opportunities for leveraging technology to facilitate screening and referral include automating screening, initiating risk-based referrals, using remote check-ins, facilitating resource navigation, tracking referrals, and providing language support. However, technological implementations should avoid perpetuating disparities and consider potential privacy or data-sharing concerns. Although advances in technological health tools alone cannot address all the challenges, they have the potential to offer dynamic tools to support the healthcare setting in identifying and addressing the unique needs and circumstances of each family in the NICU.
Collapse
Affiliation(s)
- Eline van de Kamp
- Athena Institute, Faculty of Science, Vrije Universiteit Amsterdam, 1081 HV Amsterdam, The Netherlands;
| | - Jasmin Ma
- Center for Policy, Outcomes, and Prevention, Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA; (J.M.); (S.G.J.)
| | - Nagendra Monangi
- Division of Neonatology, Perinatal Institute, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA; (N.M.); (J.H.K.)
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA;
| | - Fuchiang Rich Tsui
- Tsui Laboratory, Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, PA 19146, USA;
- Department of Anesthesiology and Critical Care Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Shilpa G. Jani
- Center for Policy, Outcomes, and Prevention, Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA; (J.M.); (S.G.J.)
| | - Jae H. Kim
- Division of Neonatology, Perinatal Institute, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA; (N.M.); (J.H.K.)
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA;
| | - Robert S. Kahn
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA;
- Michael Fisher Child Health Equity Center, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
| | - C. Jason Wang
- Center for Policy, Outcomes, and Prevention, Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA; (J.M.); (S.G.J.)
- Department of Pediatrics and Department of Health Policy, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
4
|
Isserman RS, Yuan I, Elliott EM, Muhly WT, Iyer RS, Farrell HA, Varallo DA, Georgostathi G, Richter AG, Stiso J, Tsui FR, Feldman JM. Reducing the environmental impact of mask inductions in children: A quality improvement report. Paediatr Anaesth 2023; 33:728-735. [PMID: 37203788 DOI: 10.1111/pan.14695] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 05/04/2023] [Accepted: 05/08/2023] [Indexed: 05/20/2023]
Abstract
BACKGROUND Inhalational anesthetic agents are potent greenhouse gases with global warming potential that far exceed that of carbon dioxide. Traditionally, pediatric inhalation inductions are achieved with a volatile anesthetic delivered to the patient in oxygen and nitrous oxide at high fresh gas flows. While contemporary volatile anesthetics and anesthesia machines allow for a more environmentally conscious induction, practice has not changed. We aimed to reduce the environmental impact of our inhalation inductions by decreasing the use of nitrous oxide and fresh gas flows. METHODS Through a series of four plan-do-study-act cycles, the improvement team used content experts to demonstrate the environmental impact of the current inductions and to provide practical ways to reduce this, by focusing on nitrous oxide use and fresh gas flows, with visual reminders introduced at point of delivery. The primary measures were the percentage of inhalation inductions that used nitrous oxide and the maximum fresh gas flows/kg during the induction period. Statistical process control charts were used to measure improvement over time. RESULTS 33 285 inhalation inductions were included over a 20-month period. nitrous oxide use decreased from 80% to <20% and maximum fresh gas flows/kg decreased from a rate of 0.53 L/min/kg to 0.38 L/min/kg, an overall reduction of 28%. Reduction in fresh gas flows was greatest in the lightest weight groups. Induction times and behaviors remained unchanged over the duration of this project. CONCLUSIONS Our quality improvement group decreased the environmental impact of inhalation inductions and created cultural change within our department to sustain change and foster the pursuit of future environmental efforts.
Collapse
Affiliation(s)
- Rebecca S Isserman
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Pennsylvania, Philadelphia, USA
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Ian Yuan
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Pennsylvania, Philadelphia, USA
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Elizabeth M Elliott
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Pennsylvania, Philadelphia, USA
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Wallis T Muhly
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Pennsylvania, Philadelphia, USA
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Rajeev S Iyer
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Pennsylvania, Philadelphia, USA
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Heather A Farrell
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Pennsylvania, Philadelphia, USA
| | - Domonique A Varallo
- Center for Healthcare Quality and Analytics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Georgia Georgostathi
- School of Engineering and Applied Sciences, Univeristy of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Adam G Richter
- Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Jennifer Stiso
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Pennsylvania, Philadelphia, USA
| | - Fuchiang Rich Tsui
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Pennsylvania, Philadelphia, USA
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Jeffrey M Feldman
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Pennsylvania, Philadelphia, USA
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
5
|
Ruiz VM, Goldsmith MP, Shi L, Simpao AF, Gálvez JA, Naim MY, Nadkarni V, Gaynor JW, Tsui FR. Early prediction of clinical deterioration using data-driven machine-learning modeling of electronic health records. J Thorac Cardiovasc Surg 2021; 164:211-222.e3. [PMID: 34949457 DOI: 10.1016/j.jtcvs.2021.10.060] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 10/13/2021] [Accepted: 10/28/2021] [Indexed: 12/23/2022]
Abstract
OBJECTIVES To develop and evaluate a high-dimensional, data-driven model to identify patients at high risk of clinical deterioration from routinely collected electronic health record (EHR) data. MATERIALS AND METHODS In this single-center, retrospective cohort study, 488 patients with single-ventricle and shunt-dependent congenital heart disease <6 months old were admitted to the cardiac intensive care unit before stage 2 palliation between 2014 and 2019. Using machine-learning techniques, we developed the Intensive care Warning Index (I-WIN), which systematically assessed 1028 regularly collected EHR variables (vital signs, medications, laboratory tests, and diagnoses) to identify patients in the cardiac intensive care unit at elevated risk of clinical deterioration. An ensemble of 5 extreme gradient boosting models was developed and validated on 203 cases (130 emergent endotracheal intubations, 34 cardiac arrests requiring cardiopulmonary resuscitation, 10 extracorporeal membrane oxygenation cannulations, and 29 cardiac arrests requiring cardiopulmonary resuscitation onto extracorporeal membrane oxygenation) and 378 control periods from 446 patients. RESULTS At 4 hours before deterioration, the model achieved an area under the receiver operating characteristic curve of 0.92 (95% confidence interval, 0.84-0.98), 0.881 sensitivity, 0.776 positive predictive value, 0.862 specificity, and 0.571 Brier skill score. Performance remained high at 8 hours before deterioration with 0.815 (0.688-0.921) area under the receiver operating characteristic curve. CONCLUSIONS I-WIN accurately predicted deterioration events in critically-ill infants with high-risk congenital heart disease up to 8 hours before deterioration, potentially allowing clinicians to target interventions. We propose a paradigm shift from conventional expert consensus-based selection of risk factors to a data-driven, machine-learning methodology for risk prediction. With the increased availability of data capture in EHRs, I-WIN can be extended to broader applications in data-rich environments in critical care.
Collapse
Affiliation(s)
- Victor M Ruiz
- Tsui Laboratory, Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pa
| | - Michael P Goldsmith
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, Pa; Pereleman School of Medicine, University of Pennsylvania, Philadelphia, Pa
| | - Lingyun Shi
- Tsui Laboratory, Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pa
| | - Allan F Simpao
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, Pa; Pereleman School of Medicine, University of Pennsylvania, Philadelphia, Pa
| | - Jorge A Gálvez
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, Pa; Pereleman School of Medicine, University of Pennsylvania, Philadelphia, Pa
| | - Maryam Y Naim
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, Pa; Pereleman School of Medicine, University of Pennsylvania, Philadelphia, Pa
| | - Vinay Nadkarni
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, Pa; Pereleman School of Medicine, University of Pennsylvania, Philadelphia, Pa
| | - J William Gaynor
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, Pa; Pereleman School of Medicine, University of Pennsylvania, Philadelphia, Pa
| | - Fuchiang Rich Tsui
- Tsui Laboratory, Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pa; Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, Pa; Pereleman School of Medicine, University of Pennsylvania, Philadelphia, Pa.
| |
Collapse
|
6
|
Chen CC, Tsui FR. Comparing different wavelet transforms on removing electrocardiogram baseline wanders and special trends. BMC Med Inform Decis Mak 2020; 20:343. [PMID: 33380333 PMCID: PMC7772919 DOI: 10.1186/s12911-020-01349-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 11/23/2020] [Indexed: 11/21/2022] Open
Abstract
Background Electrocardiogram (ECG) signal, an important indicator for heart problems, is commonly corrupted by a low-frequency baseline wander (BW) artifact, which may cause interpretation difficulty or inaccurate analysis. Unlike current state-of-the-art approach using band-pass filters, wavelet transforms can accurately capture both time and frequency information of a signal. However, extant literature is limited in applying wavelet transforms (WTs) for baseline wander removal. In this study, we aimed to evaluate 5 wavelet families with a total of 14 wavelets for removing ECG baseline wanders from a semi-synthetic dataset. Methods We created a semi-synthetic ECG dataset based on a public QT Database on Physionet repository with ECG data from 105 patients. The semi-synthetic ECG dataset comprised ECG excerpts from the QT database superimposed with artificial baseline wanders. We extracted one ECG excerpt from each of 105 patients, and the ECG excerpt comprised 14 s of randomly selected ECG data. Twelve baseline wanders were manually generated, including sinusoidal waves, spikes and step functions. We implemented and evaluated 14 commonly used wavelets up to 12 WT levels. The evaluation metric was mean-square-error (MSE) between the original ECG excerpt and the processed signal with artificial BW removed. Results Among the 14 wavelets, Daubechies-3 wavelet and Symlets-3 wavelet with 7 levels of WT had best performance, MSE = 0.0044. The average MSEs for sinusoidal waves, step, and spike functions were 0.0271, 0.0304, 0.0199 respectively. For artificial baseline wanders with spikes or step functions, wavelet transforms in general had lower performance in removing the BW; however, WTs accurately located the temporal position of an impulse edge. Conclusions We found wavelet transforms in general accurately removed various baseline wanders. Daubechies-3 and Symlets-3 wavelets performed best. The study could facilitate future real-time processing of streaming ECG signals for clinical decision support systems.
Collapse
Affiliation(s)
- Chao-Chen Chen
- Tsui Laboratory, Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.,Department of Biomedical Engineering, National Cheng-Kung University, Tainan, Taiwan
| | - Fuchiang Rich Tsui
- Tsui Laboratory, Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA. .,Department of Anesthesiology and Critical Care, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
7
|
Barda AJ, Ruiz VM, Gigliotti T, Tsui FR. An argument for reporting data standardization procedures in multi-site predictive modeling: case study on the impact of LOINC standardization on model performance. JAMIA Open 2019; 2:197-204. [PMID: 30944914 PMCID: PMC6435008 DOI: 10.1093/jamiaopen/ooy063] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2018] [Revised: 11/22/2018] [Accepted: 12/20/2018] [Indexed: 11/13/2022] Open
Abstract
Objectives We aimed to gain a better understanding of how standardization of laboratory data can impact predictive model performance in multi-site datasets. We hypothesized that standardizing local laboratory codes to logical observation identifiers names and codes (LOINC) would produce predictive models that significantly outperform those learned utilizing local laboratory codes. Materials and Methods We predicted 30-day hospital readmission for a set of heart failure-specific visits to 13 hospitals from 2008 to 2012. Laboratory test results were extracted and then manually cleaned and mapped to LOINC. We extracted features to summarize laboratory data for each patient and used a training dataset (2008–2011) to learn models using a variety of feature selection techniques and classifiers. We evaluated our hypothesis by comparing model performance on an independent test dataset (2012). Results Models that utilized LOINC performed significantly better than models that utilized local laboratory test codes, regardless of the feature selection technique and classifier approach used. Discussion and Conclusion We quantitatively demonstrated the positive impact of standardizing multi-site laboratory data to LOINC prior to use in predictive models. We used our findings to argue for the need for detailed reporting of data standardization procedures in predictive modeling, especially in studies leveraging multi-site datasets extracted from electronic health records.
Collapse
Affiliation(s)
- Amie J Barda
- Tsui Laboratory, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.,Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Victor M Ruiz
- Tsui Laboratory, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.,Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Tony Gigliotti
- Information Services Division, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA
| | - Fuchiang Rich Tsui
- Tsui Laboratory, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.,Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.,Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.,Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.,School of Computing Information, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.,Department of Bioengineering, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
8
|
Ferraro JP, Ye Y, Gesteland PH, Haug PJ, Tsui FR, Cooper GF, Van Bree R, Ginter T, Nowalk AJ, Wagner M. The effects of natural language processing on cross-institutional portability of influenza case detection for disease surveillance. Appl Clin Inform 2017; 8:560-580. [PMID: 28561130 PMCID: PMC6241736 DOI: 10.4338/aci-2016-12-ra-0211] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2016] [Accepted: 03/11/2017] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVES This study evaluates the accuracy and portability of a natural language processing (NLP) tool for extracting clinical findings of influenza from clinical notes across two large healthcare systems. Effectiveness is evaluated on how well NLP supports downstream influenza case-detection for disease surveillance. METHODS We independently developed two NLP parsers, one at Intermountain Healthcare (IH) in Utah and the other at University of Pittsburgh Medical Center (UPMC) using local clinical notes from emergency department (ED) encounters of influenza. We measured NLP parser performance for the presence and absence of 70 clinical findings indicative of influenza. We then developed Bayesian network models from NLP processed reports and tested their ability to discriminate among cases of (1) influenza, (2) non-influenza influenza-like illness (NI-ILI), and (3) 'other' diagnosis. RESULTS On Intermountain Healthcare reports, recall and precision of the IH NLP parser were 0.71 and 0.75, respectively, and UPMC NLP parser, 0.67 and 0.79. On University of Pittsburgh Medical Center reports, recall and precision of the UPMC NLP parser were 0.73 and 0.80, respectively, and IH NLP parser, 0.53 and 0.80. Bayesian case-detection performance measured by AUROC for influenza versus non-influenza on Intermountain Healthcare cases was 0.93 (using IH NLP parser) and 0.93 (using UPMC NLP parser). Case-detection on University of Pittsburgh Medical Center cases was 0.95 (using UPMC NLP parser) and 0.83 (using IH NLP parser). For influenza versus NI-ILI on Intermountain Healthcare cases performance was 0.70 (using IH NLP parser) and 0.76 (using UPMC NLP parser). On University of Pisstburgh Medical Center cases, 0.76 (using UPMC NLP parser) and 0.65 (using IH NLP parser). CONCLUSION In all but one instance (influenza versus NI-ILI using IH cases), local parsers were more effective at supporting case-detection although performances of non-local parsers were reasonable.
Collapse
Affiliation(s)
- Jeffrey P Ferraro
- Jeffrey P. Ferraro, Homer Warner Center | Intermountain Healthcare, 5171 South Cottonwood St, Suite 220, Murray, Utah 84107, , Tel: 801-244-6570
| | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Posada JD, Barda AJ, Shi L, Xue D, Ruiz V, Kuan PH, Ryan ND, Tsui FR. Predictive modeling for classification of positive valence system symptom severity from initial psychiatric evaluation records. J Biomed Inform 2017; 75S:S94-S104. [PMID: 28571784 DOI: 10.1016/j.jbi.2017.05.019] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Revised: 04/21/2017] [Accepted: 05/26/2017] [Indexed: 01/13/2023]
Abstract
In response to the challenges set forth by the CEGS N-GRID 2016 Shared Task in Clinical Natural Language Processing, we describe a framework to automatically classify initial psychiatric evaluation records to one of four positive valence system severities: absent, mild, moderate, or severe. We used a dataset provided by the event organizers to develop a framework comprised of natural language processing (NLP) modules and 3 predictive models (two decision tree models and one Bayesian network model) used in the competition. We also developed two additional predictive models for comparison purpose. To evaluate our framework, we employed a blind test dataset provided by the 2016 CEGS N-GRID. The predictive scores, measured by the macro averaged-inverse normalized mean absolute error score, from the two decision trees and Naïve Bayes models were 82.56%, 82.18%, and 80.56%, respectively. The proposed framework in this paper can potentially be applied to other predictive tasks for processing initial psychiatric evaluation records, such as predicting 30-day psychiatric readmissions.
Collapse
Affiliation(s)
- Jose D Posada
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd., Pittsburgh, PA 15206, United States; Electronics and Telecommunications Engineer Program, Universidad Autónoma del Caribe, CI. 90 #46-112, Barranquilla, Atlántico, Colombia
| | - Amie J Barda
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd., Pittsburgh, PA 15206, United States
| | - Lingyun Shi
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd., Pittsburgh, PA 15206, United States
| | - Diyang Xue
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd., Pittsburgh, PA 15206, United States
| | - Victor Ruiz
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd., Pittsburgh, PA 15206, United States
| | - Pei-Han Kuan
- Institute of Manufacturing Information and System, National Cheng-Kung University, Tainan, Taiwan
| | - Neal D Ryan
- Department of Psychiatry, University of Pittsburgh, 3811 O'Hara St., Pittsburgh, PA 15213, United States
| | - Fuchiang Rich Tsui
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd., Pittsburgh, PA 15206, United States.
| |
Collapse
|
10
|
López Pineda A, Ye Y, Visweswaran S, Cooper GF, Wagner MM, Tsui FR. Comparison of machine learning classifiers for influenza detection from emergency department free-text reports. J Biomed Inform 2015; 58:60-69. [PMID: 26385375 PMCID: PMC4684714 DOI: 10.1016/j.jbi.2015.08.019] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Revised: 05/28/2015] [Accepted: 08/21/2015] [Indexed: 12/31/2022]
Abstract
Influenza is a yearly recurrent disease that has the potential to become a pandemic. An effective biosurveillance system is required for early detection of the disease. In our previous studies, we have shown that electronic Emergency Department (ED) free-text reports can be of value to improve influenza detection in real time. This paper studies seven machine learning (ML) classifiers for influenza detection, compares their diagnostic capabilities against an expert-built influenza Bayesian classifier, and evaluates different ways of handling missing clinical information from the free-text reports. We identified 31,268 ED reports from 4 hospitals between 2008 and 2011 to form two different datasets: training (468 cases, 29,004 controls), and test (176 cases and 1620 controls). We employed Topaz, a natural language processing (NLP) tool, to extract influenza-related findings and to encode them into one of three values: Acute, Non-acute, and Missing. Results show that all ML classifiers had areas under ROCs (AUC) ranging from 0.88 to 0.93, and performed significantly better than the expert-built Bayesian model. Missing clinical information marked as a value of missing (not missing at random) had a consistently improved performance among 3 (out of 4) ML classifiers when it was compared with the configuration of not assigning a value of missing (missing completely at random). The case/control ratios did not affect the classification performance given the large number of training cases. Our study demonstrates ED reports in conjunction with the use of ML and NLP with the handling of missing value information have a great potential for the detection of infectious diseases.
Collapse
Affiliation(s)
- Arturo López Pineda
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Boulevard, Pittsburgh, PA, United States
| | - Ye Ye
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Boulevard, Pittsburgh, PA, United States; Intelligent System Program, University of Pittsburgh Dietrich School of Arts and Sciences, 210 South Bouquet Street, Pittsburgh, PA, United States
| | - Shyam Visweswaran
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Boulevard, Pittsburgh, PA, United States; Intelligent System Program, University of Pittsburgh Dietrich School of Arts and Sciences, 210 South Bouquet Street, Pittsburgh, PA, United States
| | - Gregory F Cooper
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Boulevard, Pittsburgh, PA, United States; Intelligent System Program, University of Pittsburgh Dietrich School of Arts and Sciences, 210 South Bouquet Street, Pittsburgh, PA, United States
| | - Michael M Wagner
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Boulevard, Pittsburgh, PA, United States; Intelligent System Program, University of Pittsburgh Dietrich School of Arts and Sciences, 210 South Bouquet Street, Pittsburgh, PA, United States
| | - Fuchiang Rich Tsui
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Boulevard, Pittsburgh, PA, United States; Intelligent System Program, University of Pittsburgh Dietrich School of Arts and Sciences, 210 South Bouquet Street, Pittsburgh, PA, United States.
| |
Collapse
|
11
|
Amin W, Tsui FR, Borromeo C, Chuang CH, Espino JU, Ford D, Hwang W, Kapoor W, Lehmann H, Martich GD, Morton S, Paranjape A, Shirey W, Sorensen A, Becich MJ, Hess R. PaTH: towards a learning health system in the Mid-Atlantic region. J Am Med Inform Assoc 2014; 21:633-6. [PMID: 24821745 PMCID: PMC4078296 DOI: 10.1136/amiajnl-2014-002759] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2014] [Revised: 03/19/2014] [Accepted: 03/25/2014] [Indexed: 12/02/2022] Open
Abstract
The PaTH (University of Pittsburgh/UPMC, Penn State College of Medicine, Temple University Hospital, and Johns Hopkins University) clinical data research network initiative is a collaborative effort among four academic health centers in the Mid-Atlantic region. PaTH will provide robust infrastructure to conduct research, explore clinical outcomes, link with biospecimens, and improve methods for sharing and analyzing data across our diverse populations. Our disease foci are idiopathic pulmonary fibrosis, atrial fibrillation, and obesity. The four network sites have extensive experience in using data from electronic health records and have devised robust methods for patient outreach and recruitment. The network will adopt best practices by using the open-source data-sharing tool, Informatics for Integrating Biology and the Bedside (i2b2), at each site to enhance data sharing using centrally defined common data elements, and will use the Shared Health Research Information Network (SHRINE) for distributed queries across the network.
Collapse
Affiliation(s)
- Waqas Amin
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| | - Fuchiang Rich Tsui
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| | - Charles Borromeo
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| | - Cynthia H Chuang
- Department of Medicine and Public Health Sciences, Penn State College of Medicine, Hershey, Pennsylvania, USA
| | - Jeremy U Espino
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| | - Daniel Ford
- Department of Medicine, Division of Health Science Informatics, John Hopkins School of Medicine, Baltimore, Maryland, USA
| | - Wenke Hwang
- Department of Public Health Sciences, Division of Health Services Research, Penn State College of Medicine, Hershey, Pennsylvania, USA
| | - Wishwa Kapoor
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| | - Harold Lehmann
- Department of Medicine, Division of Health Science Informatics, John Hopkins School of Medicine, Baltimore, Maryland, USA
| | - G Daniel Martich
- Department of Critical Care Medicine, UPMC, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| | - Sally Morton
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Anuradha Paranjape
- Department of Medicine, Temple University School of Medicine, Philadelphia, Pennsylvania, USA
| | - William Shirey
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| | - Aaron Sorensen
- Department of Medicine, Temple University School of Medicine, Philadelphia, Pennsylvania, USA
| | - Michael J Becich
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| | - Rachel Hess
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| |
Collapse
|