1
|
Porto BM, Fogliatto FS. Enhanced forecasting of emergency department patient arrivals using feature engineering approach and machine learning. BMC Med Inform Decis Mak 2024; 24:377. [PMID: 39696224 DOI: 10.1186/s12911-024-02788-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 11/26/2024] [Indexed: 12/20/2024] Open
Abstract
BACKGROUND Emergency department (ED) overcrowding is an important problem in many countries. Accurate predictions of ED patient arrivals can help management to better allocate staff and medical resources. In this study, we investigate the use of calendar and meteorological predictors, as well as feature-engineered variables, to predict daily patient arrivals using datasets from eleven different EDs across three countries. METHODS Six machine learning (ML) algorithms were tested on forecasting horizons of 7 and 45 days. Three of them - Light Gradient Boosting Machine (LightGBM), Support Vector Machine with Radial Basis Function (SVM-RBF), and Neural Network Autoregression (NNAR) - were never before reported for predicting ED patient arrivals. Algorithms' hyperparameters were tuned through a grid-search with cross-validation. Prediction performance was assessed using fivefold cross-validation and four performance metrics. RESULTS The eXtreme Gradient Boosting (XGBoost) was the best-performing model on both prediction horizons, also outperforming results reported in past studies on ED arrival prediction. XGBoost and NNAR achieved the best performance in nine out of the eleven analyzed datasets, with MAPE values ranging from 5.03% to 14.1%. Feature engineering (FE) improved the performance of the ML algorithms. CONCLUSION Accuracy in predicting ED arrivals, achieved through the FE approach, is key for managing human and material resources, as well as reducing patient waiting times and lengths of stay.
Collapse
Affiliation(s)
- Bruno Matos Porto
- Industrial Engineering Department, Federal University of Rio Grande do Sul, Av. Osvaldo Aranha, 99, 5th floor, Porto Alegre, RS, 90020-035, Brazil.
- Industrial Engineering Department, Federal University of Rio Grande do Sul, Av. Osvaldo Aranha, 99, 5th floor, Porto Alegre, 90035-190, Brazil.
| | - Flavio Sanson Fogliatto
- Industrial Engineering Department, Federal University of Rio Grande do Sul, Av. Osvaldo Aranha, 99, 5th floor, Porto Alegre, RS, 90020-035, Brazil
- Industrial Engineering Department, Federal University of Rio Grande do Sul, Av. Osvaldo Aranha, 99, 5th floor, Porto Alegre, 90035-190, Brazil
| |
Collapse
|
2
|
Huang TY, Chong CF, Lin HY, Chen TY, Chang YC, Lin MC. A pre-trained language model for emergency department intervention prediction using routine physiological data and clinical narratives. Int J Med Inform 2024; 191:105564. [PMID: 39121529 DOI: 10.1016/j.ijmedinf.2024.105564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 07/15/2024] [Accepted: 07/20/2024] [Indexed: 08/12/2024]
Abstract
INTRODUCTION The urgency and complexity of emergency room (ER) settings require precise and swift decision-making processes for patient care. Ensuring the timely execution of critical examinations and interventions is vital for reducing diagnostic errors, but the literature highlights a need for innovative approaches to optimize diagnostic accuracy and patient outcomes. In response, our study endeavors to create predictive models for timely examinations and interventions by leveraging the patient's symptoms and vital signs recorded during triage, and in so doing, augment traditional diagnostic methodologies. METHODS Focusing on four key areas-medication dispensing, vital interventions, laboratory testing, and emergency radiology exams, the study employed Natural Language Processing (NLP) and seven advanced machine learning techniques. The research was centered around the innovative use of BioClinicalBERT, a state-of-the-art NLP framework. RESULTS BioClinicalBERT emerged as the superior model, outperforming others in predictive accuracy. The integration of physiological data with patient narrative symptoms demonstrated greater effectiveness compared to models based solely on textual data. The robustness of our approach was confirmed by an Area Under the Receiver Operating Characteristic curve (AUROC) score of 0.9. CONCLUSION The findings of our study underscore the feasibility of establishing a decision support system for emergency patients, targeting timely interventions and examinations based on a nuanced analysis of symptoms. By using an advanced natural language processing technique, our approach shows promise for enhancing diagnostic accuracy. However, the current model is not yet fully mature for direct implementation into daily clinical practice. Recognizing the imperative nature of precision in the ER environment, future research endeavors must focus on refining and expanding predictive models to include detailed timely examinations and interventions. Although the progress achieved in this study represents an encouraging step towards a more innovative and technology-driven paradigm in emergency care, full clinical integration warrants further exploration and validation.
Collapse
Affiliation(s)
- Ting-Yun Huang
- Emergency Department, Shuang-Ho Hospital, Taipei Medical University, Taipei, Taiwan.
| | - Chee-Fah Chong
- Emergency Department, Shin-Kong Wu Ho-Su Memorial Hospital, Taipei, Taiwan.
| | - Heng-Yu Lin
- Graduate Institute of Data Science, Taipei Medical University, Taipei, Taiwan.
| | - Tzu-Ying Chen
- Graduate Institute of Data Science, Taipei Medical University, Taipei, Taiwan
| | - Yung-Chun Chang
- Graduate Institute of Data Science, Taipei Medical University, Taipei, Taiwan; Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan.
| | - Ming-Chin Lin
- Graduate Institute of Biomedical Informatics, Taipei Medical University, Taipei, Taiwan; Department of Neurosurgery, Shuang-Ho Hospital, Taipei Medical University, Taipei, Taiwan; Department of Neurosurgery, Taipei Municipal Wanfang Hospital, Taipei Medical University, Taipei, Taiwan..
| |
Collapse
|
3
|
Ropponen A, Hirvonen M, Kuusi T, Härmä M. Concurrent Trajectories of Objectively Measured Insufficient Recovery and Workload Among a Cohort of Shift Working Hospital Employees: Quantitative Empirical Research. Nurs Open 2024; 11:e70101. [PMID: 39571045 PMCID: PMC11580809 DOI: 10.1002/nop2.70101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 10/13/2024] [Accepted: 11/02/2024] [Indexed: 11/24/2024] Open
Abstract
AIM To investigate concurrent changes in short shift intervals (< 11 h) and workload among hospital employees. DESIGN AND DATA SOURCES This cohort study of 1904 employees in one hospital district in Finland utilised data on employees' working hours for short shift intervals and workload based on the patient classifications aggregated to a 3-week period level across 2 years, 2018-2019. The data was analysed by group-based trajectory modelling and multinominal regression models. RESULTS The seven trajectories model had the best fit to the data-Group 1: very few short shift intervals that are decreasing and low workload (15.0%); Group 2: a low amount of short shift intervals that are decreasing and stable low workload (14.2%); Group 3: moderate amount of short shift intervals that are slightly increasing and low workload (25.1%); Group 4: a low amount of short shift intervals that are slightly decreasing and stable low workload that is slightly increasing (12.1%): Group 5: a moderate amount of both short shift intervals and workload (19.8%): Group 6: short shift intervals that are clearly decreasing, with higher than the average workload decreasing (5.6%); Group 7: moderate amount of short shift intervals and very high workload (8.3%). CONCLUSIONS Only a minority of hospital employees were found to have both high workloads and insufficient recovery possibilities, but the time-related increases in objective workload were not compensated by better recovery possibilities in working hours. For shift scheduling, it is noteworthy that older employees might seek to work at units in which the workload is lower, which could be considered to support workability. REPORTING METHOD Record. PATIENT OR PUBLIC CONTRIBUTION No Patient or Public Contribution.
Collapse
Affiliation(s)
- Annina Ropponen
- Finnish Institute of Occupational HealthHelsinkiFinland
- Division of Insurance Medicine, Department of Clinical NeuroscienceKarolinska InstitutetStockholmSweden
| | | | | | - Mikko Härmä
- Finnish Institute of Occupational HealthHelsinkiFinland
| |
Collapse
|
4
|
Mapundu MT, Kabudula CW, Musenge E, Olago V, Celik T. Text mining of verbal autopsy narratives to extract mortality causes and most prevalent diseases using natural language processing. PLoS One 2024; 19:e0308452. [PMID: 39298425 DOI: 10.1371/journal.pone.0308452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Accepted: 07/24/2024] [Indexed: 09/21/2024] Open
Abstract
Verbal autopsy (VA) narratives play a crucial role in understanding and documenting the causes of mortality, especially in regions lacking robust medical infrastructure. In this study, we propose a comprehensive approach to extract mortality causes and identify prevalent diseases from VA narratives utilizing advanced text mining techniques, so as to better understand the underlying health issues leading to mortality. Our methodology integrates n-gram-based language processing, Latent Dirichlet Allocation (LDA), and BERTopic, offering a multi-faceted analysis to enhance the accuracy and depth of information extraction. This is a retrospective study that uses secondary data analysis. We used data from the Agincourt Health and Demographic Surveillance Site (HDSS), which had 16338 observations collected between 1993 and 2015. Our text mining steps entailed data acquisition, pre-processing, feature extraction, topic segmentation, and discovered knowledge. The results suggest that the HDSS population may have died from mortality causes such as vomiting, chest/stomach pain, fever, coughing, loss of weight, low energy, headache. Additionally, we discovered that the most prevalent diseases entailed human immunodeficiency virus (HIV), tuberculosis (TB), diarrhoea, cancer, neurological disorders, malaria, diabetes, high blood pressure, chronic ailments (kidney, heart, lung, liver), maternal and accident related deaths. This study is relevant in that it avails valuable insights regarding mortality causes and most prevalent diseases using novel text mining approaches. These results can be integrated in the diagnosis pipeline for ease of human annotation and interpretation. As such, this will help with effective informed intervention programmes that can improve primary health care systems and chronic based delivery, thus increasing life expectancy.
Collapse
Affiliation(s)
- Michael Tonderai Mapundu
- Department of Epidemiology and Biostatistics, School of Public Health, University of the Witwatersrand, Johannesburg, South Africa
| | - Chodziwadziwa Whiteson Kabudula
- Department of Epidemiology and Biostatistics, School of Public Health, University of the Witwatersrand, Johannesburg, South Africa
- MRC/Wits Rural Public Health and Health Transitions Research Unit (Agincourt), Johannesburg, South Africa
| | - Eustasius Musenge
- Department of Epidemiology and Biostatistics, School of Public Health, University of the Witwatersrand, Johannesburg, South Africa
| | - Victor Olago
- National Health Laboratory Service (NHLS), National Cancer Registry, Johannesburg, South Africa
| | - Turgay Celik
- Wits Institute of Data Science, University of The Witwatersrand, Johannesburg, South Africa
- School of Electrical and Information Engineering, University of The Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
5
|
Askar M, Tafavvoghi M, Småbrekke L, Bongo LA, Svendsen K. Using machine learning methods to predict all-cause somatic hospitalizations in adults: A systematic review. PLoS One 2024; 19:e0309175. [PMID: 39178283 PMCID: PMC11343463 DOI: 10.1371/journal.pone.0309175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 08/06/2024] [Indexed: 08/25/2024] Open
Abstract
AIM In this review, we investigated how Machine Learning (ML) was utilized to predict all-cause somatic hospital admissions and readmissions in adults. METHODS We searched eight databases (PubMed, Embase, Web of Science, CINAHL, ProQuest, OpenGrey, WorldCat, and MedNar) from their inception date to October 2023, and included records that predicted all-cause somatic hospital admissions and readmissions of adults using ML methodology. We used the CHARMS checklist for data extraction, PROBAST for bias and applicability assessment, and TRIPOD for reporting quality. RESULTS We screened 7,543 studies of which 163 full-text records were read and 116 met the review inclusion criteria. Among these, 45 predicted admission, 70 predicted readmission, and one study predicted both. There was a substantial variety in the types of datasets, algorithms, features, data preprocessing steps, evaluation, and validation methods. The most used types of features were demographics, diagnoses, vital signs, and laboratory tests. Area Under the ROC curve (AUC) was the most used evaluation metric. Models trained using boosting tree-based algorithms often performed better compared to others. ML algorithms commonly outperformed traditional regression techniques. Sixteen studies used Natural language processing (NLP) of clinical notes for prediction, all studies yielded good results. The overall adherence to reporting quality was poor in the review studies. Only five percent of models were implemented in clinical practice. The most frequently inadequately addressed methodological aspects were: providing model interpretations on the individual patient level, full code availability, performing external validation, calibrating models, and handling class imbalance. CONCLUSION This review has identified considerable concerns regarding methodological issues and reporting quality in studies investigating ML to predict hospitalizations. To ensure the acceptability of these models in clinical settings, it is crucial to improve the quality of future studies.
Collapse
Affiliation(s)
- Mohsen Askar
- Faculty of Health Sciences, Department of Pharmacy, UiT-The Arctic University of Norway, Tromsø, Norway
| | - Masoud Tafavvoghi
- Faculty of Science and Technology, Department of Computer Science, UiT-The Arctic University of Norway, Tromsø, Norway
| | - Lars Småbrekke
- Faculty of Health Sciences, Department of Pharmacy, UiT-The Arctic University of Norway, Tromsø, Norway
| | - Lars Ailo Bongo
- Faculty of Science and Technology, Department of Computer Science, UiT-The Arctic University of Norway, Tromsø, Norway
| | - Kristian Svendsen
- Faculty of Health Sciences, Department of Pharmacy, UiT-The Arctic University of Norway, Tromsø, Norway
| |
Collapse
|
6
|
Kuo KM, Lin YL, Chang CS, Kuo TJ. An ensemble model for predicting dispositions of emergency department patients. BMC Med Inform Decis Mak 2024; 24:105. [PMID: 38649949 PMCID: PMC11036695 DOI: 10.1186/s12911-024-02503-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 04/09/2024] [Indexed: 04/25/2024] Open
Abstract
OBJECTIVE The healthcare challenge driven by an aging population and rising demand is one of the most pressing issues leading to emergency department (ED) overcrowding. An emerging solution lies in machine learning's potential to predict ED dispositions, thus leading to promising substantial benefits. This study's objective is to create a predictive model for ED patient dispositions by employing ensemble learning. It harnesses diverse data types, including structured and unstructured information gathered during ED visits to address the evolving needs of localized healthcare systems. METHODS In this cross-sectional study, 80,073 ED patient records were amassed from a major southern Taiwan hospital in 2018-2019. An ensemble model incorporated structured (demographics, vital signs) and pre-processed unstructured data (chief complaints, preliminary diagnoses) using bag-of-words (BOW) and term frequency-inverse document frequency (TF-IDF). Two random forest base-learners for structured and unstructured data were employed and then complemented by a multi-layer perceptron meta-learner. RESULTS The ensemble model demonstrates strong predictive performance for ED dispositions, achieving an area under the receiver operating characteristic curve of 0.94. The models based on unstructured data encoded with BOW and TF-IDF yield similar performance results. Among the structured features, the top five most crucial factors are age, pulse rate, systolic blood pressure, temperature, and acuity level. In contrast, the top five most important unstructured features are pneumonia, fracture, failure, suspect, and sepsis. CONCLUSIONS Findings indicate that utilizing ensemble learning with a blend of structured and unstructured data proves to be a predictive method for determining ED dispositions.
Collapse
Affiliation(s)
- Kuang-Ming Kuo
- Department of Business Management, National United University, No.1, 360301, Lienda, Miaoli, Taiwan
| | - Yih-Lon Lin
- Department of Computer Science and Information Engineering, National Yunlin University of Science and Technology, No. 123, University Road, Section 3, 64002, Douliou, Yunlin, Taiwan
| | - Chao Sheng Chang
- Department of Emergency Medicine, E-Da Hospital, Kaohsiung City, Taiwan.
- Department of Occupational Therapy, I-Shou University, Kaohsiung City, Taiwan.
| | - Tin Ju Kuo
- Department of Computer Science and Information Engineering, National Taitung University, 369, Sec. 2, University Rd, Taitung, Taiwan
| |
Collapse
|
7
|
Chen S, Lan X, Yu H. A social network analysis: mental health scales used during the COVID-19 pandemic. Front Psychiatry 2023; 14:1199906. [PMID: 37706038 PMCID: PMC10495585 DOI: 10.3389/fpsyt.2023.1199906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 08/11/2023] [Indexed: 09/15/2023] Open
Abstract
Introduction The focus on psychological issues during COVID-19 has led to the development of large surveys that involve the use of mental health scales. Numerous mental health measurements are available; choosing the appropriate measurement is crucial. Methods A rule-based named entity recognition was used to recognize entities of mental health scales that occur in the articles from PubMed. The co-occurrence networks of mental health scales and Medical Subject Headings (MeSH) terms were constructed by Gephi. Results Five types of MeSH terms were filtered, including research objects, research topics, research methods, countries/regions, and factors. Seventy-eight mental health scales were discovered. Discussion The findings provide insights on the scales used most often during the pandemic, the key instruments used to measure healthcare workers' physical and mental health, the scales most often utilized for assessing maternal mental health, the tools used most commonly for assessing older adults' psychological resilience and loneliness, and new COVID-19 mental health scales. Future studies may use these findings as a guiding reference and compass.
Collapse
Affiliation(s)
| | - Xue Lan
- Department of Health Management, China Medical University, Shenyang, China
| | | |
Collapse
|
8
|
Sax DR, Warton EM, Sofrygin O, Mark DG, Ballard DW, Kene MV, Vinson DR, Reed ME. Automated analysis of unstructured clinical assessments improves emergency department triage performance: A retrospective deep learning analysis. J Am Coll Emerg Physicians Open 2023; 4:e13003. [PMID: 37448487 PMCID: PMC10337523 DOI: 10.1002/emp2.13003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 05/11/2023] [Accepted: 06/20/2023] [Indexed: 07/15/2023] Open
Abstract
Objectives Efficient and accurate emergency department (ED) triage is critical to prioritize the sickest patients and manage department flow. We explored the use of electronic health record data and advanced predictive analytics to improve triage performance. Methods Using a data set of over 5 million ED encounters of patients 18 years and older across 21 EDs from 2016 to 2020, we derived triage models using deep learning to predict 2 outcomes: hospitalization (primary outcome) and fast-track eligibility (exploratory outcome), defined as ED discharge with <2 resource types used (eg, laboratory or imaging studies) and no critical events (eg, resuscitative medications use or intensive care unit [ICU] admission). We report area under the receiver operator characteristic curve (AUC) and 95% confidence intervals (CI) for models using (1) triage variables alone (demographics and vital signs), (2) triage nurse clinical assessment alone (unstructured notes), and (3) triage variables plus clinical assessment for each prediction target. Results We found 12.7% of patients were hospitalized (n = 673,659) and 37.0% were fast-track eligible (n = 1,966,615). The AUC was lowest for models using triage variables alone: AUC 0.77 (95% CI 0.77-0.78) and 0.70 (95% CI 0.70-0.71) for hospitalization and fast-track eligibility, respectively, and highest for models incorporating clinical assessment with triage variables for both hospitalization and fast-track eligibility: AUC 0.87 (95% CI 0.87-0.87) for both prediction targets. Conclusion Our findings highlight the potential to use advanced predictive analytics to accurately predict key ED triage outcomes. Predictive accuracy was optimized when clinical assessments were added to models using simple structured variables alone.
Collapse
Affiliation(s)
- Dana R. Sax
- Department of Emergency MedicineKaiser East Bay and Kaiser Permanente NorthernCalifornia Division of ResearchOaklandCaliforniaUSA
| | - E. Margaret Warton
- Kaiser Permanente Northern California Division of ResearchOaklandCaliforniaUSA
| | | | - Dustin G. Mark
- Department of Emergency MedicineKaiser East Bay and Kaiser Permanente NorthernCalifornia Division of ResearchOaklandCaliforniaUSA
| | - Dustin W. Ballard
- Department of Emergency MedicineKaiser San Rafael and Kaiser Permanente Northern California Division of ResearchOaklandCaliforniaUSA
| | - Mamata V. Kene
- Department of Emergency MedicineKaiser San Rafael and Kaiser Permanente Northern California Division of ResearchOaklandCaliforniaUSA
| | - David R. Vinson
- Department of Emergency MedicineRoseville, and Kaiser Permanente Northern California Division of ResearchOaklandCaliforniaUSA
| | - Mary E. Reed
- Kaiser Permanente Northern California Division of ResearchOaklandCaliforniaUSA
| |
Collapse
|
9
|
Hatachi T, Hashizume T, Taniguchi M, Inata Y, Aoki Y, Kawamura A, Takeuchi M. Machine Learning-Based Prediction of Hospital Admission Among Children in an Emergency Care Center. Pediatr Emerg Care 2023; 39:80-86. [PMID: 36719388 DOI: 10.1097/pec.0000000000002648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
OBJECTIVES Machine learning-based prediction of hospital admissions may have the potential to optimize patient disposition and improve clinical outcomes by minimizing both undertriage and overtriage in crowded emergency care. We developed and validated the predictive abilities of machine learning-based predictions of hospital admissions in a pediatric emergency care center. METHODS A prognostic study was performed using retrospectively collected data of children younger than 16 years who visited a single pediatric emergency care center in Osaka, Japan, between August 1, 2016, and October 15, 2019. Generally, the center treated walk-in children and did not treat trauma injuries. The main outcome was hospital admission as determined by the physician. The 83 potential predictors available at presentation were selected from the following categories: demographic characteristics, triage level, physiological parameters, and symptoms. To identify predictive abilities for hospital admission, maximize the area under the precision-recall curve, and address imbalanced outcome classes, we developed the following models for the preperiod training cohort (67% of the samples) and also used them in the 1-year postperiod validation cohort (33% of the samples): (1) logistic regression, (2) support vector machine, (3) random forest, and (4) extreme gradient boosting. RESULTS Among 88,283 children who were enrolled, the median age was 3.9 years, with 47,931 (54.3%) boys and 1985 (2.2%) requiring hospital admission. Among the models, extreme gradient boosting achieved the highest predictive abilities (eg, area under the precision-recall curve, 0.26; 95% confidence interval, 0.25-0.27; area under the receiver operating characteristic curve, 0.86; 95% confidence interval, 0.84-0.88; sensitivity, 0.77; and specificity, 0.82). With an optimal threshold, the positive and negative likelihood ratios were 4.22, and 0.28, respectively. CONCLUSIONS Machine learning-based prediction of hospital admissions may support physicians' decision-making for hospital admissions. However, further improvements are required before implementing these models in real clinical settings.
Collapse
Affiliation(s)
- Takeshi Hatachi
- From the Department of Intensive Care Medicine, Osaka Women's and Children's Hospital
| | - Takao Hashizume
- Department of Pediatrics, SAKAI Children's Emergency Medical Center, Osaka
| | - Masashi Taniguchi
- From the Department of Intensive Care Medicine, Osaka Women's and Children's Hospital
| | - Yu Inata
- From the Department of Intensive Care Medicine, Osaka Women's and Children's Hospital
| | | | - Atsushi Kawamura
- From the Department of Intensive Care Medicine, Osaka Women's and Children's Hospital
| | - Muneyuki Takeuchi
- From the Department of Intensive Care Medicine, Osaka Women's and Children's Hospital
| |
Collapse
|
10
|
PToPI: A Comprehensive Review, Analysis, and Knowledge Representation of Binary Classification Performance Measures/Metrics. SN COMPUTER SCIENCE 2023; 4:13. [PMID: 36267467 PMCID: PMC9569243 DOI: 10.1007/s42979-022-01409-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 09/13/2022] [Indexed: 11/06/2022]
Abstract
Although few performance evaluation instruments have been used conventionally in different machine learning-based classification problem domains, there are numerous ones defined in the literature. This study reviews and describes performance instruments via formally defined novel concepts and clarifies the terminology. The study first highlights the issues in performance evaluation via a survey of 78 mobile-malware classification studies and reviews terminology. Based on three research questions, it proposes novel concepts to identify characteristics, similarities, and differences of instruments that are categorized into 'performance measures' and 'performance metrics' in the classification context for the first time. The concepts reflecting the intrinsic properties of instruments such as canonical form, geometry, duality, complementation, dependency, and leveling, aim to reveal similarities and differences of numerous instruments, such as redundancy and ground-truth versus prediction focuses. As an application of knowledge representation, we introduced a new exploratory table called PToPI (Periodic Table of Performance Instruments) for 29 measures and 28 metrics (69 instruments including variant and parametric ones). Visualizing proposed concepts, PToPI provides a new relational structure for the instruments including graphical, probabilistic, and entropic ones to see their properties and dependencies all in one place. Applications of the exploratory table in six examples from different domains in the literature have shown that PToPI aids overall instrument analysis and selection of the proper performance metrics according to the specific requirements of a classification problem. We expect that the proposed concepts and PToPI will help researchers comprehend and use the instruments and follow a systematic approach to classification performance evaluation and publication.
Collapse
|
11
|
Patel D, Cheetirala SN, Raut G, Tamegue J, Kia A, Glicksberg B, Freeman R, Levin MA, Timsina P, Klang E. Predicting Adult Hospital Admission from Emergency Department Using Machine Learning: An Inclusive Gradient Boosting Model. J Clin Med 2022; 11:jcm11236888. [PMID: 36498463 PMCID: PMC9740100 DOI: 10.3390/jcm11236888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 10/24/2022] [Accepted: 11/15/2022] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND AND AIM We analyzed an inclusive gradient boosting model to predict hospital admission from the emergency department (ED) at different time points. We compared its results to multiple models built exclusively at each time point. METHODS This retrospective multisite study utilized ED data from the Mount Sinai Health System, NY, during 2015-2019. Data included tabular clinical features and free-text triage notes represented using bag-of-words. A full gradient boosting model, trained on data available at different time points (30, 60, 90, 120, and 150 min), was compared to single models trained exclusively at data available at each time point. This was conducted by concatenating the rows of data available at each time point to one data matrix for the full model, where each row is considered a separate case. RESULTS The cohort included 1,043,345 ED visits. The full model showed comparable results to the single models at all time points (AUCs 0.84-0.88 for different time points for both the full and single models). CONCLUSION A full model trained on data concatenated from different time points showed similar results to single models trained at each time point. An ML-based prediction model can use used for identifying hospital admission.
Collapse
Affiliation(s)
- Dhavalkumar Patel
- Mount Sinai Health System, New York, NY 10017, USA
- Correspondence: (D.P.); (E.K.)
| | | | - Ganesh Raut
- Mount Sinai Health System, New York, NY 10017, USA
| | | | - Arash Kia
- Mount Sinai Health System, New York, NY 10017, USA
| | | | | | - Matthew A. Levin
- Mount Sinai Health System, New York, NY 10017, USA
- Department of Anesthesiology, Perioperative and Pain Management, Mount Sinai Hospital, New York, NY 10017, USA
| | - Prem Timsina
- Mount Sinai Health System, New York, NY 10017, USA
| | - Eyal Klang
- Mount Sinai Health System, New York, NY 10017, USA
- Correspondence: (D.P.); (E.K.)
| |
Collapse
|
12
|
Hacking C, Verbeek H, Hamers JPH, Sion K, Aarts S. Text mining in long-term care: Exploring the usefulness of artificial intelligence in a nursing home setting. PLoS One 2022; 17:e0268281. [PMID: 36006921 PMCID: PMC9409502 DOI: 10.1371/journal.pone.0268281] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Accepted: 04/27/2022] [Indexed: 11/19/2022] Open
Abstract
Objectives In nursing homes, narrative data are collected to evaluate quality of care as perceived by residents or their family members. This results in a large amount of textual data. However, as the volume of data increases, it becomes beyond the capability of humans to analyze it. This study aims to explore the usefulness of text mining approaches regarding narrative data gathered in a nursing home setting. Design Exploratory study showing a variety of text mining approaches. Setting and participants Data has been collected as part of the project ‘Connecting Conversations’: assessing experienced quality of care by conducting individual interviews with residents of nursing homes (n = 39), family members (n = 37) and care professionals (n = 49). Methods Several pre-processing steps were applied. A variety of text mining analyses were conducted: individual word frequencies, bigram frequencies, a correlation analysis and a sentiment analysis. A survey was conducted to establish a sentiment analysis model tailored to text collected in long-term care for older adults. Results Residents, family members and care professionals uttered respectively 285, 362 and 549 words per interview. Word frequency analysis showed that words that occurred most frequently in the interviews are often positive. Despite some differences in word usage, correlation analysis displayed that similar words are used by all three groups to describe quality of care. Most interviews displayed a neutral sentiment. Care professionals expressed a more diverse sentiment compared to residents and family members. A topic clustering analysis showed a total of 12 topics including ‘relations’ and ‘care environment’. Conclusions and implications This study demonstrates the usefulness of text mining to extend our knowledge regarding quality of care in a nursing home setting. With the rise of textual (narrative) data, text mining can lead to valuable new insights for long-term care for older adults.
Collapse
Affiliation(s)
- Coen Hacking
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
- * E-mail:
| | - Hilde Verbeek
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
| | - Jan P. H. Hamers
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
| | - Katya Sion
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
| | - Sil Aarts
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
| |
Collapse
|
13
|
Zhao X, Lai JW, Wah Ho AF, Liu N, Hock Ong ME, Cheong KH. Predicting hospital emergency department visits with deep learning approaches. Biocybern Biomed Eng 2022. [DOI: 10.1016/j.bbe.2022.07.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
14
|
Leonard F, Gilligan J, Barrett MJ. Development of a low-dimensional model to predict admissions from triage at a pediatric emergency department. J Am Coll Emerg Physicians Open 2022; 3:e12779. [PMID: 35859857 PMCID: PMC9286530 DOI: 10.1002/emp2.12779] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Revised: 05/24/2022] [Accepted: 06/17/2022] [Indexed: 11/26/2022] Open
Abstract
Objectives This study aims to develop and internally validate a low-dimensional model to predict outcomes (admission or discharge) using commonly entered data up to the post-triage process to improve patient flow in the pediatric emergency department (ED). In hospital settings where electronic data are limited, a low-dimensional model with fewer variables may be easier to implement. Methods This prognostic study included ED attendances in 2017 and 2018. The Cross Industry Standard Process for Data Mining methodology was followed. Eligibility criteria was applied to the data set, splitting into 70% train and 30% test. Sampling techniques were compared. Gradient boosting machine (GBM), logistic regression, and naïve Bayes models were created. Variables of importance were obtained from the model with the highest area under the curve (AUC) and used to create a low-dimensional model. Results Eligible attendances totaled 72,229 (15% admission rate). The AUC was 0.853 (95% confidence interval [CI], 0.846-0.859) for GBM, 0.845 (95% CI, 0.838-0.852) for logistic regression and 0.813 (95% CI, 0.806-0.821) for naïve Bayes. Important predictors in the GBM model used to create a low-dimensional model were presenting complaint, triage category, referral source, registration month, location type (resuscitation/other), distance traveled, admission history, and weekday (AUC 0.835 [95% CI, 0.829-0.842]). Conclusions Admission and discharge probability can be predicted early in a pediatric ED using 8 variables. Future work could analyze the false positives and false negatives to gain an understanding of the implementation of these predictions.
Collapse
Affiliation(s)
- Fiona Leonard
- Business Intelligence UnitChildren's Health Ireland at CrumlinDublinIreland
| | - John Gilligan
- School of Computer ScienceTechnological University DublinDublinIreland
| | - Michael J. Barrett
- Department of Paediatric Emergency MedicineChildren's Health Ireland at CrumlinDublinIreland
- Women's and Children's HealthSchool of MedicineUniversity College DublinDublinIreland
| |
Collapse
|
15
|
Chin KC, Cheng YC, Sun JT, Ou CY, Hu CH, Tsai MC, Ma MHM, Chiang WC, Chen AY. Machine Learning-Based Text Analysis to Predict Severely Injured Patients in Emergency Medical Dispatch: Model Development and Validation. J Med Internet Res 2022; 24:e30210. [PMID: 35687393 PMCID: PMC9233260 DOI: 10.2196/30210] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 09/28/2021] [Accepted: 04/22/2022] [Indexed: 11/30/2022] Open
Abstract
Background Early recognition of severely injured patients in prehospital settings is of paramount importance for timely treatment and transportation of patients to further treatment facilities. The dispatching accuracy has seldom been addressed in previous studies. Objective In this study, we aimed to build a machine learning–based model through text mining of emergency calls for the automated identification of severely injured patients after a road accident. Methods Audio recordings of road accidents in Taipei City, Taiwan, in 2018 were obtained and randomly sampled. Data on call transfers or non-Mandarin speeches were excluded. To predict cases of severe trauma identified on-site by emergency medical technicians, all included cases were evaluated by both humans (6 dispatchers) and a machine learning model, that is, a prehospital-activated major trauma (PAMT) model. The PAMT model was developed using term frequency–inverse document frequency, rule-based classification, and a Bernoulli naïve Bayes classifier. Repeated random subsampling cross-validation was applied to evaluate the robustness of the model. The prediction performance of dispatchers and the PAMT model, in severe cases, was compared. Performance was indicated by sensitivity, specificity, positive predictive value, negative predictive value, and accuracy. Results Although the mean sensitivity and negative predictive value obtained by the PAMT model were higher than those of dispatchers, they obtained higher mean specificity, positive predictive value, and accuracy. The mean accuracy of the PAMT model, from certainty level 0 (lowest certainty) to level 6 (highest certainty), was higher except for levels 5 and 6. The overall performances of the dispatchers and the PAMT model were similar; however, the PAMT model had higher accuracy in cases where the dispatchers were less certain of their judgments. Conclusions A machine learning–based model, called the PAMT model, was developed to predict severe road accident trauma. The results of our study suggest that the accuracy of the PAMT model is not superior to that of the participating dispatchers; however, it may assist dispatchers when they lack confidence while making a judgment.
Collapse
Affiliation(s)
- Kuan-Chen Chin
- Department of Emergency Medicine, Taipei Hospital, Ministry of Health and Welfare, New Taipei City, Taiwan
| | - Yu-Chia Cheng
- Department of Civil Engineering, National Taiwan University, Taipei City, Taiwan
| | - Jen-Tang Sun
- Department of Emergency Medicine, Far Eastern Memorial Hospital, New Taipei City, Taiwan
| | - Chih-Yen Ou
- Department of Civil Engineering, National Taiwan University, Taipei City, Taiwan
| | - Chun-Hua Hu
- Emergency Medical Service Division, Taipei City Fire Department, Taipei City, Taiwan
| | - Ming-Chi Tsai
- Emergency Medical Service Division, Taipei City Fire Department, Taipei City, Taiwan
| | - Matthew Huei-Ming Ma
- Department of Emergency Medicine, National Taiwan University Hospital, Taipei City, Taiwan.,Department of Emergency Medicine, National Taiwan University Hospital, Yun-Lin Branch, Yunlin County, Taiwan
| | - Wen-Chu Chiang
- Department of Emergency Medicine, National Taiwan University Hospital, Taipei City, Taiwan.,Department of Emergency Medicine, National Taiwan University Hospital, Yun-Lin Branch, Yunlin County, Taiwan
| | - Albert Y Chen
- Department of Civil Engineering, National Taiwan University, Taipei City, Taiwan
| |
Collapse
|
16
|
Ryu AJ, Romero-Brufau S, Qian R, Heaton HA, Nestler DM, Ayanian S, Kingsley TC. Assessing the Generalizability of a Clinical Machine Learning Model Across Multiple Emergency Departments. Mayo Clin Proc Innov Qual Outcomes 2022; 6:193-199. [PMID: 35517246 PMCID: PMC9062323 DOI: 10.1016/j.mayocpiqo.2022.03.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023] Open
Abstract
Objective To assess the generalizability of a clinical machine learning algorithm across multiple emergency departments (EDs). Patients and Methods We obtained data on all ED visits at our health care system's largest ED from May 5, 2018, to December 31, 2019. We also obtained data from 3 satellite EDs and 1 distant-hub ED from May 1, 2018, to December 31, 2018. A gradient-boosted machine model was trained on pooled data from the included EDs. To prevent the effect of differing training set sizes, the data were randomly downsampled to match those of our smallest ED. A second model was trained on this downsampled, pooled data. The model's performance was compared using area under the receiver operating characteristic (AUC). Finally, site-specific models were trained and tested across all the sites, and the importance of features was examined to understand the reasons for differing generalizability. Results The training data sets contained 1918-64,161 ED visits. The AUC for the pooled model ranged from 0.84 to 0.94 across the sites; the performance decreased slightly when Ns were downsampled to match those of our smallest ED site. When site-specific models were trained and tested across all the sites, the AUCs ranged more widely from 0.71 to 0.93. Within a single ED site, the performance of the 5 site-specific models was most variable for our largest and smallest EDs. Finally, when the importance of features was examined, several features were common to all site-specific models; however, the weight of these features differed. Conclusion A machine learning model for predicting hospital admission from the ED will generalize fairly well within the health care system but will still have significant differences in AUC performance across sites because of site-specific factors.
Collapse
Affiliation(s)
- Alexander J. Ryu
- Division of Hospital Internal Medicine, Mayo Clinic, Rochester, MN
| | | | - Ray Qian
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN
| | | | | | - Shant Ayanian
- Division of Hospital Internal Medicine, Mayo Clinic, Rochester, MN
| | | |
Collapse
|
17
|
Cheng X, Cao Q, Liao SS. An overview of literature on COVID-19, MERS and SARS: Using text mining and latent Dirichlet allocation. J Inf Sci 2022; 48:304-320. [PMID: 38603038 PMCID: PMC7464068 DOI: 10.1177/0165551520954674] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The unprecedented outbreak of COVID-19 is one of the most serious global threats to public health in this century. During this crisis, specialists in information science could play key roles to support the efforts of scientists in the health and medical community for combatting COVID-19. In this article, we demonstrate that information specialists can support health and medical community by applying text mining technique with latent Dirichlet allocation procedure to perform an overview of a mass of coronavirus literature. This overview presents the generic research themes of the coronavirus diseases: COVID-19, MERS and SARS, reveals the representative literature per main research theme and displays a network visualisation to explore the overlapping, similarity and difference among these themes. The overview can help the health and medical communities to extract useful information and interrelationships from coronavirus-related studies.
Collapse
Affiliation(s)
- Xian Cheng
- Business School, Sichuan University, China
| | - Qiang Cao
- Department of Information Systems, City University of Hong Kong, China
| | | |
Collapse
|
18
|
Penfold RB, Carrell DS, Cronkite DJ, Pabiniak C, Dodd T, Glass AM, Johnson E, Thompson E, Arrighi HM, Stang PE. Development of a machine learning model to predict mild cognitive impairment using natural language processing in the absence of screening. BMC Med Inform Decis Mak 2022; 22:129. [PMID: 35549702 PMCID: PMC9097352 DOI: 10.1186/s12911-022-01864-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 04/24/2022] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Patients and their loved ones often report symptoms or complaints of cognitive decline that clinicians note in free clinical text, but no structured screening or diagnostic data are recorded. These symptoms/complaints may be signals that predict who will go on to be diagnosed with mild cognitive impairment (MCI) and ultimately develop Alzheimer's Disease or related dementias. Our objective was to develop a natural language processing system and prediction model for identification of MCI from clinical text in the absence of screening or other structured diagnostic information. METHODS There were two populations of patients: 1794 participants in the Adult Changes in Thought (ACT) study and 2391 patients in the general population of Kaiser Permanente Washington. All individuals had standardized cognitive assessment scores. We excluded patients with a diagnosis of Alzheimer's Disease, Dementia or use of donepezil. We manually annotated 10,391 clinic notes to train the NLP model. Standard Python code was used to extract phrases from notes and map each phrase to a cognitive functioning concept. Concepts derived from the NLP system were used to predict future MCI. The prediction model was trained on the ACT cohort and 60% of the general population cohort with 40% withheld for validation. We used a least absolute shrinkage and selection operator logistic regression approach (LASSO) to fit a prediction model with MCI as the prediction target. Using the predicted case status from the LASSO model and known MCI from standardized scores, we constructed receiver operating curves to measure model performance. RESULTS Chart abstraction identified 42 MCI concepts. Prediction model performance in the validation data set was modest with an area under the curve of 0.67. Setting the cutoff for correct classification at 0.60, the classifier yielded sensitivity of 1.7%, specificity of 99.7%, PPV of 70% and NPV of 70.5% in the validation cohort. DISCUSSION AND CONCLUSION Although the sensitivity of the machine learning model was poor, negative predictive value was high, an important characteristic of models used for population-based screening. While an AUC of 0.67 is generally considered moderate performance, it is also comparable to several tests that are widely used in clinical practice.
Collapse
Affiliation(s)
- Robert B Penfold
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Ave., Suite 1600, Seattle, WA, 98101, USA.
| | - David S Carrell
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Ave., Suite 1600, Seattle, WA, 98101, USA
| | - David J Cronkite
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Ave., Suite 1600, Seattle, WA, 98101, USA
| | - Chester Pabiniak
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Ave., Suite 1600, Seattle, WA, 98101, USA
| | - Tammy Dodd
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Ave., Suite 1600, Seattle, WA, 98101, USA
| | - Ashley Mh Glass
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Ave., Suite 1600, Seattle, WA, 98101, USA
| | - Eric Johnson
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Ave., Suite 1600, Seattle, WA, 98101, USA
| | - Ella Thompson
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Ave., Suite 1600, Seattle, WA, 98101, USA
| | | | - Paul E Stang
- Janssen Research and Development, LLC, Raritan, USA
| |
Collapse
|
19
|
Dai LL, Jiang TC, Li PF, Shao H, Wang X, Wang Y, Jia LQ, Liu M, An L, Jing XG, Cheng Z. Predictors of Maternal Death Among Women With Pulmonary Hypertension in China From 2012 to 2020: A Retrospective Single-Center Study. Front Cardiovasc Med 2022; 9:814557. [PMID: 35509273 PMCID: PMC9058072 DOI: 10.3389/fcvm.2022.814557] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Accepted: 03/29/2022] [Indexed: 12/04/2022] Open
Abstract
Background Previous studies have suggested that pregnant women with pulmonary hypertension (PH) have high maternal mortality. However, indexes or factors that can predict maternal death are lacking. Methods We retrospectively reviewed pregnant women with PH admitted for delivery from 2012 to 2020 and followed them for over 6 months. The patients were divided into two groups according to 10-day survival status after delivery. Predictive models and predictors for maternal death were identified using four machine learning algorithms: naïve Bayes, random forest, gradient boosting decision tree (GBDT), and support vector machine. Results A total of 299 patients were included. The most frequent PH classifications were Group 1 PH (73.9%) and Group 2 PH (23.7%). The mortality within 10 days after delivery was 9.4% and higher in Group 1 PH than in the other PH groups (11.7 vs. 2.6%, P = 0.016). We identified 17 predictors, each with a P-value < 0.05 by univariable analysis, that were associated with an increased risk of death, and the most notable were pulmonary artery systolic pressure (PASP), platelet count, red cell distribution width, N-terminal brain natriuretic peptide (NT-proBNP), and albumin (all P < 0.01). Four prediction models were established using the candidate variables, and the GBDT model showed the best performance (F1-score = 66.7%, area under the curve = 0.93). Feature importance showed that the three most important predictors were NT-proBNP, PASP, and albumin. Conclusion Mortality remained high, particularly in Group 1 PH. Our study shows that NT-proBNP, PASP, and albumin are the most important predictors of maternal death in the GBDT model. These findings may help clinicians provide better advice regarding fertility for women with PH.
Collapse
Affiliation(s)
- Ling-Ling Dai
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Tian-Ci Jiang
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Peng-Fei Li
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Hua Shao
- Department of Anaesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Xi Wang
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yu Wang
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Liu-Qun Jia
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Meng Liu
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Lin An
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Xiao-Gang Jing
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Zhe Cheng
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- *Correspondence: Zhe Cheng,
| |
Collapse
|
20
|
Chintalapudi N, Angeloni U, Battineni G, di Canio M, Marotta C, Rezza G, Sagaro GG, Silenzi A, Amenta F. LASSO Regression Modeling on Prediction of Medical Terms among Seafarers’ Health Documents Using Tidy Text Mining. Bioengineering (Basel) 2022; 9:bioengineering9030124. [PMID: 35324813 PMCID: PMC8945331 DOI: 10.3390/bioengineering9030124] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 03/02/2022] [Accepted: 03/16/2022] [Indexed: 12/31/2022] Open
Abstract
Generally, seafarers face a higher risk of illnesses and accidents than land workers. In most cases, there are no medical professionals on board seagoing vessels, which makes disease diagnosis even more difficult. When this occurs, onshore doctors may be able to provide medical advice through telemedicine by receiving better symptomatic and clinical details in the health abstracts of seafarers. The adoption of text mining techniques can assist in extracting diagnostic information from clinical texts. We applied lexicon sentimental analysis to explore the automatic labeling of positive and negative healthcare terms to seafarers’ text healthcare documents. This was due to the lack of experimental evaluations using computational techniques. In order to classify diseases and their associated symptoms, the LASSO regression algorithm is applied to analyze these text documents. A visualization of symptomatic data frequency for each disease can be achieved by analyzing TF-IDF values. The proposed approach allows for the classification of text documents with 93.8% accuracy by using a machine learning model called LASSO regression. It is possible to classify text documents effectively with tidy text mining libraries. In addition to delivering health assistance, this method can be used to classify diseases and establish health observatories. Knowledge developed in the present work will be applied to establish an Epidemiological Observatory of Seafarers’ Pathologies and Injuries. This Observatory will be a collaborative initiative of the Italian Ministry of Health, University of Camerino, and International Radio Medical Centre (C.I.R.M.), the Italian TMAS.
Collapse
Affiliation(s)
- Nalini Chintalapudi
- Clinical Research Centre, School of Medicinal and Health Products Sciences, University of Camerino, 62032 Camerino, Italy; (G.B.); (M.d.C.); (G.G.S.); (F.A.)
- Correspondence: ; Tel.: +39-35-33776704
| | - Ulrico Angeloni
- General Directorate of Health Prevention, Ministry of Health, 00144 Rome, Italy; (U.A.); (C.M.); (G.R.); (A.S.)
| | - Gopi Battineni
- Clinical Research Centre, School of Medicinal and Health Products Sciences, University of Camerino, 62032 Camerino, Italy; (G.B.); (M.d.C.); (G.G.S.); (F.A.)
| | - Marzio di Canio
- Clinical Research Centre, School of Medicinal and Health Products Sciences, University of Camerino, 62032 Camerino, Italy; (G.B.); (M.d.C.); (G.G.S.); (F.A.)
- Research Department, International Radio Medical Centre (C.I.R.M.), 00144 Rome, Italy
| | - Claudia Marotta
- General Directorate of Health Prevention, Ministry of Health, 00144 Rome, Italy; (U.A.); (C.M.); (G.R.); (A.S.)
| | - Giovanni Rezza
- General Directorate of Health Prevention, Ministry of Health, 00144 Rome, Italy; (U.A.); (C.M.); (G.R.); (A.S.)
| | - Getu Gamo Sagaro
- Clinical Research Centre, School of Medicinal and Health Products Sciences, University of Camerino, 62032 Camerino, Italy; (G.B.); (M.d.C.); (G.G.S.); (F.A.)
| | - Andrea Silenzi
- General Directorate of Health Prevention, Ministry of Health, 00144 Rome, Italy; (U.A.); (C.M.); (G.R.); (A.S.)
| | - Francesco Amenta
- Clinical Research Centre, School of Medicinal and Health Products Sciences, University of Camerino, 62032 Camerino, Italy; (G.B.); (M.d.C.); (G.G.S.); (F.A.)
- Research Department, International Radio Medical Centre (C.I.R.M.), 00144 Rome, Italy
| |
Collapse
|
21
|
Hwang S, Lee B. Machine learning-based prediction of critical illness in children visiting the emergency department. PLoS One 2022; 17:e0264184. [PMID: 35176113 PMCID: PMC8853514 DOI: 10.1371/journal.pone.0264184] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 02/04/2022] [Indexed: 12/23/2022] Open
Abstract
OBJECTIVES Triage is an essential emergency department (ED) process designed to provide timely management depending on acuity and severity; however, the process may be inconsistent with clinical and hospitalization outcomes. Therefore, studies have attempted to augment this process with machine learning models, showing advantages in predicting critical conditions and hospitalization outcomes. The aim of this study was to utilize nationwide registry data to develop a machine learning-based classification model to predict the clinical course of pediatric ED visits. METHODS This cross-sectional observational study used data from the National Emergency Department Information System on emergency visits of children under 15 years of age from January 1, 2016, to December 31, 2017. The primary and secondary outcomes were to identify critically ill children and predict hospitalization from triage data, respectively. We developed and tested a random forest model with the under sampled dataset and validated the model using the entire dataset. We compared the model's performance with that of the conventional triage system. RESULTS A total of 2,621,710 children were eligible for the analysis and included 12,951 (0.5%) critical outcomes and 303,808 (11.6%) hospitalizations. After validation, the area under the receiver operating characteristic curve was 0.991 (95% confidence interval [CI] 0.991-0.992) for critical outcomes and 0.943 (95% CI 0.943-0.944) for hospitalization, which were higher than those of the conventional triage system. CONCLUSIONS The machine learning-based model using structured triage data from a nationwide database can effectively predict critical illness and hospitalizations among children visiting the ED.
Collapse
Affiliation(s)
- Soyun Hwang
- Department of Pediatrics, Severance Children’s Hospital, Yonsei University College of Medicine, Seoul, Korea
| | - Bongjin Lee
- Department of Pediatrics, Seoul National University Hospital, Seoul, Korea
| |
Collapse
|
22
|
Rule-Based Information Extraction from Free-Text Pathology Reports Reveals Trends in South African Female Breast Cancer Molecular Subtypes and Ki67 Expression. BIOMED RESEARCH INTERNATIONAL 2022; 2022:6157861. [PMID: 35355821 PMCID: PMC8960023 DOI: 10.1155/2022/6157861] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Accepted: 12/29/2021] [Indexed: 12/23/2022]
Abstract
Clinical information on molecular subtypes and the Ki67 index is critical for breast cancer (BC) prognosis and personalised treatment plan. Extracting such information into structured data is essential for research, auditing, and cancer incidence reporting and underpins the potential for automated decision support. Herewith, we developed a rule-based natural language processing algorithm that retrieved and extracted important BC parameters from free-text pathology reports towards exploring molecular subtypes and Ki67-proliferation trends. We considered malignant BC pathology reports with different free-text narrative attributes from the South African National Health Laboratory Service. The reports were preprocessed and parsed through the algorithm. Parameters extracted by the algorithm were validated against manually extracted parameters. For all parameters extracted, we obtained accurate annotations of 83-100%, 93-100%, 91-100%, and 92-100% precision, recall, F1-score, and kappa, respectively. There was a significant trend in the proportion of each molecular subtype by patient age, histologic type, grade, Ki67, and race. The findings also showed significant association in the Ki67 trend with hormone receptors, human epidermal growth factors, age, grade, and race. Our approach bridges the gap between data availability and actionable knowledge and provides a framework that could be adapted and reused in other cancers and beyond cancer studies. Information extracted from these reports showed interesting trends that may be exploited for BC screening and treatment resources in South Africa. Finally, this study strongly encourages the implementation of a synoptic style pathology report in South Africa.
Collapse
|
23
|
|
24
|
Gour A, Kumari S. A 360-Degree View of a Hospital by Analysing Patient’s Online Reviews Using Fuzzy Sentiment Analysis. JOURNAL OF HEALTH MANAGEMENT 2021. [DOI: 10.1177/09720634211032017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Millions of people use Internet for developing new skills, booking online tickets, socialising, etc. Out of the sundry activities, giving online reviews by customers has become very customary these days and the fastest medium to make one’s voice heard. With the advent of analytics, more specifically, text mining, the online reviews of the customers have made a huge difference in shaping the future strategies of the companies and have also helped them to study the customer responses of their rivals. In an effort to help hospitals analyse the patient’s reviews present online on various social media platforms, this paper analyses the 659 reviews of people across the nation, on one of the best medical college and hospital of India, All India Institute of Medical Sciences, New Delhi. An attempt is made in this article to develop fuzzy sentiment analysis model with integration of naïve base classifier, which helps to analyse reviews of different hospitals and can come up with their own social media competitive analysis strategy. The results reveal the value text mining can bring to the table for any hospital and the immense business value that it holds.
Collapse
Affiliation(s)
- Alekh Gour
- Department of Healthcare Management and Big Data Analytics, Goa Institute of Management, Ribandar, Goa, India
| | - Sony Kumari
- Business Analytics Program, Santa Clara University, California, United States
| |
Collapse
|
25
|
Zeng J, Banerjee I, Henry AS, Wood DJ, Shachter RD, Gensheimer MF, Rubin DL. Natural Language Processing to Identify Cancer Treatments With Electronic Medical Records. JCO Clin Cancer Inform 2021; 5:379-393. [PMID: 33822653 DOI: 10.1200/cci.20.00173] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Knowing the treatments administered to patients with cancer is important for treatment planning and correlating treatment patterns with outcomes for personalized medicine study. However, existing methods to identify treatments are often lacking. We develop a natural language processing approach with structured electronic medical records and unstructured clinical notes to identify the initial treatment administered to patients with cancer. METHODS We used a total number of 4,412 patients with 483,782 clinical notes from the Stanford Cancer Institute Research Database containing patients with nonmetastatic prostate, oropharynx, and esophagus cancer. We trained treatment identification models for each cancer type separately and compared performance of using only structured, only unstructured (bag-of-words, doc2vec, fasttext), and combinations of both (structured + bow, structured + doc2vec, structured + fasttext). We optimized the identification model among five machine learning methods (logistic regression, multilayer perceptrons, random forest, support vector machines, and stochastic gradient boosting). The treatment information recorded in the cancer registry is the gold standard and compares our methods to an identification baseline with billing codes. RESULTS For prostate cancer, we achieved an f1-score of 0.99 (95% CI, 0.97 to 1.00) for radiation and 1.00 (95% CI, 0.99 to 1.00) for surgery using structured + doc2vec. For oropharynx cancer, we achieved an f1-score of 0.78 (95% CI, 0.58 to 0.93) for chemoradiation and 0.83 (95% CI, 0.69 to 0.95) for surgery using doc2vec. For esophagus cancer, we achieved an f1-score of 1.0 (95% CI, 1.0 to 1.0) for both chemoradiation and surgery using all combinations of structured and unstructured data. We found that employing the free-text clinical notes outperforms using the billing codes or only structured data for all three cancer types. CONCLUSION Our results show that treatment identification using free-text clinical notes greatly improves upon the performance using billing codes and simple structured data. The approach can be used for treatment cohort identification and adapted for longitudinal cancer treatment identification.
Collapse
Affiliation(s)
- Jiaming Zeng
- Department of Management Science and Engineering, Huang Engineering Center, Stanford, CA
| | - Imon Banerjee
- Department of Biomedical Informatics, Department of Radiology, Emory University School of Medicine, Atlanta, GA
| | - A Solomon Henry
- Research Informatics Center, Stanford University, Stanford, CA
| | - Douglas J Wood
- Research Informatics Center, Stanford University, Stanford, CA
| | - Ross D Shachter
- Department of Management Science and Engineering, Stanford University School of Engineering, Stanford, CA
| | - Michael F Gensheimer
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA
| | - Daniel L Rubin
- Department of Biomedical Data Science, Radiology, and Medicine (Biomedical Informatics), Stanford University School of Medicine, Stanford, CA
| |
Collapse
|
26
|
Lenivtceva ID, Kopanitsa G. The Pipeline for Standardizing Russian Unstructured Allergy Anamnesis Using FHIR AllergyIntolerance Resource. Methods Inf Med 2021; 60:95-103. [PMID: 34425626 DOI: 10.1055/s-0041-1733945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
BACKGROUND The larger part of essential medical knowledge is stored as free text which is complicated to process. Standardization of medical narratives is an important task for data exchange, integration, and semantic interoperability. OBJECTIVES The article aims to develop the end-to-end pipeline for structuring Russian free-text allergy anamnesis using international standards. METHODS The pipeline for free-text data standardization is based on FHIR (Fast Healthcare Interoperability Resources) and SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) to ensure semantic interoperability. The pipeline solves common tasks such as data preprocessing, classification, categorization, entities extraction, and semantic codes assignment. Machine learning methods, rule-based, and dictionary-based approaches were used to compose the pipeline. The pipeline was evaluated on 166 randomly chosen medical records. RESULTS AllergyIntolerance resource was used to represent allergy anamnesis. The module for data preprocessing included the dictionary with over 90,000 words, including specific medication terms, and more than 20 regular expressions for errors correction, classification, and categorization modules resulted in four dictionaries with allergy terms (total 2,675 terms), which were mapped to SNOMED CT concepts. F-scores for different steps are: 0.945 for filtering, 0.90 to 0.96 for allergy categorization, 0.90 and 0.93 for allergens reactions extraction, respectively. The allergy terminology coverage is more than 95%. CONCLUSION The proposed pipeline is a step to ensure semantic interoperability of Russian free-text medical records and could be effective in standardization systems for further data exchange and integration.
Collapse
Affiliation(s)
- Iuliia D Lenivtceva
- National Center for Cognitive Research, ITMO University, Saint-Petersburg, Russia
| | - Georgy Kopanitsa
- National Center for Cognitive Research, ITMO University, Saint-Petersburg, Russia
| |
Collapse
|
27
|
Gero Z, Ho JC. CATAN: Chart-aware temporal attention network for adverse outcome prediction. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS 2021; 2021:83-92. [PMID: 35079697 PMCID: PMC8785859 DOI: 10.1109/ichi52183.2021.00024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
There is an increased adoption of electronic health record systems by a variety of hospitals and medical centers. This provides an opportunity to leverage automated computer systems in assisting healthcare workers. One of the least utilized but rich source of patient information is the unstructured clinical text. In this work, we develop CATAN, a chart-aware temporal attention network for learning patient representations from clinical notes. We introduce a novel representation where each note is considered a single unit, like a sentence, and composed of attention-weighted words. The notes in turn are aggregated into a patient representation using a second weighting unit, note attention. Unlike standard attention computations which focus only on the content of the note, we incorporate the chart-time for each note as a constraint for attention calculation. This allows our model to focus on notes closer to the prediction time. Using the MIMIC-III dataset, we empirically show that our patient representation and attention calculation achieves the best performance in comparison with various state-of-the-art baselines for one-year mortality prediction and 30-day hospital readmission. Moreover, the attention weights can be used to offer transparency into our model's predictions.
Collapse
Affiliation(s)
- Zelalem Gero
- Department of Computer Science, Emory University, Atlanta, USA
| | - Joyce C Ho
- Department of Computer Science, Emory University, Atlanta, USA
| |
Collapse
|
28
|
Abstract
Electronic health records (EHRs) are becoming a vital source of data for healthcare quality improvement, research, and operations. However, much of the most valuable information contained in EHRs remains buried in unstructured text. The field of clinical text mining has advanced rapidly in recent years, transitioning from rule-based approaches to machine learning and, more recently, deep learning. With new methods come new challenges, however, especially for those new to the field. This review provides an overview of clinical text mining for those who are encountering it for the first time (e.g., physician researchers, operational analytics teams, machine learning scientists from other domains). While not a comprehensive survey, this review describes the state of the art, with a particular focus on new tasks and methods developed over the past few years. It also identifies key barriers between these remarkable technical advances and the practical realities of implementation in health systems and in industry.
Collapse
Affiliation(s)
- Bethany Percha
- Department of Medicine and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10025, USA;
| |
Collapse
|
29
|
Development and Validation of Machine Learning Models to Predict Admission From Emergency Department to Inpatient and Intensive Care Units. Ann Emerg Med 2021; 78:290-302. [PMID: 33972128 DOI: 10.1016/j.annemergmed.2021.02.029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 02/10/2021] [Accepted: 02/25/2021] [Indexed: 12/23/2022]
Abstract
STUDY OBJECTIVE This study aimed to develop and validate 2 machine learning models that use historical and current-visit patient data from electronic health records to predict the probability of patient admission to either an inpatient unit or ICU at each hour (up to 24 hours) of an emergency department (ED) encounter. The secondary goal was to provide a framework for the operational implementation of these machine learning models. METHODS Data were curated from 468,167 adult patient encounters in 3 EDs (1 academic and 2 community-based EDs) of a large academic health system from August 1, 2015, to October 31, 2018. The models were validated using encounter data from January 1, 2019, to December 31, 2019. An operational user dashboard was developed, and the models were run on real-time encounter data. RESULTS For the intermediate admission model, the area under the receiver operating characteristic curve was 0.873 and the area under the precision-recall curve was 0.636. For the ICU admission model, the area under the receiver operating characteristic curve was 0.951 and the area under the precision-recall curve was 0.461. The models had similar performance in both the academic- and community-based settings as well as across the 2019 and real-time encounter data. CONCLUSION Machine learning models were developed to accurately make predictions regarding the probability of inpatient or ICU admission throughout the entire duration of a patient's encounter in ED and not just at the time of triage. These models remained accurate for a patient cohort beyond the time period of the initial training data and were integrated to run on live electronic health record data, with similar performance.
Collapse
|
30
|
Heyming TW, Knudsen-Robbins C, Feaster W, Ehwerhemuepha L. Criticality index conducted in pediatric emergency department triage. Am J Emerg Med 2021; 48:209-217. [PMID: 33975133 DOI: 10.1016/j.ajem.2021.05.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 04/27/2021] [Accepted: 05/02/2021] [Indexed: 10/21/2022] Open
Abstract
OBJECTIVE To develop and analyze the performance of a machine learning model capable of predicting the disposition of patients presenting to a pediatric emergency department (ED) based on triage assessment and historical information mined from electronic health records. METHODS We retrospectively reviewed data from 585,142 ED visits at a pediatric quaternary care institution between 2013 and 2020. An extreme gradient boosting machine learning model was trained on a randomly selected training data set (50%) to stratify patients into 3 classes: (1) high criticality (patients requiring intensive care unit [ICU] care within 4 h of hospital admission, patients who died within 4 h of admission, and patients who died in the ED); (2) moderate criticality (patients requiring hospitalization without the need for ICU care); and (3) low criticality (patients discharged home). Variables considered during model development included triage vital signs, aspects of triage nursing assessment, demographics, and historical information (diagnoses, medication use, and healthcare utilization). Historical factors were limited to the 6 months preceding the index ED visit. The model was tested on a previously withheld test data set (40%), and its performance analyzed. RESULTS The distribution of criticality among high, moderate, and low was 1.5%, 7.1%, and 91.4%, respectively. The one-versus-all area under the receiver operating characteristic (AUROC) curve for high and moderate criticality was 0.982 (95% CI 0.980, 0.983) and 0.968 (0.967, 0.969). The multi-class macro average AUROC and area under the receiver operating characteristic curve were 0.976 and 0.754. The features most integral to model performance included history of intravenous medications, capillary refill, emergency severity index level, history of hospitalization, use of a supplemental oxygen device, age, and history of admission to the ICU. CONCLUSION Pediatric ED disposition can be accurately predicted using information available at triage, providing an opportunity to improve quality of care and patient outcomes.
Collapse
Affiliation(s)
- Theodore W Heyming
- Children's Hospital of Orange County, Orange, CA, United States; Department of Emergency Medicine, University of California, Irvine, United States.
| | | | - William Feaster
- Children's Hospital of Orange County, Orange, CA, United States
| | | |
Collapse
|
31
|
Yang Z, Xu W, Chen R. A deep learning-based multi-turn conversation modeling for diagnostic Q&A document recommendation. Inf Process Manag 2021. [DOI: 10.1016/j.ipm.2020.102485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
32
|
Leonard F, Gilligan J, Barrett MJ. Predicting Admissions From a Paediatric Emergency Department - Protocol for Developing and Validating a Low-Dimensional Machine Learning Prediction Model. Front Big Data 2021; 4:643558. [PMID: 33937750 PMCID: PMC8085432 DOI: 10.3389/fdata.2021.643558] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 03/22/2021] [Indexed: 12/02/2022] Open
Abstract
Introduction: Patients boarding in the Emergency Department can contribute to overcrowding, leading to longer waiting times and patients leaving without being seen or completing their treatment. The early identification of potential admissions could act as an additional decision support tool to alert clinicians that a patient needs to be reviewed for admission and would also be of benefit to bed managers in advance bed planning for the patient. We aim to create a low-dimensional model predicting admissions early from the paediatric Emergency Department. Methods and Analysis: The methodology Cross Industry Standard Process for Data Mining (CRISP-DM) will be followed. The dataset will comprise of 2 years of data, ~76,000 records. Potential predictors were identified from previous research, comprising of demographics, registration details, triage assessment, hospital usage and past medical history. Fifteen models will be developed comprised of 3 machine learning algorithms (Logistic regression, naïve Bayes and gradient boosting machine) and 5 sampling methods, 4 of which are aimed at addressing class imbalance (undersampling, oversampling, and synthetic oversampling techniques). The variables of importance will then be identified from the optimal model (selected based on the highest Area under the curve) and used to develop an additional low-dimensional model for deployment. Discussion: A low-dimensional model comprised of routinely collected data, captured up to post triage assessment would benefit many hospitals without data rich platforms for the development of models with a high number of predictors. Novel to the planned study is the use of data from the Republic of Ireland and the application of sampling techniques aimed at improving model performance impacted by an imbalance between admissions and discharges in the outcome variable.
Collapse
Affiliation(s)
- Fiona Leonard
- Business Intelligence Unit, Children's Health Ireland at Crumlin, Dublin, Ireland
| | - John Gilligan
- School of Computer Science, Technological University Dublin, Dublin, Ireland
| | - Michael J Barrett
- Department of Emergency Medicine, Children's Health Ireland at Crumlin, Dublin, Ireland.,School of Medicine, University College Dublin, Dublin, Ireland
| |
Collapse
|
33
|
Na HJ, Lee KC, Kim ST. Integrating Text-Mining and Balanced Scorecard Techniques to Investigate the Association between CEO Message of Homepage Words and Financial Status: Emphasis on Hospitals. Healthcare (Basel) 2021; 9:healthcare9040408. [PMID: 33916303 PMCID: PMC8067190 DOI: 10.3390/healthcare9040408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 03/19/2021] [Accepted: 03/22/2021] [Indexed: 11/16/2022] Open
Abstract
(1) Background: The Chief Executive Officer’s (CEO’s) message on a hospital’s homepage on the Internet contains various components, such as the hospital’s future vision, promises to customers, availability of upgraded services and public activities. This statement usually includes non-financial information as well as financial information about the corporate entity owning/operating the hospital. In addition, it provides useful information about not only the company’s goals and vision, but also firm performance targets and strategies for the future. This study aims to investigate associations between the CEO’s message and the financial status of the institution. We used the balanced scorecard framework to analyze what content on the hospital’s homepage is related to the hospital’s various financial ratios. (2) Methods: We adopted a text-mining method to extract significantly repeated keywords from the CEO’s message on the hospital’s website. Then, we classified these keywords using a balanced scorecard approach. To examine the relationship between keywords in the CEO’s message and the hospital’s financial ratios, a t-test was conducted for the difference in the term frequency divided by inverse document frequency (TF-IDF) mean of the home page contents and its relationship with the views of the balanced scorecard framework. (3) Results: According to our empirical results on 65 samples collected from local hospitals, there are some significant relationships between the qualitative content of the hospital’s homepage and the quantitative financial ratios that indicate profitability, activity, leverage, liquidity, and accumulating reserves for proper business purposes. (4) Conclusions: The introduction section of a homepage is the part most accessible to customers, containing the aims and ideals of the hospital and reflecting the institution’s values and visions. In addition, in the coverage of financial status, the organization can either emphasize financial strength or focus on other areas to divert attention from any weakness shown in the financial information. This study reminds us of the importance of the hospital website’s disclosure, and what can be inferred from the financial status of the hospital. It also highlights the need for reconciliation and harmony between the quantitative data, financial statements, and qualitative data in the CEO’s message. (5) Implications: To the best of our knowledge, this paper is the first research attempting to investigate the relationship between text on the hospital’s homepage and the hospital’s financial ratios using text-mining techniques and the balanced scorecard framework. Hospitals play a crucial role in a country’s welfare and healthcare industry. Nevertheless, in many countries, hospital organizations tend to remain a source of critical fiscal deficits due to ineffective and sloppy management. We expect that the result of this paper can provide hospital managers with useful information to address that situation.
Collapse
Affiliation(s)
- Hyung Jong Na
- School of Global Business Administration, Semyung University, Jecheon 27136, Korea;
| | - Kun Chang Lee
- SKK Business School, Sungkyunkwan University, Seoul 03063, Korea
- Correspondence:
| | - Seong Tae Kim
- School of Management, Kyung Hee University, Seoul 02447, Korea;
| |
Collapse
|
34
|
Predicting adult neuroscience intensive care unit admission from emergency department triage using a retrospective, tabular-free text machine learning approach. Sci Rep 2021; 11:1381. [PMID: 33446890 PMCID: PMC7809037 DOI: 10.1038/s41598-021-80985-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 12/28/2020] [Indexed: 12/23/2022] Open
Abstract
Early admission to the neurosciences intensive care unit (NSICU) is associated with improved patient outcomes. Natural language processing offers new possibilities for mining free text in electronic health record data. We sought to develop a machine learning model using both tabular and free text data to identify patients requiring NSICU admission shortly after arrival to the emergency department (ED). We conducted a single-center, retrospective cohort study of adult patients at the Mount Sinai Hospital, an academic medical center in New York City. All patients presenting to our institutional ED between January 2014 and December 2018 were included. Structured (tabular) demographic, clinical, bed movement record data, and free text data from triage notes were extracted from our institutional data warehouse. A machine learning model was trained to predict likelihood of NSICU admission at 30 min from arrival to the ED. We identified 412,858 patients presenting to the ED over the study period, of whom 1900 (0.5%) were admitted to the NSICU. The daily median number of ED presentations was 231 (IQR 200–256) and the median time from ED presentation to the decision for NSICU admission was 169 min (IQR 80–324). A model trained only with text data had an area under the receiver-operating curve (AUC) of 0.90 (95% confidence interval (CI) 0.87–0.91). A structured data-only model had an AUC of 0.92 (95% CI 0.91–0.94). A combined model trained on structured and text data had an AUC of 0.93 (95% CI 0.92–0.95). At a false positive rate of 1:100 (99% specificity), the combined model was 58% sensitive for identifying NSICU admission. A machine learning model using structured and free text data can predict NSICU admission soon after ED arrival. This may potentially improve ED and NSICU resource allocation. Further studies should validate our findings.
Collapse
|
35
|
Sensitivity and Specificity of Computer-Based Neurocognitive Tests in Sport-Related Concussion: Findings from the NCAA-DoD CARE Consortium. Sports Med 2020; 51:351-365. [PMID: 33315231 DOI: 10.1007/s40279-020-01393-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/19/2020] [Indexed: 11/25/2022]
Abstract
BACKGROUND To optimally care for concussed individuals, a multi-dimensional approach is critical and a key component of this assessment in the athletic environment is computer-based neurocognitive testing. However, there continues to be concerns about the reliability and validity of these testing tools. The purpose of this study was to determine the sensitivity and specificity of three common computer-based neurocognitive tests (Immediate Post-Concussion Assessment and Cognitive Testing [ImPACT], CNS Vital Signs, and CogState Computerized Assessment Tool [CCAT]), to provide guidance on their clinical utility. METHODS This study analyzed assessments from a cohort of collegiate athletes and non-varsity cadets from the NCAA-DoD CARE Consortium. The data were collected from 2014-2018. Study participants were divided into two testing groups [concussed, n = 1414 (baseline/24-48 h) and healthy, n = 8305 (baseline/baseline)]. For each test type, change scores were calculated for the components of interest. Then, the Normative Change method, which used normative data published in a similar cohort, and the Reliable Change Index (RCI) method were used to determine if the change scores were significant. RESULTS Using the Normative Change method, ImPACT performed best with an 87.5%-confidence interval and 1 number of components failed (NCF; sensitivity = 0.583, specificity = 0.625, F1 = 0.308). CNS Vital Signs performed best with a 90%-confidence interval and 1 NCF (sensitivity = 0.587, specificity = 0.532, F1 = 0.314). CCAT performed best when using a 75%-confidence interval and 2 NCF (sensitivity = 0.513, specificity = 0.715, F1 = 0.290). When using the RCI method, ImPACT performed best with an 87.5%-confidence interval and 1 NCF (sensitivity = 0.626, specificity = 0.559, F1 = 0.297). CONCLUSION When considering all three computer-based neurocognitive tests, the overall low sensitivity and specificity results provide additional evidence for the use of a multi-dimensional assessment for concussion diagnosis, including symptom evaluation, postural control assessment, neuropsychological status, and other functional assessments.
Collapse
|
36
|
Peng Z, Xu G, Zhou H, Yao Y, Ren H, Zhu J, Liu H, Liu W. Early warning of nursing risk based on patient electronic medical record information. J Infect Public Health 2020; 13:1562-1566. [DOI: 10.1016/j.jiph.2019.07.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2019] [Revised: 07/22/2019] [Accepted: 07/23/2019] [Indexed: 11/26/2022] Open
|
37
|
Lucini FR, dos Reis MA, da Silveira GJC, Fogliatto FS, Anzanello MJ, Andrioli GG, Nicolaidis R, Beltrame RCF, Neyeloff JL, Schaan BD. Man vs. machine: Predicting hospital bed demand from an emergency department. PLoS One 2020; 15:e0237937. [PMID: 32853217 PMCID: PMC7451657 DOI: 10.1371/journal.pone.0237937] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Accepted: 08/05/2020] [Indexed: 11/19/2022] Open
Abstract
Background The recent literature reports promising results from using intelligent systems to support decision making in healthcare operations. Using these systems may lead to improved diagnostic and treatment protocols and to predict hospital bed demand. Predicting hospital bed demand in emergency department (ED) attendances could help resource allocation and reduce pressure on busy hospitals. However, there is still limited knowledge on whether intelligent systems can operate as fully autonomous, user-independent systems. Objective Compare the performance of a computer-based algorithm and humans in predicting hospital bed demand (admissions and discharges) based on the initial SOAP (Subjective, Objective, Assessment, Plan) records of the ED. Methods This was a retrospective cohort study that compared the performance of humans and machines in predicting hospital bed demand from an ED. It considered electronic medical records (EMR) of 9030 patients (230 used as a testing set, and hence evaluated both by humans and by an algorithm, and 8800 used as a training set exclusively by the algorithm) who visited the ED of a tertiary care and teaching public hospital located in Porto Alegre, Brazil between January and December 2014. The machine role was played by Support Vector Machine Classifier and the human prediction was performed by four ED physicians. Predictions were compared in terms of sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (AUROC). Results All graders achieved similar accuracies. The accuracy by AUROC for the testing set was 0.82 [95% confidence interval (CI) of 0.77–0.87], 0.80 (95% CI: 0.75–0.85), 0.76 (95% CI: 0.71–0.81) for novice physicians, machine, experienced physicians, respectively. Processing time per test EMR was 0.00812±0.0009 seconds. In contrast, novice physicians took on average 156.80 seconds per test EMR, while experienced physicians took on average 56.40 seconds per test EMR. Conclusions Our data indicated that the system could predict patient admission or discharge states with 80% accuracy, which was similar the performance of novice and experienced physicians. These results suggested that the algorithm could operate as an autonomous and independent system to complete this task.
Collapse
Affiliation(s)
- Filipe Rissieri Lucini
- Department of Critical Care Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Mateus Augusto dos Reis
- Hospital de Clínicas de Porto Alegre, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
- Department of Internal Medicine, Faculty of Medicine, Postgraduate Program in Medical Sciences: Endocrinology, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
- * E-mail:
| | | | - Flavio Sanson Fogliatto
- Industrial Engineering Department, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Michel José Anzanello
- Industrial Engineering Department, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Giordanna Guerra Andrioli
- Hospital de Clínicas de Porto Alegre, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Rafael Nicolaidis
- Hospital de Clínicas de Porto Alegre, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | | | - Jeruza Lavanholi Neyeloff
- Hospital de Clínicas de Porto Alegre, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Beatriz D'Agord Schaan
- Hospital de Clínicas de Porto Alegre, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
- Department of Internal Medicine, Faculty of Medicine, Postgraduate Program in Medical Sciences: Endocrinology, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| |
Collapse
|
38
|
Rundo L, Pirrone R, Vitabile S, Sala E, Gambino O. Recent advances of HCI in decision-making tasks for optimized clinical workflows and precision medicine. J Biomed Inform 2020; 108:103479. [DOI: 10.1016/j.jbi.2020.103479] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 04/27/2020] [Accepted: 06/06/2020] [Indexed: 12/28/2022]
|
39
|
Tollinton L, Metcalf AM, Velupillai S. Enhancing predictions of patient conveyance using emergency call handler free text notes for unconscious and fainting incidents reported to the London Ambulance Service. Int J Med Inform 2020; 141:104179. [PMID: 32663739 DOI: 10.1016/j.ijmedinf.2020.104179] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 04/28/2020] [Accepted: 05/13/2020] [Indexed: 11/29/2022]
Abstract
OBJECTIVE Pre-hospital emergency medical services use clinical decision support systems (CDSS) to triage calls. Call handlers often supplement this by making free text notes covering key incident information. We investigate whether machine learning approaches using features from such free text notes can improve prediction of unconscious patients who require conveyance. MATERIALS AND METHODS We analysed a subset of all London Ambulance Service calls that were triaged through the Medical Priority Dispatch System (MPDS) as involving an unconscious or fainting patient in 2018. We use and compare two machine learning algorithms: random forest (RF) and gradient boosting machine (GBM). For each incident, we predict whether the patient will be conveyed to a hospital emergency department or equivalent using as features 1) the MPDS code, 2) the free text notes and 3) the two together. We evaluate model performance using the area under the curve (AUC) metric. Given the imbalance of outcomes (patient conveyed 71 %, not conveyed 29 %), we also consider sensitivity and specificity. RESULTS Using only the MPDS code resulted in an AUC of 0.57. Using the text notes gave an improved AUC score of 0.63 and combining the two gave an AUC score of 0.64 (scores were similar for RF and GBM). GBM models scored better on sensitivity (0.93 vs 0.62 for RF in the combined model), but specificity was lower (0.17 vs. 0.56 for RF in the combined model). CONCLUSIONS Using information contained in the free text notes made by call handlers in combination with MPDS improves prediction of unconscious and fainting patients requiring conveyance to a hospital emergency department (or equivalent) when compared with machine learning models using MPDS codes only. This suggests there is some useful information in unstructured data captured by emergency call handlers that complements MPDS codes. Quantifying this gain can help inform emergency medical service policy when evaluating the decision to expand or augment existing CDSS.
Collapse
Affiliation(s)
- Liam Tollinton
- Centre for Urban Science and Progress Studies, King's College London, UK
| | | | - Sumithra Velupillai
- Centre for Urban Science and Progress Studies, King's College London, UK; Institute for Psychiatry, Psychology & Neuroscience, King's College London, UK.
| |
Collapse
|
40
|
Mowbray F, Zargoush M, Jones A, de Wit K, Costa A. Predicting hospital admission for older emergency department patients: Insights from machine learning. Int J Med Inform 2020; 140:104163. [PMID: 32474393 DOI: 10.1016/j.ijmedinf.2020.104163] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Revised: 04/26/2020] [Accepted: 04/28/2020] [Indexed: 11/17/2022]
Abstract
BACKGROUND Emergency departments (ED) are a portal of entry into the hospital and are uniquely positioned to influence the health care trajectories of older adults seeking medical attention. Older adults present to the ED with distinct needs and complex medical histories, which can make disposition planning more challenging. Machine learning (ML) approaches have been previously used to inform decision-making surrounding ED disposition in the general population. However, little is known about the performance and utility of ML methods in predicting hospital admission among older ED patients. We applied a series of ML algorithms to predict ED admission in older adults and discuss their clinical and policy implications. MATERIALS AND METHODS We analyzed the Canadian data from the interRAI multinational ED study, the largest prospective cohort study of older ED patients to date. The data included 2274 ED patients 75 years of age and older from eight ED sites across Canada between November 2009 and April 2012. Data were extracted from the interRAI ED Contact Assessment, with predictors including a series of geriatric syndromes, functional assessments, and baseline care needs. We applied a total of five ML algorithms. Models were trained, assessed, and analyzed using 10-fold cross-validation. The performance of predictive models was measured using the area under the receiver operating characteristic curve (AUC). We also report the accuracy, sensitivity, and specificity of each model to supplement performance interpretation. RESULTS Gradient boosted trees was the most accurate model to predict older ED patients who would require hospitalization (AUC = 0.80). The five most informative features include home intravenous therapy, time of ED presentation, a requirement for formal support services, independence in walking, and the presence of an unstable medical condition. CONCLUSION To the best of our knowledge, this is the first study to predict hospital admission in older ED patients using a series of geriatric syndromes and functional assessments. We were able to predict hospital admission in older ED patients with good accuracy using the items available in the interRAI ED Contact Assessment. This information can be used to inform decision-making about ED disposition and may expedite admission processes and proactive discharge planning.
Collapse
Affiliation(s)
- Fabrice Mowbray
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada; Big Data and Geriatric Models of Care (BDG) Cluster, McMaster University, Hamilton, Ontario, Canada
| | - Manaf Zargoush
- Health Policy and Management, DeGroote School of Business, McMaster University, Hamilton, Ontario, Canada.
| | - Aaron Jones
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada; Big Data and Geriatric Models of Care (BDG) Cluster, McMaster University, Hamilton, Ontario, Canada
| | - Kerstin de Wit
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada; Division of Emergency Medicine, Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Andrew Costa
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada; Big Data and Geriatric Models of Care (BDG) Cluster, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
41
|
Prediction of admission in pediatric emergency department with deep neural networks and triage textual data. Neural Netw 2020; 126:170-177. [DOI: 10.1016/j.neunet.2020.03.012] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Revised: 01/11/2020] [Accepted: 03/12/2020] [Indexed: 11/16/2022]
|
42
|
Chen CH, Hsieh JG, Cheng SL, Lin YL, Lin PH, Jeng JH. Emergency department disposition prediction using a deep neural network with integrated clinical narratives and structured data. Int J Med Inform 2020; 139:104146. [PMID: 32387818 DOI: 10.1016/j.ijmedinf.2020.104146] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2020] [Revised: 03/30/2020] [Accepted: 04/14/2020] [Indexed: 12/23/2022]
Abstract
BACKGROUND Emergency department (ED) overcrowding has been a serious issue and demands effective clinical decision-making of patient disposition. In previous studies, emergency clinical narratives provide a rich context for clinical decisions. We aimed to develop the disposition prediction model using deep learning modeling strategy with the heterogeneous data, including the physicians' narratives. METHODS We constructed a retrospective cohort of all 104,083 ED visits of non-trauma adults during 2017-18 from an academically affiliated ED in Taiwan. 18,308 visits were excluded based on the completeness of each record and the unpredictable dispositions, such as out-of-hospital cardiac arrest, against-advice discharge, and escapes. We integrated subjective section of the first physicians' clinical narratives and structured data (e.g., demographics, triage vital signs, etc.) as available predictors at the first physician-patient encounter. To predict final patient disposition (i.e., hospitalization or discharge), a deep neural network (DNN) model was developed with word embedding, a common natural language processing method. We compared the proposed model to a reference model using the Rapid Emergency Medicine Score, a logistic regression model with structured data, and a DNN model with paragraph vectors. F1 score was used to measure the predictive performance for each model. RESULTS The F1 score (with 95 % CI) for the proposed model, the reference model, the logistic regression model with structured data, and the DNN model with paragraph vectors were 0.674 (0.669-0.679), 0.474 (0.469-0.479), 0.547 (0.543-0.551), and 0.602 (0.596-0.607), respectively. While analyzing the relationship between context length and predictive performance under the proposed model, the F1 score at 95th percentile of the word counts was higher than that at 25th percentile of the word counts in chief complaint [0.634 (0.629-0.640) vs. 0.624 (0.620-0.628)] and in present illness [0.671 (0.667-0.674) vs. 0.654 (0.651-0.658)], but not in past medical history [0.674 (0.669-0.679) vs. 0.673 (0.666-0.679)]. CONCLUSIONS The proposed deep learning model with the usage of the first physicians' clinical narratives and structured data based on natural language processing outperformed the commonly used ones in terms of F1 score. It also evidenced the importance of the subjective section of clinical narratives, which serve as vital predictors for ED clinical decision-making.
Collapse
Affiliation(s)
- Chien-Hua Chen
- Department of Electrical Engineering, I-Shou University, Kaohsiung, Taiwan; Department of Emergency Medicine, Taichung Veterans General Hospital Chiayi Branch, Chia-Yi, Taiwan
| | - Jer-Guang Hsieh
- Department of Electrical Engineering, I-Shou University, Kaohsiung, Taiwan
| | - Shu-Ling Cheng
- Department of Multimedia and Game Developing Management, Far East University, Tainan, Taiwan.
| | - Yih-Lon Lin
- Department of Information Engineering, I-Shou University, Kaohsiung, Taiwan
| | - Po-Hsiang Lin
- Department of Electrical Engineering, I-Shou University, Kaohsiung, Taiwan; Department of Emergency Medicine, Kaohsiung Veterans General Hospital, Kaohsiung, Taiwan
| | - Jyh-Horng Jeng
- Department of Information Engineering, I-Shou University, Kaohsiung, Taiwan
| |
Collapse
|
43
|
Krsnik I, Glavaš G, Krsnik M, Miletić D, Štajduhar I. Automatic Annotation of Narrative Radiology Reports. Diagnostics (Basel) 2020; 10:E196. [PMID: 32244833 PMCID: PMC7235892 DOI: 10.3390/diagnostics10040196] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Revised: 03/27/2020] [Accepted: 03/27/2020] [Indexed: 12/04/2022] Open
Abstract
Narrative texts in electronic health records can be efficiently utilized for building decision support systems in the clinic, only if they are correctly interpreted automatically in accordance with a specified standard. This paper tackles the problem of developing an automated method of labeling free-form radiology reports, as a precursor for building query-capable report databases in hospitals. The analyzed dataset consists of 1295 radiology reports concerning the condition of a knee, retrospectively gathered at the Clinical Hospital Centre Rijeka, Croatia. Reports were manually labeled with one or more labels from a set of 10 most commonly occurring clinical conditions. After primary preprocessing of the texts, two sets of text classification methods were compared: (1) traditional classification models-Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), and Random Forests (RF)-coupled with Bag-of-Words (BoW) features (i.e., symbolic text representation) and (2) Convolutional Neural Network (CNN) coupled with dense word vectors (i.e., word embeddings as a semantic text representation) as input features. We resorted to nested 10-fold cross-validation to evaluate the performance of competing methods using accuracy, precision, recall, and F 1 score. The CNN with semantic word representations as input yielded the overall best performance, having a micro-averaged F 1 score of 86 . 7 % . The CNN classifier yielded particularly encouraging results for the most represented conditions: degenerative disease ( 95 . 9 % ), arthrosis ( 93 . 3 % ), and injury ( 89 . 2 % ). As a data-hungry deep learning model, the CNN, however, performed notably worse than the competing models on underrepresented classes with fewer training instances such as multicausal disease or metabolic disease. LR, RF, and SVM performed comparably well, with the obtained micro-averaged F 1 scores of 84 . 6 % , 82 . 2 % , and 82 . 1 % , respectively.
Collapse
Affiliation(s)
- Ivan Krsnik
- Department of Computer Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, Croatia;
| | - Goran Glavaš
- School of Business Informatics and Mathematics, University of Mannheim, 68159 Mannheim, Germany;
| | - Marina Krsnik
- Faculty of Veterinary Medicine, University of Zagreb, Heinzelova 55, 10000 Zagreb, Croatia;
| | - Damir Miletić
- Clinical Hospital Centre Rijeka, University of Rijeka, Krešimirova 42, 51000 Rijeka, Croatia;
| | - Ivan Štajduhar
- Department of Computer Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, Croatia;
- Center for Artificial Intelligence and Cybersecurity, University of Rijeka, Radmile Matejčić 2, 51000 Rijeka, Croatia
| |
Collapse
|
44
|
Abstract
Text mining in big data analytics is emerging as a powerful tool for harnessing the power of unstructured textual data by analyzing it to extract new knowledge and to identify significant patterns and correlations hidden in the data. This study seeks to determine the state of text mining research by examining the developments within published literature over past years and provide valuable insights for practitioners and researchers on the predominant trends, methods, and applications of text mining research. In accordance with this, more than 200 academic journal articles on the subject are included and discussed in this review; the state-of-the-art text mining approaches and techniques used for analyzing transcripts and speeches, meeting transcripts, and academic journal articles, as well as websites, emails, blogs, and social media platforms, across a broad range of application areas are also investigated. Additionally, the benefits and challenges related to text mining are also briefly outlined.
Collapse
|
45
|
da Silva DA, ten Caten CS, dos Santos RP, Fogliatto FS, Hsuan J. Predicting the occurrence of surgical site infections using text mining and machine learning. PLoS One 2019; 14:e0226272. [PMID: 31834905 PMCID: PMC6910696 DOI: 10.1371/journal.pone.0226272] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Accepted: 11/22/2019] [Indexed: 12/11/2022] Open
Abstract
In this study we propose the use of text mining and machine learning methods to predict and detect Surgical Site Infections (SSIs) using textual descriptions of surgeries and post-operative patients’ records, mined from the database of a high complexity University hospital. SSIs are among the most common adverse events experienced by hospitalized patients; preventing such events is fundamental to ensure patients’ safety. Knowledge on SSI occurrence rates may also be useful in preventing future episodes. We analyzed 15,479 surgery descriptions and post-operative records testing different preprocessing strategies and the following machine learning algorithms: Linear SVC, Logistic Regression, Multinomial Naive Bayes, Nearest Centroid, Random Forest, Stochastic Gradient Descent, and Support Vector Classification (SVC). For prediction purposes, the best result was obtained using the Stochastic Gradient Descent method (79.7% ROC-AUC); for detection, Logistic Regression yielded the best performance (80.6% ROC-AUC).
Collapse
Affiliation(s)
- Daniel A. da Silva
- Industrial Engineering Department, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Carla S. ten Caten
- Industrial Engineering Department, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | | | - Flavio S. Fogliatto
- Industrial Engineering Department, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
- * E-mail:
| | | |
Collapse
|
46
|
Emergency Department Capacity Planning: A Recurrent Neural Network and Simulation Approach. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2019; 2019:4359719. [PMID: 31827585 PMCID: PMC6881773 DOI: 10.1155/2019/4359719] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Accepted: 10/28/2019] [Indexed: 11/18/2022]
Abstract
Emergency departments (EDs) play a vital role in the whole healthcare system as they are the first point of care in hospitals for urgent and critically ill patients. Therefore, effective management of hospital's ED is crucial in improving the quality of the healthcare service. The effectiveness depends on how efficiently the hospital resources are used, particularly under budget constraints. Simulation modeling is one of the best methods to optimize resources and needs inputs such as patients' arrival time, patient's length of stay (LOS), and the route of patients in the ED. This study develops a simulation model to determine the optimum number of beds in an ED by minimizing the patients' LOS. The hospital data are analyzed, and patients' LOS and the route of patients in the ED are determined. To determine patients' arrival times, the features associated with patients' arrivals at ED are identified. Mean arrival rate is used as a feature in addition to climatic and temporal variables. The exhaustive feature-selection method has been used to determine the best subset of the features, and the mean arrival rate is determined as one of the most significant features. This study is executed using the one-year ED arrival data together with five-year (43.824 study hours) ED arrival data to improve the accuracy of predictions. Furthermore, ten different machine learning (ML) algorithms are used utilizing the same best subset of these features. After a tenfold cross-validation experiment, based on mean absolute percentage error (MAPE), the stateful long short-term memory (LSTM) model performed better than other models with an accuracy of 47%, followed by the decision tree and random forest methods. Using the simulation method, the LOS has been minimized by 7% and the number of beds at the ED has been optimized.
Collapse
|
47
|
Kim L, Ju J. Can media forecast technological progress?: A text-mining approach to the on-line newspaper and blog's representation of prospective industrial technologies. Inf Process Manag 2019. [DOI: 10.1016/j.ipm.2018.10.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
48
|
Prediction of emergency department patient disposition based on natural language processing of triage notes. Int J Med Inform 2019; 129:184-188. [PMID: 31445253 DOI: 10.1016/j.ijmedinf.2019.06.008] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 05/21/2019] [Accepted: 06/10/2019] [Indexed: 12/23/2022]
Abstract
BACKGROUND Nursing triage documentation is the first free-form text data created at the start of an emergency department (ED) visit. These 1-3 unstructured sentences reflect the clinical impression of an experienced nurse and are key in gauging a patient's illness. We aimed to predict final ED disposition using three commonly-employed natural language processing (NLP) techniques of nursing triage notes in isolation from other data. METHODS We constructed a retrospective cohort of all 260,842 consecutive ED encounters in 2015-16, from three clinically heterogeneous academically-affiliated EDs. After exclusion of 3964 encounters based on completeness of triage, and disposition data, we included 256,878 encounters. We defined the outcome as: 1) admission, transfer, or in-ED death [68,092 encounters] vs. 2) discharge, "left without being seen," and "left against medical advice" [188,786 encounters]. The dataset was divided into training and testing subsets. Neural network regression models were trained using bag-of-words, paragraph vectors, and topic distributions to predict disposition and were evaluated using the testing dataset. RESULTS Area under the curve for disposition using triage notes as bag-of-words, paragraph vectors, and topic distributions were 0.737 (95% CI: 0.734 - 0.740), 0.785 (95% CI: 0.782 - 0.788), and 0.687 (95% CI: 0.684 - 0.690), respectively. CONCLUSIONS Nursing triage notes can be used to predict final ED patient disposition, even when used separately from other clinical information. These findings have substantial implications for future studies, suggesting that free text from medical records may be considered as a critical predictor in research of patient outcomes.
Collapse
|
49
|
Chang JR, Chen MY, Chen LS, Chien WT. Recognizing important factors of influencing trust in O2O models: an example of OpenTable. Soft comput 2019. [DOI: 10.1007/s00500-019-04019-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
50
|
A Robust Framework for Self-Care Problem Identification for Children with Disability. Symmetry (Basel) 2019. [DOI: 10.3390/sym11010089] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Recently, a standard dataset namely SCADI (Self-Care Activities Dataset) based on the International Classification of Functioning, Disability, and Health for Children and Youth framework for self-care problems identification of children with physical and motor disabilities was introduced. This is a very interesting, important and challenging topic due to its usefulness in medical diagnosis. This study proposes a robust framework using a sampling technique and extreme gradient boosting (FSX) to improve the prediction performance for the SCADI dataset. The proposed framework first converts the original dataset to a new dataset with a smaller number of dimensions. Then, our proposed framework balances the new dataset in the previous step using oversampling techniques with different ratios. Next, extreme gradient boosting was used to diagnose the problems. The experiments in terms of prediction performance and feature importance were conducted to show the effectiveness of FSX as well as to analyse the results. The experimental results show that FSX that uses the Synthetic Minority Over-sampling Technique (SMOTE) for the oversampling module outperforms the ANN (Artificial Neural Network) -based approach, Support vector machine (SVM) and Random Forest for the SCADI dataset. The overall accuracy of the proposed framework reaches 85.4%, a pretty high performance, which can be used for self-care problem classification in medical diagnosis.
Collapse
|