1
|
Sezgin E, Hussain SA, Rust S, Huang Y. Extracting Medical Information From Free-Text and Unstructured Patient-Generated Health Data Using Natural Language Processing Methods: Feasibility Study With Real-world Data. JMIR Form Res 2023; 7:e43014. [PMID: 36881467 PMCID: PMC10031450 DOI: 10.2196/43014] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 01/24/2023] [Accepted: 01/30/2023] [Indexed: 01/31/2023] Open
Abstract
BACKGROUND Patient-generated health data (PGHD) captured via smart devices or digital health technologies can reflect an individual health journey. PGHD enables tracking and monitoring of personal health conditions, symptoms, and medications out of the clinic, which is crucial for self-care and shared clinical decisions. In addition to self-reported measures and structured PGHD (eg, self-screening, sensor-based biometric data), free-text and unstructured PGHD (eg, patient care note, medical diary) can provide a broader view of a patient's journey and health condition. Natural language processing (NLP) is used to process and analyze unstructured data to create meaningful summaries and insights, showing promise to improve the utilization of PGHD. OBJECTIVE Our aim is to understand and demonstrate the feasibility of an NLP pipeline to extract medication and symptom information from real-world patient and caregiver data. METHODS We report a secondary data analysis, using a data set collected from 24 parents of children with special health care needs (CSHCN) who were recruited via a nonrandom sampling approach. Participants used a voice-interactive app for 2 weeks, generating free-text patient notes (audio transcription or text entry). We built an NLP pipeline using a zero-shot approach (adaptive to low-resource settings). We used named entity recognition (NER) and medical ontologies (RXNorm and SNOMED CT [Systematized Nomenclature of Medicine Clinical Terms]) to identify medication and symptoms. Sentence-level dependency parse trees and part-of-speech tags were used to extract additional entity information using the syntactic properties of a note. We assessed the data; evaluated the pipeline with the patient notes; and reported the precision, recall, and F1 scores. RESULTS In total, 87 patient notes are included (audio transcriptions n=78 and text entries n=9) from 24 parents who have at least one CSHCN. The participants were between the ages of 26 and 59 years. The majority were White (n=22, 92%), had more than one child (n=16, 67%), lived in Ohio (n=22, 92%), had mid- or upper-mid household income (n=15, 62.5%), and had higher level education (n=24, 58%). Out of 87 notes, 30 were drug and medication related, and 46 were symptom related. We captured medication instances (medication, unit, quantity, and date) and symptoms satisfactorily (precision >0.65, recall >0.77, F1>0.72). These results indicate the potential when using NER and dependency parsing through an NLP pipeline on information extraction from unstructured PGHD. CONCLUSIONS The proposed NLP pipeline was found to be feasible for use with real-world unstructured PGHD to accomplish medication and symptom extraction. Unstructured PGHD can be leveraged to inform clinical decision-making, remote monitoring, and self-care including medical adherence and chronic disease management. With customizable information extraction methods using NER and medical ontologies, NLP models can feasibly extract a broad range of clinical information from unstructured PGHD in low-resource settings (eg, a limited number of patient notes or training data).
Collapse
Affiliation(s)
- Emre Sezgin
- The Abigail Wexner Research Institute at Nationwide Children's Hospital, Columbus, OH, United States
- The Ohio State University College of Medicine, Columbus, OH, United States
| | - Syed-Amad Hussain
- The Abigail Wexner Research Institute at Nationwide Children's Hospital, Columbus, OH, United States
| | - Steve Rust
- The Abigail Wexner Research Institute at Nationwide Children's Hospital, Columbus, OH, United States
| | - Yungui Huang
- The Abigail Wexner Research Institute at Nationwide Children's Hospital, Columbus, OH, United States
| |
Collapse
|
2
|
Vohra A, Garg R. Deep learning based sentiment analysis of public perception of working from home through tweets. J Intell Inf Syst 2023; 60:255-274. [PMID: 36034686 PMCID: PMC9399597 DOI: 10.1007/s10844-022-00736-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 08/01/2022] [Accepted: 08/08/2022] [Indexed: 11/26/2022]
Abstract
Nowadays, we are witnessing a paradigm shift from the conventional approach of working from office spaces to the emerging culture of working virtually from home. Even during the COVID-19 pandemic, many organisations were forced to allow employees to work from their homes, which led to worldwide discussions of this trend on Twitter. The analysis of this data has immense potential to change the way we work but extracting useful information from this valuable data is a challenge. Hence in this study, the microblogging website Twitter is used to gather more than 450,000 English language tweets from 22nd January 2022 to 12th March 2022, consisting of keywords related to working from home. A state-of-the-art pre-processing technique is used to convert all emojis into text, remove duplicate tweets, retweets, username tags, URLs, hashtags etc. and then the text is converted to lowercase. Thus, the number of tweets is reduced to 358,823. In this paper, we propose a fine-tuned Convolutional Neural Network (CNN) model to analyse Twitter data. The input to our deep learning model is an annotated set of tweets that are effectively labelled into three sentiment classes, viz. positive negative and neutral using VADER (Valence Aware Dictionary for sEntiment Reasoning). We also use a variation in the input vector to the embedding layer, by using FastText embeddings with our model to train supervised word representations for our text corpus of more than 450,000 tweets. The proposed model uses multiple convolution and max pooling layers, dropout operation, and dense layers with ReLU and sigmoid activations to achieve remarkable results on our dataset. Further, the performance of our model is compared with some standard classifiers like Support Vector Machine (SVM), Naive Bayes, Decision Tree, and Random Forest. From the results, it is observed that on the given dataset, the proposed CNN with FastText word embeddings outperforms other classifiers with an accuracy of 0.925969. As a result of this classification, 54.41% of the tweets are found to show affirmation, 24.50% show a negative disposition, and 21.09% have neutral sentiments towards working from home.
Collapse
Affiliation(s)
- Aarushi Vohra
- grid.444547.20000 0004 0500 4975Department of Computer Engineering, National Institute of Technology Kurukshetra, 136119 Kurukshetra, Haryana India
| | - Ritu Garg
- grid.444547.20000 0004 0500 4975Department of Computer Engineering, National Institute of Technology Kurukshetra, 136119 Kurukshetra, Haryana India
| |
Collapse
|
3
|
Lu H, Rui X, Gemechu GF, Li R. Quantitative Evaluation of Psychological Tolerance under the Haze: A Case Study of Typical Provinces and Cities in China with Severe Haze. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19116574. [PMID: 35682158 PMCID: PMC9180424 DOI: 10.3390/ijerph19116574] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 05/09/2022] [Accepted: 05/20/2022] [Indexed: 01/27/2023]
Abstract
The interplay of specific weather conditions and human activity results due to haze. When the haze arrives, individuals will use microblogs to communicate their concerns and feelings. It will be easier for municipal administrators to alter public communication and resource allocation under the haze if we can master the emotions of netizens. Psychological tolerance is the ability to cope with and adjust to psychological stress and unpleasant emotions brought on by adversity, and it can guide human conduct to some extent. Although haze has a significant impact on human health, environment, transportation, and other factors, its impact on human mental health is concealed, indirect, and frequently underestimated. In this study, psychological tolerance was developed as a psychological impact evaluation index to quantify the impact of haze on human mental health. To begin, data from microblogs in China’s significantly haze-affected districts were collected from 2013 to 2019. The emotion score was then calculated using SnowNLP, and the subject index was calculated using the co-word network approach, both of which were used as social media evaluation indicators. Finally, utilizing ecological and socioeconomic factors, psychological tolerance was assessed at the provincial and prefecture level. The findings suggest that psychological tolerance differs greatly between areas. Psychological tolerance has a spatio-temporal trajectory in the timeseries as well. The findings offer a fresh viewpoint on haze’s mental effects.
Collapse
Affiliation(s)
- Haiyue Lu
- College of Hydrology and Water Resources, Hohai University, Nanjing 210003, China; (H.L.); (G.F.G.)
| | - Xiaoping Rui
- College of Earth and Engineering, Hohai University, Nanjing 211100, China
- Correspondence: (X.R.); (R.L.)
| | - Gadisa Fayera Gemechu
- College of Hydrology and Water Resources, Hohai University, Nanjing 210003, China; (H.L.); (G.F.G.)
- Faculty of Natural Sciences, Salale University, Fiche 245, Ethiopia
| | - Runkui Li
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
- Correspondence: (X.R.); (R.L.)
| |
Collapse
|
4
|
Retta EA, Almekhlafi E, Sutcliffe R, Mhamed M, Ali H, Feng J. A New Amharic Speech Emotion Dataset and Classification Benchmark. ACM T ASIAN LOW-RESO 2022. [DOI: 10.1145/3529759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
In this paper we present the Amharic Speech Emotion Dataset (ASED), which covers four dialects (Gojjam, Wollo, Shewa and Gonder) and five different emotions (neutral, fearful, happy, sad and angry). We believe it is the first Speech Emotion Recognition (SER) dataset for the Amharic language. 65 volunteer participants, all native speakers of Amharic, recorded 2,474 sound samples, two to four seconds in length. Eight judges (two for each dialect) assigned emotions to the samples with high agreement level (Fleiss kappa = 0.8). The resulting dataset is freely available for download. Next, we developed a four-layer variant of the well-known VGG model which we call VGGb. Three experiments were then carried out using VGGb for SER, using ASED. First, we investigated which features work best for Amharic, FilterBank, Mel Spectrogram, or Mel-frequency Cepstral Coefficient (MFCC). This was done by training three VGGb SER models on ASED, using FilterBank, Mel Spectrogram and MFCC features respectively. Four forms of training were tried, standard cross-validation, and three variants based on sentences, dialects and speaker groups. Thus, a sentence used for training would not be used for testing, and the same for a dialect and speaker group. MFCC features were superior under all four training schemes. MFCC was therefore adopted for Experiment 2, where VGGb and three well-known existing models were compared on ASED: RESNet50, AlexNet and LSTM. VGGb was found to have very good accuracy (90.73%) as well as the fastest training time. In Experiment 3, the performance of VGGb was compared when trained on two existing SER datasets – RAVDESS (English) and EMO-DB (German) – as well as on ASED (Amharic). Results are comparable across these languages, with ASED being the highest. This suggests that VGGb can be successfully applied to other languages. We hope that ASED will encourage researchers to explore the Amharic language and to experiment with other models for Amharic SER.
Collapse
Affiliation(s)
- Ephrem Afele Retta
- School of Information Science and Technology, Northwest University, China
| | - Eiad Almekhlafi
- School of Information Science and Technology, Northwest University, China
| | - Richard Sutcliffe
- School of Information Science and Technology, Northwest University, China and School of Computer Science and Electronic Engineering, University of Essex, UK
| | - Mustafa Mhamed
- School of Information Science and Technology, Northwest University, China
| | - Haider Ali
- School of Information Science and Technology, Northwest University, China
| | - Jun Feng
- School of Information Science and Technology, Northwest University, China
| |
Collapse
|
5
|
A Proposal for Multimodal Emotion Recognition Using Aural Transformers and Action Units on RAVDESS Dataset. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app12010327] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Emotion recognition is attracting the attention of the research community due to its multiple applications in different fields, such as medicine or autonomous driving. In this paper, we proposed an automatic emotion recognizer system that consisted of a speech emotion recognizer (SER) and a facial emotion recognizer (FER). For the SER, we evaluated a pre-trained xlsr-Wav2Vec2.0 transformer using two transfer-learning techniques: embedding extraction and fine-tuning. The best accuracy results were achieved when we fine-tuned the whole model by appending a multilayer perceptron on top of it, confirming that the training was more robust when it did not start from scratch and the previous knowledge of the network was similar to the task to adapt. Regarding the facial emotion recognizer, we extracted the Action Units of the videos and compared the performance between employing static models against sequential models. Results showed that sequential models beat static models by a narrow difference. Error analysis reported that the visual systems could improve with a detector of high-emotional load frames, which opened a new line of research to discover new ways to learn from videos. Finally, combining these two modalities with a late fusion strategy, we achieved 86.70% accuracy on the RAVDESS dataset on a subject-wise 5-CV evaluation, classifying eight emotions. Results demonstrated that these modalities carried relevant information to detect users’ emotional state and their combination allowed to improve the final system performance.
Collapse
|
6
|
Luna-Jiménez C, Griol D, Callejas Z, Kleinlein R, Montero JM, Fernández-Martínez F. Multimodal Emotion Recognition on RAVDESS Dataset Using Transfer Learning. SENSORS (BASEL, SWITZERLAND) 2021; 21:7665. [PMID: 34833739 PMCID: PMC8618559 DOI: 10.3390/s21227665] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 11/12/2021] [Accepted: 11/15/2021] [Indexed: 11/29/2022]
Abstract
Emotion Recognition is attracting the attention of the research community due to the multiple areas where it can be applied, such as in healthcare or in road safety systems. In this paper, we propose a multimodal emotion recognition system that relies on speech and facial information. For the speech-based modality, we evaluated several transfer-learning techniques, more specifically, embedding extraction and Fine-Tuning. The best accuracy results were achieved when we fine-tuned the CNN-14 of the PANNs framework, confirming that the training was more robust when it did not start from scratch and the tasks were similar. Regarding the facial emotion recognizers, we propose a framework that consists of a pre-trained Spatial Transformer Network on saliency maps and facial images followed by a bi-LSTM with an attention mechanism. The error analysis reported that the frame-based systems could present some problems when they were used directly to solve a video-based task despite the domain adaptation, which opens a new line of research to discover new ways to correct this mismatch and take advantage of the embedded knowledge of these pre-trained models. Finally, from the combination of these two modalities with a late fusion strategy, we achieved 80.08% accuracy on the RAVDESS dataset on a subject-wise 5-CV evaluation, classifying eight emotions. The results revealed that these modalities carry relevant information to detect users' emotional state and their combination enables improvement of system performance.
Collapse
Grants
- TIN2017-85854-C4-4-R Ministerio de Economía, Industria y Competitividad, Gobierno de España
- PID2020-118112RB-C22 Ministerio de Economía, Industria y Competitividad, Gobierno de España
- PRE2018-083225 Ministerio de Educación, Cultura y Deporte
- Horizon2020 - grant agreement Nº 823907 European Commission
- FEDER, UE Agencia Estatas de Investigación
- PID2020-118112RB-C21 Ministerio de Economía, Industria y Competitividad, Gobierno de España
- TEC2017-84593-C2-1-R Ministerio de Economía, Industria y Competitividad, Gobierno de España
Collapse
Affiliation(s)
- Cristina Luna-Jiménez
- Grupo de Tecnología del Habla y Aprendizaje Automático (THAU Group), Information Processing and Telecommunications Center, E.T.S.I. de Telecomunicación, Universidad Politécnica de Madrid, Avda. Complutense 30, 28040 Madrid, Spain; (R.K.); (J.M.M.); (F.F.-M.)
| | - David Griol
- Department of Software Engineering, CITIC-UGR, University of Granada, Periodista Daniel Saucedo Aranda S/N, 18071 Granada, Spain; (D.G.); (Z.C.)
| | - Zoraida Callejas
- Department of Software Engineering, CITIC-UGR, University of Granada, Periodista Daniel Saucedo Aranda S/N, 18071 Granada, Spain; (D.G.); (Z.C.)
| | - Ricardo Kleinlein
- Grupo de Tecnología del Habla y Aprendizaje Automático (THAU Group), Information Processing and Telecommunications Center, E.T.S.I. de Telecomunicación, Universidad Politécnica de Madrid, Avda. Complutense 30, 28040 Madrid, Spain; (R.K.); (J.M.M.); (F.F.-M.)
| | - Juan M. Montero
- Grupo de Tecnología del Habla y Aprendizaje Automático (THAU Group), Information Processing and Telecommunications Center, E.T.S.I. de Telecomunicación, Universidad Politécnica de Madrid, Avda. Complutense 30, 28040 Madrid, Spain; (R.K.); (J.M.M.); (F.F.-M.)
| | - Fernando Fernández-Martínez
- Grupo de Tecnología del Habla y Aprendizaje Automático (THAU Group), Information Processing and Telecommunications Center, E.T.S.I. de Telecomunicación, Universidad Politécnica de Madrid, Avda. Complutense 30, 28040 Madrid, Spain; (R.K.); (J.M.M.); (F.F.-M.)
| |
Collapse
|