Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Haider F, Pollak S, Albert P, Luz S. Emotion recognition in low-resource settings: An evaluation of automatic feature selection methods. COMPUT SPEECH LANG 2021. [DOI: 10.1016/j.csl.2020.101119] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

For:	Haider F, Pollak S, Albert P, Luz S. Emotion recognition in low-resource settings: An evaluation of automatic feature selection methods. COMPUT SPEECH LANG 2021. [DOI: 10.1016/j.csl.2020.101119] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Number

Cited by Other Article(s)

Akinpelu S, Viriri S, Adegun A. An enhanced speech emotion recognition using vision transformer. Sci Rep 2024;14:13126. [PMID: 38849422 PMCID: PMC11161461 DOI: 10.1038/s41598-024-63776-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Accepted: 06/02/2024] [Indexed: 06/09/2024] Open

Ke X, Mak MW, Meng HM. Automatic selection of spoken language biomarkers for dementia detection. Neural Netw 2024;169:191-204. [PMID: 37898051 DOI: 10.1016/j.neunet.2023.10.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Revised: 10/10/2023] [Accepted: 10/12/2023] [Indexed: 10/30/2023]

Daneshfar F, Jamshidi MB. An octonion-based nonlinear echo state network for speech emotion recognition in Metaverse. Neural Netw 2023;163:108-121. [PMID: 37030275 DOI: 10.1016/j.neunet.2023.03.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 01/18/2023] [Accepted: 03/19/2023] [Indexed: 03/29/2023]

Abstract

While the Metaverse is becoming a popular trend and drawing much attention from academia, society, and businesses, processing cores used in its infrastructures need to be improved, particularly in terms of signal processing and pattern recognition. Accordingly, the speech emotion recognition (SER) method plays a crucial role in creating the Metaverse platforms more usable and enjoyable for its users. However, existing SER methods continue to be plagued by two significant problems in the online environment. The shortage of adequate engagement and customization between avatars and users is recognized as the first issue and the second problem is related to the complexity of SER problems in the Metaverse as we face people and their digital twins or avatars. This is why developing efficient machine learning (ML) techniques specified for hypercomplex signal processing is essential to enhance the impressiveness and tangibility of the Metaverse platforms. As a solution, echo state networks (ESNs), which are an ML powerful tool for SER, can be an appropriate technique to enhance the Metaverse's foundations in this area. Nevertheless, ESNs have some technical issues restricting them from a precise and reliable analysis, especially in the aspect of high-dimensional data. The most significant limitation of these networks is the high memory consumption caused by their reservoir structure in face of high-dimensional signals. To solve all problems associated with ESNs and their application in the Metaverse, we have come up with a novel structure for ESNs empowered by octonion algebra called NO2GESNet. Octonion numbers have eight dimensions, compactly display high-dimensional data, and improve the network precision and performance in comparison to conventional ESNs. The proposed network also solves the weaknesses of the ESNs in the presentation of the higher-order statistics to the output layer by equipping it with a multidimensional bilinear filter. Three comprehensive scenarios to use the proposed network in the Metaverse have been designed and analyzed, not only do they show the accuracy and performance of the proposed approach, but also the ways how SER can be employed in the Metaverse platforms.

Collapse

Doğdu C, Kessler T, Schneider D, Shadaydeh M, Schweinberger SR. A Comparison of Machine Learning Algorithms and Feature Sets for Automatic Vocal Emotion Recognition in Speech. SENSORS (BASEL, SWITZERLAND) 2022;22:7561. [PMID: 36236658 PMCID: PMC9571288 DOI: 10.3390/s22197561] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 09/26/2022] [Accepted: 10/02/2022] [Indexed: 06/16/2023]

User Identity Protection in Automatic Emotion Recognition through Disguised Speech. AI 2021. [DOI: 10.3390/ai2040038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Abstract Ambient Assisted Living (AAL) technologies are being developed which could assist elderly people to live healthy and active lives. These technologies have been used to monitor people’s daily exercises, consumption of calories and sleep patterns, and to provide coaching interventions to foster positive behaviour. Speech and audio processing can be used to complement such AAL technologies to inform interventions for healthy ageing by analyzing speech data captured in the user’s home. However, collection of data in home settings presents challenges. One of the most pressing challenges concerns how to manage privacy and data protection. To address this issue, we proposed a low cost system for recording disguised speech signals which can protect user identity by using pitch shifting. The disguised speech so recorded can then be used for training machine learning models for affective behaviour monitoring. Affective behaviour could provide an indicator of the onset of mental health issues such as depression and cognitive impairment, and help develop clinical tools for automatically detecting and monitoring disease progression. In this article, acoustic features extracted from the non-disguised and disguised speech are evaluated in an affect recognition task using six different machine learning classification methods. The results of transfer learning from non-disguised to disguised speech are also demonstrated. We have identified sets of acoustic features which are not affected by the pitch shifting algorithm and also evaluated them in affect recognition. We found that, while the non-disguised speech signal gives the best Unweighted Average Recall (UAR) of 80.01%, the disguised speech signal only causes a slight degradation of performance, reaching 76.29%. The transfer learning from non-disguised to disguised speech results in a reduction of UAR (65.13%). However, feature selection improves the UAR (68.32%). This approach forms part of a large project which includes health and wellbeing monitoring and coaching. Collapse

Amjad A, Khan L, Chang HT. Effect on speech emotion classification of a feature selection approach using a convolutional neural network. PeerJ Comput Sci 2021;7:e766. [PMID: 34805511 PMCID: PMC8576551 DOI: 10.7717/peerj-cs.766] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 10/11/2021] [Indexed: 05/24/2023]

de la Fuente Garcia S, Haider F, Luz S. COVID-19: Affect recognition through voice analysis during the winter lockdown in Scotland. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021;2021:2326-2329. [PMID: 34890322 DOI: 10.1109/embc46164.2021.9630833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Farooq M, Hussain F, Baloch NK, Raja FR, Yu H, Zikria YB. Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network. SENSORS 2020;20:s20216008. [PMID: 33113907 PMCID: PMC7660211 DOI: 10.3390/s20216008] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Revised: 10/19/2020] [Accepted: 10/20/2020] [Indexed: 12/05/2022]