Daneshfar F, Jamshidi MB. An octonion-based nonlinear echo state network for speech emotion recognition in Metaverse.
Neural Netw 2023;
163:108-121. [PMID:
37030275 DOI:
10.1016/j.neunet.2023.03.026]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 01/18/2023] [Accepted: 03/19/2023] [Indexed: 03/29/2023]
Abstract
While the Metaverse is becoming a popular trend and drawing much attention from academia, society, and businesses, processing cores used in its infrastructures need to be improved, particularly in terms of signal processing and pattern recognition. Accordingly, the speech emotion recognition (SER) method plays a crucial role in creating the Metaverse platforms more usable and enjoyable for its users. However, existing SER methods continue to be plagued by two significant problems in the online environment. The shortage of adequate engagement and customization between avatars and users is recognized as the first issue and the second problem is related to the complexity of SER problems in the Metaverse as we face people and their digital twins or avatars. This is why developing efficient machine learning (ML) techniques specified for hypercomplex signal processing is essential to enhance the impressiveness and tangibility of the Metaverse platforms. As a solution, echo state networks (ESNs), which are an ML powerful tool for SER, can be an appropriate technique to enhance the Metaverse's foundations in this area. Nevertheless, ESNs have some technical issues restricting them from a precise and reliable analysis, especially in the aspect of high-dimensional data. The most significant limitation of these networks is the high memory consumption caused by their reservoir structure in face of high-dimensional signals. To solve all problems associated with ESNs and their application in the Metaverse, we have come up with a novel structure for ESNs empowered by octonion algebra called NO2GESNet. Octonion numbers have eight dimensions, compactly display high-dimensional data, and improve the network precision and performance in comparison to conventional ESNs. The proposed network also solves the weaknesses of the ESNs in the presentation of the higher-order statistics to the output layer by equipping it with a multidimensional bilinear filter. Three comprehensive scenarios to use the proposed network in the Metaverse have been designed and analyzed, not only do they show the accuracy and performance of the proposed approach, but also the ways how SER can be employed in the Metaverse platforms.
Collapse