1
|
Avola D, Cinque L, Mambro AD, Fagioli A, Marini MR, Pannone D, Fanini B, Foresti GL. Spatio-Temporal Image-Based Encoded Atlases for EEG Emotion Recognition. Int J Neural Syst 2024; 34:2450024. [PMID: 38533631 DOI: 10.1142/s0129065724500242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2024]
Abstract
Emotion recognition plays an essential role in human-human interaction since it is a key to understanding the emotional states and reactions of human beings when they are subject to events and engagements in everyday life. Moving towards human-computer interaction, the study of emotions becomes fundamental because it is at the basis of the design of advanced systems to support a broad spectrum of application areas, including forensic, rehabilitative, educational, and many others. An effective method for discriminating emotions is based on ElectroEncephaloGraphy (EEG) data analysis, which is used as input for classification systems. Collecting brain signals on several channels and for a wide range of emotions produces cumbersome datasets that are hard to manage, transmit, and use in varied applications. In this context, the paper introduces the Empátheia system, which explores a different EEG representation by encoding EEG signals into images prior to their classification. In particular, the proposed system extracts spatio-temporal image encodings, or atlases, from EEG data through the Processing and transfeR of Interaction States and Mappings through Image-based eNcoding (PRISMIN) framework, thus obtaining a compact representation of the input signals. The atlases are then classified through the Empátheia architecture, which comprises branches based on convolutional, recurrent, and transformer models designed and tuned to capture the spatial and temporal aspects of emotions. Extensive experiments were conducted on the Shanghai Jiao Tong University (SJTU) Emotion EEG Dataset (SEED) public dataset, where the proposed system significantly reduced its size while retaining high performance. The results obtained highlight the effectiveness of the proposed approach and suggest new avenues for data representation in emotion recognition from EEG signals.
Collapse
Affiliation(s)
- Danilo Avola
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Luigi Cinque
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Angelo Di Mambro
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Alessio Fagioli
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Marco Raoul Marini
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Daniele Pannone
- Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy
| | - Bruno Fanini
- Institute of Heritage Science, National Research Council, Area della Ricerca Roma 1, SP35d, 9, Montelibretti 00010, Italy
| | - Gian Luca Foresti
- Department of Computer Science, Mathematics and Physics, University of Udine, Via delle Scienze 206, Udine 33100, Italy
| |
Collapse
|
2
|
Choi SH, Park JK, An D, Kim CH, Park G, Lee I, Lee S. Fault Diagnosis Method for Human Coexistence Robots Based on Convolutional Neural Networks Using Time-Series Data Generation and Image Encoding. Sensors (Basel) 2023; 23:9753. [PMID: 38139599 PMCID: PMC10748154 DOI: 10.3390/s23249753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 12/01/2023] [Accepted: 12/07/2023] [Indexed: 12/24/2023]
Abstract
This paper proposes fault diagnosis methods aimed at proactively preventing potential safety issues in robot systems, particularly human coexistence robots (HCRs) used in industrial environments. The data were collected from durability tests of the driving module for HCRs, gathering time-series vibration data until the module failed. In this study, to apply classification methods in the absence of post-failure data, the initial 50% of the collected data were designated as the normal section, and the data from the 10 h immediately preceding the failure were selected as the fault section. To generate additional data for the limited fault dataset, the Wasserstein generative adversarial networks with gradient penalty (WGAN-GP) model was utilized and residual connections were added to the generator to maintain the basic structure while preventing the loss of key features of the data. Considering that the performance of image encoding techniques varies depending on the dataset type, this study applied and compared five image encoding methods and four CNN models to facilitate the selection of the most suitable algorithm. The time-series data were converted into image data using image encoding techniques including recurrence plot, Gramian angular field, Markov transition field, spectrogram, and scalogram. These images were then applied to CNN models, including VGGNet, GoogleNet, ResNet, and DenseNet, to calculate the accuracy of fault diagnosis and compare the performance of each model. The experimental results demonstrated significant improvements in diagnostic accuracy when employing the WGAN-GP model to generate fault data, and among the image encoding techniques and convolutional neural network models, spectrogram and DenseNet exhibited superior performance, respectively.
Collapse
Affiliation(s)
- Seung-Hwan Choi
- Advanced Mechatronics Research Group, Daegyeong Division, Korea Institute of Industrial Technology, Daegu 42994, Republic of Korea; (S.-H.C.); (C.-H.K.); (G.P.)
| | - Jun-Kyu Park
- Renewable Energy Solution Group, Korea Electric Power Research Institute (KEPRI), Naju 58277, Republic of Korea;
| | - Dawn An
- Advanced Mechatronics Research Group, Daegyeong Division, Korea Institute of Industrial Technology, Daegu 42994, Republic of Korea; (S.-H.C.); (C.-H.K.); (G.P.)
| | - Chang-Hyun Kim
- Advanced Mechatronics Research Group, Daegyeong Division, Korea Institute of Industrial Technology, Daegu 42994, Republic of Korea; (S.-H.C.); (C.-H.K.); (G.P.)
| | - Gunseok Park
- Advanced Mechatronics Research Group, Daegyeong Division, Korea Institute of Industrial Technology, Daegu 42994, Republic of Korea; (S.-H.C.); (C.-H.K.); (G.P.)
| | - Inho Lee
- Department of Electronics Engineering, Pusan National University, Busan 46241, Republic of Korea
| | - Suwoong Lee
- Advanced Mechatronics Research Group, Daegyeong Division, Korea Institute of Industrial Technology, Daegu 42994, Republic of Korea; (S.-H.C.); (C.-H.K.); (G.P.)
| |
Collapse
|
3
|
Wang J, Tan FS, Yuan Y. Random Matrix Transformation and Its Application in Image Hiding. Sensors (Basel) 2023; 23:1017. [PMID: 36679814 PMCID: PMC9867056 DOI: 10.3390/s23021017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 01/03/2023] [Accepted: 01/05/2023] [Indexed: 06/17/2023]
Abstract
Image coding technology has become an indispensable technology in the field of modern information. With the vigorous development of the big data era, information security has received more attention. Image steganography is an important method of image encoding and hiding, and how to protect information security with this technology is worth studying. Using a basis of mathematical modeling, this paper makes innovations not only in improving the theoretical system of kernel function but also in constructing a random matrix to establish an information-hiding scheme. By using the random matrix as the reference matrix for secret-information steganography, due to the characteristics of the random matrix, the secret information set to be retrieved is very small, reducing the modification range of the steganography image and improving the steganography image quality and efficiency. This scheme can maintain the steganography image quality with a PSNR of 49.95 dB and steganography of 1.5 bits per pixel and can ensure that the steganography efficiency is improved by reducing the steganography set. In order to adapt to different steganography requirements and improve the steganography ability of the steganography schemes, this paper also proposes an adaptive large-capacity information-hiding scheme based on the random matrix. In this scheme, a method of expanding the random matrix is proposed, which can generate a corresponding random matrix according to different steganography capacity requirements to achieve the corresponding secret-information steganography. Two schemes are demonstrated through simulation experiments as well as an analysis of the steganography efficiency, steganography image quality, and steganography capacity and security. The experimental results show that the latter two schemes are better than the first two in terms of steganography capacity and steganography image quality.
Collapse
Affiliation(s)
- Jijun Wang
- Faculty of Computing and Informatics, Universiti Malaysia Sabah (UMS), Kota Kinabalu 88400, Malaysia
- Guangxi Key Laboratory of Big Data in Finance and Economics, Nanning 530003, China
- Faculty of Big Data and Artificial Intelligence, Guangxi University of Finance and Economics, Nanning 530003, China
| | - Fun Soo Tan
- Faculty of Computing and Informatics, Universiti Malaysia Sabah (UMS), Kota Kinabalu 88400, Malaysia
| | - Yi Yuan
- Faculty of Computing and Informatics, Universiti Malaysia Sabah (UMS), Kota Kinabalu 88400, Malaysia
| |
Collapse
|
4
|
Babu EK, Mistry K, Anwar MN, Zhang L. Facial Feature Extraction Using a Symmetric Inline Matrix-LBP Variant for Emotion Recognition. Sensors (Basel) 2022; 22:8635. [PMID: 36433232 PMCID: PMC9696972 DOI: 10.3390/s22228635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 11/02/2022] [Accepted: 11/04/2022] [Indexed: 06/16/2023]
Abstract
With a large number of Local Binary Patterns (LBP) variants being currently used today, the significant and importance of visual descriptors in computer vision applications are prominent. This paper presents a novel visual descriptor, i.e., SIM-LBP. It employs a new matrix technique called the Symmetric Inline Matrix generator method, which acts as a new variant of LBP. The key feature that separates our variant from existing counterparts is that our variant is very efficient in extracting facial expression features like eyes, eye brows, nose and mouth in a wide range of lighting conditions. For testing our model, we applied SIM-LBP on the JAFFE dataset to convert all the images to its corresponding SIM-LBP transformed variant. These transformed images are then used to train a Convolution Neural Network (CNN) based deep learning model for facial expressions recognition (FER). Several performance evaluation metrics, i.e., recognition accuracy rate, precision, recall, and F1-score, were used to test mode efficiency in comparison with those using the traditional LBP descriptor and other LBP variants. Our model outperformed in all four matrices with the proposed SIM-LBP transformation on the input images against those of baseline methods. In comparison analysis with the other state-of-the-art methods, it shows the usefulness of the proposed SIM-LBP model. Our proposed SIM-LBP variant transformation can also be applied on facial images to identify a person's mental states and predict mood variations.
Collapse
Affiliation(s)
- Eaby Kollonoor Babu
- Faculty of Engineering and Environment, Department of Computer and Information Sciences, Northumbria University, Newcastle upon Tyne NE1 8ST, UK
| | - Kamlesh Mistry
- Faculty of Engineering and Environment, Department of Computer and Information Sciences, Northumbria University, Newcastle upon Tyne NE1 8ST, UK
| | - Muhammad Naveed Anwar
- Faculty of Engineering and Environment, Department of Computer and Information Sciences, Northumbria University, Newcastle upon Tyne NE1 8ST, UK
| | - Li Zhang
- Department of Computer Science, Royal Holloway, University of London, Surrey TW20 0EX, UK
| |
Collapse
|
5
|
Ortis A, Grisanti M, Rundo F, Battiato S. A Benchmark Evaluation of Adaptive Image Compression for Multi Picture Object Stereoscopic Images. J Imaging 2021; 7:160. [PMID: 34460796 DOI: 10.3390/jimaging7080160] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 08/03/2021] [Accepted: 08/18/2021] [Indexed: 11/17/2022] Open
Abstract
A stereopair consists of two pictures related to the same subject taken by two different points of view. Since the two images contain a high amount of redundant information, new compression approaches and data formats are continuously proposed, which aim to reduce the space needed to store a stereoscopic image while preserving its quality. A standard for multi-picture image encoding is represented by the MPO format (Multi-Picture Object). The classic stereoscopic image compression approaches compute a disparity map between the two views, which is stored with one of the two views together with a residual image. An alternative approach, named adaptive stereoscopic image compression, encodes just the two views independently with different quality factors. Then, the redundancy between the two views is exploited to enhance the low quality image. In this paper, the problem of stereoscopic image compression is presented, with a focus on the adaptive stereoscopic compression approach, which allows us to obtain a standardized format of the compressed data. The paper presents a benchmark evaluation on large and standardized datasets including 60 stereopairs that differ by resolution and acquisition technique. The method is evaluated by varying the amount of compression, as well as the matching and optimization methods resulting in 16 different settings. The adaptive approach is also compared with other MPO-compliant methods. The paper also presents an Human Visual System (HVS)-based assessment experiment which involved 116 people in order to verify the perceived quality of the decoded images.
Collapse
|