1
|
Ben Haj Amor A, El Ghoul O, Jemni M. Sign Language Recognition Using the Electromyographic Signal: A Systematic Literature Review. SENSORS (BASEL, SWITZERLAND) 2023; 23:8343. [PMID: 37837173 PMCID: PMC10574929 DOI: 10.3390/s23198343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 09/23/2023] [Accepted: 10/03/2023] [Indexed: 10/15/2023]
Abstract
The analysis and recognition of sign languages are currently active fields of research focused on sign recognition. Various approaches differ in terms of analysis methods and the devices used for sign acquisition. Traditional methods rely on video analysis or spatial positioning data calculated using motion capture tools. In contrast to these conventional recognition and classification approaches, electromyogram (EMG) signals, which measure muscle electrical activity, offer potential technology for detecting gestures. These EMG-based approaches have recently gained attention due to their advantages. This prompted us to conduct a comprehensive study on the methods, approaches, and projects utilizing EMG sensors for sign language handshape recognition. In this paper, we provided an overview of the sign language recognition field through a literature review, with the objective of offering an in-depth review of the most significant techniques. These techniques were categorized in this article based on their respective methodologies. The survey discussed the progress and challenges in sign language recognition systems based on surface electromyography (sEMG) signals. These systems have shown promise but face issues like sEMG data variability and sensor placement. Multiple sensors enhance reliability and accuracy. Machine learning, including deep learning, is used to address these challenges. Common classifiers in sEMG-based sign language recognition include SVM, ANN, CNN, KNN, HMM, and LSTM. While SVM and ANN are widely used, random forest and KNN have shown better performance in some cases. A multilayer perceptron neural network achieved perfect accuracy in one study. CNN, often paired with LSTM, ranks as the third most popular classifier and can achieve exceptional accuracy, reaching up to 99.6% when utilizing both EMG and IMU data. LSTM is highly regarded for handling sequential dependencies in EMG signals, making it a critical component of sign language recognition systems. In summary, the survey highlights the prevalence of SVM and ANN classifiers but also suggests the effectiveness of alternative classifiers like random forests and KNNs. LSTM emerges as the most suitable algorithm for capturing sequential dependencies and improving gesture recognition in EMG-based sign language recognition systems.
Collapse
Affiliation(s)
| | - Oussama El Ghoul
- Mada—Assistive Technology Center Qatar, Doha P.O. Box 24230, Qatar;
| | - Mohamed Jemni
- Arab League Educational, Cultural, and Scientific Organization, Tunis 1003, Tunisia
| |
Collapse
|
2
|
Freitas MLB, Mendes JJA, Dias TS, Siqueira HV, Stevan SL. Surgical Instrument Signaling Gesture Recognition Using Surface Electromyography Signals. SENSORS (BASEL, SWITZERLAND) 2023; 23:6233. [PMID: 37448082 DOI: 10.3390/s23136233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 07/03/2023] [Accepted: 07/05/2023] [Indexed: 07/15/2023]
Abstract
Surgical Instrument Signaling (SIS) is compounded by specific hand gestures used by the communication between the surgeon and surgical instrumentator. With SIS, the surgeon executes signals representing determined instruments in order to avoid error and communication failures. This work presented the feasibility of an SIS gesture recognition system using surface electromyographic (sEMG) signals acquired from the Myo armband, aiming to build a processing routine that aids telesurgery or robotic surgery applications. Unlike other works that use up to 10 gestures to represent and classify SIS gestures, a database with 14 selected gestures for SIS was recorded from 10 volunteers, with 30 repetitions per user. Segmentation, feature extraction, feature selection, and classification were performed, and several parameters were evaluated. These steps were performed by taking into account a wearable application, for which the complexity of pattern recognition algorithms is crucial. The system was tested offline and verified as to its contribution for all databases and each volunteer individually. An automatic segmentation algorithm was applied to identify the muscle activation; thus, 13 feature sets and 6 classifiers were tested. Moreover, 2 ensemble techniques aided in separating the sEMG signals into the 14 SIS gestures. Accuracy of 76% was obtained for the Support Vector Machine classifier for all databases and 88% for analyzing the volunteers individually. The system was demonstrated to be suitable for SIS gesture recognition using sEMG signals for wearable applications.
Collapse
Affiliation(s)
- Melissa La Banca Freitas
- Graduate Program in Electrical Engineering (PPGEE), Federal University of Technology-Paraná (UTFPR), Ponta Grossa 84017-220, PR, Brazil
| | - José Jair Alves Mendes
- Graduate Program in Electrical Engineering and Industrial Informatics (CPGEI), Federal University of Technology-Paraná (UTFPR), Curitiba 80230-901, PR, Brazil
| | - Thiago Simões Dias
- Graduate Program in Electrical Engineering and Industrial Informatics (CPGEI), Federal University of Technology-Paraná (UTFPR), Curitiba 80230-901, PR, Brazil
| | - Hugo Valadares Siqueira
- Graduate Program in Electrical Engineering (PPGEE), Federal University of Technology-Paraná (UTFPR), Ponta Grossa 84017-220, PR, Brazil
| | - Sergio Luiz Stevan
- Graduate Program in Electrical Engineering (PPGEE), Federal University of Technology-Paraná (UTFPR), Ponta Grossa 84017-220, PR, Brazil
| |
Collapse
|
3
|
Wu J, Zhang Y, Xie L, Yan Y, Zhang X, Liu S, An X, Yin E, Ming D. A novel silent speech recognition approach based on parallel inception convolutional neural network and Mel frequency spectral coefficient. Front Neurorobot 2022; 16:971446. [PMID: 36119717 PMCID: PMC9478652 DOI: 10.3389/fnbot.2022.971446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 07/07/2022] [Indexed: 11/13/2022] Open
Abstract
Silent speech recognition breaks the limitations of automatic speech recognition when acoustic signals cannot be produced or captured clearly, but still has a long way to go before being ready for any real-life applications. To address this issue, we propose a novel silent speech recognition framework based on surface electromyography (sEMG) signals. In our approach, a new deep learning architecture Parallel Inception Convolutional Neural Network (PICNN) is proposed and implemented in our silent speech recognition system, with six inception modules processing six channels of sEMG data, separately and simultaneously. Meanwhile, Mel Frequency Spectral Coefficients (MFSCs) are employed to extract speech-related sEMG features for the first time. We further design and generate a 100-class dataset containing daily life assistance demands for the elderly and disabled individuals. The experimental results obtained from 28 subjects confirm that our silent speech recognition method outperforms state-of-the-art machine learning algorithms and deep learning architectures, achieving the best recognition accuracy of 90.76%. With sEMG data collected from four new subjects, efficient steps of subject-based transfer learning are conducted to further improve the cross-subject recognition ability of the proposed model. Promising results prove that our sEMG-based silent speech recognition system could have high recognition accuracy and steady performance in practical applications.
Collapse
Affiliation(s)
- Jinghan Wu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
- Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China
| | - Yakun Zhang
- Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China
- Defense Innovation Institute, Academy of Military Sciences (AMS), Beijing, China
| | - Liang Xie
- Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China
- Defense Innovation Institute, Academy of Military Sciences (AMS), Beijing, China
| | - Ye Yan
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
- Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China
- Defense Innovation Institute, Academy of Military Sciences (AMS), Beijing, China
| | - Xu Zhang
- Department of Electronic Science and Technology, University of Science and Technology of China, Hefei, China
| | - Shuang Liu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
| | - Xingwei An
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
- *Correspondence: Xingwei An
| | - Erwei Yin
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
- Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China
- Defense Innovation Institute, Academy of Military Sciences (AMS), Beijing, China
- Erwei Yin
| | - Dong Ming
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
| |
Collapse
|
4
|
Jiang Y, Song L, Zhang J, Song Y, Yan M. Multi-Category Gesture Recognition Modeling Based on sEMG and IMU Signals. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22155855. [PMID: 35957417 PMCID: PMC9371015 DOI: 10.3390/s22155855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 08/01/2022] [Accepted: 08/02/2022] [Indexed: 05/14/2023]
Abstract
Gesture recognition based on wearable devices is one of the vital components of human-computer interaction systems. Compared with skeleton-based recognition in computer vision, gesture recognition using wearable sensors has attracted wide attention for its robustness and convenience. Recently, many studies have proposed deep learning methods based on surface electromyography (sEMG) signals for gesture classification; however, most of the existing datasets are built for surface EMG signals, and there is a lack of datasets for multi-category gestures. Due to model limitations and inadequate classification data, the recognition accuracy of these methods cannot satisfy multi-gesture interaction scenarios. In this paper, a multi-category dataset containing 20 gestures is recorded with the help of a wearable device that can acquire surface electromyographic and inertial (IMU) signals. Various two-stream deep learning models are established and improved further. The basic convolutional neural network (CNN), recurrent neural network (RNN), and Transformer models are experimented on with our dataset as the classifier. The CNN and the RNN models' test accuracy is over 95%; however, the Transformer model has a lower test accuracy of 71.68%. After further improvements, the CNN model is introduced into the residual network and augmented to the CNN-Res model, achieving 98.24% accuracy; moreover, it has the shortest training and testing time. Then, after combining the RNN model and the CNN-Res model, the long short term memory (LSTM)-Res model and gate recurrent unit (GRU)-Res model achieve the highest classification accuracy of 99.67% and 99.49%, respectively. Finally, the fusion of the Transformer model and the CNN model enables the Transformer-CNN model to be constructed. Such improvement dramatically boosts the performance of the Transformer module, increasing the recognition accuracy from 71.86% to 98.96%.
Collapse
Affiliation(s)
- Yujian Jiang
- State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China
- Key Laboratory of Acoustic Visual Technology and Intelligent Control System, Ministry of Culture and Tourism, Communication University of China, Beijing 100024, China
- Beijing Key Laboratory of Modern Entertainment Technology, Communication University of China, Beijing 100024, China
- School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
- Correspondence:
| | - Lin Song
- State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China
- Key Laboratory of Acoustic Visual Technology and Intelligent Control System, Ministry of Culture and Tourism, Communication University of China, Beijing 100024, China
- Beijing Key Laboratory of Modern Entertainment Technology, Communication University of China, Beijing 100024, China
- School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
| | - Junming Zhang
- State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China
- Key Laboratory of Acoustic Visual Technology and Intelligent Control System, Ministry of Culture and Tourism, Communication University of China, Beijing 100024, China
- Beijing Key Laboratory of Modern Entertainment Technology, Communication University of China, Beijing 100024, China
- School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
| | - Yang Song
- State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China
- Key Laboratory of Acoustic Visual Technology and Intelligent Control System, Ministry of Culture and Tourism, Communication University of China, Beijing 100024, China
- Beijing Key Laboratory of Modern Entertainment Technology, Communication University of China, Beijing 100024, China
- School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
| | - Ming Yan
- State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China
- Key Laboratory of Acoustic Visual Technology and Intelligent Control System, Ministry of Culture and Tourism, Communication University of China, Beijing 100024, China
- Beijing Key Laboratory of Modern Entertainment Technology, Communication University of China, Beijing 100024, China
- School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
| |
Collapse
|
5
|
Li J, Wei L, Wen Y, Liu X, Wang H. Hand gesture recognition based improved multi-channels CNN architecture using EMG sensors. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-212390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
With the continuous development of sensor and computer technology, human-computer interaction technology is also improving. Gesture recognition has become a research hotspot in human-computer interaction, sign language recognition, rehabilitation training, and sports medicine. This paper proposed a method of hand gestures recognition which extracts the time domain and frequency domain features from surface electromyography (sEMG) by using an improved multi-channels convolutional neural network (IMC-CNN). The 10 most commonly used hand gestures are recognized by using the spectral features of sEMG signals which is the input of the IMC-CNN model. Firstly, the third-order Butterworth low-pass filter and high-pass filter are used to denoise the sEMG signal. Secondly, effective sEMG signal segment from denoised signal is applied. Thirdly, the spectrogram features of different channels’ sEMG signals are merged into a comprehensive improved spectrogram feature which is used as the input of IMC-CNN to classify the hand gestures. Finally, the recognition accuracy of IMC-CNN model, three single channel CNN of IMC-CNN model, SVM, LDA, LCNN and EMGNET are compared. The experiment was carried out on the same dataset and the same computer. The experimental results showed that the recognition accuracy, sensitivity and accuracy of the proposed model reached 97.5%, 97.25% and 96.25% respectively. The proposed method not only has high average recognition accuracy on MYO collected dataset, but also has high average recognition accuracy on NinaPro DB5 dataset. Overall, the proposed model has more advantages in accuracy and efficiency than that of the comparison models.
Collapse
Affiliation(s)
- Jun Li
- College of Electronic and Information Engineering, Yanshan University, Qinhuangdao, China
- College of Electronic and Information Engineering, Hebei University, Baoding, China
| | - Lixin Wei
- College of Electronic and Information Engineering, Yanshan University, Qinhuangdao, China
| | - Yintang Wen
- College of Electronic and Information Engineering, Yanshan University, Qinhuangdao, China
| | - Xiaoguang Liu
- College of Electronic and Information Engineering, Hebei University, Baoding, China
| | - Hongrui Wang
- College of Electronic and Information Engineering, Yanshan University, Qinhuangdao, China
- College of Electronic and Information Engineering, Hebei University, Baoding, China
| |
Collapse
|
6
|
Fang Y, Yang J, Zhou D, Ju Z. Modelling EMG driven wrist movements using a bio-inspired neural network. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.10.104] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
7
|
Electromyogram-Based Classification of Hand and Finger Gestures Using Artificial Neural Networks. SENSORS 2021; 22:s22010225. [PMID: 35009768 PMCID: PMC8749583 DOI: 10.3390/s22010225] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 12/21/2021] [Accepted: 12/27/2021] [Indexed: 11/16/2022]
Abstract
Electromyogram (EMG) signals have been increasingly used for hand and finger gesture recognition. However, most studies have focused on the wrist and whole-hand gestures and not on individual finger (IF) gestures, which are considered more challenging. In this study, we develop EMG-based hand/finger gesture classifiers based on fixed electrode placement using machine learning methods. Ten healthy subjects performed ten hand/finger gestures, including seven IF gestures. EMG signals were measured from three channels, and six time-domain (TD) features were extracted from each channel. A total of 18 features was used to build personalized classifiers for ten gestures with an artificial neural network (ANN), a support vector machine (SVM), a random forest (RF), and a logistic regression (LR). The ANN, SVM, RF, and LR achieved mean accuracies of 0.940, 0.876, 0.831, and 0.539, respectively. One-way analyses of variance and F-tests showed that the ANN achieved the highest mean accuracy and the lowest inter-subject variance in the accuracy, respectively, suggesting that it was the least affected by individual variability in EMG signals. Using only TD features, we achieved a higher ratio of gestures to channels than other similar studies, suggesting that the proposed method can improve the system usability and reduce the computational burden.
Collapse
|
8
|
Wu J, Zhao T, Zhang Y, Xie L, Yan Y, Yin E. Parallel-Inception CNN Approach for Facial sEMG based Silent Speech Recognition. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:554-557. [PMID: 34891354 DOI: 10.1109/embc46164.2021.9630373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
With the purpose of providing an external human-machine interaction platform for the elderly in need, a novel facial surface electromyography based silent speech recognition system was developed. In this study, we propose a deep learning architecture named Parallel-Inception Convolutional Neural Network (PICNN), and employ up-to-date feature extraction method log Mel frequency spectral coefficients (MFSC). To better meet the requirements of our target users, a 100-class dataset containing daily life-related demands was designed and generated for the comparative experiments. According to experimental results, the highest recognition accuracy of 88.44% was achieved by proposed recognition framework based on MFSC and PICNN, exceeding the performance of state-of-the-art deep learning algorithms such as CNN, VGGNet and Inception CNN (3.22%, 4.09% and 1.19%, respectively). These findings suggest that the newly developed silent speech approach holds promise to provide a more reliable communication channel, and the application scenery of speech recognition technology has been expanded at the same time.
Collapse
|