1
|
Achenbach P, Laux S, Purdack D, Müller PN, Göbel S. Give Me a Sign: Using Data Gloves for Static Hand-Shape Recognition. SENSORS (BASEL, SWITZERLAND) 2023; 23:9847. [PMID: 38139692 PMCID: PMC10747392 DOI: 10.3390/s23249847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/07/2023] [Accepted: 12/13/2023] [Indexed: 12/24/2023]
Abstract
Human-to-human communication via the computer is mainly carried out using a keyboard or microphone. In the field of virtual reality (VR), where the most immersive experience possible is desired, the use of a keyboard contradicts this goal, while the use of a microphone is not always desirable (e.g., silent commands during task-force training) or simply not possible (e.g., if the user has hearing loss). Data gloves help to increase immersion within VR, as they correspond to our natural interaction. At the same time, they offer the possibility of accurately capturing hand shapes, such as those used in non-verbal communication (e.g., thumbs up, okay gesture, …) and in sign language. In this paper, we present a hand-shape recognition system using Manus Prime X data gloves, including data acquisition, data preprocessing, and data classification to enable nonverbal communication within VR. We investigate the impact on accuracy and classification time of using an outlier detection and a feature selection approach in our data preprocessing. To obtain a more generalized approach, we also studied the impact of artificial data augmentation, i.e., we created new artificial data from the recorded and filtered data to augment the training data set. With our approach, 56 different hand shapes could be distinguished with an accuracy of up to 93.28%. With a reduced number of 27 hand shapes, an accuracy of up to 95.55% could be achieved. The voting meta-classifier (VL2) proved to be the most accurate, albeit slowest, classifier. A good alternative is random forest (RF), which was even able to achieve better accuracy values in a few cases and was generally somewhat faster. outlier detection was proven to be an effective approach, especially in improving the classification time. Overall, we have shown that our hand-shape recognition system using data gloves is suitable for communication within VR.
Collapse
Affiliation(s)
- Philipp Achenbach
- Serious Games Group, Technical University of Darmstadt, 64289 Darmstadt, Germany (D.P.); (S.G.)
| | | | | | | | | |
Collapse
|
2
|
Dias TS, Mendes JJA, Pichorim SF. Comparison between handcraft feature extraction and methods based on Recurrent Neural Network models for gesture recognition by instrumented gloves: A case for Brazilian Sign Language Alphabet. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
3
|
LiST: A Lightweight Framework for Continuous Indian Sign Language Translation. INFORMATION 2023. [DOI: 10.3390/info14020079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
Sign language is a natural, structured, and complete form of communication to exchange information. Non-verbal communicators, also referred to as hearing impaired and hard of hearing (HI&HH), consider sign language an elemental mode of communication to convey information. As this language is less familiar among a large percentage of the human population, an automatic sign language translator that can act as an interpreter and remove the language barrier is mandatory. The advent of deep learning has resulted in the availability of several sign language translation (SLT) models. However, SLT models are complex, resulting in increased latency in language translation. Furthermore, SLT models consider only hand gestures for further processing, which might lead to the misinterpretation of ambiguous sign language words. In this paper, we propose a lightweight SLT framework, LiST (Lightweight Sign language Translation), that simultaneously considers multiple modalities, such as hand gestures, facial expressions, and hand orientation, from an Indian sign video. The Inception V3 architecture handles the features associated with different signer modalities, resulting in the generation of a feature map, which is processed by a two-layered (long short-term memory) (LSTM) architecture. This sequence helps in sentence-by-sentence recognition and in the translation of sign language into text and audio. The model was tested with continuous Indian Sign Language (ISL) sentences taken from the INCLUDE dataset. The experimental results show that the LiST framework achieved a high translation accuracy of 91.2% and a prediction accuracy of 95.9% while maintaining a low word-level translation error compared to other existing models.
Collapse
|
4
|
Xia K, Lu W, Fan H, Zhao Q. A Sign Language Recognition System Applied to Deaf-Mute Medical Consultation. SENSORS (BASEL, SWITZERLAND) 2022; 22:9107. [PMID: 36501809 PMCID: PMC9739223 DOI: 10.3390/s22239107] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 11/10/2022] [Accepted: 11/20/2022] [Indexed: 06/17/2023]
Abstract
It is an objective reality that deaf-mute people have difficulty seeking medical treatment. Due to the lack of sign language interpreters, most hospitals in China currently do not have the ability to interpret sign language. Normal medical treatment is a luxury for deaf people. In this paper, we propose a sign language recognition system: Heart-Speaker. Heart-Speaker is applied to a deaf-mute consultation scenario. The system provides a low-cost solution for the difficult problem of treating deaf-mute patients. The doctor only needs to point the Heart-Speaker at the deaf patient and the system automatically captures the sign language movements and translates the sign language semantics. When a doctor issues a diagnosis or asks a patient a question, the system displays the corresponding sign language video and subtitles to meet the needs of two-way communication between doctors and patients. The system uses the MobileNet-YOLOv3 model to recognize sign language. It meets the needs of running on embedded terminals and provides favorable recognition accuracy. We performed experiments to verify the accuracy of the measurements. The experimental results show that the accuracy rate of Heart-Speaker in recognizing sign language can reach 90.77%.
Collapse
Affiliation(s)
| | - Weiwei Lu
- Correspondence: ; Tel.: +86-13671637275
| | | | | |
Collapse
|
5
|
Amangeldy N, Kudubayeva S, Kassymova A, Karipzhanova A, Razakhova B, Kuralov S. Sign Language Recognition Method Based on Palm Definition Model and Multiple Classification. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22176621. [PMID: 36081076 PMCID: PMC9460639 DOI: 10.3390/s22176621] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 08/28/2022] [Accepted: 08/29/2022] [Indexed: 06/01/2023]
Abstract
Technologies for pattern recognition are used in various fields. One of the most relevant and important directions is the use of pattern recognition technology, such as gesture recognition, in socially significant tasks, to develop automatic sign language interpretation systems in real time. More than 5% of the world's population-about 430 million people, including 34 million children-are deaf-mute and not always able to use the services of a living sign language interpreter. Almost 80% of people with a disabling hearing loss live in low- and middle-income countries. The development of low-cost systems of automatic sign language interpretation, without the use of expensive sensors and unique cameras, would improve the lives of people with disabilities, contributing to their unhindered integration into society. To this end, in order to find an optimal solution to the problem, this article analyzes suitable methods of gesture recognition in the context of their use in automatic gesture recognition systems, to further determine the most optimal methods. From the analysis, an algorithm based on the palm definition model and linear models for recognizing the shapes of numbers and letters of the Kazakh sign language are proposed. The advantage of the proposed algorithm is that it fully recognizes 41 letters of the 42 in the Kazakh sign alphabet. Until this time, only Russian letters in the Kazakh alphabet have been recognized. In addition, a unified function has been integrated into our system to configure the frame depth map mode, which has improved recognition performance and can be used to create a multimodal database of video data of gesture words for the gesture recognition system.
Collapse
Affiliation(s)
- Nurzada Amangeldy
- Faculty of Information Technologies, L.N. Gumilyov Eurasian National University, Nur-Sultan 010008, Kazakhstan
| | - Saule Kudubayeva
- Faculty of Information Technologies, L.N. Gumilyov Eurasian National University, Nur-Sultan 010008, Kazakhstan
| | - Akmaral Kassymova
- Institute of Economics, Information Technologies and Professional Education, Zangir Khan West Kazakhstan Agrarion-Technical University, Uralsk 090000, Kazakhstan
| | - Ardak Karipzhanova
- Department of Information and Technical Sciences, Faculty of Information Technologies and Economics, Kazakh Humanitarian Law Innovative University, East Kazakhstan Region, Semey 701400, Kazakhstan
| | - Bibigul Razakhova
- Faculty of Information Technologies, L.N. Gumilyov Eurasian National University, Nur-Sultan 010008, Kazakhstan
| | - Serikbay Kuralov
- Faculty of Information Technologies, L.N. Gumilyov Eurasian National University, Nur-Sultan 010008, Kazakhstan
| |
Collapse
|
6
|
Reducing the Number of Sensors in the Data Glove for Recognition of Static Hand Gestures. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12157388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Data glove devices, apart from being widely used in industry and entertainment, can also serve as a means for communication with the environment. This is possible thanks to the advancement in electronic technology and machine learning algorithms. In this paper, the results of the study using a designed data glove equipped with 10 piezoelectric sensors are reported, and the designed glove is validated on a recognition task of hand gestures based on 16 static signs of the Polish Sign Language (PSL) alphabet. The main result of the study is that recognition of 16 PSL static gestures is possible with a reduced number of piezoelectric sensors. This result has been achieved by applying the decision tree classifier that can rank the importance of the sensors for the recognition performance. Other machine learning algorithms were also tested, and it was showed that for the Support Vector Machines, k-NN and Bagged Trees classifiers, a recognition rate of the signs exceeding 90% can be achieved just for three preselected sensors. Such a result is important for a reduction in design complexity and costs of such a data glove with sustained reliability of the device.
Collapse
|
7
|
Sultan A, Makram W, Kayed M, Ali AA. Sign language identification and recognition: A comparative study. OPEN COMPUTER SCIENCE 2022. [DOI: 10.1515/comp-2022-0240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Abstract
Sign Language (SL) is the main language for handicapped and disabled people. Each country has its own SL that is different from other countries. Each sign in a language is represented with variant hand gestures, body movements, and facial expressions. Researchers in this field aim to remove any obstacles that prevent the communication with deaf people by replacing all device-based techniques with vision-based techniques using Artificial Intelligence (AI) and Deep Learning. This article highlights two main SL processing tasks: Sign Language Recognition (SLR) and Sign Language Identification (SLID). The latter task is targeted to identify the signer language, while the former is aimed to translate the signer conversation into tokens (signs). The article addresses the most common datasets used in the literature for the two tasks (static and dynamic datasets that are collected from different corpora) with different contents including numerical, alphabets, words, and sentences from different SLs. It also discusses the devices required to build these datasets, as well as the different preprocessing steps applied before training and testing. The article compares the different approaches and techniques applied on these datasets. It discusses both the vision-based and the data-gloves-based approaches, aiming to analyze and focus on main methods used in vision-based approaches such as hybrid methods and deep learning algorithms. Furthermore, the article presents a graphical depiction and a tabular representation of various SLR approaches.
Collapse
Affiliation(s)
- Ahmed Sultan
- Faculty of Computers and Artificial Intelligence, Computer Science Department , Beni-Suef University , Egypt
| | - Walied Makram
- Faculty of Computers and Information, Information System Department , Minia University , Egypt
| | - Mohammed Kayed
- Faculty of Computers and Artificial Intelligence, Computer Science Department , Beni-Suef University , Egypt
| | - Abdelmaged Amin Ali
- Faculty of Computers and Information, Information System Department , Minia University , Egypt
| |
Collapse
|
8
|
Shin S, Yoon HU, Yoo B. Hand Gesture Recognition Using EGaIn-Silicone Soft Sensors. SENSORS (BASEL, SWITZERLAND) 2021; 21:3204. [PMID: 34063055 PMCID: PMC8125695 DOI: 10.3390/s21093204] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 04/23/2021] [Accepted: 05/02/2021] [Indexed: 01/23/2023]
Abstract
Exploiting hand gestures for non-verbal communication has extraordinary potential in HCI. A data glove is an apparatus widely used to recognize hand gestures. To improve the functionality of the data glove, a highly stretchable and reliable signal-to-noise ratio sensor is indispensable. To do this, the study focused on the development of soft silicone microchannel sensors using a Eutectic Gallium-Indium (EGaIn) liquid metal alloy and a hand gesture recognition system via the proposed data glove using the soft sensor. The EGaIn-silicone sensor was uniquely designed to include two sensing channels to monitor the finger joint movements and to facilitate the EGaIn alloy injection into the meander-type microchannels. We recruited 15 participants to collect hand gesture dataset investigating 12 static hand gestures. The dataset was exploited to estimate the performance of the proposed data glove in hand gesture recognition. Additionally, six traditional classification algorithms were studied. From the results, a random forest shows the highest classification accuracy of 97.3% and a linear discriminant analysis shows the lowest accuracy of 87.4%. The non-linearity of the proposed sensor deteriorated the accuracy of LDA, however, the other classifiers adequately overcame it and performed high accuracies (>90%).
Collapse
Affiliation(s)
- Sungtae Shin
- Department of Mechanical Engineering, Dong-A University, Busan 49315, Korea;
- Department of Mechanical Engineering, University of Maryland, College Park, MD 20742, USA
| | - Han Ul Yoon
- Division of Computer and Telecommunication Engineering, Yonsei University, Wonju 26493, Korea
| | - Byungseok Yoo
- Department of Aerospace Engineering, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
9
|
Mendes Junior JJA, Freitas MLB, Campos DP, Farinelli FA, Stevan SL, Pichorim SF. Analysis of Influence of Segmentation, Features, and Classification in sEMG Processing: A Case Study of Recognition of Brazilian Sign Language Alphabet. SENSORS (BASEL, SWITZERLAND) 2020; 20:E4359. [PMID: 32764286 PMCID: PMC7471999 DOI: 10.3390/s20164359] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 07/26/2020] [Accepted: 08/01/2020] [Indexed: 11/17/2022]
Abstract
Sign Language recognition systems aid communication among deaf people, hearing impaired people, and speakers. One of the types of signals that has seen increased studies and that can be used as input for these systems is surface electromyography (sEMG). This work presents the recognition of a set of alphabet gestures from Brazilian Sign Language (Libras) using sEMG acquired from an armband. Only sEMG signals were used as input. Signals from 12 subjects were acquired using a MyoTM armband for the 26 signs of the Libras alphabet. Additionally, as the sEMG has several signal processing parameters, the influence of segmentation, feature extraction, and classification was considered at each step of the pattern recognition. In segmentation, window length and the presence of four levels of overlap rates were analyzed, as well as the contribution of each feature, the literature feature sets, and new feature sets proposed for different classifiers. We found that the overlap rate had a high influence on this task. Accuracies in the order of 99% were achieved for the following factors: segments of 1.75 s with a 12.5% overlap rate; the proposed set of four features; and random forest (RF) classifiers.
Collapse
Affiliation(s)
- José Jair Alves Mendes Junior
- Graduate Program in Electrical Engineering and Industrial Informatics (CPGEI), Federal University of Technology–Paraná (UTFPR), Curitiba (PR) 80230-901, Brazil; (J.J.A.M.J.); (F.A.F.); (S.F.P.)
| | - Melissa La Banca Freitas
- Graduate Program in Electrical Engineering (PPGEE), Federal University of Technology–Paraná (UTFPR), Ponta Grossa (PR) 84017-220, Brazil;
| | - Daniel Prado Campos
- Graduate Program in Biomedical Engineering (PPGEB), Federal University of Technology–Paraná (UTFPR), Curitiba (PR) 80230-901, Brazil;
| | - Felipe Adalberto Farinelli
- Graduate Program in Electrical Engineering and Industrial Informatics (CPGEI), Federal University of Technology–Paraná (UTFPR), Curitiba (PR) 80230-901, Brazil; (J.J.A.M.J.); (F.A.F.); (S.F.P.)
| | - Sergio Luiz Stevan
- Graduate Program in Electrical Engineering (PPGEE), Federal University of Technology–Paraná (UTFPR), Ponta Grossa (PR) 84017-220, Brazil;
| | - Sérgio Francisco Pichorim
- Graduate Program in Electrical Engineering and Industrial Informatics (CPGEI), Federal University of Technology–Paraná (UTFPR), Curitiba (PR) 80230-901, Brazil; (J.J.A.M.J.); (F.A.F.); (S.F.P.)
| |
Collapse
|