1
|
Che Z, Wan X, Xu J, Duan C, Zheng T, Chen J. Speaking without vocal folds using a machine-learning-assisted wearable sensing-actuation system. Nat Commun 2024; 15:1873. [PMID: 38472193 PMCID: PMC10933441 DOI: 10.1038/s41467-024-45915-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 02/06/2024] [Indexed: 03/14/2024] Open
Abstract
Voice disorders resulting from various pathological vocal fold conditions or postoperative recovery of laryngeal cancer surgeries, are common causes of dysphonia. Here, we present a self-powered wearable sensing-actuation system based on soft magnetoelasticity that enables assisted speaking without relying on the vocal folds. It holds a lightweighted mass of approximately 7.2 g, skin-alike modulus of 7.83 × 105 Pa, stability against skin perspiration, and a maximum stretchability of 164%. The wearable sensing component can effectively capture extrinsic laryngeal muscle movement and convert them into high-fidelity and analyzable electrical signals, which can be translated into speech signals with the assistance of machine learning algorithms with an accuracy of 94.68%. Then, with the wearable actuation component, the speech could be expressed as voice signals while circumventing vocal fold vibration. We expect this approach could facilitate the restoration of normal voice function and significantly enhance the quality of life for patients with dysfunctional vocal folds.
Collapse
Affiliation(s)
- Ziyuan Che
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Xiao Wan
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Jing Xu
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Chrystal Duan
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Tianqi Zheng
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Jun Chen
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
| |
Collapse
|
2
|
Fu J, Deng Z, Liu C, Liu C, Luo J, Wu J, Peng S, Song L, Li X, Peng M, Liu H, Zhou J, Qiao Y. Intelligent, Flexible Artificial Throats with Sound Emitting, Detecting, and Recognizing Abilities. SENSORS (BASEL, SWITZERLAND) 2024; 24:1493. [PMID: 38475029 DOI: 10.3390/s24051493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 02/22/2024] [Accepted: 02/22/2024] [Indexed: 03/14/2024]
Abstract
In recent years, there has been a notable rise in the number of patients afflicted with laryngeal diseases, including cancer, trauma, and other ailments leading to voice loss. Currently, the market is witnessing a pressing demand for medical and healthcare products designed to assist individuals with voice defects, prompting the invention of the artificial throat (AT). This user-friendly device eliminates the need for complex procedures like phonation reconstruction surgery. Therefore, in this review, we will initially give a careful introduction to the intelligent AT, which can act not only as a sound sensor but also as a thin-film sound emitter. Then, the sensing principle to detect sound will be discussed carefully, including capacitive, piezoelectric, electromagnetic, and piezoresistive components employed in the realm of sound sensing. Following this, the development of thermoacoustic theory and different materials made of sound emitters will also be analyzed. After that, various algorithms utilized by the intelligent AT for speech pattern recognition will be reviewed, including some classical algorithms and neural network algorithms. Finally, the outlook, challenge, and conclusion of the intelligent AT will be stated. The intelligent AT presents clear advantages for patients with voice impairments, demonstrating significant social values.
Collapse
Affiliation(s)
- Junxin Fu
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, Shenzhen 518107, China
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Zhikang Deng
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, Shenzhen 518107, China
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Chang Liu
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, Shenzhen 518107, China
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Chuting Liu
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, Shenzhen 518107, China
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Jinan Luo
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, Shenzhen 518107, China
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Jingzhi Wu
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, Shenzhen 518107, China
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Shiqi Peng
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, Shenzhen 518107, China
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Lei Song
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, Shenzhen 518107, China
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Xinyi Li
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, Shenzhen 518107, China
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Minli Peng
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, Shenzhen 518107, China
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Houfang Liu
- School of Integrated Circuits and Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing 100084, China
| | - Jianhua Zhou
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, Shenzhen 518107, China
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Yancong Qiao
- School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, Shenzhen 518107, China
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510275, China
| |
Collapse
|
3
|
Yamada T, Yamaguchi K, Horike A, Takahashi K, Amornsuradech S, Nakagawa K, Yoshimi K, Tohara H. Development and evaluation of a new intraoral voice assist device called the voice retriever. Laryngoscope Investig Otolaryngol 2024; 9:e1204. [PMID: 38362198 PMCID: PMC10866604 DOI: 10.1002/lio2.1204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 11/17/2023] [Accepted: 12/13/2023] [Indexed: 02/17/2024] Open
Abstract
Objective Patients lose their voice after laryngectomy for laryngeal cancer or aspiration prevention surgery for severe dysphagia. To assist such patients, we developed and verified the utility of a novel vocalization method using a device termed the voice retriever (VR), in which the sound source is placed in the mouth. Methods We investigated the effectiveness of the VR in patients. The VR consists of a mouthpiece with a built-in speaker and a dedicated application that serves as the sound source. We compared the speech intelligibility and naturalness in normal participants using VR and an electrolarynx (EL) for the first time as well as the voice-related quality of life (V-RQOL) in patients with dysphonia before and after using the VR. Results The VR produced significantly higher 100-syllable test scores as well as fluency, amount of additional noise, intonation, intelligibility and overall long reading test ratings in the first-time VR and EL users. Furthermore, the VR use significantly improved the V-RQOL of participants with dysphonia. Conclusion Compared to EL, VR allows more effective speech improvement in participants without experience using an alternative vocalization method and improves the V-RQOL in patients with dysphonia. Level of Evidence Step 4.
Collapse
Affiliation(s)
- Taishi Yamada
- Department of Dysphagia Rehabilitation, Division of Gerontology and GerodontologyTokyo Medical and Dental UniversityTokyoJapan
| | - Kohei Yamaguchi
- Department of Dysphagia Rehabilitation, Division of Gerontology and GerodontologyTokyo Medical and Dental UniversityTokyoJapan
| | - Ayane Horike
- Department of Dysphagia Rehabilitation, Division of Gerontology and GerodontologyTokyo Medical and Dental UniversityTokyoJapan
| | - Kohei Takahashi
- Department of RehabilitationJapanese Red Cross Osaka HospitalOsakaJapan
| | - Sirinthip Amornsuradech
- Department of Dysphagia Rehabilitation, Division of Gerontology and GerodontologyTokyo Medical and Dental UniversityTokyoJapan
- Department of Community DentistryMahidol UniversityBangkokThailand
| | - Kazuharu Nakagawa
- Department of Dysphagia Rehabilitation, Division of Gerontology and GerodontologyTokyo Medical and Dental UniversityTokyoJapan
| | - Kanako Yoshimi
- Department of Dysphagia Rehabilitation, Division of Gerontology and GerodontologyTokyo Medical and Dental UniversityTokyoJapan
| | - Haruka Tohara
- Department of Dysphagia Rehabilitation, Division of Gerontology and GerodontologyTokyo Medical and Dental UniversityTokyoJapan
| |
Collapse
|
4
|
Pan C, Andrews LIB, Johnson E, Bhatt NK, Rizvi ZH. Factors associated with successful electrolarynx use after total laryngectomy, a multi-institutional study. Laryngoscope Investig Otolaryngol 2024; 9:e1212. [PMID: 38362175 PMCID: PMC10866577 DOI: 10.1002/lio2.1212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 12/11/2023] [Accepted: 12/21/2023] [Indexed: 02/17/2024] Open
Abstract
Objective To identify characteristics associated with successful electrolarynx (EL) use after total laryngectomy (TL). Methods Records of 196 adults who underwent TL from 03/15/2012 to 03/15/2022 at the University of Washington and Puget Sound Veterans Affairs were reviewed. Characteristics included age, Charlson Comorbidity Index, social support, pre-operative radiation (RT) and chemoradiation (CRT), and 6-month post-TL swallow status. EL success was evaluated using pre-defined criteria of intelligibility, reliability, and independence with use. Poisson regressions and robust standard error estimates were used to estimate unadjusted risk ratios for each characteristic. Statistically significant characteristics were included in multivariate analysis (MVA) to estimate adjusted risk ratios. Results Median age was 64, median Charlson Comorbidity Index was 5, 170 (87%) were male, 159 (81%) had high social support, and 159 (81%) attained post-TL full-oral diet. Pre-operatively, 110 (56%) had RT, including 55 (28%) with CRT. Ninety-three (47%) met our criteria for EL success. Characteristics significantly associated with EL success included social support (p = .037) and post-TL full-oral diet (p = .037); both approached significance on MVA. EL success varied by pre-operative treatment on univariate (p = .005) and MVA (p = .014). Compared to no prior RT or CRT, the probability of EL success was 29% higher with prior RT and 29% lower with prior CRT in MVA, although these associations did not reach significance. Conclusions In this retrospective review, EL success correlated with high social support, post-TL full-oral diet, and pre-operative treatment history. These results warrant validation in a larger prospective study to help guide the choice of voice rehabilitation modalities or intensified speech therapy. Level of Evidence 4.
Collapse
Affiliation(s)
- Cassie Pan
- Department of Otolaryngology‐Head and Neck SurgeryUniversity of WashingtonSeattleWashingtonUSA
| | - Leah I. B. Andrews
- Department of Biostatistics, School of Public HealthUniversity of WashingtonSeattleWashingtonUSA
| | - Emily Johnson
- Department of Veterans AffairsPuget Sound Health Care SystemSeattleWashingtonUSA
| | - Neel K. Bhatt
- Department of Otolaryngology‐Head and Neck SurgeryUniversity of WashingtonSeattleWashingtonUSA
| | - Zain H. Rizvi
- Department of Otolaryngology‐Head and Neck SurgeryUniversity of WashingtonSeattleWashingtonUSA
- Department of Veterans AffairsPuget Sound Health Care SystemSeattleWashingtonUSA
| |
Collapse
|
5
|
Cao B, Ravi S, Sebkhi N, Bhavsar A, Inan OT, Xu W, Wang J. MagTrack: A Wearable Tongue Motion Tracking System for Silent Speech Interfaces. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:3206-3221. [PMID: 37146629 PMCID: PMC10555459 DOI: 10.1044/2023_jslhr-22-00319] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 09/06/2022] [Accepted: 02/20/2023] [Indexed: 05/07/2023]
Abstract
PURPOSE Current electromagnetic tongue tracking devices are not amenable for daily use and thus not suitable for silent speech interface and other applications. We have recently developed MagTrack, a novel wearable electromagnetic articulograph tongue tracking device. This study aimed to validate MagTrack for potential silent speech interface applications. METHOD We conducted two experiments: (a) classification of eight isolated vowels in consonant-vowel-consonant form and (b) continuous silent speech recognition. In these experiments, we used data from healthy adult speakers collected with MagTrack. The performance of vowel classification was measured by accuracies. The continuous silent speech recognition was measured by phoneme error rates. The performance was then compared with results using data collected with commercial electromagnetic articulograph in a prior study. RESULTS The isolated vowel classification using MagTrack achieved an average accuracy of 89.74% when leveraging all MagTrack signals (x, y, z coordinates; orientation; and magnetic signals), which outperformed the accuracy using commercial electromagnetic articulograph data (only y, z coordinates) in our previous study. The continuous speech recognition from two subjects using MagTrack achieved phoneme error rates of 73.92% and 66.73%, respectively. The commercial electromagnetic articulograph achieved 64.53% from the same subject (66.73% using MagTrack data). CONCLUSIONS MagTrack showed comparable results with the commercial electromagnetic articulograph when using the same localized information. Adding raw magnetic signals would improve the performance of MagTrack. Our preliminary testing demonstrated the potential for silent speech interface as a lightweight wearable device. This work also lays the foundation to support MagTrack's potential for other applications including visual feedback-based speech therapy and second language learning.
Collapse
Affiliation(s)
- Beiming Cao
- Department of Electrical and Computer Engineering, The University of Texas at Austin
- Department of Speech, Language, and Hearing Sciences, The University of Texas at Austin
| | - Shravan Ravi
- Department of Computer Science, The University of Texas at Austin
| | - Nordine Sebkhi
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta
| | - Arpan Bhavsar
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta
| | - Omer T. Inan
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta
| | - Wen Xu
- Division of Computer Science, Texas Woman's University, Denton
| | - Jun Wang
- Department of Speech, Language, and Hearing Sciences, The University of Texas at Austin
- Department of Neurology, The University of Texas at Austin
| |
Collapse
|
6
|
Mesolella M, Allosso S, D’aniello R, Pappalardo E, Catalano V, Quaremba G, Motta G, Salerno G. Subjective Perception and Psychoacoustic Aspects of the Laryngectomee Voice: The Impact on Quality of Life. J Pers Med 2023; 13:jpm13030570. [PMID: 36983751 PMCID: PMC10057772 DOI: 10.3390/jpm13030570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 02/17/2023] [Accepted: 03/21/2023] [Indexed: 03/29/2023] Open
Abstract
Purpose: A retrospective study is presented to correlate the inter-judge consistency for the different psycho-perceptual parameters of the recently proposed Impression Noise Fluency Voicing (INFVo) perceptual rating scale for substitution voices, and the vocal function as perceived by the patient. Methods: The scale Voice-Related Quality of Life (V-RQoL) and the Self Evaluation of Communication Experiences After Laryngectomy scale (SECEL)—a self-evaluation questionnaire of communicative experience after laryngectomy surgery—were administered to 89 total laryngectomees, subdivided in four groups depending on their type of alaryngeal voice (i.e., tracheoesophageal and esophageal speakers, electro larynx users, voiceless patients), in order to evaluate the impact of the impairment of the phonatory function on the quality of life. Results: No significant differences exist among the various groups on their perception of QoL using subjective questionnaires, whereas the INFVo scale has proven to be a useful tool for the description and analysis of the psychoacoustic characteristics of the vocal signal and a reliable instrument to correctly classify the patients. It is also notable that the judgement of the patients on their own voice and those of the referees are highly significant. Conclusion: Although speech rehabilitation for the acquisition of a substitution voice offers a new way of communication for the laryngectomized patients, nonetheless, their QoL is not significantly related to the type of substitution voice. Therefore, improving the patient’s adaptation to the new phonatory condition is mandatory.
Collapse
|
7
|
Mialland A, Atallah I, Bonvilain A. Toward a robust swallowing detection for an implantable active artificial larynx: a survey. Med Biol Eng Comput 2023; 61:1299-1327. [PMID: 36792845 DOI: 10.1007/s11517-023-02772-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 01/04/2023] [Indexed: 02/17/2023]
Abstract
Total laryngectomy consists in the removal of the larynx and is intended as a curative treatment for laryngeal cancer, but it leaves the patient with no possibility to breathe, talk, and swallow normally anymore. A tracheostomy is created to restore breathing through the throat, but the aero-digestive tracts are permanently separated and the air no longer passes through the nasal tracts, which allowed filtration, warming, humidification, olfaction, and acceleration of the air for better tissue oxygenation. As for phonation restoration, various techniques allow the patient to talk again. The main one consists of a tracheo-esophageal valve prosthesis that makes the air passes from the esophagus to the pharynx, and makes the air vibrate to allow speech through articulation. Finally, swallowing is possible through the original tract as it is now isolated from the trachea. Yet, many methods exist to detect and assess a swallowing, but none is intended as a definitive restoration technique of the natural airway, which would permanently close the tracheostomy and avoid its adverse effects. In addition, these methods are non-invasive and lack detection accuracy. The feasibility of an effective early detection of swallowing would allow to further develop an implantable active artificial larynx and therefore restore the aero-digestive tracts. A previous attempt has been made on an artificial larynx implanted in 2012, but no active detection was included and the system was completely mechanic. This led to residues in the airway because of the imperfect sealing of the mechanism. An active swallowing detection coupled with indwelling measurements would thus likely add a significant reliability on such a system as it would allow to actively close an artificial larynx. So, after a brief explanation of the swallowing mechanism, this survey intends to first provide a detailed consideration of the anatomical region involved in swallowing, with a detection perspective. Second, the swallowing mechanism following total laryngectomy surgery is detailed. Third, the current non-invasive swallowing detection technique and their limitations are discussed. Finally, the previous points are explored with regard to the inherent requirements for the feasibility of an effective swallowing detection for an artificial larynx. Graphical Abstract.
Collapse
Affiliation(s)
- Adrien Mialland
- Institute of Engineering and Management Univ. Grenoble Alpes, Univ. Grenoble Alpes, CNRS, Grenoble INP, Gipsa-lab, 38000, Grenoble, France.
| | - Ihab Atallah
- Institute of Engineering and Management Univ. Grenoble Alpes, Otorhinolaryngology, CHU Grenoble Alpes, 38700, La Tronche, France
| | - Agnès Bonvilain
- Institute of Engineering and Management Univ. Grenoble Alpes, Univ. Grenoble Alpes, CNRS, Grenoble INP, Gipsa-lab, 38000, Grenoble, France
| |
Collapse
|
8
|
Petrosyan A, Voskoboinikov A, Sukhinin D, Makarova A, Skalnaya A, Arkhipova N, Sinkin M, Ossadtchi A. Speech decoding from a small set of spatially segregated minimally invasive intracranial EEG electrodes with a compact and interpretable neural network. J Neural Eng 2022; 19. [PMID: 36356309 DOI: 10.1088/1741-2552/aca1e1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 11/10/2022] [Indexed: 11/12/2022]
Abstract
Objective. Speech decoding, one of the most intriguing brain-computer interface applications, opens up plentiful opportunities from rehabilitation of patients to direct and seamless communication between human species. Typical solutions rely on invasive recordings with a large number of distributed electrodes implanted through craniotomy. Here we explored the possibility of creating speech prosthesis in a minimally invasive setting with a small number of spatially segregated intracranial electrodes.Approach. We collected one hour of data (from two sessions) in two patients implanted with invasive electrodes. We then used only the contacts that pertained to a single stereotactic electroencephalographic (sEEG) shaft or an electrocorticographic (ECoG) stripe to decode neural activity into 26 words and one silence class. We employed a compact convolutional network-based architecture whose spatial and temporal filter weights allow for a physiologically plausible interpretation.Mainresults. We achieved on average 55% accuracy using only six channels of data recorded with a single minimally invasive sEEG electrode in the first patient and 70% accuracy using only eight channels of data recorded for a single ECoG strip in the second patient in classifying 26+1 overtly pronounced words. Our compact architecture did not require the use of pre-engineered features, learned fast and resulted in a stable, interpretable and physiologically meaningful decision rule successfully operating over a contiguous dataset collected during a different time interval than that used for training. Spatial characteristics of the pivotal neuronal populations corroborate with active and passive speech mapping results and exhibit the inverse space-frequency relationship characteristic of neural activity. Compared to other architectures our compact solution performed on par or better than those recently featured in neural speech decoding literature.Significance. We showcase the possibility of building a speech prosthesis with a small number of electrodes and based on a compact feature engineering free decoder derived from a small amount of training data.
Collapse
Affiliation(s)
- Artur Petrosyan
- Center for Bioelectric Interfaces, Higher School of Economics, Moscow, Russia
| | | | - Dmitrii Sukhinin
- Center for Bioelectric Interfaces, Higher School of Economics, Moscow, Russia
| | - Anna Makarova
- Center for Bioelectric Interfaces, Higher School of Economics, Moscow, Russia
| | | | | | - Mikhail Sinkin
- Moscow State University of Medicine and Dentistry, Scientific Research Institute of First Aid to them. N.V. Sklifosovsky, Moscow, Russia
| | - Alexei Ossadtchi
- Center for Bioelectric Interfaces, Higher School of Economics, Moscow, Russia.,Artificial Intelligence Research Institute, AIRI, Moscow, Russia
| |
Collapse
|
9
|
Cao B, Wisler A, Wang J. Speaker Adaptation on Articulation and Acoustics for Articulation-to-Speech Synthesis. SENSORS (BASEL, SWITZERLAND) 2022; 22:6056. [PMID: 36015817 PMCID: PMC9416444 DOI: 10.3390/s22166056] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/03/2022] [Accepted: 08/08/2022] [Indexed: 05/23/2023]
Abstract
Silent speech interfaces (SSIs) convert non-audio bio-signals, such as articulatory movement, to speech. This technology has the potential to recover the speech ability of individuals who have lost their voice but can still articulate (e.g., laryngectomees). Articulation-to-speech (ATS) synthesis is an algorithm design of SSI that has the advantages of easy-implementation and low-latency, and therefore is becoming more popular. Current ATS studies focus on speaker-dependent (SD) models to avoid large variations of articulatory patterns and acoustic features across speakers. However, these designs are limited by the small data size from individual speakers. Speaker adaptation designs that include multiple speakers' data have the potential to address the issue of limited data size from single speakers; however, few prior studies have investigated their performance in ATS. In this paper, we investigated speaker adaptation on both the input articulation and the output acoustic signals (with or without direct inclusion of data from test speakers) using the publicly available electromagnetic articulatory (EMA) dataset. We used Procrustes matching and voice conversion for articulation and voice adaptation, respectively. The performance of the ATS models was measured objectively by the mel-cepstral distortions (MCDs). The synthetic speech samples were generated and are provided in the supplementary material. The results demonstrated the improvement brought by both Procrustes matching and voice conversion on speaker-independent ATS. With the direct inclusion of target speaker data in the training process, the speaker-adaptive ATS achieved a comparable performance to speaker-dependent ATS. To our knowledge, this is the first study that has demonstrated that speaker-adaptive ATS can achieve a non-statistically different performance to speaker-dependent ATS.
Collapse
Affiliation(s)
- Beiming Cao
- Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX 78712, USA
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX 78712, USA
| | - Alan Wisler
- Department of Mathematics and Statistics, Utah State University, Logan, UT 84322, USA
| | - Jun Wang
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX 78712, USA
- Department of Neurology, Dell Medical School, University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
10
|
Abstract
Since the first total laryngectomy was performed in the late 18th century, several improvements and variations in surgical techniques have been proposed for this procedure. The surgical techniques employed in total laryngectomy have not been comprehensively discussed to date. Thus, the main objective of this article was to address controversial aspects related to this procedure and compare different surgical techniques used for a total laryngectomy procedure from the beginning to the end. Although the management paradigms in laryngeal and hypopharyngeal squamous cell carcinomas have shifted to organ-preserving chemoradiotherapy protocols, total laryngectomy still plays a prominent role in the treatment of advanced and recurrent tumors. The increased incidence of complications associated with salvage total laryngectomy has driven efforts to improve the surgical techniques in various aspects of the operation. Loss of voice and impaired swallowing are the most difficult challenges to be overcome in laryngectomies, and the introduction of tracheoesophageal voice prostheses has made an enormous difference in postoperative rehabilitation and quality of life. Advancements in reconstruction techniques, tumor control, and metastatic management, such as prophylactic neck treatments and paratracheal nodal dissection (PTND), as well as the use of thyroid gland-preserving total laryngectomy in selected patients have all led to the increasing success of modern total laryngectomy. Several conclusions regarding the benchmarking of surgical techniques cannot be drawn. Issues regarding total laryngectomy are still open for discussion, and the technique will continue to require improvement in the near future.
Collapse
Affiliation(s)
- Adit Chotipanich
- Otolaryngology Department, Chonburi Cancer Hospital, Ministry of Public Health, Chonburi, THA
| |
Collapse
|
11
|
Sato K, Genda J, Minabe R, Taniguchi T. Characteristics of Japanese Electrolaryngeal Speech Produced by Untrained Speakers: An Observational Study Involving Healthy Volunteers. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:3786-3793. [PMID: 34546765 DOI: 10.1044/2021_jslhr-21-00069] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose The aim of this study was to investigate the characteristics of electrolaryngeal (EL) speech among untrained speakers to aid in its effective introduction and to identify syllables and words that are easy or difficult to pronounce. Method A total of 21 healthy individuals who had never used an EL were included. The participants were briefed, and tests comprising 100 Japanese syllables and 50 single words were conducted to evaluate EL speech intelligibility. A trained speaker was defined as a certified speech-language pathologist who underwent EL training for 3 months. A 5-point electrolarynx effectivity score (EES) was used for the subjective assessment of EL. Results The median (interquartile range) intelligibility scores of the untrained and trained groups were 24.0% (20.0%-34.0%) and 40.0% (36.0%-45.0%) for syllables and 48.0% (38.0%-60.0%) and 88.0% (82.0%-90.0%) for words, respectively. The intelligibility scores for syllables and words were higher in the trained group than those in the untrained group. Only two syllable subgroups (/m/ and /w/) had > 80% correct answers among untrained speakers. A total of 14 syllable subgroups (/k, kʲ, s, ɕ, t, t͡ɕ, ts, ɲ, h, ç, ɸ, p, pʲ, and a/), a number of which contained voiceless consonants, had < 40% correct answers among both speaker groups. A greater number of morae were associated with higher intelligibility scores. An EES of 4, indicating that the EL was effective, was the most frequent score. Conclusions It was difficult for untrained speakers to produce intelligible speech using an EL. Syllables, including voiceless consonants, were difficult to pronounce using an EL. Longer words with a greater number of morae were more intelligible, even for untrained EL speakers. Supplemental Material https://doi.org/10.23641/asha.16632622.
Collapse
Affiliation(s)
- Koji Sato
- Intensive Care Unit, Kanazawa University Hospital, Japan
| | - Junji Genda
- Department of Rehabilitation, Kanazawa University Hospital, Japan
| | - Ryoya Minabe
- Department of Rehabilitation, Kanazawa University Hospital, Japan
| | - Takumi Taniguchi
- Intensive Care Unit, Kanazawa University Hospital, Japan
- Department of Anesthesiology and Intensive Care Medicine, Graduate School of Medical Sciences, Kanazawa University, Japan
| |
Collapse
|
12
|
Knollhoff SM, Borrie SA, Barrett TS, Searl JP. Listener impressions of alaryngeal communication modalities. INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2021; 23:540-547. [PMID: 33501872 DOI: 10.1080/17549507.2020.1849400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose: Following a total laryngectomy in which the larynx is completely removed, individuals in the USA have three primary options for alaryngeal verbal communication including tracheoesophageal speech (TES), oesophageal speech (ES) and electrolarynx (EL). Using a large sample of participants from across the USA, this study investigated listener impressions of each primary type of alaryngeal communication. As these are the individuals more likely to be participating in social interactions and in positions of hiring for employment, the general public's impressions of TES, ES and EL may be a vital consideration during the treatment process.Method: A total of 381 individuals rated eight speech samples, including samples from speakers of each alaryngeal communication modality as well samples from age and sex matched laryngeal speakers, with regards to three outcome measures: intelligence, likability and employability.Result: Listener impressions of alaryngeal speech samples were modulated by the type of communication mode. Further, the patterns of results differed by speaker sex, with ES speech rated consistently more favourable for female speakers across all outcome measures and TES rated consistently more favourable for male speakers across all outcome measures.Conclusion: An overall preference for laryngeal speech was noted, particularly with male speakers. The female ES stimuli, interestingly, was the highest rated alaryngeal communication modality. Regardless of speaker sex, all alaryngeal modes greatly affected impressions of employability relative to impressions of likeability and intelligence.
Collapse
Affiliation(s)
| | - Stephanie A Borrie
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan, UT, USA
| | - Tyson S Barrett
- Department of Psychology, Utah State University, Logan, UT, USA, and
| | - Jeff P Searl
- Department of Communication Sciences and Disorders, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
13
|
An automatic water-occluding device to enable laryngectomee participation in water activities. PLoS One 2021; 16:e0257463. [PMID: 34516593 PMCID: PMC8437266 DOI: 10.1371/journal.pone.0257463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 09/01/2021] [Indexed: 11/19/2022] Open
Abstract
Individuals with a laryngectomy face a host of challenges ranging from restricted vocal communication to significant lifestyle modifications associated with breathing through a stoma. Although there are significant mental and physical health benefits achieved by returning to recreational pursuits that were enjoyed pre-surgery, there can be significant obstacles in doing so. One particular challenge arises during participation in water activities (e.g, fishing, boating, etc.) where accidental submersion poses a significant risk of drowning. This manuscript describes a proof-of-concept device that protects the airway from accidental incursion of water into the airway during unanticipated submersion in water, thereby allowing laryngectomees to return to participation in water activities. The device is designed to be worn comfortably for long periods of time, while not interfering with the common methods of replacement speech that are utilized post-laryngectomy.
Collapse
|
14
|
Zhu Y, Chen D, Jiang L, Yu L. Assessing the applications of transitional care and its impact on the quality of life in patients after total laryngectomy. Am J Transl Res 2021; 13:7349-7355. [PMID: 34306504 PMCID: PMC8290806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 02/08/2021] [Indexed: 06/13/2023]
Abstract
OBJECTIVE To explore the effect of transitional care and its impact on quality of life (QoL) in patients who underwent total laryngectomy. METHODS The study enrolled 68 patients who were admitted to our hospital and underwent total laryngectomy from January 2017 to January 2019. The subjects were randomly divided into an observation group and a control group. Conventional care was given to the control group (34 cases), while conventional and transitional care was given to the observation group (34 cases). The study sought to compare the self-care ability, health knowledge, satisfaction with nursing, and QoL between the two groups at discharge and 6 months after discharge. RESULT Compared with the control group, the observation group showed higher scores in self-care ability, more extensive health knowledge, greater satisfaction, and better QoL at 6 months after discharge from the hospital. The differences were statistically significant (P < 0.05). CONCLUSION Transitional care can effectively improve the following performance in patients, including self-care activity after hospital discharge, health knowledge, and satisfaction with care, medication adherence, and QoL. Transitional care can be considered in a broader application.
Collapse
Affiliation(s)
- Yingchao Zhu
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People’s Hospital, Shanghai Jiao Tong University School of MedicineShanghai 200011, China
- Ear Institute, Shanghai JiaoTong University School of MedicineShanghai 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose DiseasesShanghai 200011, China
| | - Dong Chen
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People’s Hospital, Shanghai Jiao Tong University School of MedicineShanghai 200011, China
- Ear Institute, Shanghai JiaoTong University School of MedicineShanghai 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose DiseasesShanghai 200011, China
| | - Lili Jiang
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People’s Hospital, Shanghai Jiao Tong University School of MedicineShanghai 200011, China
- Ear Institute, Shanghai JiaoTong University School of MedicineShanghai 200011, China
- Shanghai Key Laboratory of Translational Medicine on Ear and Nose DiseasesShanghai 200011, China
| | - Leilei Yu
- Department of Oral Surgery Department, Shanghai Ninth People’s Hospital, Shanghai Jiao Tong University School of MedicineShanghai 200011, China
| |
Collapse
|
15
|
Repova B, Zabrodsky M, Plzak J, Kalfert D, Matousek J, Betka J. Text-to-speech synthesis as an alternative communication means after total laryngectomy. Biomed Pap Med Fac Univ Palacky Olomouc Czech Repub 2020; 165:192-197. [PMID: 32367081 DOI: 10.5507/bp.2020.016] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 04/06/2020] [Indexed: 11/23/2022] Open
Abstract
AIMS Total laryngectomy still plays an essential part in the treatment of laryngeal cancer and loss of voice is the most feared consequence of the surgery. Commonly used rehabilitation methods include esophageal voice, electrolarynx, and implantation of voice prosthesis. In this paper we focus on a new perspective of vocal rehabilitation utilizing alternative and augmentative communication (AAC) methods. METHODS AND PATIENTS 61 consecutive patients treated by means of total laryngectomy with or w/o voice prosthesis implantation were included in the study. All were offered voice banking and personalized speech synthesis (PSS). They had to voluntarily express their willingness to participate and to prove the ability to use modern electronic communication devices. RESULTS Of 30 patients fulfilling the study criteria, only 18 completed voice recording sufficient for voice reconstruction and synthesis. Eventually, only 7 patients started to use this AAC technology during the early postoperative period. The frequency and total usage time of the device gradually decreased. Currently, only 6 patients are active users of the technology. CONCLUSION The influence of communication with the surrounding world on the quality of life of patients after total laryngectomy is unquestionable. The possibility of using the spoken word with the patient's personalized voice is an indisputable advantage. Such a form of voice rehabilitation should be offered to all patients who are deemed eligible.
Collapse
Affiliation(s)
- Barbora Repova
- Department of Otorhinolaryngology, Head and Neck Surgery, First Faculty of Medicine, Charles University in Prague and University Hospital Motol
| | - Michal Zabrodsky
- Department of Otorhinolaryngology, Head and Neck Surgery, First Faculty of Medicine, Charles University in Prague and University Hospital Motol
| | - Jan Plzak
- Department of Otorhinolaryngology, Head and Neck Surgery, First Faculty of Medicine, Charles University in Prague and University Hospital Motol
| | - David Kalfert
- Department of Otorhinolaryngology, Head and Neck Surgery, First Faculty of Medicine, Charles University in Prague and University Hospital Motol
| | - Jindrich Matousek
- Department of Cybernetics, University of West Bohemia in Pilsen, Pilsen, Czech Republic
| | - Jan Betka
- Department of Otorhinolaryngology, Head and Neck Surgery, First Faculty of Medicine, Charles University in Prague and University Hospital Motol
| |
Collapse
|
16
|
Moors T, Silva S, Maraschin D, Young D, Quinn JM, de Carpentier J, Allouche J, Himonides E. Using Beatboxing for Creative Rehabilitation After Laryngectomy: Experiences From a Public Engagement Project. Front Psychol 2020; 10:2854. [PMID: 32082203 PMCID: PMC7001741 DOI: 10.3389/fpsyg.2019.02854] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2019] [Accepted: 12/02/2019] [Indexed: 11/29/2022] Open
Abstract
Laryngectomy is the surgical removal of the larynx (voice box), usually performed in patients with advanced stages of throat cancer. The psychosocial impact of losing the voice is significant, affecting a person’s professional and social life in a devastating way, and a proportion of this patient group subsequently must overcome depression (22–30%) and social isolation (40%). The profound changes to anatomical structures involved in voicing and articulation, as a result of surgery, radiotherapy or chemotherapy (separately or in combination with one another), introduce challenges faced in speech rehabilitation and voice production that complicate social reintegration and quality of life. After laryngectomy, breathing, voicing, articulation and tongue movement are major components in restoring communication. Regular exercise of the chest, neck and oropharyngeal muscles, in particular, is important in controlling these components and keeping the involved structures supple. It is, however, a difficult task for a speech therapist to keep the patient engaged and motivated to practice these exercises. We have adopted a multidisciplinary approach to explore the use of basic beatboxing techniques to create a wide variety of exercises that are seen as fun and interactive and that maximize the use of the structures important in alaryngeal phonation. We herein report on our empirical work in developing patients’ skills, particularly relating to voiced and unvoiced consonants to improve intelligibility. In collaboration with a professional beatboxing performer, we produced instructional online video materials to support patients working on their own and/or with support from speech therapists. Although the present paper is focused predominantly on introducing the structure of the conducted workshops, the rationale for their design and the final public engagement performance, we also include feedback from participants to commence the critical discourse about whether this type of activity could lead to systematic underlying research and robustly assessed interventions in the future. Based on this exploratory work, we conclude that the innovative approach that we employed was found to be engaging, useful, informative and motivating. We conclude by offering our views regarding the limitations of our work and the implications for future empirical research.
Collapse
Affiliation(s)
| | | | - Donatella Maraschin
- School of Arts and Creative Industries, London South Bank University, London, United Kingdom
| | - David Young
- School of Science and Engineering, University of Dundee, Dundee, United Kingdom
| | - John M Quinn
- First Faculty of Medicine, Institute of Hygiene and Epidemiology, Charles University, Prague, Czechia
| | | | | | - Evangelos Himonides
- UCL Institute of Education, University College London, London, United Kingdom
| |
Collapse
|
17
|
Lee JH, Ba D, Liu G, Leslie D, Zacharia BE, Goyal N. Association of Head and Neck Cancer With Mental Health Disorders in a Large Insurance Claims Database. JAMA Otolaryngol Head Neck Surg 2020; 145:339-344. [PMID: 30816930 DOI: 10.1001/jamaoto.2018.4512] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Importance Although a few studies have shown that mental health disorders (MHDs) are strongly associated with the 5-year survival and recurrence rates in patients with head and neck cancer (HNC), none have been replicated in a large-scale study. Objective To describe the prevalence of MHDs in patients with HNC and the potential associations with survival and recurrence using a large insurance claims database. Design, Setting, and Participants This retrospective cohort study assessed data queried from the MarketScan database from January 1, 2005, through December 31, 2014, for 52 641 patients with a diagnosis of HNC. To exclude patients with a preexisting HNC diagnosis or those with incomplete data, patients were included if they were in the database for at least 12 months before the index diagnosis and continuously enrolled. Data were analyzed from February 20, 2017, through January 22, 2019. Main Outcomes and Measures To compare the frequency of MHDs before and after diagnosis of HNC, χ2 tests for independence were used. Adjusted adds ratios (aORs) were obtained using multivariable logistic regression by comparing the prevalence of MHDs in patients with oral cavity cancer and those with other cancer sites in the head and neck. Results Among the 52 641 patients included in the analysis (mean [SD] age, 51.31 [9.79] years), men (58.5%), patients aged 55 to 64 years (46.6%), and those from the South (40.3%) were most commonly affected by HNC. Oral cavity cancers (40.4%) were the most common type, followed by cancers of the oropharynx (19.2%) and larynx (15.5%). Of the various cancer sites, the OR for MHD prevalence was significantly increased in patients with cancers of the trachea compared with the oral cavity (2.11; 95% CI, 1.87-2.38). The prevalence of MHDs in patients with HNC increased to 29.9% compared with 20.6% before the cancer diagnosis. Specifically, women (adjusted OR, 1.58; 95% CI, 1.49-1.67) and patients with a history of tobacco use (adjusted OR, 1.42; 95% CI, 1.34-1.50) and alcohol use (adjusted OR, 1.56; 95% CI, 1.38-1.76) had significantly higher odds of MHDs after the diagnosis of HNC. Conclusions and Relevance Although the baseline MHD prevalence of 20.6% before the cancer diagnosis was close to the national average (17.9% according to the National Survey on Drug Use and Health), results of this study showed that it increased to 29.9% after the cancer diagnosis. Women and patients with a history of tobacco and alcohol use were most susceptible to being diagnosed with an MHD. There is an association between patients with HNC and an increased prevalence of MHDs after treatment compared with the general population.
Collapse
Affiliation(s)
- Ji Hyae Lee
- Medical student at College of Medicine, Pennsylvania State University, Hershey
| | - Djibril Ba
- Department of Public Health Sciences, College of Medicine, Pennsylvania State University, Hershey
| | - Guodong Liu
- Department of Public Health Sciences, College of Medicine, Pennsylvania State University, Hershey
| | - Douglas Leslie
- Department of Public Health Sciences, College of Medicine, Pennsylvania State University, Hershey
| | - Brad E Zacharia
- Department of Neurosurgery, College of Medicine, Pennsylvania State University, Hershey
| | - Neerav Goyal
- Division of Otolaryngology-Head and Neck Surgery, Department of Surgery, College of Medicine, Pennsylvania State University, Hershey
| |
Collapse
|
18
|
Rameau A. Pilot study for a novel and personalized voice restoration device for patients with laryngectomy. Head Neck 2019; 42:839-845. [PMID: 31876090 DOI: 10.1002/hed.26057] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 11/06/2019] [Accepted: 12/10/2019] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND The main modalities for voice restoration after laryngectomy are the electrolarynx, and the tracheoesophageal puncture [Correction added on 30 January 2020 after first online publication: The preceding sentence has been revised. It originally read "The main modalities for voice restoration after laryngectomy are the electrolarynx and the tracheoesophageal puncture."]. All have limitations and new technologies may offer innovative alternatives via silent speech. OBJECTIVE To describe a novel and personalized method of voice restoration using machine learning applied to electromyographic signal from articulatory muscles for the recognition of silent speech in a patient with total laryngectomy. METHODS Surface electromyographic (sEMG) signals of articulatory muscles were recorded from the face and neck of a patient with total laryngectomy who was articulating words silently. These sEMG signals were then used for automatic speech recognition via machine learning. Sensor placement was tailored to the patient's unique anatomy, following radiation and surgery. A personalized wearable mask covering the sensors was designed using 3D scanning and 3D printing. RESULTS Using seven sEMG sensors on the patient's face and neck and two grounding electrodes, we recorded EMG data while he was mouthing "Tedd" and "Ed." With data from 75 utterances for each of these words, we discriminated the sEMG signal with 86.4% accuracy using an XGBoost machine-learning model. CONCLUSIONS This pilot study demonstrates the feasibility of sEMG-based alaryngeal speech recognition, using tailored sensor placement and a personalized wearable device. Further refinement of this approach could allow translation of silently articulated speech into a synthesized voiced speech via portable devices.
Collapse
Affiliation(s)
- Anaïs Rameau
- Department of Otolaryngology-Head and Neck Surgery, Weill Cornell Medical College, Sean Parker Institute for the Voice, New York, New York
| |
Collapse
|
19
|
Influence of Collective Esophageal Speech Training on Self-efficacy in Chinese Laryngectomees: A Pretest-posttest Group Study. Curr Med Sci 2019; 39:810-815. [PMID: 31612400 DOI: 10.1007/s11596-019-2109-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Revised: 04/17/2019] [Indexed: 10/25/2022]
Abstract
Total laryngectomy affects the speaking functions of many patients. Speech deprivation has great impacts on the quality of life of patients, especially on self-efficacy. Learning esophageal speech represents a way to help laryngectomees speak again. The purpose of this study was to determine the influence of collective esophageal speech training on self-efficacy of laryngectomees. In this study, 28 patients and 30 family members were included. The participants received information about training via telephone or a WeChat group. Collective esophageal speech training was used to educate laryngectomees on esophageal speech. Before and after collective esophageal speech training, all participants completed the General Self-Efficacy Scale (GSES) to assess their perceptions on self-efficacy. Through the training, laryngectomees recovered their speech. After the training, the self-efficacy scores of laryngectomees were higher than those before the training, with significant differences noted (T<0.05). However, family members' scores did not change significantly. In conclusion, collective esophageal speech training is not only convenient and economical, but also improves self-efficacy and confidence of laryngectomees. Greater self-efficacy is helpful for laryngectomees to master esophageal speech and improve their quality of life. In addition, more attention should be focused on improving the self-efficacy of family members and making them give full play to their talent and potential on laryngectomees' voice rehabilitation.
Collapse
|