1
|
Al-Zanoon N, Parsa V, Doyle PC. Using visual feedback to enhance intonation control with a variable pitch electrolarynx. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1802. [PMID: 32237840 DOI: 10.1121/10.0000936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Accepted: 03/03/2020] [Indexed: 06/11/2023]
Abstract
This study evaluated the effectiveness of using visual feedback to facilitate pitch control by a speaker using a pressure sensitive onset controlled electrolarynx (EL). This proof-of-concept study was conducted with one healthy adult. The participant-speaker was provided with computer generated visual feedback over five sessions within a consecutive period of three weeks. Changes in force control accuracy were gathered and analyzed. An improvement in finger (thumb) force control accuracy from the first to the last training session was documented. The results of this study provide data toward the development of a clinical training protocol for the use of a pressure sensitive onset controlled EL by laryngectomized speakers. Further, these results highlight the importance of developing a relevant multimodality training protocol for the improvement of postlaryngectomy EL speech production.
Collapse
Affiliation(s)
- Noor Al-Zanoon
- Department of Communication Sciences and Disorders, University of Alberta, 116 Street and 85 Avenue, Edmonton, Alberta T6G 2R3, Canada
| | - Vijay Parsa
- School of Communication Sciences and Disorders, Elborn College, Western University, London, Ontario N6A 3K7, Canada
| | - Philip C Doyle
- School of Communication Sciences and Disorders, Elborn College, Western University, London, Ontario N6A 3K7, Canada
| |
Collapse
|
2
|
Patel RR, Lulich SM, Verdi A. Vocal tract shape and acoustic adjustments of children during phonation into narrow flow-resistant tubes. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:352. [PMID: 31370566 DOI: 10.1121/1.5116681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 06/25/2019] [Indexed: 06/10/2023]
Abstract
The goal of the study is to quantify the salient vocal tract acoustic, subglottal acoustic, and vocal tract physiological characteristics during phonation into a narrow flow-resistant tube with 2.53 mm inner diameter and 124 mm length in typically developing vocally healthy children using simultaneous microphone, accelerometer, and 3D/4D ultrasound recordings. Acoustic measurements included fundamental frequency (fo), first formant frequency (F1), second formant frequency (F2), first subglottal resonance (FSg1), and peak-to-peak amplitude ratio (Pvt:Psg). Physiological measurements included posterior tongue height (D1), tongue dorsum height (D2), tongue tip height (D3), tongue length (D4), oral cavity width (D5), hyoid elevation (D6), pharynx width (D7). All measurements were made on eight boys and ten girls (6-9 years) during sustained /o:/ production at typical pitch and loudness, with and without flow-resistant tube. Phonation with the flow-resistant tube resulted in a significant decrease in F1, F2, and Pvt:Psg and a significant increase in D2, D3, and FSg1. A statistically significant gender effect was observed for D1, with D1 higher in boys. These findings agree well with reported findings from adults, suggesting common acoustic and articulatory mechanisms for narrow flow-resistant tube phonation. Theoretical implications of the findings are discussed.
Collapse
Affiliation(s)
- Rita R Patel
- Department of Speech and Hearing Sciences, Indiana University, 200 South Jordan Avenue, Bloomington, Indiana 47405-7002, USA
| | - Steven M Lulich
- Department of Speech and Hearing Sciences, Indiana University, 200 South Jordan Avenue, Bloomington, Indiana 47405-7002, USA
| | - Alessandra Verdi
- Department of Speech and Hearing Sciences, Indiana University, 200 South Jordan Avenue, Bloomington, Indiana 47405-7002, USA
| |
Collapse
|
3
|
Tuttle TG, Erath BD. Design and Evaluation of a Mechanically Driven Artificial Speech Device. J Med Device 2017. [DOI: 10.1115/1.4038222] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
This paper presents the design of a mechanically driven artificial speech device to be used by laryngectomees as an affordable alternative to an electrolarynx (EL). Design objectives were based on feedback from potential end users. The device implements a mainspring powered gear train that drives a striker. The striker impacts a suspended drum-like head, producing sound. When pressed against the neck, the head transmits sound into the oral cavity, allowing the user to produce intelligible speech. The dynamics of the vibrating head and sound pressure levels (SPLs) produced at the mouth were measured to compare performance between the device and a common, commercially available EL. The results showed comparable performance and sound output.
Collapse
Affiliation(s)
- Tyler G. Tuttle
- Department of Mechanical Engineering, Michigan State University, 428 South Shaw Ln, East Lansing, MI 48824 e-mail:
| | - Byron D. Erath
- Department of Mechanical and Aeronautical Engineering, Clarkson University, 8 Clarkson Ave, Potsdam, NY 13699 e-mail:
| |
Collapse
|
4
|
Mostafa SS, Awal MA, Ahmad M, Rashid MA. Voiceless Bangla vowel recognition using sEMG signal. SPRINGERPLUS 2016; 5:1522. [PMID: 27652095 PMCID: PMC5017969 DOI: 10.1186/s40064-016-3170-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/04/2016] [Accepted: 08/30/2016] [Indexed: 11/10/2022]
Abstract
Some people cannot produce sound although their facial muscles work properly due to having problem in their vocal cords. Therefore, recognition of alphabets as well as sentences uttered by these voiceless people is a complex task. This paper proposes a novel method to solve this problem using non-invasive surface Electromyogram (sEMG). Firstly, eleven Bangla vowels are pronounced and sEMG signals are recorded at the same time. Different features are extracted and mRMR feature selection algorithm is then applied to select prominent feature subset from the large feature vector. After that, these prominent features subset is applied in the Artificial Neural Network for vowel classification. This novel Bangla vowel classification method can offer a significant contribution in voice synthesis as well as in speech communication. The result of this experiment shows an overall accuracy of 82.3 % with fewer features compared to other studies in different languages.
Collapse
Affiliation(s)
- S S Mostafa
- Khulna University of Engineering and Technology, Khulna, 9203 Bangladesh
| | - M A Awal
- Centre for Clinical Research, The University of Queensland, Brisbane, QLD Australia
| | - M Ahmad
- Khulna University of Engineering and Technology, Khulna, 9203 Bangladesh
| | - M A Rashid
- Universiti Sultan Zainal Abidin, 21300 Kuala Terengganu, Malaysia
| |
Collapse
|
5
|
Hanna N, Smith J, Wolfe J. Frequencies, bandwidths and magnitudes of vocal tract and surrounding tissue resonances, measured through the lips during phonation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:2924. [PMID: 27250184 DOI: 10.1121/1.4948754] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
The frequencies, magnitudes, and bandwidths of vocal tract resonances are all important in understanding and synthesizing speech. High precision acoustic impedance spectra of the vocal tracts of 10 subjects were measured from 10 Hz to 4.2 kHz by injecting a broadband acoustic signal through the lips. Between 300 Hz and 4 kHz the acoustic resonances R (impedance minima measured through the lips) and anti-resonances R¯ (impedance maxima) associated with the first three voice formants, have bandwidths of ∼50 to 90 Hz for men and ∼70 to 90 Hz for women. These acoustic resonances approximate those of a smooth, dry, rigid cylinder of similar dimensions, except that their bandwidths indicate higher losses in the vocal tract. The lossy, inertive load and airflow caused by opening the glottis further increase the bandwidths observed during phonation. The vocal tract walls are not rigid and measurements show an acousto-mechanical resonance R0 ∼ 20 Hz and anti-resonance R¯0∼200 Hz. These give an estimate of wall inertance consistent with an effective thickness of 1-2 cm and a wall stiffness of 2-4 kN m(-1). The non-rigidity of the tract imposes a lower limit of the frequency of the first acoustic resonance fR1 and the first formant F1.
Collapse
Affiliation(s)
- Noel Hanna
- School of Physics, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - John Smith
- School of Physics, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Joe Wolfe
- School of Physics, University of New South Wales, Sydney, New South Wales 2052, Australia
| |
Collapse
|
6
|
Mehta DD, Van Stan JH, Hillman RE. Relationships between vocal function measures derived from an acoustic microphone and a subglottal neck-surface accelerometer. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 2016; 24:659-668. [PMID: 27066520 PMCID: PMC4826073 DOI: 10.1109/taslp.2016.2516647] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Monitoring subglottal neck-surface acceleration has received renewed attention due to the ability of low-profile accelerometers to confidentially and noninvasively track properties related to normal and disordered voice characteristics and behavior. This study investigated the ability of subglottal neck-surface acceleration to yield vocal function measures traditionally derived from the acoustic voice signal and help guide the development of clinically functional accelerometer-based measures from a physiological perspective. Results are reported for 82 adult speakers with voice disorders and 52 adult speakers with normal voices who produced the sustained vowels /a/, /i/, and /u/ at a comfortable pitch and loudness during the simultaneous recording of radiated acoustic pressure and subglottal neck-surface acceleration. As expected, timing-related measures of jitter exhibited the strongest correlation between acoustic and neck-surface acceleration waveforms (r ≤ 0.99), whereas amplitude-based measures of shimmer correlated less strongly (r ≤ 0.74). Additionally, weaker correlations were exhibited by spectral measures of harmonics-to-noise ratio (r ≤ 0.69) and tilt (r ≤ 0.57), whereas the cepstral peak prominence correlated more strongly (r ≤ 0.90). These empirical relationships provide evidence to support the use of accelerometers as effective complements to acoustic recordings in the assessment and monitoring of vocal function in the laboratory, clinic, and during an individual's daily activities.
Collapse
Affiliation(s)
- Daryush D Mehta
- Center for Laryngeal Surgery & Voice Rehabilitation, Massachusetts General Hospital, Boston MA 02114 USA, Department of Surgery, Harvard Medical School, Boston, MA 02115 USA, and the Institute of Health Professions, Massachusetts General Hospital, Boston, Massachusetts 02129 USA ( )
| | - Jarrad H Van Stan
- Center for Laryngeal Surgery & Voice Rehabilitation, Massachusetts General Hospital, Boston MA 02114 USA and the Institute of Health Professions, Massachusetts General Hospital, Boston, Massachusetts 02129 USA ( )
| | - Robert E Hillman
- Center for Laryngeal Surgery & Voice Rehabilitation and Institute of Health Professions, Massachusetts General Hospital, Boston MA 02114 USA and Surgery and Health Sciences & Technology, Harvard Medical School, Boston, MA 02115 ( )
| |
Collapse
|
7
|
Perrachione TK, Stepp CE, Hillman RE, Wong PCM. Talker identification across source mechanisms: experiments with laryngeal and electrolarynx speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2014; 57:1651-1665. [PMID: 24801962 PMCID: PMC4655826 DOI: 10.1044/2014_jslhr-s-13-0161] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2013] [Accepted: 03/12/2014] [Indexed: 05/29/2023]
Abstract
PURPOSE The purpose of this study was to determine listeners' ability to learn talker identity from speech produced with an electrolarynx, explore source and filter differentiation in talker identification, and describe acoustic-phonetic changes associated with electrolarynx use. METHOD Healthy adult control listeners learned to identify talkers from speech recordings produced using talkers' normal laryngeal vocal source or an electrolarynx. Listeners' abilities to identify talkers from the trained vocal source (Experiment 1) and generalize this knowledge to the untrained source (Experiment 2) were assessed. Acoustic-phonetic measurements of spectral differences between source mechanisms were performed. Additional listeners attempted to match recordings from different source mechanisms to a single talker (Experiment 3). RESULTS Listeners successfully learned talker identity from electrolarynx speech but less accurately than from laryngeal speech. Listeners were unable to generalize talker identity to the untrained source mechanism. Electrolarynx use resulted in vowels with higher F1 frequencies compared with laryngeal speech. Listeners matched recordings from different sources to a single talker better than chance. CONCLUSIONS Electrolarynx speech, although lacking individual differences in voice quality, nevertheless conveys sufficient indexical information related to the vocal filter and articulation for listeners to identify individual talkers. Psychologically, perception of talker identity arises from a "gestalt" of the vocal source and filter.
Collapse
|
8
|
Wu L, Xiao K, Dong J, Wang S, Wan M. Measurement of the sound transmission characteristics of normal neck tissue using a reflectionless uniform tube. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:350-6. [PMID: 24993219 DOI: 10.1121/1.4883355] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Understanding the sound transmission of the neck tissue is necessary and important in areas such as vocal function evaluation and electrolarynx improvement. In this paper, a simple method using a reflectionless tube was proposed to measure the neck frequency response function (NFRF) of ten normal subjects (five males and five females) during Mandarin vowel production. The NFRFs across different subjects producing different vowels were measured at different neck positions and compared to confirm the effectiveness of the method, and determine the NFRF variations in normal subjects. The results showed that the proposed method offered an easy and effective way to obtain an accurate NFRF. For normal subjects, the neck tissue can be treated as a low-pass filter, with a maximum gain at 310 Hz and a roll-off at a slope of -8.4 dB/octave, flattening out above 2000 Hz. The measurement position on the neck did not influence the shape of the NFRF, but did change the overall gains of the NFRF. In addition, there was a significant gender difference in NFRFs at the low frequencies. Finally, some potential applications of this method and the results are suggested.
Collapse
Affiliation(s)
- Liang Wu
- The Key Laboratory of Biomedical Information Engineering of the Ministry of Education, Department of Biomedical Engineering, School of Life Science and Technology, Xi' an Jiaotong University, Xi' an 710049, People's Republic of China
| | - Ke Xiao
- The Key Laboratory of Biomedical Information Engineering of the Ministry of Education, Department of Biomedical Engineering, School of Life Science and Technology, Xi' an Jiaotong University, Xi' an 710049, People's Republic of China
| | - Jiaqi Dong
- The Key Laboratory of Biomedical Information Engineering of the Ministry of Education, Department of Biomedical Engineering, School of Life Science and Technology, Xi' an Jiaotong University, Xi' an 710049, People's Republic of China
| | - Supin Wang
- The Key Laboratory of Biomedical Information Engineering of the Ministry of Education, Department of Biomedical Engineering, School of Life Science and Technology, Xi' an Jiaotong University, Xi' an 710049, People's Republic of China
| | - Mingxi Wan
- The Key Laboratory of Biomedical Information Engineering of the Ministry of Education, Department of Biomedical Engineering, School of Life Science and Technology, Xi' an Jiaotong University, Xi' an 710049, People's Republic of China
| |
Collapse
|
9
|
Liang Wu, Congying Wan, Supin Wang, Mingxi Wan. Improvement of Electrolaryngeal Speech Quality Using a Supraglottal Voice Source With Compensation of Vocal Tract Characteristics. IEEE Trans Biomed Eng 2013; 60:1965-74. [DOI: 10.1109/tbme.2013.2246789] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
10
|
Nagle KF, Eadie TL, Wright DR, Sumida YA. Effect of fundamental frequency on judgments of electrolaryngeal speech. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2012; 21:154-166. [PMID: 22355005 DOI: 10.1044/1058-0360(2012/11-0050)] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
PURPOSE To determine (a) the effect of fundamental frequency (f₀) on speech intelligibility, acceptability, and perceived gender in electrolaryngeal (EL) speakers, and (b) the effect of known gender on speech acceptability in EL speakers. METHOD A 2-part study was conducted. In Part 1, 34 healthy adults provided speech recordings using electrolarynges set at 75 Hz, 130 Hz, and 175 Hz, and 36 listeners transcribed the recordings. In Part 2, 22 speech samples were presented to 16 listeners. First, listeners identified the gender of each speaker and judged his or her speech acceptability using rating scales. Second, listeners judged the same samples for speech acceptability when gender information was provided. RESULTS In Part 1, speakers were significantly more intelligible when using 75-Hz devices. In Part 2, the f₀ of the speech signal significantly impacted listeners' accuracy in perceiving the speaker's gender: In gender-incongruent conditions (males using 175-Hz devices, females using 75-Hz devices), listeners were unable to identify female speakers. Speech acceptability judgments were directly related to intelligibility. Finally, listeners differentially penalized female speakers who used 75-Hz devices when gender information was known. CONCLUSION Low f₀ facilitated speech intelligibility. However, at low f₀, listeners were unable to identify females as female, and females were differentially penalized for speech acceptability. Results may have implications for rehabilitation.
Collapse
|
11
|
Ng ML, Liu H, Zhao Q, Lam PKY. Long-term average spectral characteristics of Cantonese alaryngeal speech. Auris Nasus Larynx 2009; 36:571-7. [PMID: 19261410 DOI: 10.1016/j.anl.2008.12.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2008] [Revised: 11/28/2008] [Accepted: 12/18/2008] [Indexed: 11/26/2022]
Abstract
OBJECTIVE In Hong Kong, esophageal (SE), tracheoesophageal (TE), electrolaryngeal (EL), and pneumatic artificial laryngeal (PA) speech are commonly used by laryngectomees as a means to regain verbal communication after total laryngectomy. While SE and TE speech has been studied to some extent, little is known regarding the EL and PA sound quality. The present study examined the sound quality associated with SE, TE, EL, and PA speech, and compared with that associated with laryngeal (NL) speech by using long-term average speech spectra (LTAS). METHODS Continuous speech samples of reading a 136-word passage were obtained from NL, SE, TE, EL, and PA speakers of Cantonese. The alaryngeal speakers were all superior speakers selected from the New Voice Club of Hong Kong, which is a self-help organization for the laryngectomees in Hong Kong. TE speakers were fitted with Provox valve, and EL speakers used Servox-type electrolarynx. Speech samples were digitized at 20kHz and 16bits/sample by using Praat, based on which LTAS contours were developed. First spectral peak (FSP), mean spectral energy (MSE), and spectral tilt (ST) derived from the LTAS contours associated with different speaker groups were compared. RESULTS Data revealed all speakers generally exhibited similar LTA contours. However, PA speakers exhibited the lowest average FSP value and the greatest average MSE value. NL phonation was associated with a significantly greater ST value than alaryngeal speech of Cantonese. CONCLUSION The differences in FSP, MSE, and ST values in different speaker groups may be related to the different sound sources being used by the laryngectomees, and the difference in the way the sound source is coupled with the vocal tract system.
Collapse
Affiliation(s)
- Manwa L Ng
- The University of Hong Kong, Prince Philip Dental Hospital, Sai Ying Pun, Hong Kong, SAR, China.
| | | | | | | |
Collapse
|
12
|
Munger JB, Thomson SL. Frequency response of the skin on the head and neck during production of selected speech sounds. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:4001-4012. [PMID: 19206823 DOI: 10.1121/1.3001703] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Vibrations within the vocal tract during speech are transmitted through tissue to the skin surface and can be used to transmit speech. Achieving quality speech signals using skin vibration is desirable but problematic, primarily due to the several sound production locations along the vocal tract. The objective of this study was to characterize the frequency content of speech signals on various locations of the head and neck. Signals were recorded using a microphone and accelerometers attached to 15 locations on the heads and necks of 14 males and 10 females. The subjects voiced various phonemes and one phrase. The power spectral densities (PSD) of the phonemes were used to determine a quality ranking for each location and sound. Spectrograms were used to examine signal frequency content for selected locations. A perceptual listening test was conducted and compared to the PSD rankings. The signal-to-noise ratio was found for each location with and without background noise. These results are presented and discussed. Notably, while high-frequency content is attenuated at the throat, it is shown to be detectable at some other locations. The best locations for speech transmission were found to be generally common to males and females.
Collapse
Affiliation(s)
- Jacob B Munger
- Department of Mechanical Engineering, Brigham Young University, Provo, Utah 84602, USA
| | | |
Collapse
|
13
|
Stepp CE, Heaton JT, Hillman RE. Post-laryngectomy speech respiration patterns. Ann Otol Rhinol Laryngol 2008; 117:557-63. [PMID: 18771069 DOI: 10.1177/000348940811700801] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
OBJECTIVES The goal of this study was to determine whether speech breathing changes over time in laryngectomy patients who use an electrolarynx, to explore the potential of using respiratory signals to control an artificial voice source. METHODS Respiratory patterns during serial speech tasks (counting, days of the week) with an electrolarynx were prospectively studied by inductance plethysmography in 6 individuals across their first 1 to 2 years after total laryngectomy, as well as in an additional 8 individuals who had had a laryngectomy at least 1 year earlier. RESULTS In contrast to normal speech that is only produced during exhalation, all individuals were found to engage in inhalation during speech production, and those studied longitudinally displayed increased occurrences of inhalation during speech production with time after laryngectomy. These trends appear to be stronger for individuals who used an electrolarynx as their primary means of oral communication rather than tracheoesophageal speech, possibly because of continued dependence on respiratory support for the production of tracheoesophageal speech. CONCLUSIONS Our results indicate that there are post-laryngectomy changes in the speech breathing behaviors of electrolarynx users. This has implications for designing improved electrolarynx communication systems, which could use signals derived from respiratory function as one of many potential physiologically based sources for more natural control of electrolarynx speech.
Collapse
Affiliation(s)
- Cara E Stepp
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| | | | | |
Collapse
|
14
|
Meltzner GS, Hillman RE. Impact of aberrant acoustic properties on the perception of sound quality in electrolarynx speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2005; 48:766-79. [PMID: 16378472 DOI: 10.1044/1092-4388(2005/053)] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2004] [Accepted: 01/14/2005] [Indexed: 05/05/2023]
Abstract
A large percentage of patients who have undergone laryngectomy to treat advanced laryngeal cancer rely on an electrolarynx (EL) to communicate verbally. Although serviceable, EL speech is plagued by shortcomings in both sound quality and intelligibility. This study sought to better quantify the relative contributions of previously identified acoustic abnormalities to the perception of degraded quality in EL speech. Ten normal listeners evaluated the sound quality of EL speech tokens that had been acoustically enhanced by (a) increased low-frequency energy, (b) EL-noise reduction, and (c) fundamental frequency variation to mimic normal pitch intonation in relation to nonenhanced EL speech, normal speech, and normal monotonous speech (fundamental frequency variation removed). In comparing all possible combinations of token pairs, listeners were asked to identify which one of each pair sounded most like normal natural speech, and then to rate on a visual analog scale how different the chosen token was from normal speech. The results indicate that although EL speech can be most improved by removing the EL noise and providing proper pitch information, the resulting quality is still well below that of normal natural speech or even that of monotonous natural speech. This suggests that, in addition to the widely acknowledged acoustic abnormalities examined in this investigation, there are other attributes that contribute significantly to the unnatural quality of EL speech. Such additional factors need to be clearly identified and remedied before EL speech can be made to more closely approximate the sound quality of normal natural speech.
Collapse
|
15
|
Hemmerling TM, Michaud G, Deschamps S, Trager G. An external monitoring site at the neck cannot be used to measure neuromuscular blockade of the larynx. Anesth Analg 2005; 100:1718-1722. [PMID: 15920202 DOI: 10.1213/01.ane.0000152189.85483.0e] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Using phonomyography, a new monitoring technique of neuromuscular blockade (NMB), we compared NMB after mivacurium 0.1 mg/kg at the lateral cricoarytenoid muscle (LCA) with a possible external monitoring site of the larynx. In 12 patients, data were obtained at both sites using phonomyography. Anesthesia was induced with remifentanil 0.25-0.5 microg . kg(-1) . min(-1) followed by propofol 2-3 mg/kg. A small piezo-electric microphone was positioned beside the vocal cords into the muscular process at the base of the arytenoid cartilage to record acoustic signals from the contraction of the LCA. A second microphone was positioned at an external site, lateral to the trachea, just below the thyroid notch. The recurrent laryngeal nerve was stimulated supramaximally using train-of-four (TOF) stimulation every 12 s. Onset, maximum effect, and offset of NMB were measured and compared. Peak effect, time to reach (T) 25%, 75%, and 90% of control twitch response, and TOF recovery to TOF ratios 0.5-0.8 were significantly longer at the external site. The onset time was not significantly different between the two sites. We used phonomyography with a microphone placed at the neck to evaluate the possibility to externally monitor NMB at the larynx. When compared with LCA, we found a more pronounced peak effect and longer offset of NMB. The acoustic signals recorded at this external site are unlikely to stem from laryngeal muscle contraction but are rather a result of contraction of the strap muscles of the neck.
Collapse
Affiliation(s)
- Thomas M Hemmerling
- Neuromuscular Research Group (NRG), Department of Anesthesiology, Centre Hospitalier de l'Université de Montréal (CHUM) Hôtel-Dieu, Université de Montréal, Canada
| | | | | | | |
Collapse
|