Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Watson PJ, Schlauch RS. Fundamental frequency variation with an electrolarynx improves speech understanding: a case study. Am J Speech Lang Pathol 2009;18:162-167. [PMID: 19106204 DOI: 10.1044/1058-0360(2008/08-0025)] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]

For:	Watson PJ, Schlauch RS. Fundamental frequency variation with an electrolarynx improves speech understanding: a case study. Am J Speech Lang Pathol 2009;18:162-167. [PMID: 19106204 DOI: 10.1044/1058-0360(2008/08-0025)] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]

Number

Cited by Other Article(s)

Cox SR, Huang T, Chen WR, Ng ML. An acoustic study of Cantonese alaryngeal speech in different speaking conditions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023;153:2973. [PMID: 37212513 PMCID: PMC10205142 DOI: 10.1121/10.0019471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 04/30/2023] [Accepted: 05/02/2023] [Indexed: 05/23/2023]

Cox SR, McNicholl K, Shadle CH, Chen WR. Variability of Electrolaryngeal Speech Intelligibility in Multitalker Babble. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2020;29:2012-2022. [PMID: 32870708 PMCID: PMC8740568 DOI: 10.1044/2020_ajslp-20-00092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 06/08/2020] [Accepted: 06/29/2020] [Indexed: 06/11/2023]

Al-Zanoon N, Parsa V, Doyle PC. Using visual feedback to enhance intonation control with a variable pitch electrolarynx. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020;147:1802. [PMID: 32237840 DOI: 10.1121/10.0000936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Accepted: 03/03/2020] [Indexed: 06/11/2023]

Qian Z, Wang L, Zhang S, Liu C, Niu H. Mandarin Electrolaryngeal Speech Recognition Based on WaveNet-CTC. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019;62:2203-2212. [PMID: 31200617 DOI: 10.1044/2019_jslhr-s-18-0313] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Abstract

Purpose The application of Chinese Mandarin electrolaryngeal (EL) speech for laryngectomees has been limited by its drawbacks such as single fundamental frequency, mechanical sound, and large radiation noise. To improve the intelligibility of Chinese Mandarin EL speech, a new perspective using the automatic speech recognition (ASR) system was proposed, which can convert EL speech into healthy speech, if combined with text-to-speech. Method An ASR system was designed to recognize EL speech based on a deep learning model WaveNet and the connectionist temporal classification (WaveNet-CTC). This system mainly consists of 3 parts: the acoustic model, the language model, and the decoding model. The acoustic features are extracted during speech preprocessing, and 3,230 utterances of EL speech mixed with 10,000 utterances of healthy speech are used to train the ASR system. Comparative experiment was designed to evaluate the performance of the proposed method. Results The results show that the proposed ASR system has higher stability and generalizability compared with the traditional methods, manifesting superiority in terms of Chinese characters, Chinese words, short sentences, and long sentences. Phoneme confusion occurs more easily in the stop and affricate of EL speech than the healthy speech. However, the highest accuracy of the ASR could reach 83.24% when 3,230 utterances of EL speech were used to train the ASR system. Conclusions This study indicates that EL speech could be recognized effectively by the ASR based on WaveNet-CTC. This proposed method has a higher generalization performance and better stability than the traditional methods. A higher accuracy of the ASR system based on WaveNet-CTC can be obtained, which means that EL speech can be converted into healthy speech. Supplemental Material https://doi.org/10.23641/asha.8250830.

Collapse

Li W, Zhaopeng Q, Yijun F, Haijun N. Design and Preliminary Evaluation of Electrolarynx With F0 Control Based on Capacitive Touch Technology. IEEE Trans Neural Syst Rehabil Eng 2018. [PMID: 29522407 DOI: 10.1109/tnsre.2018.2805338] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Abstract

An electrolarynx (EL) is one of the most popular voice rehabilitation technologies used after laryngectomy. However, most ELs generate monotonic EL speech, which has been shown to create a particular deficit in speech intelligibility, especially for Chinese Mandarin (Mandarin). Mandarin is a tonal language that makes lexical distinctions using variations in tone. Our purpose is to design an EL that can produce the four Mandarin tones, and to evaluate its performance. We designed a fundamental frequency (F0) control method for Mandarin EL speech and manufactured a touch-controlled electrolarynx (T-EL) prototype. Using monosyllables, disyllabic words, and frequently used phrases, we evaluated speech produced with a T-EL, as well as with monotone (M-EL) and variable-frequency modes (P-EL) of a commercially available TruTone EL. A male native Mandarin speaker with laryngectomy volunteered to be the speaker. Results show that the normal speech pitch contours of the four Mandarin tones were most closely matched by the characteristics produced with T-EL. The statistical accuracy of the T-EL's tone and word perception was significantly higher than that of the other EL types. Moreover, the confusion matrix indicates that the listeners could correctly identify the tones of monosyllables and disyllabic words in T-EL speech. Accurate tone judgment can improve the intelligibility of EL speech in Mandarin. The mean opinion score was used to evaluate the listeners' acceptability of EL speech. The scores of the T-EL and M-EL were very close, and the score of the P-EL was significantly lower than that of the other two ELs. However, the results from a single speaker cannot provide sufficient data to conclude which EL has a higher acceptability. The evaluation of multiple EL speakers with different EL types at difference levels of proficiency should be studied in future research.

Collapse

Wang L, Feng Y, Yang Z, Niu H. Development and evaluation of wheel-controlled pitch-adjustable electrolarynx. Med Biol Eng Comput 2016;55:1463-1472. [PMID: 28013472 DOI: 10.1007/s11517-016-1606-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 12/03/2016] [Indexed: 10/20/2022]

Kyong JS, Scott SK, Rosen S, Howe TB, Agnew ZK, McGettigan C. Exploring the roles of spectral detail and intonation contour in speech intelligibility: an FMRI study. J Cogn Neurosci 2014;26:1748-63. [PMID: 24568205 DOI: 10.1162/jocn_a_00583] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

Abstract

The melodic contour of speech forms an important perceptual aspect of tonal and nontonal languages and an important limiting factor on the intelligibility of speech heard through a cochlear implant. Previous work exploring the neural correlates of speech comprehension identified a left-dominant pathway in the temporal lobes supporting the extraction of an intelligible linguistic message, whereas the right anterior temporal lobe showed an overall preference for signals clearly conveying dynamic pitch information [Johnsrude, I. S., Penhune, V. B., & Zatorre, R. J. Functional specificity in the right human auditory cortex for perceiving pitch direction. Brain, 123, 155-163, 2000; Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123, 2400-2406, 2000]. The current study combined modulations of overall intelligibility (through vocoding and spectral inversion) with a manipulation of pitch contour (normal vs. falling) to investigate the processing of spoken sentences in functional MRI. Our overall findings replicate and extend those of Scott et al. [Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123, 2400-2406, 2000], where greater sentence intelligibility was predominately associated with increased activity in the left STS, and the greatest response to normal sentence melody was found in right superior temporal gyrus. These data suggest a spatial distinction between brain areas associated with intelligibility and those involved in the processing of dynamic pitch information in speech. By including a set of complexity-matched unintelligible conditions created by spectral inversion, this is additionally the first study reporting a fully factorial exploration of spectrotemporal complexity and spectral inversion as they relate to the neural processing of speech intelligibility. Perhaps surprisingly, there was little evidence for an interaction between the two factors-we discuss the implications for the processing of sound and speech in the dorsolateral temporal lobes.

Collapse

Nagle KF, Eadie TL, Wright DR, Sumida YA. Effect of fundamental frequency on judgments of electrolaryngeal speech. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2012;21:154-166. [PMID: 22355005 DOI: 10.1044/1058-0360(2012/11-0050)] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Heaton JT, Robertson M, Griffin C. Development of a wireless electromyographically controlled electrolarynx voice prosthesis. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012;2011:5352-5. [PMID: 22255547 DOI: 10.1109/iembs.2011.6091324] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Miller SE, Schlauch RS, Watson PJ. The effects of fundamental frequency contour manipulations on speech intelligibility in background noise. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010;128:435-43. [PMID: 20649237 DOI: 10.1121/1.3397384] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]