Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Isshiki N, Kitajima K, Kojima H, Harita Y. Turbulent noise in dysphonia. Folia Phoniatr (Basel) 1978;30:214-24. [PMID: 669517 DOI: 10.1159/000264126] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Number

Cited by Other Article(s)

Rohlfing ML, Buckley DP, Piraquive J, Stepp CE, Tracy LF. Hey Siri: How Effective are Common Voice Recognition Systems at Recognizing Dysphonic Voices? Laryngoscope 2020;131:1599-1607. [PMID: 32949415 DOI: 10.1002/lary.29082] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 08/13/2020] [Accepted: 08/16/2020] [Indexed: 11/11/2022]

Abstract

OBJECTIVES/HYPOTHESIS

Interaction with voice recognition systems, such as Siri™ and Alexa™, is an increasingly important part of everyday life. Patients with voice disorders may have difficulty with this technology, leading to frustration and reduction in quality of life. This study evaluates the ability of common voice recognition systems to transcribe dysphonic voices.

STUDY DESIGN

Retrospective evaluation of "Rainbow Passage" voice samples from patients with and without voice disorders.

METHODS

Participants with (n = 30) and without (n = 23) voice disorders were recorded reading the "Rainbow Passage". Recordings were played at standardized intensity and distance-to-dictation programs on Apple iPhone 6S™, Apple iPhone 11 Pro™, and Google Voice™. Word recognition scores were calculated as the proportion of correctly transcribed words. Word recognition scores were compared to auditory-perceptual and acoustic measures.

RESULTS

Mean word recognition scores for participants with and without voice disorders were, respectively, 68.6% and 91.9% for Apple iPhone 6S™ (P < .001), 71.2% and 93.7% for Apple iPhone 11 Pro™ (P < .001), and 68.7% and 93.8% for Google Voice™ (P < .001). There were strong, approximately linear associations between CAPE-V ratings of overall severity of dysphonia and word recognition score, with correlation coefficients (R² ) of 0.609 (iPhone 6S™), 0.670 (iPhone 11 Pro™), and 0.619 (Google Voice™). These relationships persisted when controlling for diagnosis, age, gender, fundamental frequency, and speech rate (P < .001 for all systems).

CONCLUSION

Common voice recognition systems function well with nondysphonic voices but are poor at accurately transcribing dysphonic voices. There was a strong negative correlation with word recognition scores and perceptual voice evaluation. As our society increasingly interfaces with automated voice recognition technology, the needs of patients with voice disorders should be considered.

LEVEL OF EVIDENCE

4 Laryngoscope, 131:1599-1607, 2021.

Collapse

Ishikawa K, Boyce S, Kelchner L, Powell MG, Schieve H, de Alarcon A, Khosla S. The Effect of Background Noise on Intelligibility of Dysphonic Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017;60:1919-1929. [PMID: 28679008 PMCID: PMC6194928 DOI: 10.1044/2017_jslhr-s-16-0012] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2016] [Revised: 07/19/2016] [Accepted: 01/20/2017] [Indexed: 05/21/2023]

Monson BB, Hunter EJ, Lotto AJ, Story BH. The perceptual significance of high-frequency energy in the human voice. Front Psychol 2014;5:587. [PMID: 24982643 PMCID: PMC4059169 DOI: 10.3389/fpsyg.2014.00587] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2014] [Accepted: 05/26/2014] [Indexed: 11/25/2022] Open

Fraj S, Schoentgen J, Grenez F. Development and perceptual assessment of a synthesizer of disordered voices. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012;132:2603-2615. [PMID: 23039453 DOI: 10.1121/1.4751536] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]

Schoentgen J. Spectral models of additive and modulation noise in speech and phonatory excitation signals. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2003;113:553-562. [PMID: 12558291 DOI: 10.1121/1.1523384] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]

Schoentgen J. Stochastic models of jitter. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2001;109:1631-1650. [PMID: 11325133 DOI: 10.1121/1.1350557] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]

Schoentgen J, Bensaid M, Bucella F. Multivariate statistical analysis of flat vowel spectra with a view to characterizing dysphonic voices. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2000;43:1493-1508. [PMID: 11193968 DOI: 10.1044/jslhr.4306.1493] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]

Wolfe VI, Martin DP, Palmer CI. Perception of dysphonic voice quality by naive listeners. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2000;43:697-705. [PMID: 10877439 DOI: 10.1044/jslhr.4303.697] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]

Murphy PJ. Perturbation-free measurement of the harmonics-to-noise ratio in voice signals using pitch synchronous harmonic analysis. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 1999;105:2866-2881. [PMID: 10335636 DOI: 10.1121/1.426901] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]

Fisher KV, Scherer RC, Guo CG, Owen AS. Longitudinal phonatory characteristics after botulinum toxin type A injection. JOURNAL OF SPEECH AND HEARING RESEARCH 1996;39:968-980. [PMID: 8898251 DOI: 10.1044/jshr.3905.968] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]

de Krom G. Some spectral correlates of pathological breathy and rough voice quality for different types of vowel fragments. JOURNAL OF SPEECH AND HEARING RESEARCH 1995;38:794-811. [PMID: 7474973 DOI: 10.1044/jshr.3804.794] [Citation(s) in RCA: 89] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]

de Krom G. A cepstrum-based technique for determining a harmonics-to-noise ratio in speech signals. JOURNAL OF SPEECH AND HEARING RESEARCH 1993;36:254-266. [PMID: 8487518 DOI: 10.1044/jshr.3602.254] [Citation(s) in RCA: 109] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]

Green DC, Berke GS, Ward PH. Vocal fold medialization by surgical augmentation versus arytenoid adduction in the in vivo canine model. Ann Otol Rhinol Laryngol 1991;100:280-7. [PMID: 2018285 DOI: 10.1177/000348949110000404] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Shoji K, Regenbogen E, Yu JD, Blaugrund SM. High-frequency components of normal voice. J Voice 1991. [DOI: 10.1016/s0892-1997(05)80160-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Smith ME, Berke GS. The effects of phonosurgery on laryngeal vibration: Part I. Theoretic considerations. Otolaryngol Head Neck Surg 1990;103:380-90. [PMID: 2122367 DOI: 10.1177/019459989010300308] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Habermann G. [Functional disorders of the voice and their treatment (author's transl)]. ARCHIVES OF OTO-RHINO-LARYNGOLOGY 1980;227:171-345. [PMID: 7469925 DOI: 10.1007/bf00456373] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]

Schultz-Coulon HJ. [Diagnosis of dysfunction of the voice (author's transl)]. ARCHIVES OF OTO-RHINO-LARYNGOLOGY 1980;227:1-169. [PMID: 7469924 DOI: 10.1007/bf00456372] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]