1
|
Shen J, Heller Murray E. Breathy Vocal Quality, Background Noise, and Hearing Loss: How Do These Adverse Conditions Affect Speech Perception by Older Adults? Ear Hear 2024:00003446-990000000-00361. [PMID: 39494949 DOI: 10.1097/aud.0000000000001599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2024]
Abstract
OBJECTIVES Although breathy vocal quality and hearing loss are both prevalent age-related changes, their combined impact on speech communication is poorly understood. This study investigated whether breathy vocal quality affected speech perception and listening effort by older listeners. Furthermore, the study examined how this effect was modulated by the adverse listening environment of background noise and the listener's level of hearing loss. DESIGN Nineteen older adults participated in the study. Their hearing ranged from near-normal to mild-moderate sensorineural hearing loss. Participants heard speech material of low-context sentences, with stimuli resynthesized to simulate original, mild-moderately breathy, and severely breathy conditions. Speech intelligibility was measured using a speech recognition in noise paradigm, with pupillometry data collected simultaneously to measure listening effort. RESULTS Simulated severely breathy vocal quality was found to reduce intelligibility and increase listening effort. Breathiness and background noise level independently modulated listening effort. The impact of hearing loss was not observed in this dataset, which can be due to the use of individualized signal to noise ratios and a small sample size. CONCLUSION Results from this study demonstrate the challenges of listening to speech with a breathy vocal quality. Theoretically, the findings highlight the importance of periodicity cues in speech perception in noise by older listeners. Breathy voice could be challenging to separate from the noise when the noise also lacks periodicity. Clinically, it suggests the need to address both listener- and talker-related factors in speech communication by older adults.
Collapse
Affiliation(s)
- Jing Shen
- Department of Communication Sciences and Disorders, College of Public Health, Temple University, Philadelphia, Pennsylvania, USA
| | | |
Collapse
|
2
|
Laukkanen AM, Kadiri SR, Narayanan S, Alku P. Can a Machine Distinguish High and Low Amount of Social Creak in Speech? J Voice 2024:S0892-1997(24)00342-4. [PMID: 39455325 DOI: 10.1016/j.jvoice.2024.09.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Revised: 09/29/2024] [Accepted: 09/30/2024] [Indexed: 10/28/2024]
Abstract
OBJECTIVES Increased prevalence of social creak particularly among female speakers has been reported in several studies. The study of social creak has been previously conducted by combining perceptual evaluation of speech with conventional acoustical parameters such as the harmonic-to-noise ratio and cepstral peak prominence. In the current study, machine learning (ML) was used to automatically distinguish speech of low amount of social creak from speech of high amount of social creak. METHODS The amount of creak in continuous speech samples produced in Finnish by 90 female speakers was first perceptually assessed by two voice specialists. Based on their assessments, the speech samples were divided into two categories (low vs high amount of creak). Using the speech signals and their creak labels, seven different ML models were trained. Three spectral representations were used as feature for each model. RESULTS The results show that the best performance (accuracy of 71.1%) was obtained by the following two systems: an Adaboost classifier using the mel-spectrogram feature and a decision tree classifier using the mel-frequency cepstral coefficient feature. CONCLUSIONS The study of social creak is becoming increasingly popular in sociolinguistic and vocological research. The conventional human perceptual assessment of the amount of creak is laborious and therefore ML technology could be used to assist researchers studying social creak. The classification systems reported in this study could be considered as baselines in future ML-based studies on social creak.
Collapse
Affiliation(s)
| | - Sudarsana Reddy Kadiri
- Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, Los Angeles
| | - Shrikanth Narayanan
- Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, Los Angeles
| | - Paavo Alku
- Department of Information and Communications Engineering, Aalto University, Espoo, Finland.
| |
Collapse
|
3
|
Leite do Ó SD, Behlau M, de Abreu SR, Englert MT, Wanderley Lopes L. Cepstral Acoustic Measurements: Influence of Speech Task and Degree of Vocal Deviation. J Voice 2024:S0892-1997(24)00281-9. [PMID: 39261203 DOI: 10.1016/j.jvoice.2024.08.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 08/21/2024] [Accepted: 08/22/2024] [Indexed: 09/13/2024]
Abstract
OBJECTIVE To analyze whether there are differences in the cepstral measures obtained in different speech tasks, depending on the presence and degree of vocal deviation, and to analyze if there is a correlation between the cepstral measures obtained from different speech tasks and the general degree of vocal deviation. METHOD Analysis of 258 vocal samples of the sustained vowel [a] and connected speech (counting numbers) from a database, including 160 dysphonic and 98 nondysphonic voices. The counting number samples were edited in three different durations: counting from 1 to 10, from 1 to 11, and from 1 to 20. Five speech-language pathologists (SLPs), voice specialists, carried out the perceptual-auditory judgment of the overall degree of vocal deviation (ODD) using the G from the overall dysphonia grade, roughness, breathiness, asthenia, and strain (GRBAS) scale. We extracted the cepstral peak prominence (CPP) and smoothed cepstral peak prominence (CPPS) measurements from all the vocal samples using an extraction script in the free software Praat. RESULTS CPP and CPPS were different between dysphonic and nondysphonic individuals, regardless of the speech task, with lower values for dysphonic. Also, CPP values between the vowel and the connected speech tasks were different between both groups. Only the CPPS showed differences between all the speech tasks depending on the degree of vocal deviation. There was a strong negative correlation between the CPPSVowel, CPPS10, CPPS11, CPPS20, and the ODD, and a moderate negative correlation between CPPVowel, CPP10, CPP11, CPP20, and ODD. CONCLUSIONS There are differences in the cepstral measures obtained in different speech tasks, depending on the presence of dysphonia and ODD. CPP and CPPS values are different between dysphonic and nondysphonic individuals in all speech tasks. There is a moderate negative correlation between CCP in the different speech tasks and ODD, while there is a strong negative correlation between CPPS in the different speech tasks and ODD.
Collapse
Affiliation(s)
- Samylle Danúbia Leite do Ó
- Department of Speech-Language and Hearing Science, Center for Voice Studies - CEV, São Paulo, SP, Brazil; Department of Speech-Language and Hearing Science, Federal University of São Paulo - UNIFESP, São Paulo, SP, Brazil
| | - Mara Behlau
- Department of Speech-Language and Hearing Science, Center for Voice Studies - CEV, São Paulo, SP, Brazil; Department of Speech-Language and Hearing Science, Federal University of São Paulo - UNIFESP, São Paulo, SP, Brazil
| | - Samuel Ribeiro de Abreu
- Department of Speech-Language and Hearing Science, Federal University of Paraíba - UFPB, João Pessoa, PB, Brazil
| | - Marina Taborda Englert
- Department of Speech-Language and Hearing Science, Center for Voice Studies - CEV, São Paulo, SP, Brazil
| | - Leonardo Wanderley Lopes
- Department of Speech-Language and Hearing Science, Federal University of Paraíba - UFPB, João Pessoa, PB, Brazil.
| |
Collapse
|
4
|
Liu GS, Jovanovic N, Sung CK, Doyle PC. A Scoping Review of Artificial Intelligence Detection of Voice Pathology: Challenges and Opportunities. Otolaryngol Head Neck Surg 2024; 171:658-666. [PMID: 38738887 DOI: 10.1002/ohn.809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 04/05/2024] [Accepted: 04/19/2024] [Indexed: 05/14/2024]
Abstract
OBJECTIVE Survey the current literature on artificial intelligence (AI) applications for detecting and classifying vocal pathology using voice recordings, and identify challenges and opportunities for advancing the field forward. DATA SOURCES PubMed, EMBASE, CINAHL, and Scopus databases. REVIEW METHODS A comprehensive literature search was performed following the Preferred Reporting Items for Systematic Reviews and Meta-analyses Extension for Scoping Reviews guidelines. Peer-reviewed journal articles in the English language were included if they used an AI approach to detect or classify pathological voices using voice recordings from patients diagnosed with vocal pathologies. RESULTS Eighty-two studies were included in the review between the years 2000 and 2023, with an increase in publication rate from one study per year in 2012 to 10 per year in 2022. Seventy-two studies (88%) were aimed at detecting the presence of voice pathology, 24 (29%) at classifying the type of voice pathology present, and 4 (5%) at assessing pathological voice using the Grade, Roughness, Breathiness, Asthenia, and Strain scale. Thirty-six databases were used to collect and analyze speech samples. Fourteen articles (17%) did not provide information about their AI model validation methodology. Zero studies moved beyond the preclinical and offline AI model development stages. Zero studies specified following a reporting guideline for AI research. CONCLUSION There is rising interest in the potential of AI technology to aid the detection and classification of voice pathology. Three challenges-and areas of opportunities-for advancing this research are heterogeneity of databases, lack of clinical validation studies, and inconsistent reporting.
Collapse
Affiliation(s)
- George S Liu
- Department of Otolaryngology-Head and Neck Surgery, Stanford University, Stanford, California, USA
| | - Nedeljko Jovanovic
- Rehabilitation Sciences-Voice Production and Perception Laboratory, Western University, London, Ontario, Canada
| | - C Kwang Sung
- Department of Otolaryngology-Head and Neck Surgery, Stanford University, Stanford, California, USA
| | - Philip C Doyle
- Department of Otolaryngology-Head and Neck Surgery, Stanford University, Stanford, California, USA
| |
Collapse
|
5
|
Park Y, Baker Brehm S, Kelchner L, Weinrich B, McElfresh K, Anand S, Shrivastav R, de Alarcon A, Eddins DA. Effects of Vibratory Source on Auditory-Perceptual and Bio-Inspired Computational Measures of Pediatric Voice Quality. J Voice 2023:S0892-1997(23)00254-0. [PMID: 37739862 PMCID: PMC10950844 DOI: 10.1016/j.jvoice.2023.08.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/11/2023] [Accepted: 08/14/2023] [Indexed: 09/24/2023]
Abstract
OBJECTIVE The vibratory source for voicing in children with dysphonia is classified into three categories including a glottal vibratory source (GVS) observed in those with vocal lesions or hyperfunction; supraglottal vibratory sources (SGVS) observed secondary to laryngeal airway injuries, malformations, or reconstruction surgeries; and a combination of both glottal and supraglottal vibratory sources called mixed vibratory source (MVS). This study evaluated the effects of vibratory source on three primary dimensions of voice quality (breathiness, roughness, and strain) in children with GVS, SGVS, and MVS using single-variable matching tasks and computational measures obtained from bio-inspired auditory models. METHODS A total of 44 dysphonic voice samples from children aged 4-11 years were selected. Seven listeners rated breathiness, roughness, and strain of 1000-ms /ɑ/ samples using single-variable matching tasks. Computational estimates of pitch strength, amplitude modulation filterbank output, and sharpness were obtained through custom-designed MATLAB algorithms. RESULTS Perceived roughness and strain were significantly higher in children with SGVS and MVS compared to children with GVS. Among the computational measures, only the modulation filterbank output resulted in significant differences among vibratory sources; a posthoc test revealed that children with SGVS had greater amplitude modulation than children with GVS, as expected from their rougher voice quality. CONCLUSIONS The results indicate that the output of an auditory amplitude modulation filterbank model may capture characteristics of SGVS that are strongly related to the rough voice quality.
Collapse
Affiliation(s)
- Yeonggwang Park
- Department of Communication Sciences and Disorders, University of Central Florida, Orlando, Florida.
| | - Susan Baker Brehm
- Department of Speech Pathology and Audiology, Miami University, Oxford, Ohio; Division of Speech-Language Pathology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - Lisa Kelchner
- Division of Speech-Language Pathology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio; Department of Communication Sciences and Disorders, University of Cincinnati, Cincinnati, Ohio
| | - Barbara Weinrich
- Department of Speech Pathology and Audiology, Miami University, Oxford, Ohio; Division of Speech-Language Pathology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - Kevin McElfresh
- Division of Speech-Language Pathology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - Supraja Anand
- Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida
| | - Rahul Shrivastav
- Office of the Provost & Executive Vice President, Indiana University, Bloomington, Indiana
| | - Alessandro de Alarcon
- Pediatric Otolaryngology Head & Neck Surgery, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - David A Eddins
- Department of Communication Sciences and Disorders, University of Central Florida, Orlando, Florida
| |
Collapse
|
6
|
Ikuma T, McWhorter AJ, Oral E, Kunduk M. Formant-Aware Spectral Analysis of Sustained Vowels of Pathological Breathy Voice. J Voice 2023:S0892-1997(23)00154-6. [PMID: 37302909 DOI: 10.1016/j.jvoice.2023.05.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 05/07/2023] [Accepted: 05/08/2023] [Indexed: 06/13/2023]
Abstract
OBJECTIVES This paper reports the effectiveness of formant-aware spectral parameters to predict the perceptual breathiness rating. A breathy voice has a steeper spectral slope and higher turbulent noise than a normal voice. Measuring spectral parameters of acoustic signals over lower formant regions is a known approach to capture the properties related to breathiness. This study examines this approach by testing the contemporary spectral parameters and algorithms within the framework, alternate frequency band designs, and vowel effects. METHODS Sustained vowel recordings (/a/, /i/, and /u/) of speakers with voice disorders in the German Saarbrueken Voice Database were considered (n: 367). Recordings with signal irregularities, such as subharmonics or with roughness perception, were excluded from the study. Four speech language pathologists perceptually rated the recordings for breathiness on a 100-point scale, and their averages were used in the analysis. The acoustic spectra were segmented into four frequency bands according to the vowel formant structures. Five spectral parameters (intraband harmonics-to-noise ratio, HNR; interband harmonics ratio, HHR; interband noise ratio, NNR; and interband glottal-to-noise energy, GNE, ratio) were evaluated in each band to predict the perceptual breathiness rating. Four HNR algorithms were tested. RESULTS Multiple linear regression models of spectral parameters, led by the HNRs, were shown to explain up to 85% of the variance in perceptual breathiness ratings. This performance exceeded that of the acoustic breathiness index (82%). Individually, the HNR over the first two formants best explained the variances in the breathiness (78%), exceeding the smoothed cepstrum peak prominence (74%). The performance of HNR was highly algorithm dependent (10% spread). Some vowel effects were observed in the perceptual rating (higher for /u/), predictability (5% lower for /u/), and model parameter selections. CONCLUSIONS Strong per-vowel breathiness acoustic models were found by segmenting the spectrum to isolate the portion most affected by breathiness.
Collapse
Affiliation(s)
- Takeshi Ikuma
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans, Louisiana; Voice Center, The Our Lady of The Lake Regional Medical Center, Baton Rouge, Louisiana.
| | - Andrew J McWhorter
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans, Louisiana; Voice Center, The Our Lady of The Lake Regional Medical Center, Baton Rouge, Louisiana
| | - Evrim Oral
- Biostatistics Program, School of Public Health, Louisiana State University Health Sciences Center, New Orleans, Louisiana
| | - Melda Kunduk
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans, Louisiana; Voice Center, The Our Lady of The Lake Regional Medical Center, Baton Rouge, Louisiana; Dept. of Communication Sciences & Disorders, Louisiana State University, Baton Rouge, Louisiana
| |
Collapse
|
7
|
Saki N, Nasiri R, Bayat A, Nikakhlagh S, Salmanzadeh S, Khoramshahi H. Relationship Between Vocal Fatigue Index and Acoustic Voice Scales in Patients With Coronavirus Infection. J Voice 2023:S0892-1997(23)00152-2. [PMID: 37277295 DOI: 10.1016/j.jvoice.2023.04.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 04/27/2023] [Accepted: 04/28/2023] [Indexed: 06/07/2023]
Abstract
OBJECTIVES The voice quality of patients with Coronavirus Disease 2019 (COVID-19) seems to be affected due to lower and upper respiratory involvement. Patient-based voice assessment scales are important clinical measures to diagnose voice disorders and monitor treatment outcomes in COVID-19 patients. This study compared vocal fatigue between COVID-19 patients and those with normal voices. Furthermore, the relationship between vocal fatigue and acoustic voice parameters of COVID-19 patients was evaluated. METHODS This cross-sectional study enrolled 30 laboratory-confirmed patients with COVID-19 (18 males and 12 females) and 30 healthy individuals with normal voices (14 males and 16 females) to compare their respiratory or phonatory parameters. The Persian versions of the Consensus Auditory Perceptual Evaluation of Voice (CAPE-V) and the vocal fatigue index (VFI) were conducted before and after reading the text. The Jitter, shimmer, maximum phonation time, and harmonic-to-noise ratio (HNR) were analyzed by Praat software based on the recorded voices of CAPE-V tasks. The acoustic assessment and VFI questionnaire results were compared between COVID-19 patients and the control group. RESULTS There were significant differences between COVID-19 patients and their healthy counterparts in all VFI subscales (P < 0.001). Moreover, after reading the text, we found significant differences between the two groups regarding Jitter, shimmer, and HNR of /a/ and /i/ vowels (P < 0.05). Our findings also indicated a significant correlation between symptom improvement with rest and acoustic parameters in all tasks, except the Jitter of /a/ before reading the text. CONCLUSION Patients with COVID-19 showed significantly more vocal fatigue than people with normal voices after reading the text. Moreover, there was a significant relationship between Jitter, shimmer, and HNR and the tiredness of voice and physical discomfort subscales of VFI.
Collapse
Affiliation(s)
- Nader Saki
- Department of Otolaryngology, Head and Neck Surgery, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Khuzestan Province, Iran; Hearing Research Center, Clinical Sciences Research Institute, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Khuzestan Province, Iran
| | - Reyhane Nasiri
- Hearing Research Center, Clinical Sciences Research Institute, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Khuzestan Province, Iran
| | - Arash Bayat
- Department of Audiology, School of Rehabilitation Sciences, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Khuzestan Province, Iran; Hearing Research Center, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Khuzestan Province, Iran
| | - Soheila Nikakhlagh
- Department of Otolaryngology, Head and Neck Surgery, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Khuzestan Province, Iran
| | - Shokrollah Salmanzadeh
- Infectious and Tropical Diseases Research Center, Health Research Institute, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Khuzestan Province, Iran
| | - Hassan Khoramshahi
- Mobility Impairment Research Center, Health Research Institute, Babol University of Medical Sciences, Babol, Mazandaran Province, Iran; Department of Speech Therapy, School of Rehabilitation, Babol University of Medical Sciences, Babol, Mazandaran Province, Iran.
| |
Collapse
|
8
|
Fujiki RB, Thibeault SL. Examining Relationships Between GRBAS Ratings and Acoustic, Aerodynamic and Patient-Reported Voice Measures in Adults With Voice Disorders. J Voice 2023; 37:390-397. [PMID: 33750626 PMCID: PMC8419204 DOI: 10.1016/j.jvoice.2021.02.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 01/31/2021] [Accepted: 02/09/2021] [Indexed: 11/26/2022]
Abstract
OBJECTIVE To determine if auditory-perceptual voice ratings performed using the GRBAS scale correlate with acoustic and aerodynamic measures of voice. A secondary aim was to examine the relationship between GRBAS ratings and patient-reported quality of life scales. METHODS GRBAS ratings, acoustic, aerodynamic and patient-reported quality of life ratings were collected from the University of Wisconsin Madison Voice and Swallow Outcomes Database for 508 adults with voice disorders. Acoustic measures included noise to harmonic ratio, jitter%, shimmer%, highest fundamental frequency (F0) of vocal range, lowest F0 of vocal range, maximum phonation time and dysphonia severity index. Aerodynamic measures included phonation threshold pressure, subglottal pressure, mean transglottal airflow and laryngeal airway resistance. Patient-reported quality of life measures included the Vocal Handicap Index (VHI) and Glottal Function Index (GFI). RESULTS GRBAS ratings were significantly correlated with several acoustic and aerodynamic measures, VHI and GFI. The strongest significant correlations for acoustic measures were observed between GRBAS ratings of overall voice quality and perturbation measures (jitter% r = 0.58, shimmer% r = 0.45, noise to harmonic ratio r = 0.36, Dysphonia Severity Index r = -0.56). The strongest significant correlation for aerodynamic voice measures was observed between GRBAS ratings of breathiness and transglottal airflow (r = 0.23), subglottal pressure (r = 0.49), and phonation threshold pressure (r = 0.26). GRBAS ratings were also significantly correlated with both VHI and the GFI scales. R values were higher for the VHI, but remained largely in low range for both scales. CONCLUSIONS Although GRBAS ratings were significantly correlated with multiple objective voice and patient related quality of life ratings, r values were low. These findings support the need for multiple voice measures when performing voice evaluations as no single voice measure was highly correlated with voice quality as measured by the GRBAS scale.
Collapse
Affiliation(s)
- Robert Brinton Fujiki
- Department of Surgery, University of Wisconsin Madison, Wisconsin Institutes for Medical Research (WIMR) BLDG. 1485, Madison, Wisconsin
| | - Susan L Thibeault
- Department of Surgery, University of Wisconsin Madison, Wisconsin Institutes for Medical Research (WIMR) BLDG. 1485, Madison, Wisconsin.
| |
Collapse
|
9
|
Nguyen DD, Madill C. Auditory-perceptual Parameters as Predictors of Voice Acoustic Measures. J Voice 2023:S0892-1997(23)00088-7. [PMID: 37003863 DOI: 10.1016/j.jvoice.2023.02.030] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 02/23/2023] [Accepted: 02/23/2023] [Indexed: 04/03/2023]
Abstract
BACKGROUND Much research has examined the relationship between perceptual and acoustic measures. However, little is known about the prediction values of perceptual measures on an acoustic parameter. AIMS This study utilized simulated and disordered voice samples to investigate the prediction values of breathiness, roughness, and strain ratings on the selection of some time-based and spectral-based measures of voice quality. METHOD This study retrospectively analysed two sets of precollected data. The experimental data had been collected from nine trained speakers manipulating false vocal fold activity, true vocal fold mass, and larynx height. The voice-disordered data had been extracted from a clinical database for 68 patients with muscle tension voice disorders (MTVD). Both data sets had been perceptually rated for breathiness, roughness, and strain. Voice samples (prolonged vowel /ɑ/ and Rainbow Passage readings) had undergone acoustic analysis using Praat for harmonics-to-noise ratio (HNR) and the program "Analysis of Dysphonia in Speech and Voice" (ADSV) for cepstral peak prominence (CPP), Cepstral/Spectral Index of Dysphonia (CSID), and Low/High spectral ratio (L/H ratio). Perceptual parameters were regressed against these acoustic measures to test their prediction values. RESULTS Reliability data showed satisfactory intra- and inter-reliability of perceptual ratings for both data sets. Breathiness significantly predicted CPP (both vocal tasks) and CSID (Rainbow Passage) in experimental data and predicted all the acoustic measures in MTVD data. Roughness significantly predicted HNR, CPP, and CSID in experimental data, and CPP (Rainbow Passage) and CSID (both vocal tasks) in MTVD data. Strain (both vocal tasks) significantly predicted L/H ratio in both data sets. CONCLUSIONS Breathiness ratings predicted selection of HNR, CPP and CSID; roughness ratings predicted selection of CPP and CSID, and strain ratings predicted L/H ratio.
Collapse
Affiliation(s)
- Duy Duong Nguyen
- Voice Research Laboratory, Sydney School of Health Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Catherine Madill
- Voice Research Laboratory, Sydney School of Health Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia.
| |
Collapse
|
10
|
Echternach M, Nusseck M, Strasding M, Richter B. Differences of Electroglottographical Contact Quotients between Connected Speech and Sustained Phonation in Clinical Measurement of Voice. J Voice 2023:S0892-1997(23)00077-2. [PMID: 36941166 DOI: 10.1016/j.jvoice.2023.02.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 02/15/2023] [Accepted: 02/15/2023] [Indexed: 03/23/2023]
Abstract
INTRODUCTION In clinical practice, sustained phonation is mostly used for acoustic voice measurements, while perceptual evaluation is based on connected speech. Since sustained phonation could be associated with the use of the singing voice, and since vocal registers are more relevant for singing rather than speech, it is unclear if vocal registers contribute to observable vocal fold contact differences between sustained phonation and speech. MATERIAL AND METHODS Sustained phonation (vowel [a] on comfortable pitch and loudness) and connected speech (German text: Der Nordwind und die Sonne) were analyzed for 1216 subjects (426 with and 790 without dysphonia) using the Laryngograph system (combining electroglottography and audio recordings). From these samples, fundamental frequency (ƒo), contact quotient (CQ), sound pressure level (SPL) and frequency perturbation (jitter first for sustained and cFx for connected speech) were evaluated. RESULTS Compared to connected speech, the values of ƒo and SPL were higher for sustained phonation. For female voices, ƒo difference was greater than for male voices. At the same time, and only for the females, CQ was lower for the sustained phonation, indicating a register difference. CONCLUSION In order to achieve a better comparability, sustained phonation should be standardized regarding the ƒo and SPL values in correspondence to the ƒo and SPL range of reading a text. This should also reduce the risk of using a different register for different types of phonation.
Collapse
Affiliation(s)
- Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Munich, Germany.
| | - Manfred Nusseck
- Institute of Musicians' Medicine, University of Freiburg Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Malin Strasding
- Division of Fixed Prosthodontics and Biomaterials, Université de Genève, Geneve, Switzerland
| | - Bernhard Richter
- Institute of Musicians' Medicine, University of Freiburg Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| |
Collapse
|
11
|
Maffei MF, Green JR, Murton O, Yunusova Y, Rowe HP, Wehbe F, Diana K, Nicholson K, Berry JD, Connaghan KP. Acoustic Measures of Dysphonia in Amyotrophic Lateral Sclerosis. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:872-887. [PMID: 36802910 PMCID: PMC10205101 DOI: 10.1044/2022_jslhr-22-00363] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 10/25/2022] [Accepted: 12/01/2022] [Indexed: 05/25/2023]
Abstract
PURPOSE Identifying efficacious measures to characterize dysphonia in complex neurodegenerative diseases is key to optimal assessment and intervention. This study evaluates the validity and sensitivity of acoustic features of phonatory disruption in amyotrophic lateral sclerosis (ALS). METHOD Forty-nine individuals with ALS (40-79 years old) were audio-recorded while producing a sustained vowel and continuous speech. Perturbation/noise-based (jitter, shimmer, and harmonics-to-noise ratio) and cepstral/spectral (cepstral peak prominence, low-high spectral ratio, and related features) acoustic measures were extracted. The criterion validity of each measure was assessed using correlations with perceptual voice ratings provided by three speech-language pathologists. Diagnostic accuracy of the acoustic features was evaluated using area-under-the-curve analysis. RESULTS Perturbation/noise-based and cepstral/spectral features extracted from /a/ were significantly correlated with listener ratings of roughness, breathiness, strain, and overall dysphonia. Fewer and smaller correlations between cepstral/spectral measures and perceptual ratings were observed for the continuous speech task, although post hoc analyses revealed stronger correlations in speakers with less perceptually impaired speech. Area-under-the-curve analyses revealed that multiple acoustic features, particularly from the sustained vowel task, adequately differentiated between individuals with ALS with and without perceptually dysphonic voices. CONCLUSIONS Our findings support using both perturbation/noise-based and cepstral/spectral measures of sustained /a/ to assess phonatory quality in ALS. Results from the continuous speech task suggest that multisubsystem involvement impacts cepstral/spectral analyses in complex motor speech disorders such as ALS. Further investigation of the validity and sensitivity of cepstral/spectral measures during continuous speech in ALS is warranted.
Collapse
Affiliation(s)
- Marc F. Maffei
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA
| | - Jordan R. Green
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA
- Speech and Hearing Bioscience and Technology Program, Harvard University, Cambridge, MA
| | - Olivia Murton
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA
| | - Yana Yunusova
- Department of Speech-Language Pathology, University of Toronto, Ontario, Canada
- Hurvitz Brain Sciences Program, Sunnybrook Research Institute, Toronto, Ontario, Canada
- Toronto Rehabilitation Institute, University Health Network, Ontario, Canada
| | - Hannah P. Rowe
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA
| | - Farah Wehbe
- Department of Speech-Language Pathology, University of Toronto, Ontario, Canada
- Hurvitz Brain Sciences Program, Sunnybrook Research Institute, Toronto, Ontario, Canada
| | - Kathleen Diana
- Department of Neurology, Neurological Clinical Research Institute, Massachusetts General Hospital, Boston
| | - Katharine Nicholson
- Department of Neurology, Neurological Clinical Research Institute, Massachusetts General Hospital, Boston
| | - James D. Berry
- Department of Neurology, Neurological Clinical Research Institute, Massachusetts General Hospital, Boston
| | - Kathryn P. Connaghan
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA
| |
Collapse
|
12
|
Yaslıkaya S, Geçkil AA, Birişik Z. Is There a Relationship between Voice Quality and Obstructive Sleep Apnea Severity and Cumulative Percentage of Time Spent at Saturations below Ninety Percent: Voice Analysis in Obstructive Sleep Apnea Patients. MEDICINA (KAUNAS, LITHUANIA) 2022; 58:medicina58101336. [PMID: 36295497 PMCID: PMC9608866 DOI: 10.3390/medicina58101336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Revised: 09/08/2022] [Accepted: 09/19/2022] [Indexed: 11/19/2022]
Abstract
Background and Objectives: Apnea hypopnea index is the most important criterion in determining the severity of obstructive sleep apnea (OSA), while the percentage of the total number of times which oxygen saturation is measured below 90% during polysomnography (CT90%) is important in determining the severity of hypoxemia. As hypoxemia increases, inflammation will also increase in OSA. Inflammation in the respiratory tract may affect phonation. We aimed to determine the effects of the degree of OSA and CT90% on phonation. Materials and Methods: The patients were between the ages of 18−60 years and were divided into four groups: normal, mild, moderate, and severe OSA. Patients were asked to say the vowels /α:/ and /i:/ for 5 s for voice recording. Maximum phonation time (MPT) was recorded. Using the Praat voice analysis program, Jitter%, Shimmer%, harmonics-to-noise ratio (HNR), and f0 values were obtained. Results: Seventy-two patients were included. Vowel sound /α:/; there was a significant difference for Jitter%, Shimmer%, and HNR measurements between the 1st and the 4th group (p < 0.001, p < 0.001, and p < 0.001, respectively) and a correlation between CT90% and Shimmer% and HNR values (p < 0.001 and p < 0.021, respectively). Vowel sound /i:/; there was a significant difference in f0 values between the 1st group and 2nd and 4th groups (p < 0.028 and p < 0.015, respectively), and for Jitter%, Shimmer%, and HNR measurements between the 1st and 4th group (p < 0.04, p < 0.000, and p < 0.000, respectively), and a correlation between CT90% and Shimmer% and HNR values (p < 0.016 and p < 0.003, respectively). The difference was significant in MPT between the 1st group and 3rd and 4th groups (p < 0.03 and p < 0.003, respectively). Conclusions: Glottic phonation can be affected, especially in patients whose AHI scores are ≥15. Voice quality can decrease as the degree of OSA increases. The increase in CT90% can be associated with the worsening of voice and can be used as a predictor in the evaluation of voice disorders in the future.
Collapse
Affiliation(s)
- Serhat Yaslıkaya
- Department of Otorhinolaryngology, Faculty of Medicine, Adıyaman University, Adıyaman 02100, Turkey
- Correspondence: ; Tel.: +90-4162161015
| | - Ayşegül Altıntop Geçkil
- Department of Chest Diseases, Faculty of Medicine, Malatya Turgut Özal University, Malatya 44210, Turkey
| | - Zehra Birişik
- Department of Speech and Language Therapy, Malatya Training and Research Hospital, Malatya 44000, Turkey
| |
Collapse
|
13
|
Ikuma T, Story B, McWhorter AJ, Adkins L, Kunduk M. Harmonics-to-noise ratio estimation with deterministically time-varying harmonic model for pathological voice signals. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1783. [PMID: 36182331 DOI: 10.1121/10.0014177] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 09/01/2022] [Indexed: 06/16/2023]
Abstract
The harmonics-to-noise ratio (HNR) and other spectral noise parameters are important in clinical objective voice assessment as they could indicate the presence of nonharmonic phenomena, which are tied to the perception of hoarseness or breathiness. Existing HNR estimators are built on the voice signals to be nearly periodic (fixed over a short period), although voice pathology could induce involuntary slow modulation to void this assumption. This paper proposes the use of a deterministically time-varying harmonic model to improve the HNR measurements. To estimate the time-varying model, a two-stage iterative least squares algorithm is proposed to reduce model overfitting. The efficacy of the proposed HNR estimator is demonstrated with synthetic signals, simulated tremor signals, and recorded acoustic signals. Results indicate that the proposed algorithm can produce consistent HNR measures as the extent and rate of tremor are varied.
Collapse
Affiliation(s)
- Takeshi Ikuma
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans, Louisiana 70112, USA
| | - Brad Story
- Department of Speech, Language, and Hearing Sciences, University of Arizona, Tucson, Arizona 85721, USA
| | - Andrew J McWhorter
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans, Louisiana 70112, USA
| | - Lacey Adkins
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans, Louisiana 70112, USA
| | - Melda Kunduk
- Department of Communication Disorders, Louisiana State University, Baton Rouge, Louisiana 70803, USA
| |
Collapse
|
14
|
Park Y, Anand S, Ozmeral EJ, Shrivastav R, Eddins DA. Predicting Perceived Vocal Roughness Using a Bio-Inspired Computational Model of Auditory Temporal Envelope Processing. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:2748-2758. [PMID: 35867607 PMCID: PMC9911094 DOI: 10.1044/2022_jslhr-22-00101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 04/14/2022] [Accepted: 04/25/2022] [Indexed: 06/15/2023]
Abstract
PURPOSE Vocal roughness is often present in many voice disorders but the assessment of roughness mainly depends on the subjective auditory-perceptual evaluation and lacks acoustic correlates. This study aimed to apply the concept of roughness in general sound quality perception to vocal roughness assessment and to characterize the relationship between vocal roughness and temporal envelop fluctuation measures obtained from an auditory model. METHOD Ten /ɑ/ recordings with a wide range of roughness were selected from an existing database. Ten listeners rated the roughness of the recordings in a single-variable matching task. Temporal envelope fluctuations of the recordings were analyzed with an auditory processing model of amplitude modulation that utilizes a modulation filterbank of different modulation frequencies. Pitch strength and the smoothed cepstral peak prominence were also obtained for comparison. RESULTS Individual simple regression models yielded envelope standard deviation from a modulation filter with a low center frequency (64.3 Hz) as a statistically significant predictor of vocal roughness with a strong coefficient of determination (r 2 = .80). Pitch strength and CPPS were not significant predictors of roughness. CONCLUSION This result supports the possible utility of envelope fluctuation measures from an auditory model as objective correlates of vocal roughness.
Collapse
Affiliation(s)
- Yeonggwang Park
- Department of Communication Sciences and Disorders, University of South Florida, Tampa
| | - Supraja Anand
- Department of Communication Sciences and Disorders, University of South Florida, Tampa
| | - Erol J. Ozmeral
- Department of Communication Sciences and Disorders, University of South Florida, Tampa
| | - Rahul Shrivastav
- Office of the Provost & Executive Vice President, Indiana University Bloomington
| | - David A. Eddins
- Department of Communication Sciences and Disorders, University of South Florida, Tampa
| |
Collapse
|
15
|
Ertan E, Gürvit HI, Hanağası HH, Bilgiç B, Tunçer MA, Yılmaz C. Intensive voice treatment (the Lee Silverman Voice Treatment [LSVT ®LOUD]) for individuals with Wilson's disease and adult cerebral palsy: two case reports. LOGOP PHONIATR VOCO 2021; 47:262-270. [PMID: 34287100 DOI: 10.1080/14015439.2021.1951348] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Objective: In this case report, we aimed to examine the effects of an intensive voice treatment (the Lee Silverman Voice Treatment [LSVT®LOUD]) for Wilson's disease (WD), and adult cerebral palsy (CP), and dysarthria.Method: The participants received LSVT®LOUD four times a week for 4 weeks. Acoustic, perceptual (GRBAS) analyses were performed and data from the Voice Handicap Index (VHI) were obtained before and after treatment.Results: Besides the Harmonics-to Noise Ratio (HNR) value (dB) of the participant with WD, for both participants' fundamental frequencies (Hz), jitter (%), and shimmer (%) values showed significant differences (p < .05) after therapy. Both participants showed significant improvements (p < .05) in the duration (s) and the sound pressure level (dB, SPL) of sustained vowel phonation (/a/), in SPL (dB) of pitch range (high and low /a/) and reading and conversation (p < .01). There was a positive improvement in the high-frequency values (Hz) of both participants but not in the low-frequency values (Hz) in the participant with WD. Perceptual analysis with GRBAS judgements of sustained vowel (/a/) and paragraph reading of two participants also showed improvement. After therapy, perceived loudness of the participants' voice increased.Conclusions: The findings provide some preliminary observations that the individuals with WD and the adult individuals with CP can respond positively to intensive speech treatment such as LSVT®LOUD. Further studies are needed to investigate speech treatments specific to WD and adult CP.
Collapse
Affiliation(s)
- Esra Ertan
- Institut für Deutsche Sprache und Linguistik, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Hakan I Gürvit
- Behavioral Neurology and Movement Disorders Unit, Department of Neurology, Istanbul Faculty of Medicine, Istanbul University, Istanbul, Turkey
| | - Haşmet H Hanağası
- Behavioral Neurology and Movement Disorders Unit, Department of Neurology, Istanbul Faculty of Medicine, Istanbul University, Istanbul, Turkey
| | - Başar Bilgiç
- Behavioral Neurology and Movement Disorders Unit, Department of Neurology, Istanbul Faculty of Medicine, Istanbul University, Istanbul, Turkey
| | - Müge A Tunçer
- Department of Speech and Language Therapy, Faculty of Health Science, Sıtkı Koçman University, Muğla, Turkey
| | - Cemil Yılmaz
- Department of Speech and Language Therapy, Faculty of Health Science, Anadolu University, Eskişehir, Turkey
| |
Collapse
|
16
|
Labuschagne IB, Ciocca V. Noise thresholds in harmonic series maskers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:2492. [PMID: 33940897 DOI: 10.1121/10.0004130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 03/15/2021] [Indexed: 06/12/2023]
Abstract
The presence of noise is a salient cue to the perception of breathiness and aspiration in speech sounds. The detection of noise within harmonic series (maskers) composed of unresolved components was found to depend on the fundamental frequency (fo) and the overall level of the masker [Gockel, Moore, and Patterson (2002). J. Acoust. Soc. Am., 111 (6), 2759-2770]. In the present study, noise detection thresholds were measured as a function of the frequency range, the fo, and the overall level of harmonic maskers. Frequency range was specified in equivalent rectangular bandwidth (ERB) units (3-13, 13-23, 23-33, or 3-33 ERBs). The results were consistent with the idea that listeners rely on spectral cues when maskers comprise only resolved components (3-13 ERBs), and on temporal (dip listening) cues when maskers contain only unresolved components (23-33 ERBs). Noise detection thresholds were generally lower when masker level was high (70 dBA) than when it was low (50 dBA). Masker fo affected thresholds only when listeners relied on spectral cues for noise detection. With the wideband (3-33 ERBs) masker, listeners likely detected noise by focusing on the frequency band (23-33 ERBs) with the most advantageous noise-to-harmonic ratio.
Collapse
Affiliation(s)
- Ilse B Labuschagne
- School of Audiology and Speech Sciences, The University of British Columbia, 2177 Wesbrook Mall, Vancouver, British Columbia, V6T 1Z3, Canada
| | - Valter Ciocca
- School of Audiology and Speech Sciences, The University of British Columbia, 2177 Wesbrook Mall, Vancouver, British Columbia, V6T 1Z3, Canada
| |
Collapse
|
17
|
Fujiki RB, Thibeault SL. The Relationship Between Auditory-Perceptual Rating Scales and Objective Voice Measures in Children With Voice Disorders. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2021; 30:228-238. [PMID: 33439742 DOI: 10.1044/2020_ajslp-20-00188] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose The purpose of this study was to determine concurrent validity of the Grade, Roughness, Breathiness, Asthenia, and Strain (GRBAS) and Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) auditory-perceptual scales in children with voice disorders. A secondary purpose was to determine correlation between the GRBAS, CAPE-V, and objective voice measures. Method GRBAS and CAPE-V ratings and acoustic and aerodynamic measures were collected from the University of Wisconsin-Madison Voice and Swallow Outcomes Database. Correlations between CAPE-V and GRBAS ratings were calculated for overall severity of dysphonia, roughness, breathiness, and strain. Correlations between auditory-perceptual voice ratings and objective voice measures were also examined. Results One hundred thirty GRBAS and CAPE-V auditory-perceptual ratings were significantly correlated for overall severity, roughness, breathiness, and strain. r 2 values were highest for overall severity of dysphonia (r 2 = .75) and lowest for strain (r 2 = .54). CAPE-V and GRBAS ratings were largely associated with similar acoustic and aerodynamic measures. The highest correlations were observed for auditory-perceptual ratings of breathiness and jitter% (CAPE-V r 2 = .44, GRBAS r 2 = .44), shimmer% (CAPE-V r 2 = .45, GRBAS r 2 = .45), noise-to-harmonic ratio (CAPE-V r 2 = .42, GRBAS r 2 = .40), fundamental frequency (CAPE-V r 2 = .47, GRBAS r 2 = .44), and maximum phonation time (CAPE-V r 2 = .56, GRBAS r 2 = .51). Akaike information criterion values indicated that CAPE-V ratings were more strongly correlated with objective voice measures than GRBAS ratings. Conclusions CAPE-V and GRBAS scales have concurrent validity in children with voice disorders. CAPE-V ratings are more strongly correlated with acoustic and aerodynamic voice measures.
Collapse
|
18
|
Barone NA, Ludlow CL, Tellis CM. Acoustic and Aerodynamic Comparisons of Voice Qualities Produced After Voice Training. J Voice 2021; 35:157.e11-157.e21. [DOI: 10.1016/j.jvoice.2019.07.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Revised: 07/11/2019] [Accepted: 07/15/2019] [Indexed: 10/26/2022]
|
19
|
Asiaee M, Vahedian-Azimi A, Atashi SS, Keramatfar A, Nourbakhsh M. Voice Quality Evaluation in Patients With COVID-19: An Acoustic Analysis. J Voice 2020; 36:879.e13-879.e19. [PMID: 33051108 PMCID: PMC7528943 DOI: 10.1016/j.jvoice.2020.09.024] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2020] [Revised: 09/26/2020] [Accepted: 09/29/2020] [Indexed: 01/19/2023]
Abstract
Objectives With the COVID-19 outbreak around the globe and its potential effect on infected patients’ voice, this study set out to evaluate and compare the acoustic parameters of voice between healthy and infected people in an objective manner. Methods Voice samples of 64 COVID-19 patients and 70 healthy Persian speakers who produced a sustained vowel /a/ were evaluated. Between-group comparisons of the data were performed using the two-way ANOVA and Wilcoxon's rank-sum test. Results The results revealed significant differences in CPP, HNR, H1H2, F0SD, jitter, shimmer, and MPT values between COVID-19 patients and the healthy participants. There were also significant differences between the male and female participants in all the acoustic parameters, except jitter, shimmer and MPT. No interaction was observed between gender and health status in any of the acoustic parameters. Conclusion The statistical analysis of the data revealed significant differences between the experimental and control groups in this study. Changes in the acoustic parameters of voice are caused by the insufficient airflow, and increased aperiodicity, irregularity, signal perturbation and level of noise, which are the consequences of pulmonary and laryngological involvements in patients with COVID-19.
Collapse
Affiliation(s)
- Maral Asiaee
- Department of Linguistics, Faculty of Literature, Alzahra University, Tehran, Iran
| | - Amir Vahedian-Azimi
- Trauma research Center, Nursing Faculty, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Seyed Shahab Atashi
- Department of Food and Drug control, Jundishapour University of Medical Sciences, Ahvaz, Iran
| | | | - Mandana Nourbakhsh
- Department of Linguistics, Faculty of Literature, Alzahra University, Tehran, Iran.
| |
Collapse
|
20
|
Barsties V Latoszek B, Kim GH, Delgado Hernández J, Hosokawa K, Englert M, Neumann K, Hetjens S. The validity of the Acoustic Breathiness Index in the evaluation of breathy voice quality: A Meta-Analysis. Clin Otolaryngol 2020; 46:31-40. [PMID: 32770718 DOI: 10.1111/coa.13629] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Revised: 07/03/2020] [Accepted: 07/31/2020] [Indexed: 02/01/2023]
Abstract
BACKGROUND The evaluation of voice quality with acoustic measurements is useful to objectify the diagnostic process. Particularly, breathiness was highly evaluated and the Acoustic Breathiness Index (ABI) might have promising features. OBJECTIVE OF REVIEW The goal of the present meta-analysis is to quantify, from existing cross-validation studies, the evidence for the diagnostic accuracy of ABI, including its sensitivity and specificity. TYPE OF REVIEW Meta-analysis. SEARCH STRATEGY We searched in MEDLINE, Google Scholar and Science Citation Index, and as manual search for the term Acoustic Breathiness Index from inception to February 2020. Studies were included that used equal proportion of continuous speech and sustained vowel segments, a recording hardware with a sufficient standard for voice signal analyses, the software Praat for signal processing and the customised Praat script, and two groups of subjects (vocally healthy and voice-disordered). Furthermore, the diagnostic accuracy of ABI was measured. EVALUATION METHOD The primary outcome variable was ABI. The score ranged from 0 to 10 with varying thresholds according to different languages to determine the absence or presence of breathiness. A meta-analysis was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-analyses of diagnostic test accuracy study guidelines. Data were extracted, and the risk of bias was assessed using the QUADAS-2 tool. The pooled sensitivity and specificity of ABI were determined using a summary receiver operating characteristic (SROC) approach to calculate also a weighted threshold value of ABI with its sensitivity and specificity. RESULTS A total of 34 unique citations were screened, and 10 full-text articles were reviewed, including six studies. In total, 3603 voice samples were considered for further analysis separating into 467 vocally healthy and 3136 voice-disordered voice samples. The pooled sensitivity was 0.84 (95% CI, 0.83-0.85), and the pooled specificity was 0.92 (95% CI, 0.89-0.94). The area under the curve of the SROC curve of this analysis showed an excellent value of 0.94. The weighted ABI threshold was determined at 3.40 (sensitivity: 0.86, 95% CI, 0.84-0.87.; specificity: 0.90, 95% CI 0.88-0.92). CONCLUSIONS The results confirm the ABI as robust and valid objective measure for evaluating breathiness.
Collapse
Affiliation(s)
- Ben Barsties V Latoszek
- Department of Phoniatrics and Pediatric Audiology, University Hospital Münster, Westphalian Wilhelm University, Münster, Germany.,Speech-Language Pathology, SRH University of Applied Health Sciences, Düsseldorf, Germany
| | - Geun-Hyo Kim
- Department of Otorhinolaryngology-Head and Neck Surgery and Biomedical Research Institute, Pusan National University Hospital, Busan, South Korea
| | | | - Kiyohito Hosokawa
- Department of Otorhinolaryngology, Japan Community Health Care Organization, Osaka Hospital, Osaka, Japan.,Department of Otorhinolaryngology, Osaka Police Hospital, Osaka, Japan.,Department of Otorhinolaryngology and Head & Neck Surgery, Osaka University Graduate School of Medicine, Osaka, Japan
| | - Marina Englert
- Department of Communication Disorders, UNIFESP - Universidade Federal de São Paulo, São Paulo, Brazil.,CEV, Centro de Estudos da Voz, São Paulo, Brazil
| | - Katrin Neumann
- Speech-Language Pathology, SRH University of Applied Health Sciences, Düsseldorf, Germany
| | - Svetlana Hetjens
- Department of Statistics, Medical Faculty Mannheim, Ruprecht Karls University of Heidelberg, Mannheim, Germany
| |
Collapse
|
21
|
Wu CH, Chan RW. Effects of a 6-Week Straw Phonation in Water Exercise Program on the Aging Voice. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:1018-1032. [PMID: 32302246 DOI: 10.1044/2020_jslhr-19-00124] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Purpose Semi-occluded vocal tract (SOVT) exercises with tubes or straws have been widely used for a variety of voice disorders. Yet, the effects of longer periods of SOVT exercises (lasting for weeks) on the aging voice are not well understood. This study investigated the effects of a 6-week straw phonation in water (SPW) exercise program. Method Thirty-seven elderly subjects with self-perceived voice problems were assigned into two groups: (a) SPW exercises with six weekly sessions and home practice (experimental group) and (b) vocal hygiene education (control group). Before and after intervention (2 weeks after the completion of the exercise program), acoustic analysis, auditory-perceptual evaluation, and self-assessment of vocal impairment were conducted. Results Analysis of covariance revealed significant differences between the two groups in smoothed cepstral peak prominence measures, harmonics-to-noise ratio, the auditory-perceptual parameter of breathiness, and Voice Handicap Index-10 scores postintervention. No significant differences between the two groups were found for other measures. Conclusions Our results supported the positive effects of SOVT exercises for the aging voice, with a 6-week SPW exercise program being a clinical option. Future studies should involve long-term follow-up and additional outcome measures to better understand the efficacy of SOVT exercises, particularly SPW exercises, for the aging voice.
Collapse
Affiliation(s)
- Chia-Hsin Wu
- Department of Speech Language Pathology and Audiology, National Taipei University of Nursing and Health Sciences, Taiwan
| | - Roger W Chan
- Department of Speech Language Pathology and Audiology, National Taipei University of Nursing and Health Sciences, Taiwan
| |
Collapse
|
22
|
Principal component analysis of the spectrogram of the speech signal: Interpretation and application to dysarthric speech. COMPUT SPEECH LANG 2020. [DOI: 10.1016/j.csl.2019.07.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
23
|
Park JW, Kim B, Oh JH, Kang TK, Kim DY, Woo JH. Study for Correlation between Objective and Subjective Voice Parameters in Patients with Dysphonia. ACTA ACUST UNITED AC 2019. [DOI: 10.22469/jkslp.2019.30.2.118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
24
|
Braun B, Dehé N, Neitsch J, Wochner D, Zahner K. The Prosody of Rhetorical and Information-Seeking Questions in German. LANGUAGE AND SPEECH 2019; 62:779-807. [PMID: 30563430 DOI: 10.1177/0023830918816351] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This paper reports on the prosody of rhetorical questions (RQs) and information-seeking questions (ISQs) in German for two question types-polar questions and constituent questions (henceforth "wh-questions"). The results are as follows: Phonologically, polar RQs were mainly realized with H-% (high plateau), while polar ISQs mostly ended in H-^H% (high-rise). Wh-RQs almost exclusively terminated in a low edge tone, whereas wh-ISQs allowed for more tonal variation (L-%, L-H%, H-^H%). Irrespective of question type, RQs were mainly produced with L*+H accents. Phonetically, RQs were more often realized with breathy voice quality than ISQs, in particular in the beginning of the interrogative. Furthermore, they were produced with longer constituent durations than ISQs, in particular at the end of the interrogative. While the difference between RQs and ISQs is reflected in the intonational terminus of the utterance, this does not happen in the way suggested in the semantic literature, and in addition, accent type and phonetic parameters also play a role. Crucially, a simple distinction between rising and falling intonation is insufficient to capture the realization of the different illocution types (RQs, ISQs), against frequent claims in the semantic and pragmatic literature. We suggest alternative ways to interpret the findings.
Collapse
|
25
|
O' Leary D, Lee A, O'Toole C, Gibbon F. Perceptual and acoustic evaluation of speech production in Down syndrome: A case series. CLINICAL LINGUISTICS & PHONETICS 2019; 34:72-91. [PMID: 31345071 DOI: 10.1080/02699206.2019.1611925] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Revised: 04/18/2019] [Accepted: 04/22/2019] [Indexed: 06/10/2023]
Abstract
People with Down syndrome (DS) can experience difficulties with speech production that can impact on speech intelligibility. In previous research, both perceptual and acoustic analysis has shown that people with DS can have difficulties with speech production in the areas of respiration, phonation, articulation, resonance and prosody. However, these studies have investigated various aspects of speech production separately. No study has examined all components of speech production in one single study and considered how these components, if impaired, may impact on speech intelligibility in DS. This paper presents the data of three male speakers with DS and three age- and gender-matched controls as a case series. The participants' speech samples were analysed using a number of perceptual and acoustic parameters, across the major components of speech production - respiration, phonation, articulation, resonance, and prosody. Results showed that different areas of speech production were affected in each participant, to different extents. The main perceptual difficulties included poor voice quality, monopitch, and monoloudness. Acoustic findings showed a higher mean F0, lower harmonics-to-noise ratio and longer voice onset times. These preliminary findings show that people with DS can present with mixed profiles of speech production that can affect speech intelligibility. When assessing speech production in DS, clinicians need to evaluate all components of speech production and consider how they may be impacting intelligibility.
Collapse
Affiliation(s)
- Deirdre O' Leary
- Department of Speech and Hearing Sciences, University College Cork, Cork, Ireland
| | - Alice Lee
- Department of Speech and Hearing Sciences, University College Cork, Cork, Ireland
| | - Ciara O'Toole
- Department of Speech and Hearing Sciences, University College Cork, Cork, Ireland
| | - Fiona Gibbon
- Department of Speech and Hearing Sciences, University College Cork, Cork, Ireland
| |
Collapse
|
26
|
Anand S, Skowronski MD, Shrivastav R, Eddins DA. Perceptual and Quantitative Assessment of Dysphonia Across Vowel Categories. J Voice 2019; 33:473-481. [DOI: 10.1016/j.jvoice.2017.12.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2017] [Accepted: 12/21/2017] [Indexed: 10/16/2022]
|
27
|
Schickhofer L, Malinen J, Mihaescu M. Compressible flow simulations of voiced speech using rigid vocal tract geometries acquired by MRI. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:2049. [PMID: 31046346 DOI: 10.1121/1.5095250] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Accepted: 03/07/2019] [Indexed: 05/27/2023]
Abstract
Voiced speech consists mainly of the source signal that is frequency weighted by the acoustic filtering of the upper airways and vortex-induced sound through perturbation in the flow field. This study investigates the flow instabilities leading to vortex shedding and the importance of coherent structures in the supraglottal region downstream of the vocal folds for the far-field sound signal. Large eddy simulations of the compressible airflow through the glottal constriction are performed in realistic geometries obtained from three-dimensional magnetic resonance imaging data. Intermittent flow separation through the glottis is shown to introduce unsteady surface pressure through impingement of vortices. Additionally, dominant flow instabilities develop in the shear layer associated with the glottal jet. The aerodynamic perturbations in the near field and the acoustic signal in the far field are examined by means of spatial and temporal Fourier analysis. Furthermore, the acoustic sources due to the unsteady supraglottal flow are identified with the aid of surface spectra, and critical regions of amplification of the dominant frequencies of the investigated vowel geometries are identified.
Collapse
Affiliation(s)
- Lukas Schickhofer
- Department of Mechanics, Linné FLOW Centre, KTH Royal Institute of Technology, Stockholm, SE-10044, Sweden
| | - Jarmo Malinen
- Department of Mathematics and Systems Analysis, Aalto University, Aalto, FI-00076, Finland
| | - Mihai Mihaescu
- Department of Mechanics, Linné FLOW Centre, KTH Royal Institute of Technology, Stockholm, SE-10044, Sweden
| |
Collapse
|
28
|
Shoup-Knox ML, Ostrander GM, Reimann GE, Pipitone RN. Fertility-Dependent Acoustic Variation in Women's Voices Previously Shown to Affect Listener Physiology and Perception. EVOLUTIONARY PSYCHOLOGY 2019; 17:1474704919843103. [PMID: 31023082 PMCID: PMC10358420 DOI: 10.1177/1474704919843103] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 03/15/2019] [Indexed: 11/16/2022] Open
Abstract
Previous research demonstrates that listeners perceive women's voices as more attractive when recorded at high compared to low fertility phases of the menstrual cycle. This effect has been repeated with multiple voice recording samples, but one stimuli set has shown particularly robust replications. First collected by Pipitone and Gallup (2008), women were recorded counting from 1-10 on approximately the same day and time once a week for 4 weeks. Repeatedly, studies using these recordings have shown that naturally cycling women recorded at high fertility are rated as more attractive compared to voices of the same women at low fertility. Additionally, these stimuli have been shown to elicit autonomic nervous system arousal and precipitate a rise in testosterone levels among listeners. Although previous studies have examined the acoustic properties of voices across the menstrual cycle, they reach little consensus. The current study evaluates Pipitone and Gallup's voice stimuli from an acoustic perspective, analyzing specific vocal characteristics of both naturally cycling women and women taking hormonal contraceptives. Results show that among naturally cycling women, variation in vocal amplitude (shimmer) was significantly lower in high fertility recordings compared to the women's voices at low fertility. Harmonics-to-noise ratio and variation in voice pitch (jitter) also fluctuated systematically across voices sampled at different times during the menstrual cycle, though these effects were not statistically significant. It is possible that these acoustic changes could account for some of the replicated perceptual, hormonal, and physiological changes documented in prior literature using these voice stimuli.
Collapse
Affiliation(s)
| | | | | | - R. Nathan Pipitone
- Department of Psychology, Florida Gulf Coast University, Fort Myers, FL, USA
| |
Collapse
|
29
|
Madill C, Nguyen DD, Yick-Ning Cham K, Novakovic D, McCabe P. The Impact of Nasalance on Cepstral Peak Prominence and Harmonics-to-Noise Ratio. Laryngoscope 2018; 129:E299-E304. [PMID: 30585334 PMCID: PMC6767134 DOI: 10.1002/lary.27685] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/22/2018] [Indexed: 11/10/2022]
Abstract
Objectives/Hypothesis Cepstral peak prominence (CPP) has been reported as a reliable measure of dysphonia and a preferred alternative to harmonics‐to‐noise ratio (HNR). However, CPP has been observed to be sensitive to articulatory variation and vocal intensity. The aim of this study was to examine the impact of nasalance on CPP and HNR of voice signals. It was hypothesized that increased nasalance would be associated with decreased CPP. Study Design Within‐subject correlation design. Methods Thirty vocally healthy female participants were recorded reading and producing a vowel in alternation with a nasal consonant while wearing a nasometer for calculation of nasalance. Recorded vowel, nasalized, and nasal segments of speech were used to calculate CPP using Analysis of Dysphonia in Speech and Voice software, and HNR and vocal intensity using Praat software. Results Significant main effects of conditions were observed for CPP. CPP values decreased significantly when phonation changed from vowel to nasalized vowel and to nasal. There was correlation between CPP and nasalance and between CPP and intensity. HNR was slightly higher in the nasal condition than in vowel. There was a weak correlation between HNR and nasalance. No correlation was found between HNR and intensity. Conclusions CPP is sensitive to changes in vocal tract configuration caused by nasalization as well as intensity, whereas HNR is not. Therefore, CPP may reflect the periodicity in source signal or the filtering effects of vocal tract. Further research is needed to clarify the application and interpretation of CPP in clinical practice. Level of Evidence 4 Laryngoscope, 129:E299–E304, 2019
Collapse
Affiliation(s)
- Catherine Madill
- Voice Research Laboratory, The University of Sydney, Sydney, New South Wales, Australia
| | - Duong Duy Nguyen
- Voice Research Laboratory, The University of Sydney, Sydney, New South Wales, Australia
| | | | - Daniel Novakovic
- Faculty of Health Sciences and the Central Clinical School, The University of Sydney, Sydney, New South Wales, Australia
| | - Patricia McCabe
- Voice Research Laboratory, The University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
30
|
Suire A, Raymond M, Barkat-Defradas M. Human vocal behavior within competitive and courtship contexts and its relation to mating success. EVOL HUM BEHAV 2018. [DOI: 10.1016/j.evolhumbehav.2018.07.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
31
|
Affiliation(s)
- Jordan Raine
- Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of Sussex, Brighton, UK
| | - Katarzyna Pisanski
- Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of Sussex, Brighton, UK
| | - Julia Simner
- MULTISENSE Research Lab, School of Psychology, University of Sussex, Brighton, UK
| | - David Reby
- Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of Sussex, Brighton, UK
| |
Collapse
|
32
|
V Latoszek BB, Maryn Y, Gerrits E, De Bodt M. A Meta-Analysis: Acoustic Measurement of Roughness and Breathiness. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:298-323. [PMID: 29392295 DOI: 10.1044/2017_jslhr-s-16-0188] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Accepted: 10/25/2017] [Indexed: 06/07/2023]
Abstract
PURPOSE Over the last 5 decades, many acoustic measures have been created to measure roughness and breathiness. The aim of this study is to present a meta-analysis of correlation coefficients (r) between auditory-perceptual judgment of roughness and breathiness and various acoustic measures in both sustained vowels and continuous speech. METHOD Scientific literature reporting perceptual-acoustic correlations on roughness and breathiness were sought in 28 databases. Weighted average correlation coefficients (rw) were calculated when multiple r-values were available for a specific acoustic marker. An rw ≥ .60 was the threshold for an acoustic measure to be considered acceptable. RESULTS From 103 studies of roughness and 107 studies of breathiness that were investigated, only 33 studies and 34 studies, respectively, met the inclusion criteria of the meta-analysis on sustained vowels. Eighty-six acoustic measures were identified for roughness and 85 acoustic measures for breathiness on sustained vowels, in which 43 and 39 measures, respectively, yielded multiple r-values. Finally, only 14 measures for roughness and 12 measures for breathiness produced rw ≥ .60. On continuous speech, 4 measures for roughness and 21 measures for breathiness were identified, yielding 3 and 6 measures, respectively, with multiple r-values in which only 1 and 2, respectively, had rw ≥ .60. CONCLUSION This meta-analysis showed that only a few acoustic parameters were determined as the best estimators for roughness and breathiness.
Collapse
Affiliation(s)
- Ben Barsties V Latoszek
- Faculty of Medicine and Health Sciences, University of Antwerp, Belgium
- Institute of Health Studies, HAN University of Applied Sciences, Nijmegen, the Netherlands
| | - Youri Maryn
- Faculty of Medicine and Health Sciences, University of Antwerp, Belgium
- European Institute for ORL, Sint-Augustinus Hospital, Antwerp, Belgium
- Faculty of Education, Health & Social Work, University College Ghent, Belgium
| | - Ellen Gerrits
- Faculty of Health Care, HU University of Applied Sciences Utrecht, the Netherlands
- Faculty of Humanities, University of Utrecht, the Netherlands
- Department of Otolaryngology, University Medical Center Utrecht, the Netherlands
| | - Marc De Bodt
- Faculty of Medicine and Health Sciences, University of Antwerp, Belgium
- Department of Otorhinolaryngology and Head & Neck Surgery, Antwerp University Hospital, Belgium
- Faculty of Medicine & Health Sciences, University of Ghent, Belgium
| |
Collapse
|
33
|
How Do Voice Perceptual Changes Predict Acoustic Parameters in Persian Voice Patients? J Voice 2017; 32:705-709. [PMID: 29033255 DOI: 10.1016/j.jvoice.2017.08.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2017] [Revised: 08/15/2017] [Accepted: 08/16/2017] [Indexed: 11/22/2022]
Abstract
INTRODUCTION Perceptual and acoustic analyses are essential tools that help voice therapists comprehensively assess voice quality. While perceptual evaluations are subjective and are influenced by external and culturally driven factors, acoustic analysis is an objective and reliable means of evaluating voice. The goals of this study were (1) to determine which acoustic parameters were predicted by perceptual voice quality and (2) to assess the effect of a short period of training on the reliability of perceptual voice analyses for Persian speakers. METHOD This was a cross-sectional study. Subjects were 20 patients with various voice disorders. Voice samples were obtained during text reading and /a/ prolongation. Fifteen expert voice clinicians completed perceptual evaluations on voice samples using the Grade, Roughness, Breathiness, Asthenia, and Strain scale. We repeated this process after a short period of perceptual voice evaluation training. Acoustic analysis was completed using the Praat program. We used the intraclass correlation coefficient (ICC) for reliability measurement of the perceptual evaluation results and ordinal regression procedures to analyze all data. Significance level was set at P < 0.05. RESULTS Both intrarater and interrater reliability increased after training, for all five parameters. The ICC for grade increased to 0.95 after training. Grade and roughness significantly predicted fundamental frequency (F0) (P = 0.021 and P = 0.030, respectively) and harmonic-to-noise ratio (HNR) (P = 0.019 and P = 0.016, respectively). Breathiness significantly predicted shimmer (P = 0.013). CONCLUSION Training had a positive effect and increased the reliability of perceptual voice evaluation. For Persian listeners, changes in F0, increases in HNR, and shimmer were perceptually associated with poor voice quality.
Collapse
|
34
|
Chhetri DK, Merati AL, Blumin JH, Sulica L, Damrose EJ, Tsai VW. Reliability of the Perceptual Evaluation of Adductor Spasmodic Dysphonia. Ann Otol Rhinol Laryngol 2017; 117:159-65. [PMID: 18444474 DOI: 10.1177/000348940811700301] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Objectives: Although perceptual assessment by experienced voice clinicians remains the gold standard for the diagnosis and assessment of severity of adductor spasmodic dysphonia (ADSD), the interrater reliability of voice experts for this task has not been assessed. In addition, it is unknown whether telephone-recorded or -transmitted voice samples could be used for this task. The aims of this study were 1) to assess the reliability of perceptual analysis of ADSD severity by voice experts and 2) to compare the results between digitally recorded voice samples and those recorded over the telephone. Methods: Five laryngologists randomly selected voice samples from 46 ADSD patients and rated the severity of ADSD on a 5-point rating scale. A set of digital voice recordings and a set of telephone voice recordings made from filtering the digital set via the telephone were rated, and each voice set was rated twice. Measures of intrarater and interrater reliability, as well as a measure of the probability of agreement among the raters, were calculated. Results: There was a high level of agreement on ADSD severity, with excellent interrater and intrarater reliability (Cronbach's alpha, .93 to .96). The probabilities of rater agreement on the digitally recorded and telephone-filtered voice samples were similar (χ2, p = .07). The ratings of digital versus telephone voice samples were highly correlated (Pearson r, 0.99; p<.001). Conclusions: These results demonstrate that voice experts are reliably able to judge and agree on the severity of ADSD. Telephone-filtered voices appear to convey adequate ADSD perceptual cues for expert listeners to judge the severity of spasmodic dysphonia.
Collapse
Affiliation(s)
- Dinesh K. Chhetri
- Division of Head and Neck Surgery, University of California-Los Angeles School of Medicine, Los Angeles, California
| | - Albert L. Merati
- Department of Otolaryngology and Communication Sciences, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Joel H. Blumin
- Department of Otolaryngology and Communication Sciences, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Lucian Sulica
- Department of Otorhinolaryngology, Weill Medical College of Cornell University, New York, New York
| | - Edward J. Damrose
- Department of Otolaryngology, Stanford University School of Medicine, Palo Alto, California
| | - Veling W. Tsai
- Division of Head and Neck Surgery, University of California-Los Angeles School of Medicine, Los Angeles, California
| |
Collapse
|
35
|
Šebesta P, Kleisner K, Tureček P, Kočnar T, Akoko RM, Třebický V, Havlíček J. Voices of Africa: acoustic predictors of human male vocal attractiveness. Anim Behav 2017. [DOI: 10.1016/j.anbehav.2017.03.014] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
36
|
Siau RTK, Goswamy J, Jones S, Khwaja S. Is OperaVOX a clinically useful tool for the assessment of voice in a general ENT clinic? BMC EAR, NOSE, AND THROAT DISORDERS 2017; 17:4. [PMID: 28439206 PMCID: PMC5399865 DOI: 10.1186/s12901-017-0037-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Accepted: 04/05/2017] [Indexed: 01/26/2023]
Abstract
Background Objective acoustic analysis is a key component of multidimensional voice assessment. OperaVOX is an iOS app which has been shown to be comparable to Multi Dimensional Voice Program for most principal measures of vocal function. As a relatively cheap, portable and easily accessible form of acoustic analysis, OperaVOX may be more clinically useful than laboratory-based software in many situations. This study aims to determine whether correlation exists between acoustic measurements obtained using OperaVOX, and perceptual evaluation of voice. Methods Forty-four voices from the multidisciplinary voice clinic were examined. Each voice was assessed blindly by a single experienced voice therapist using the GRBAS scale, and analysed using OperaVOX. The Spearman rank correlation co-efficient was calculated between each element of the GRBAS scale and acoustic measurements obtained by OperaVOX. Results Significant correlations were identified between GRBAS scores and OperaVOX parameters. Grade correlated significantly with jitter (ρ = 0.495, p < 0.05), shimmer (ρ = 0.385, p < 0.05), noise-to-harmonic ratio (NHR; ρ = 0.526, p < 0.05) and maximum phonation time (MPT; ρ = −0.415, p < 0.05). Roughness did not correlate with any of the measured variables. Breathiness correlated significantly with jitter (ρ = 0.342, p < 0.05), NHR (ρ = 0.344, p < 0.05) and MPT (ρ = −0.336, p < 0.05). Aesthenia correlated with NHR (ρ = 0.413, p < 0.05) and MPT (ρ = −0.399, p < 0.05). Strain correlated with Jitter (ρ = 0.560, p < 0.05), NHR (ρ = 0.600, p < 0.05) and MPT (ρ = −0.356, p < 0.05). Conclusions OperaVOX provides objective acoustic analysis which has shown statistically significant correlation to perceptual evaluation using the GRBAS scale. The accessibility of the software package makes it possible for a wide range of health practitioners, e.g. general ENT surgeons, vascular surgeons, thyroid surgeons and cardiothoracic surgeons to objectively monitor outcomes and complications of surgical procedures that may affect vocal function. Given the increasing requirement for surgeons to monitor their outcomes as part of the move towards ‘surgeon reported outcomes’ this may become an invaluable tool towards that goal.
Collapse
Affiliation(s)
- Richard Teck Kee Siau
- Department of Otolaryngology - Head and Neck Surgery, University Hospital of South Manchester NHS Foundation Trust, Wythenshawe Hospital, Manchester, UK
| | - Jay Goswamy
- Department of Otolaryngology - Head and Neck Surgery, University Hospital of South Manchester NHS Foundation Trust, Wythenshawe Hospital, Manchester, UK
| | - Sue Jones
- Department of Speech and Language Therapy, University Hospital of South Manchester NHS Foundation Trust, Manchester, UK
| | - Sadie Khwaja
- Department of Otolaryngology - Head and Neck Surgery, University Hospital of South Manchester NHS Foundation Trust, Wythenshawe Hospital, Manchester, UK
| |
Collapse
|
37
|
Samlan RA, Story BH. Influence of Left-Right Asymmetries on Voice Quality in Simulated Paramedian Vocal Fold Paralysis. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:306-321. [PMID: 28199505 DOI: 10.1044/2016_jslhr-s-16-0076] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Accepted: 05/31/2016] [Indexed: 05/25/2023]
Abstract
PURPOSE The purpose of this study was to determine the vocal fold structural and vibratory symmetries that are important to vocal function and voice quality in a simulated paramedian vocal fold paralysis. METHOD A computational kinematic speech production model was used to simulate an exemplar "voice" on the basis of asymmetric settings of parameters controlling glottal configuration. These parameters were then altered individually to determine their effect on maximum flow declination rate, spectral slope, cepstral peak prominence, harmonics-to-noise ratio, and perceived voice quality. RESULTS Asymmetry of each of the 5 vocal fold parameters influenced vocal function and voice quality; measured change was greatest for adduction and bulging. Increasing the symmetry of all parameters improved voice, and the best voice occurred with overcorrection of adduction, followed by bulging, nodal point ratio, starting phase, and amplitude of vibration. CONCLUSIONS Although vocal process adduction and edge bulging asymmetries are most influential in voice quality for simulated vocal fold motion impairment, amplitude of vibration and starting phase asymmetries are also perceptually important. These findings are consistent with the current surgical approach to vocal fold motion impairment, where goals include medializing the vocal process and straightening concave edges. The results also explain many of the residual postoperative voice limitations.
Collapse
Affiliation(s)
- Robin A Samlan
- Department of Speech, Language, and Hearing Sciences, University of Arizona, Tucson
| | - Brad H Story
- Department of Speech, Language, and Hearing Sciences, University of Arizona, Tucson
| |
Collapse
|
38
|
Vocal Cues Underlying Youth and Adult Portrayals of Socio-emotional Expressions. JOURNAL OF NONVERBAL BEHAVIOR 2017. [DOI: 10.1007/s10919-017-0250-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
39
|
Hartl DM, Vaissière J, Laccourreye O, Brasnu DF. Acoustic Analysis of Autologous Fat Injection versus Thyroplasty in the Same Patient. Ann Otol Rhinol Laryngol 2016; 112:987-92. [PMID: 14653369 DOI: 10.1177/000348940311201112] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
We objectively measured the acoustic effects of treatment of unilateral vocal fold paralysis by injection of autologous fat and by polytetrafluoroethylene thyroplasty, in the same patient. To our knowledge, this is the first report comparing the two techniques by using the patient's normal voice as the control. The voice of a male patient was recorded before and after onset of unilateral vocal fold paralysis, after treatment with autologous fat, and after polytetrafluoroethylene thyroplasty. Acoustic analysis was performed on a long-term average spectrum of text and on the MDVP (Kay Elemetrics) evaluation of the vowel /a/. Jitter and shimmer were not normalized, but they improved to a greater extent after fat injection. The cepstral peak prominence, spectral skewness, and long-term average spectrum returned to preparalytic values after both treatments, but improved to a greater extent after fat injection. This study showed that both techniques can return the voice to preparalytic values. Spectral measurements best reflected the voice improvement. Further prospective studies in a larger number of patients will be necessary to confirm these results and to determine the long-term objective voice outcome obtained with these techniques.
Collapse
Affiliation(s)
- Dana M Hartl
- Voice, Biomaterials, and Head and Neck Oncology Research Laboratory, University Paris V, Hôpital Européen Georges Pompidou, Paris, France
| | | | | | | |
Collapse
|
40
|
Lee SJ, Cho Y, Song JY, Lee D, Kim Y, Kim H. Aging Effect on Korean Female Voice: Acoustic and Perceptual Examinations of Breathiness. Folia Phoniatr Logop 2016; 67:300-7. [PMID: 27160514 DOI: 10.1159/000445290] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVE This paper sought to examine perceptual and acoustic characteristics in Korean female voices, focusing on the 'breathy' quality as a function of aging. In addition, we aimed to investigate if the three selected measures, H1-H2, H1-A1, and H1-A3, demonstrated any changes along a sustained vowel production. PARTICIPANTS AND METHODS A total of 42 participants were assigned to two age groups, young women and elderly women. All participants were asked to sustain /a/ as long and as steadily as possible. Perceptual judgments of breathiness were made on the GRBAS scale and by a direct magnitude estimation technique, while three acoustic parameters, H1-H2, H1-A1, and H1-A3, were measured at five measurement time points during the sustained vowel test. RESULTS Results indicated that the H1-H2 and H1-A1 values were significantly lower for elderly women compared to young women, although no difference in the perceptual estimation of breathiness was found between the age groups. Among the acoustic measures, only H1-A1 was significantly regressed against the perceptual estimate of breathiness. In addition, no significant acoustic difference in the measures was found across the five measurement points. CONCLUSION Our findings suggest that the aging voice might not be universally characterized by the breathy quality, which hints at the need for further research on ethnic diversity in vocal quality.
Collapse
Affiliation(s)
- Seung Jin Lee
- Graduate Program in Speech and Language Pathology, Yonsei University, Seoul, Korea
| | | | | | | | | | | |
Collapse
|
41
|
Shin YJ, Hong KH. Cepstral Analysis of Voice in Patients With Thyroidectomy. Clin Exp Otorhinolaryngol 2016; 9:157-62. [PMID: 27090273 PMCID: PMC4881323 DOI: 10.21053/ceo.2015.00199] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2015] [Revised: 03/04/2015] [Accepted: 03/18/2015] [Indexed: 11/25/2022] Open
Abstract
Objectives The vocal changes after a thyroidectomy are temporary and nonsevere, therefore, obtaining accurate analytical results on the pathological vocal characteristics following such a procedure is difficult. For a more objective acoustic analysis, this study used the cepstral analysis method to examine changes in the patients’ voices during the perioperative period regarding sustained vowel phonation. Methods The sustained phonation of the five vowels (i.e., /a/, /e/, /i/, /o/, and /u/) by 35 patients with thyroidectomy were recorded by using a Multi-Speech program. Of the 35 patients, 10 were men and 25 were women, with an average age of 51.5 years. Voice data were collected a total of 3 times (preoperatively, 5–7 days after the operation, and 6 weeks after the operation) and were edited according to each fragment (on-set, mid, and off-set) for cepstral analysis. Results The cepstral analysis on the patients’ voices revealed no significant differences between the examination periods of all vowel phonations. However, analysis of the on-set fragment of the vowel /i/ revealed pathological characteristics in which the cepstral measurements of the voice were significantly lower after the operation than before the operation, with the cepstral measurements of the voice increasing further 6 weeks following surgery. Conclusion The results of the acoustic analysis on the on-set fragment of the vowel /i/ will be important data for characterizing the vocal changes during the perioperative period. This study contributes to future research on the mechanisms underlying changes in the voice of patients with a history of thyroid or neck surgery.
Collapse
Affiliation(s)
- Yu Jeong Shin
- Department of Speech-Language Therapy, Howon University, Gunsan, Korea
| | - Ki Hwan Hong
- Department of Otolaryngology-Head and Neck Surgery, Research Institute of Clinical Medicine-Biomedical Research Institute of Chonbuk National University Hospital, Jeonju, Korea
| |
Collapse
|
42
|
Modeling of Breathy Voice Quality Using Pitch-strength Estimates. J Voice 2016; 30:774.e1-774.e7. [PMID: 26775221 DOI: 10.1016/j.jvoice.2015.11.016] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Accepted: 11/20/2015] [Indexed: 11/23/2022]
Abstract
BACKGROUND The characteristic voice quality of a speaker conveys important linguistic, paralinguistic, and vocal health-related information. Pitch strength refers to the salience of pitch sensation in a sound and was recently reported to be strongly correlated with the magnitude of perceived breathiness based on a small number of voice stimuli. OBJECTIVE The current study examined the relationship between perceptual judgments of breathiness and computational estimates of pitch strength based on the Aud-SWIPE (P-NP) algorithm for a large number of voice stimuli (330 synthetic and 57 natural). METHODS AND RESULTS Similar to the earlier study, the current results confirm a strong relationship between estimated pitch strength and listener judgments of breathiness such that low pitch-strength values are associated with voices that have high perceived breathiness. Based on this result, a model was developed for the perception of breathy voice quality using a pitch-strength estimator. Regression functions derived between the pitch-strength estimates and perceptual judgments of breathiness obtained from matching task revealed a linear relationship for a subset of the natural stimuli. We then used this function to obtain predicted breathiness values for the synthetic and the remaining natural stimuli. CONCLUSIONS Predicted breathiness values from our model were highly correlated with the perceptual data for both types of stimuli. Systematic differences between the breathiness of natural and synthetic stimuli are discussed.
Collapse
|
43
|
Shiba TL, Chhetri DK. Dynamics of phonatory posturing at phonation onset. Laryngoscope 2015; 126:1837-43. [PMID: 26690882 DOI: 10.1002/lary.25816] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Revised: 10/29/2015] [Accepted: 11/13/2015] [Indexed: 11/08/2022]
Abstract
INTRODUCTION In speech and singing, the intrinsic laryngeal muscles set the prephonatory posture prior to the onset of phonation. The timing and shape of the prephonatory glottal posture can directly affect the resulting phonation type. We investigated the dynamics of human laryngeal phonatory posturing. METHODS Onset of vocal fold adduction to phonation was observed in 27 normal subjects using high-speed video recording. Subjects were asked to utter a variety of phonation types (modal, breathy, pressed, /i/ following sniff). Digital videokymography with concurrent acoustic signal was analyzed to assess the timing of the following: onset of adduction to final phonatory posture (FPT), phonation onset time (POT), and phonatory posture time (PPT). Final phonatory posture time was determined as the moment at which the laryngeal configuration used in phonation was first achieved. RESULTS Thirty-three audiovisual recordings met inclusion criteria. Average FPT, PPT, and POT were as follows: 303, 106, and 409 ms for modal; 430, 104, and 534 ms for breathy; 483, 213, and 696 ms for pressed; and 278, 98, and 376 ms for sniff-/i/. The following posturing features were observed: 1) pressed phonation: increased speed of closure just prior to final posture, complete glottal closure, and increased supraglottic hyperactivity; and 2) breathy phonation: decreased speed of closure prior to final posture, increased posterior glottal gap, and increased midmembranous gap. CONCLUSIONS Phonation onset latency was shortest for modal and longest for pressed voice. These findings are likely explained by glottal resistance and subglottal pressure requirements. LEVEL OF EVIDENCE NA. Laryngoscope, 126:1837-1843, 2016.
Collapse
Affiliation(s)
- Travis L Shiba
- Laryngeal Physiology Laboratory, CHS 62-132, Department of Head and Neck Surgery, UCLA School of Medicine, Los Angeles, California, U.S.A
| | - Dinesh K Chhetri
- Laryngeal Physiology Laboratory, CHS 62-132, Department of Head and Neck Surgery, UCLA School of Medicine, Los Angeles, California, U.S.A
| |
Collapse
|
44
|
Laugh Like You Mean It: Authenticity Modulates Acoustic, Physiological and Perceptual Properties of Laughter. JOURNAL OF NONVERBAL BEHAVIOR 2015. [DOI: 10.1007/s10919-015-0222-8] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
45
|
Ouattassi N, Benmansour N, Ridal M, Zaki Z, Bendahhou K, Nejjari C, Cherkaoui A, El Alami MNEA. Acoustic assessment of erygmophonic speech of Moroccan laryngectomized patients. Pan Afr Med J 2015; 21:270. [PMID: 26587121 PMCID: PMC4633833 DOI: 10.11604/pamj.2015.21.270.4301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2014] [Accepted: 04/01/2015] [Indexed: 11/11/2022] Open
Abstract
Introduction Acoustic evaluation of alaryngeal voices is among the most prominent issues in speech analysis field. In fact, many methods have been developed to date to substitute the classic perceptual evaluation. The Aim of this study is to present our experience in erygmophonic speech objective assessment and to discuss the most widely used methods of acoustic speech appraisal. through a prospective case-control study we have measured acoustic parameters of speech quality during one year of erygmophonic rehabilitation therapy of Moroccan laryngectomized patients. Methods We have assessed acoustic parameters of erygmophonic speech samples of eleven laryngectomized patients through the speech rehabilitation therapy. Acoustic parameters were obtained by perturbation analysis method and linear predictive coding algorithms also through the broadband spectrogram. Results Using perturbation analysis methods, we have found erygmophonic voice to be significantly poorer than normal speech and it exhibits higher formant frequency values. However, erygmophonic voice shows also higher and extremely variable Error values that were greater than the acceptable level. And thus, live a doubt on the reliability of those analytic methods results. Conclusion Acoustic parameters for objective evaluation of alaryngeal voices should allow a reliable representation of the perceptual evaluation of the quality of speech. This requirement has not been fulfilled by the common methods used so far. Therefore, acoustical assessment of erygmophonic speech needs more investigations.
Collapse
Affiliation(s)
- Naouar Ouattassi
- ENT Head and Neck Department, Hassan II University Hospital, Fez, Morocco
| | - Najib Benmansour
- ENT Head and Neck Department, Hassan II University Hospital, Fez, Morocco
| | - Mohammed Ridal
- ENT Head and Neck Department, Hassan II University Hospital, Fez, Morocco
| | - Zouheir Zaki
- ENT Head and Neck Department, Hassan II University Hospital, Fez, Morocco
| | - Karima Bendahhou
- Epidemiology, Clinical Research and Community Health Department, Faculty of Medicine and Pharmacy, Fez, Morocco
| | - Chakib Nejjari
- Epidemiology, Clinical Research and Community Health Department, Faculty of Medicine and Pharmacy, Fez, Morocco
| | | | | |
Collapse
|
46
|
Carson C, Ryalls J, Hardin-Hollingsworth K, Le Normand MT, Ruddy B. Acoustic Analyses of Prolonged Vowels in Young Adults With Friedreich Ataxia. J Voice 2015; 30:272-80. [PMID: 26454768 DOI: 10.1016/j.jvoice.2015.05.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2015] [Accepted: 05/15/2015] [Indexed: 11/28/2022]
Abstract
OBJECTIVES Finding measures that track disease progression and determine treatment efficacy is vital for appropriate management in Friedreich ataxia (FA). The purpose of this study was to determine which cepstral- and spectral-based measures extracted from prolonged vowels using Analysis of Dysphonia in Speech and Voice (ADSV) program discriminate between those who have FA and normal voice (NV) peers. STUDY DESIGN This is a descriptive, prospective study. METHODS Initial 2 seconds of prolonged /a/, /i/, and /o/ were analyzed through ADSV from 20 individuals diagnosed with FA and 20 NV individuals. ADSV measures used were cepstral peak prominence (CPP), cepstral peak prominence standard deviation (CPP SD), low/high spectral ratio (L/H ratio), low/high spectral ratio standard deviation (L/H ratio SD), and the Cepstral/Spectral Index of Dysphonia (CSID). RESULTS L/H ratio SD was the only measure where significant differences were found across all vowels between groups. Comparing measures per vowel, the vowel /o/ was significantly different between groups on four of five measures. Discrimination analysis revealed 100% of those in the FA group were classified correctly (sensitivity), whereas 95% of NV members were correctly identified (specificity) when all ADSV measures, with the exception of L/H ratio, were entered. CONCLUSIONS Unstable periods of phonation, such as initiations of voice production in vowels, may yield robust acoustic cues in the FA population. ADSV provides measures that, when considered together, have excellent sensitivity and very good specificity. Vowels yielded differing results on ADSV measures; analysis of different vowel types is recommended.
Collapse
Affiliation(s)
- Cecyle Carson
- Department of Communication Sciences & Disorders, Health and Public Affairs I, University of Central Florida, Orlando, Florida 32816.
| | - Jack Ryalls
- Department of Communication Sciences & Disorders, Health and Public Affairs I, University of Central Florida, Orlando, Florida 32816
| | - Kaylea Hardin-Hollingsworth
- Department of Communication Sciences & Disorders, Health and Public Affairs I, University of Central Florida, Orlando, Florida 32816
| | | | - Bari Ruddy
- Department of Communication Sciences & Disorders, Health and Public Affairs I, University of Central Florida, Orlando, Florida 32816
| |
Collapse
|
47
|
Rangarathnam B, McCullough GH, Pickett H, Zraick RI, Tulunay-Ugur O, McCullough KC. Telepractice Versus In-Person Delivery of Voice Therapy for Primary Muscle Tension Dysphonia. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2015; 24:386-399. [PMID: 25836732 DOI: 10.1044/2015_ajslp-14-0017] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Accepted: 03/25/2015] [Indexed: 06/04/2023]
Abstract
PURPOSE The purpose of this study was to investigate the utility of telepractice for delivering flow phonation exercises to persons with primary muscle tension dysphonia (MTD). METHOD Fourteen participants with a diagnosis of primary MTD participated, 7 on site and 7 at remote locations. Each participant received 12 treatment sessions across 6 weeks. Treatment consisted of flow phonation voice therapy exercises. Auditory-perceptual, acoustic, aerodynamic, and quality-of-life measures were taken before and after treatment. RESULTS Perceptual and quality-of-life measures were significantly better posttreatment and were statistically equivalent across groups. Acoustic and aerodynamic measures improved in both groups, but changes did not reach statistical significance. Results for the 2 service delivery groups were comparable, with no significant differences observed for perceptual and quality-of-life measures. CONCLUSIONS Although the American Speech-Language-Hearing Association supports the use of telepractice for speech-language pathology services, evidence for the use of telepractice for providing behavioral treatment to patients with MTD has been lacking. The results of this study indicate that flow phonation exercises can be successfully used for patients with MTD using telepractice.
Collapse
|
48
|
Automatic Evaluation of Voice Quality Using Text-Based Laryngograph Measurements and Prosodic Analysis. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2015; 2015:316325. [PMID: 26136813 PMCID: PMC4468283 DOI: 10.1155/2015/316325] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Accepted: 05/13/2015] [Indexed: 11/17/2022]
Abstract
Due to low intra- and interrater reliability, perceptual voice evaluation should be supported by objective, automatic methods. In this study, text-based, computer-aided prosodic analysis and measurements of connected speech were combined in order to model perceptual evaluation of the German Roughness-Breathiness-Hoarseness (RBH) scheme. 58 connected speech samples (43 women and 15 men; 48.7 ± 17.8 years) containing the German version of the text "The North Wind and the Sun" were evaluated perceptually by 19 speech and voice therapy students according to the RBH scale. For the human-machine correlation, Support Vector Regression with measurements of the vocal fold cycle irregularities (CFx) and the closed phases of vocal fold vibration (CQx) of the Laryngograph and 33 features from a prosodic analysis module were used to model the listeners' ratings. The best human-machine results for roughness were obtained from a combination of six prosodic features and CFx (r = 0.71, ρ = 0.57). These correlations were approximately the same as the interrater agreement among human raters (r = 0.65, ρ = 0.61). CQx was one of the substantial features of the hoarseness model. For hoarseness and breathiness, the human-machine agreement was substantially lower. Nevertheless, the automatic analysis method can serve as the basis for a meaningful objective support for perceptual analysis.
Collapse
|
49
|
Gunjawate DR, Aithal VU, Guddattu V, Bellur R. Acoustic Analysis of Madhya and Taar Saptak/Sthayi in Indian Classical Singers. Folia Phoniatr Logop 2015; 67:36-41. [DOI: 10.1159/000381337] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
|
50
|
Watson BC, Baken RJ, Roark RM. Effect of Voice Onset Type on Vocal Attack Time. J Voice 2015; 30:11-4. [PMID: 25795369 DOI: 10.1016/j.jvoice.2014.12.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2014] [Accepted: 12/08/2014] [Indexed: 10/23/2022]
Abstract
Vocal attack time (VAT) is the time lag between the growth of sound pressure (SP) and electroglottographic (EGG) signals at vocal initiation. The characteristics of voice initiation are associated with issues of vocal hygiene, efficiency, and quality. Vocal onsets have commonly been qualitatively characterized into three types: hard, simultaneous, and breathy. This study examines the effect of voice onset type on VAT values in normal speakers. SP and EGG recordings were obtained for 55 female and 57 male subjects while producing multiple tokens of three tasks (sustained /ɑ/ and "always" as unaspirated onsets, and "hallways" as an aspirated onset). Results revealed a significant effect of onset type on VAT, with the mean VAT for the "hallways" (aspirated) task greater than the mean VAT for the sustained /ɑ/ and "always" (unaspirated) tasks. There was no significant VAT difference between the sustained /ɑ/ and "always" tasks. Findings confirm the sensitivity of the VAT measure to vocal onset type and suggest its potential application as an objective and quantitative clinical measure of the type of vocal onset.
Collapse
Affiliation(s)
- Ben C Watson
- Department of Speech-Language Pathology, New York Medical College, Valhalla, New York.
| | - R J Baken
- Department of Speech-Language Pathology, New York Medical College, Valhalla, New York
| | - Rick M Roark
- Department of Otolaryngology, New York Medical College, Valhalla, New York
| |
Collapse
|