1
|
Shen J, Heller Murray E. Breathy Vocal Quality, Background Noise, and Hearing Loss: How Do These Adverse Conditions Affect Speech Perception by Older Adults? Ear Hear 2024:00003446-990000000-00361. [PMID: 39494949 DOI: 10.1097/aud.0000000000001599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2024]
Abstract
OBJECTIVES Although breathy vocal quality and hearing loss are both prevalent age-related changes, their combined impact on speech communication is poorly understood. This study investigated whether breathy vocal quality affected speech perception and listening effort by older listeners. Furthermore, the study examined how this effect was modulated by the adverse listening environment of background noise and the listener's level of hearing loss. DESIGN Nineteen older adults participated in the study. Their hearing ranged from near-normal to mild-moderate sensorineural hearing loss. Participants heard speech material of low-context sentences, with stimuli resynthesized to simulate original, mild-moderately breathy, and severely breathy conditions. Speech intelligibility was measured using a speech recognition in noise paradigm, with pupillometry data collected simultaneously to measure listening effort. RESULTS Simulated severely breathy vocal quality was found to reduce intelligibility and increase listening effort. Breathiness and background noise level independently modulated listening effort. The impact of hearing loss was not observed in this dataset, which can be due to the use of individualized signal to noise ratios and a small sample size. CONCLUSION Results from this study demonstrate the challenges of listening to speech with a breathy vocal quality. Theoretically, the findings highlight the importance of periodicity cues in speech perception in noise by older listeners. Breathy voice could be challenging to separate from the noise when the noise also lacks periodicity. Clinically, it suggests the need to address both listener- and talker-related factors in speech communication by older adults.
Collapse
Affiliation(s)
- Jing Shen
- Department of Communication Sciences and Disorders, College of Public Health, Temple University, Philadelphia, Pennsylvania, USA
| | | |
Collapse
|
2
|
Laukkanen AM, Kadiri SR, Narayanan S, Alku P. Can a Machine Distinguish High and Low Amount of Social Creak in Speech? J Voice 2024:S0892-1997(24)00342-4. [PMID: 39455325 DOI: 10.1016/j.jvoice.2024.09.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Revised: 09/29/2024] [Accepted: 09/30/2024] [Indexed: 10/28/2024]
Abstract
OBJECTIVES Increased prevalence of social creak particularly among female speakers has been reported in several studies. The study of social creak has been previously conducted by combining perceptual evaluation of speech with conventional acoustical parameters such as the harmonic-to-noise ratio and cepstral peak prominence. In the current study, machine learning (ML) was used to automatically distinguish speech of low amount of social creak from speech of high amount of social creak. METHODS The amount of creak in continuous speech samples produced in Finnish by 90 female speakers was first perceptually assessed by two voice specialists. Based on their assessments, the speech samples were divided into two categories (low vs high amount of creak). Using the speech signals and their creak labels, seven different ML models were trained. Three spectral representations were used as feature for each model. RESULTS The results show that the best performance (accuracy of 71.1%) was obtained by the following two systems: an Adaboost classifier using the mel-spectrogram feature and a decision tree classifier using the mel-frequency cepstral coefficient feature. CONCLUSIONS The study of social creak is becoming increasingly popular in sociolinguistic and vocological research. The conventional human perceptual assessment of the amount of creak is laborious and therefore ML technology could be used to assist researchers studying social creak. The classification systems reported in this study could be considered as baselines in future ML-based studies on social creak.
Collapse
Affiliation(s)
| | - Sudarsana Reddy Kadiri
- Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, Los Angeles
| | - Shrikanth Narayanan
- Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, Los Angeles
| | - Paavo Alku
- Department of Information and Communications Engineering, Aalto University, Espoo, Finland.
| |
Collapse
|
3
|
Kitayama I, Hosokawa K, Iwaki S, Yoshida M, Miyauchi A, Ogawa M, Inohara H. Validation of Subharmonics Quantification Using Two-Stage Cepstral Analysis. J Voice 2023:S0892-1997(23)00389-2. [PMID: 38142187 DOI: 10.1016/j.jvoice.2023.12.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 12/02/2023] [Accepted: 12/04/2023] [Indexed: 12/25/2023]
Abstract
OBJECTIVES Hoarseness is primarily perceived as breathiness or roughness. Despite the various tools that quantitatively assess hoarseness, roughness has been difficult to quantify because of its complex acoustic structure, such as subharmonics. The parameter obtained from the two-stage cepstral analysis is promising for evaluating roughness. Thus, this study aimed to improve the accuracy of the parameter using a customized pitch setting and investigate the relationship between roughness and subharmonics. STUDY DESIGN The design is a retrospective study. METHODS Two-stage cepstral analysis was used to analyze the voice recordings of 455 participants, speech impaired and normal controls, using the Analysis of Dysphonia in Speech and Voice and Praat software. For validation, the ground truth of subharmonics was visually quantified using a narrowband spectrogram. The reliability and validity of the two-stage cepstral analysis and subharmonics measures on spectrograms were evaluated. RESULTS The two-stage cepstral analysis showed a very strong correlation (r = 0.963) between the two software programs. Intra- and inter-rater reliability of the subharmonics measures on spectrograms were also good. Two-stage cepstral analysis showed that even with customized pitch settings, the diagnostic systems and correlations for perceptual roughness and subharmonics were weak to moderate. The subharmonics measures on spectrograms showed a strong correlation with roughness and moderate diagnostic accuracy of subharmonics. CONCLUSIONS The two-stage cepstral analysis showed some improvement in diagnostic accuracy and correlation with customized pitch settings, but it did not sufficiently detect subharmonics or roughness. The analysis using subharmonics measures on spectrograms proved the high correlation between subharmonics and roughness, indicating that developing acoustic analysis parameters that sufficiently detect subharmonics is necessary.
Collapse
Affiliation(s)
- Itsuki Kitayama
- Department of Otorhinolaryngology and Head & Neck Surgery, Osaka University Graduate School of Medicine, Osaka, Japan
| | - Kiyohito Hosokawa
- Department of Otorhinolaryngology and Head & Neck Surgery, Osaka University Graduate School of Medicine, Osaka, Japan; Department of Otorhinolaryngology, Osaka Police Hospital, Osaka, Japan.
| | - Shinobu Iwaki
- Department of Rehabilitation, Kobe University Hospital, Hyogo, Japan
| | - Misao Yoshida
- Department of Rehabilitation, Sakai Heisei Hospital, Osaka, Japan
| | | | - Makoto Ogawa
- Department of Otorhinolaryngology and Head & Neck Surgery, Osaka University Graduate School of Medicine, Osaka, Japan
| | - Hidenori Inohara
- Department of Otorhinolaryngology and Head & Neck Surgery, Osaka University Graduate School of Medicine, Osaka, Japan
| |
Collapse
|
4
|
Park Y, Anand S, Kopf LM, Shrivastav R, Eddins DA. Interactions Between Breathy and Rough Voice Qualities and Their Contributions to Overall Dysphonia Severity. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:4071-4084. [PMID: 36260821 PMCID: PMC9940885 DOI: 10.1044/2022_jslhr-22-00012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
PURPOSE Dysphonic voices typically present multiple voice quality dimensions. This study investigated potential interactions between perceived breathiness and roughness and their contributions to overall dysphonia severity. METHOD Synthetic stimuli based on four talkers were created to systematically map out potential interactions. For each talker, a stimulus matrix composed of 49 stimuli (seven breathiness steps × seven roughness steps) was created by varying aspiration noise and open quotient to manipulate breathiness and superimposing amplitude modulation of varying depths to simulate roughness. One-dimensional matching (1DMA) and magnitude estimation (1DME) tasks were used to measure perceived breathiness, roughness, their potential interactions, and overall dysphonia severity. Additional 1DME tasks were used to assess a set of natural stimuli that varied along both breathiness and roughness. RESULTS For the synthetic stimuli, the 1DMA task indicated little interaction between the two voice qualities. For the 1DME task, breathiness magnitude was influenced by roughness step to a greater extent than roughness magnitude was influenced by breathiness step. The additive contributions of breathiness and roughness to overall severity gradually diminished with increasing breathiness and roughness steps, possibly reflecting a ceiling effect in the 1DME task. For the natural stimuli, little consistent interaction was observed between breathiness and roughness. CONCLUSIONS The matching task revealed minimal interaction between perceived breathiness and roughness, whereas the magnitude estimation task revealed some interaction between the two qualities and their cumulative contributions to overall dysphonia severity. Task differences are discussed in terms of differences in response bias and the role of perceptual anchors. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.21313701.
Collapse
Affiliation(s)
- Yeonggwang Park
- Department of Communication Sciences and Disorders, University of South Florida, Tampa
| | - Supraja Anand
- Department of Communication Sciences and Disorders, University of South Florida, Tampa
| | - Lisa M. Kopf
- Department of Speech, Language and Hearing Sciences, The George Washington University, Washington, DC
| | - Rahul Shrivastav
- Office of the Provost and Executive Vice President, Indiana University, Bloomington
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing
| | - David A. Eddins
- Department of Communication Sciences and Disorders, University of South Florida, Tampa
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing
| |
Collapse
|
5
|
Park Y, Anand S, Ozmeral EJ, Shrivastav R, Eddins DA. Predicting Perceived Vocal Roughness Using a Bio-Inspired Computational Model of Auditory Temporal Envelope Processing. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:2748-2758. [PMID: 35867607 PMCID: PMC9911094 DOI: 10.1044/2022_jslhr-22-00101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 04/14/2022] [Accepted: 04/25/2022] [Indexed: 06/15/2023]
Abstract
PURPOSE Vocal roughness is often present in many voice disorders but the assessment of roughness mainly depends on the subjective auditory-perceptual evaluation and lacks acoustic correlates. This study aimed to apply the concept of roughness in general sound quality perception to vocal roughness assessment and to characterize the relationship between vocal roughness and temporal envelop fluctuation measures obtained from an auditory model. METHOD Ten /ɑ/ recordings with a wide range of roughness were selected from an existing database. Ten listeners rated the roughness of the recordings in a single-variable matching task. Temporal envelope fluctuations of the recordings were analyzed with an auditory processing model of amplitude modulation that utilizes a modulation filterbank of different modulation frequencies. Pitch strength and the smoothed cepstral peak prominence were also obtained for comparison. RESULTS Individual simple regression models yielded envelope standard deviation from a modulation filter with a low center frequency (64.3 Hz) as a statistically significant predictor of vocal roughness with a strong coefficient of determination (r 2 = .80). Pitch strength and CPPS were not significant predictors of roughness. CONCLUSION This result supports the possible utility of envelope fluctuation measures from an auditory model as objective correlates of vocal roughness.
Collapse
Affiliation(s)
- Yeonggwang Park
- Department of Communication Sciences and Disorders, University of South Florida, Tampa
| | - Supraja Anand
- Department of Communication Sciences and Disorders, University of South Florida, Tampa
| | - Erol J. Ozmeral
- Department of Communication Sciences and Disorders, University of South Florida, Tampa
| | - Rahul Shrivastav
- Office of the Provost & Executive Vice President, Indiana University Bloomington
| | - David A. Eddins
- Department of Communication Sciences and Disorders, University of South Florida, Tampa
| |
Collapse
|
6
|
魏 梅, 杜 建, 耿 磊, 王 巍. [Detection of speech pathology based on parameters of analysis of dysphonia in speech and voice]. LIN CHUANG ER BI YAN HOU TOU JING WAI KE ZA ZHI = JOURNAL OF CLINICAL OTORHINOLARYNGOLOGY, HEAD, AND NECK SURGERY 2022; 36:492-496. [PMID: 35822373 PMCID: PMC10128384 DOI: 10.13201/j.issn.2096-7993.2022.07.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Indexed: 06/15/2023]
Abstract
Objective:To analysis speech pathology based on dysphonia in speech and voice(ADSV). Methods:The acoustic signals of continuous vowels and continuous speech of one-hundred and thirteen individuals were collected, including 93 vocal cord polyps cases, 20 glottis laryngeal carcinoma cases and 47 volunteers without speech sound disorders. Cepstral peak prominence(CPP), CPP standard deviation(CPP SD), L/H spectral ratio(L/H ratio), L/H ratio standard deviation(L/H ratio SD) and cepstral/spectral index of dysphonia(CSID) were analyzed by ADSV to explore the role of these parameters in the recognition of speech pathology. Results:In the acoustic signal of continuous vowels, CPP and L/H ratio in normal group were higher than those in pathological voice group(P<0.001), while CPP SD and CSID were lower than those in pathological voice group(P<0.001), CPP and CSID areas under ROC curve were 0.95 and 0.99, respectively, which were important acoustic parameters for diagnosing pathological voice. In continuous speech acoustic signals, CPP, CPP SD and L/H ratio in the normal group were all higher than those in the speech disorders group(P<0.001), and the area under the curve of CPP SD was 0.90, which showed high accuracy in diagnosing pathological voice. The ADSV voice analysis parameters CPP, CPP SD, CSID, and L/H ratio also showed significant differences between the vocal cord polyp group and the glottic laryngeal cancer group. The results of the discriminant analysis model show that the use of ADSV voice parameters can distinguish vocal cord polyps and laryngeal cancers. Conclusion:The ADSV voice analysis parameters can not only distinguish the voice signals of the normal group and the pathological group, but also distinguish different types of pathological voices. It has high sensitivity and specificity in diagnosing pathological voices.
Collapse
Affiliation(s)
- 梅 魏
- 天津市第一中心医院耳鼻咽喉头颈外科 天津市耳鼻喉科研究所 天津市听觉言语与平衡医学重点实验室 天津市医学重点学科(耳鼻咽喉科学) 天津市耳鼻喉质量控制中心(天津,300192)Department of Otorhinolaryngology Head and Neck Surgery, Tianjin First Central Hospital, Institute of Otolaryngology of Tianjin, China Key Laboratory of Auditory Speech and Balance Medicine, Key Medical Discipline of Tianjin[Otolaryngology], China Quality Control Centre of Otolaryngology, Tianjin, 300192, China
| | - 建群 杜
- 天津市第一中心医院耳鼻咽喉头颈外科 天津市耳鼻喉科研究所 天津市听觉言语与平衡医学重点实验室 天津市医学重点学科(耳鼻咽喉科学) 天津市耳鼻喉质量控制中心(天津,300192)Department of Otorhinolaryngology Head and Neck Surgery, Tianjin First Central Hospital, Institute of Otolaryngology of Tianjin, China Key Laboratory of Auditory Speech and Balance Medicine, Key Medical Discipline of Tianjin[Otolaryngology], China Quality Control Centre of Otolaryngology, Tianjin, 300192, China
| | - 磊 耿
- 天津工业大学生命科学学院 天津光电检测技术与系统重点实验室School of Life Sciences, Tianjin University of Technology, Tianjin Key Laboratory of Photoelectric Detection Technology and System
| | - 巍 王
- 天津市第一中心医院耳鼻咽喉头颈外科 天津市耳鼻喉科研究所 天津市听觉言语与平衡医学重点实验室 天津市医学重点学科(耳鼻咽喉科学) 天津市耳鼻喉质量控制中心(天津,300192)Department of Otorhinolaryngology Head and Neck Surgery, Tianjin First Central Hospital, Institute of Otolaryngology of Tianjin, China Key Laboratory of Auditory Speech and Balance Medicine, Key Medical Discipline of Tianjin[Otolaryngology], China Quality Control Centre of Otolaryngology, Tianjin, 300192, China
| |
Collapse
|
7
|
Pierce JL, Tanner K, Merrill RM, Shnowske L, Roy N. A Field-Based Approach to Establish Normative Acoustic Data for Healthy Female Voices. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:691-706. [PMID: 33561361 DOI: 10.1044/2020_jslhr-20-00490] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose The primary aim of this study was to obtain high-quality acoustic normative data in natural field environments for female voices. A secondary aim was to examine acoustic measurement variability in field environments. Method This study employed a within-subject repeated-measures experimental design that included 45 young female adults with normal voices. Participants were stratified by age (18-23, 24-29, and 30-35 years). After initial evaluation and instruction, participants completed voice recordings during seven consecutive days using a standard protocol, including both connected speech and sustained vowels. Thirty-two cepstral-, spectral-, and time-based acoustic measures were acquired using Praat and the Analysis of Dysphonia in Speech and Voice. Results Among the 958 total recordings, greater than 90% satisfied inclusion criteria based on protocol compliance, peak clipping, and signal-to-noise ratio. Significant differences were observed for age (p < .05). For 19 acoustic measures, values improved significantly as signal-to-noise ratio increased. Cepstral- and spectral-based measures demonstrated less measurement variability as compared with time-based measures. Conclusions With adequate training, field audio recordings represent a viable option for clinical voice management. The significant age effects observed in this study support the need for more specific criteria when collecting and applying normative data. Cepstral- and spectral-based measures demonstrated the least measurement variability. This study provides additional evidence for multiparameter acoustic voice measurement, specifically toward ecologically valid sampling in natural environments. Future studies should expand on these findings in other populations with normal and disordered voices.
Collapse
Affiliation(s)
- Jenny L Pierce
- Department of Surgery, The University of Utah, Salt Lake City
- Department of Communication Sciences & Disorders, The University of Utah, Salt Lake City
| | - Kristine Tanner
- Department of Communication Disorders, Brigham Young University, Provo, UT
| | - Ray M Merrill
- Department of Public Health, Brigham Young University, Provo, UT
| | - Lauren Shnowske
- Department of Communication Sciences & Disorders, The University of Utah, Salt Lake City
- Department of Communication Sciences and Disorders, University of Kentucky, Lexington
| | - Nelson Roy
- Department of Communication Sciences & Disorders, The University of Utah, Salt Lake City
| |
Collapse
|
8
|
Park Y, Cádiz MD, Nagle KF, Stepp CE. Perceptual and Acoustic Assessment of Strain Using Synthetically Modified Voice Samples. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3897-3908. [PMID: 33151770 PMCID: PMC8608200 DOI: 10.1044/2020_jslhr-20-00294] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 07/23/2020] [Accepted: 08/17/2020] [Indexed: 06/11/2023]
Abstract
Purpose Assessment of strained voice quality is difficult due to the weak reliability of auditory-perceptual evaluation and lack of strong acoustic correlates. This study evaluated the contributions of relative fundamental frequency (RFF) and mid-to-high frequency noise to the perception of strain. Method Stimuli were created using recordings of speakers producing /ifi/ with a comfortable voice and with maximum vocal effort. RFF values of the comfortable voice samples were synthetically lowered, and RFF values of the maximum vocal effort samples were synthetically raised. Mid-to-high frequency noise was added to the samples. Twenty listeners rated strain in a visual sort-and-rate task. The effects of RFF modification and added noise on strain were assessed using an analysis of variance; intra- and interrater reliability were compared with and without noise. Results Lowering RFF in the comfortable voice samples increased their perceived strain, whereas raising RFF in the maximum vocal effort samples decreased their strain. Adding noise increased strain and decreased intra- and interrater reliability relative to samples without added noise. Conclusions Both RFF and mid-to-high frequency noise contribute to the perception of strain. The presence of dysphonia may decrease the reliability of auditory-perceptual evaluation of strain, which supports the need for complementary objective assessments. Supplemental Material https://doi.org/10.23641/asha.13172252.
Collapse
Affiliation(s)
- Yeonggwang Park
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Manuel Díaz Cádiz
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Kathleen F. Nagle
- Department of Speech-Language Pathology, Seton Hall University, South Orange, NJ
| | - Cara E. Stepp
- Department of Speech, Language and Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
- Department of Otolaryngology – Head and Neck Surgery, Boston University School of Medicine, MA
| |
Collapse
|
9
|
Murton O, Hillman R, Mehta D. Cepstral Peak Prominence Values for Clinical Voice Evaluation. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2020; 29:1596-1607. [PMID: 32658592 PMCID: PMC7893528 DOI: 10.1044/2020_ajslp-20-00001] [Citation(s) in RCA: 79] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Revised: 03/05/2020] [Accepted: 04/20/2020] [Indexed: 05/24/2023]
Abstract
Purpose The goal of this study was to employ frequently used analysis methods and tasks to identify values for cepstral peak prominence (CPP) that can aid clinical voice evaluation. Experiment 1 identified CPP values to distinguish speakers with and without voice disorders. Experiment 2 was an initial attempt to estimate auditory-perceptual ratings of overall dysphonia severity using CPP values. Method CPP was computed using the Analysis of Dysphonia in Speech and Voice (ADSV) program and Praat. Experiment 1 included recordings from 295 patients with medically diagnosed voice disorders and 50 vocally healthy control speakers. Speakers produced sustained /a/ vowels and the English language Rainbow Passage. CPP cutoff values that best distinguished patient and control speakers were identified. Experiment 2 analyzed recordings from 32 English speakers with varying dysphonia severity and provided preliminary validation of the Experiment 1 cutoffs. Speakers sustained the /a/ vowel and read four sentences from the Consensus Auditory-Perceptual Evaluation of Voice protocol. Trained listeners provided auditory-perceptual ratings of overall dysphonia for the recordings, which were estimated using CPP values in a linear regression model whose performance was evaluated using the coefficient of determination (r 2). Results Experiment 1 identified CPP cutoff values of 11.46 dB (ADSV) and 14.45 dB (Praat) for the sustained /a/ vowels and 6.11 dB (ADSV) and 9.33 dB (Praat) for the Rainbow Passage. CPP values below those thresholds indicated the presence of a voice disorder with up to 94.5% accuracy. In Experiment 2, CPP values estimated ratings of overall dysphonia with r 2 values up to .74. Conclusions The CPP cutoff values identified in Experiment 1 provide normative reference points for clinical voice evaluation based on sustained /a/ vowels and the Rainbow Passage. Experiment 2 provides an initial predictive framework that can be used to relate CPP values to the auditory perception of overall dysphonia severity based on sustained /a/ vowels and Consensus Auditory-Perceptual Evaluation of Voice sentences.
Collapse
Affiliation(s)
- Olivia Murton
- Speech and Hearing Bioscience and Technology, Division of Medical Sciences, Harvard Medical School, Boston, MA
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston
| | - Robert Hillman
- Speech and Hearing Bioscience and Technology, Division of Medical Sciences, Harvard Medical School, Boston, MA
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston
- MGH Institute of Health Professions, Boston, MA
- Department of Surgery, Harvard Medical School, Boston, MA
| | - Daryush Mehta
- Speech and Hearing Bioscience and Technology, Division of Medical Sciences, Harvard Medical School, Boston, MA
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston
- MGH Institute of Health Professions, Boston, MA
- Department of Surgery, Harvard Medical School, Boston, MA
| |
Collapse
|
10
|
Alharbi GG, Cannito MP, Buder EH, Awan SN. Spectral/Cepstral Analyses of Phonation in Parkinson's Disease before and after Voice Treatment: A Preliminary Study. Folia Phoniatr Logop 2019; 71:275-285. [PMID: 31117110 DOI: 10.1159/000495837] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Accepted: 11/27/2018] [Indexed: 11/19/2022] Open
Abstract
PURPOSE This article examines cepstral/spectral analyses of sustained /α/ vowels produced by speakers with hypokinetic dysarthria secondary to idiopathic Parkinson's disease (PD) before and after Lee Silverman Voice Treatment (LSVT®LOUD) and the relationship of these measures with overall voice intensity. METHODOLOGY Nine speakers with PD were examined in a pre-/post-treatment design, with multiple daily audio recordings before and after treatment. Sustained vowels were analyzed for cepstral peak prominence (CPP), CPP standard deviation (CPP SD), low/high spectral ratio (L/H SR), and Cepstral/Spectral Index of Dysphonia (CSID) using the KAYPENTAX computer software. RESULTS CPP and CPP SD increased significantly and CSID decreased significantly from pre- to post-treatment recordings, with strong effect sizes. Increased CPP indicates increased dominance of harmonics in the spectrum following LSVT. After restricting the frequency cutoff to the region just above the first formant and second formant and below the third formant, L/H SR was observed to decrease significantly following treatment. Correlation analyses demonstrated that CPP was more strongly associated with CSID before treatment than after. CONCLUSION In addition to increased vocal intensity following LSVT, speakers with PD exhibited both improved harmonic structure and voice quality as reflected by cepstral/spectral analysis, indicating that there was improved harmonic structure and reduced dysphonia following treatment.
Collapse
Affiliation(s)
- Ghadah G Alharbi
- Department of Special Education, College of Education, University of Jeddah, Jeddah, Saudi Arabia,
| | - Michael P Cannito
- Department of Communicative Disorders, University of Louisiana, Lafayette, Louisiana, USA
| | - Eugene H Buder
- School of Communication Sciences and Disorders, University of Memphis, Memphis, Tennessee, USA
| | - Shaheen N Awan
- Department of Communication Sciences and Disorders, Bloomsburg University of Pennsylvania, Bloomsburg, Pennsylvania, USA
| |
Collapse
|