1
|
Nguyen DD, Novakovic D, Madill C. Voice disorder discrimination using vowel acoustic measures in female speakers. INTERNATIONAL JOURNAL OF LANGUAGE & COMMUNICATION DISORDERS 2024. [PMID: 38884559 DOI: 10.1111/1460-6984.13081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 05/19/2024] [Indexed: 06/18/2024]
Abstract
BACKGROUND Sustained vowels are important vocal tasks that have been investigated in discriminating voice disorders using acoustic analysis. To date, no study has combined vowel acoustic measures only that evaluate major aspects of the pathological voice signals in voice disorder discrimination. AIMS To investigate the value of vowel acoustic measures that quantify glottal noise, signal stability, signal periodicity, spectral slope and overall voice quality in discriminating female speakers with and without voice disorders. METHODS & PROCEDURES Sustained vowel /ɑ/ samples were extracted from 133 voice-disordered female patients and 97 non-voice disordered female speakers and were signal typed prior to analysis. Praat software was used to measure harmonics-to-noise ratio (HNR), glottal-to-noise excitation ratio (GNE), the standard deviation of fundamental frequency (F0SD) and cepstral peak prominence (CPPp); and the Analysis of Dysphonia in Speech and Voice (ADSV) program was used to measure CPPadsv, low/high spectral ratio (LH) and the cepstral/spectral index of dysphonia (CSID). Outcome measures included sensitivity, specificity, and discrimination accuracy. OUTCOMES & RESULTS As individual acoustic measures, only spectral-based measures showed good (CPPadsv) and acceptable (CSID) discrimination results. The HNR, GNE and CPPp measures had acceptable sensitivity but poor or non-acceptable specificity and discrimination accuracy. Logistic regression models with all Praat measures (F0SD, HNR, GNE, CPPp) plus ADSV measures (CPPadsv, LH or CSID) provided excellent sensitivity, good-to-excellent specificity and excellent discrimination accuracy. ROC analysis for all individual measures showed that CPPadsv, CSID, CPPp, GNE and F0SD had the highest area under the curve (AUC) values. CONCLUSIONS & IMPLICATIONS A combination of acoustic measures that evaluate the major aspects of vocal dysfunction resulted in good to excellent voice discrimination outcomes. Individual acoustic measures had lower discrimination ability than combined measures. The findings implied that acoustic measures extracted from a prolonged vowel were useful in voice disorder discrimination. WHAT THIS PAPER ADDS What is already known on this subject Acoustic measures hold great value in discriminating voice disorders from normal voices. However, no study has evaluated discrimination values of a combination of sustained vowel acoustic measures that quantify additive noise, signal stability, signal periodicity, spectral slope and overall voice quality in single-gender cohorts. Previous studies have not used signal typing (the classification of the acoustic signals) for time-based measures, impacting the reliability of discrimination. What this study adds to the existing knowledge This study was the first to implement signal typing to include sustained vowel samples of Types 1 and 2 signals for discrimination statistics. We showed that a combination of vocal acoustic measures using time- and spectral-based extraction from the sustained /ɑ/ vowel evaluating additive noise, signal stability, signal periodicity, spectral slope and overall voice quality resulted in good to excellent sensitivity, specificity and discrimination accuracy. As individual measures, traditional time-based measures such as HNR had rather limited discrimination values whilst spectral-based measures provided higher discrimination values. Measures that are sensitive to signal types have low discrimination ability. What are the potential or actual clinical implications of this work? The sustained vowel /ɑ/ is a relevant, universal vocal task for clinical application using acoustic measures to discriminate female speakers with and without voice disorders if signal typing is implemented. Clinical voice assessment using vowels may not be effective if relying solely on time-based measurements. Spectral-based measures perform better in voice disorder discrimination given their insensitivity to signal types. The most effective voice disorder discrimination could only be obtained using a combination of acoustic measures that quantify major phenomena in the signals of disordered voices. Using measures extracted from both programs, Praat and ADSV, is useful given that specific settings in a program may impact on discrimination accuracy.
Collapse
Affiliation(s)
- Duy Duong Nguyen
- Voice Research Laboratory, Sydney School of Health Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
| | - Daniel Novakovic
- Voice Research Laboratory, Sydney School of Health Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
| | - Catherine Madill
- Voice Research Laboratory, Sydney School of Health Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
| |
Collapse
|
2
|
Toles LE, Shembel AC. Acoustic and Physiologic Correlates of Vocal Effort in Individuals With and Without Primary Muscle Tension Dysphonia. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2024; 33:237-247. [PMID: 37931092 PMCID: PMC11000796 DOI: 10.1044/2023_ajslp-23-00159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 08/23/2023] [Accepted: 09/16/2023] [Indexed: 11/08/2023]
Abstract
OBJECTIVES The aims of this study were to determine relationships between vocal effort and (a) acoustic correlates of vocal output and (b) supraglottic compression in individuals with primary muscle tension dysphonia (pMTD) and without voice disorders (controls) in the context of a vocal load challenge. METHOD Twenty-six individuals with pMTD and 35 vocally healthy controls participated in a 30-min vocal load challenge. The pre- and postload relationships among self-ratings of vocal effort, various acoustic voice measures, and supraglottic compression (mediolateral and anteroposterior) were tested with multiple regression models and post hoc Pearson's correlations. Acoustic measures included cepstral peak prominence (CPP), low-to-high spectral ratio, difference in intensity between the first two harmonics, fundamental frequency, and sound pressure level (dB SPL). RESULTS Regression models for CPP and mediolateral compression were statistically significant. Vocal effort, diagnosis of pMTD, and vocal demand were each significant variables influencing CPP measures. CPP was lower in the pMTD group across stages. There was no statistical change in CPP following the vocal load challenge within either group, but both groups had an increase in vocal effort postload. Vocal effort and diagnosis influenced the mediolateral compression model. Mediolateral compression was higher in the pMTD group across stages and had a negative relationship with vocal effort, but it did not differ after vocal loading. CONCLUSIONS CPP and mediolateral supraglottic compression were influenced by vocal effort and diagnosis of pMTD. Increased vocal effort was associated with lower CPP, particularly after vocal load, and decreased mediolateral supraglottic compression in the pMTD group.
Collapse
Affiliation(s)
- Laura E. Toles
- Department of Otolaryngology–Head and Neck Surgery, The University of Texas Southwestern Medical Center, Dallas
| | - Adrianna C. Shembel
- Department of Otolaryngology–Head and Neck Surgery, The University of Texas Southwestern Medical Center, Dallas
- School of Behavioral and Brain Sciences, Department of Speech, Language, and Hearing, The University of Texas at Dallas, Richardson
| |
Collapse
|
3
|
Dimos K, He L, Dellwo V. Shouting affects temporal properties of the speech amplitude envelope. JASA EXPRESS LETTERS 2024; 4:015202. [PMID: 38169314 DOI: 10.1121/10.0023995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 11/27/2023] [Indexed: 01/05/2024]
Abstract
Distinguishing shouted from non-shouted speech is crucial in communication. We examined how shouting affects temporal properties of the amplitude envelope (ENV) in a total of 720 sentences read by 18 Swiss German speakers in normal and shouted modes; shouting was characterised by maintaining sound pressure levels of ≥80 dB sound pressure level (dB-SPL) (C-weighted) at a 1-meter distance from the mouth. Generalized additive models revealed significant temporal alterations of ENV in shouted speech, marked by steeper ascent, delayed peak, and extended high levels. These findings offer potential cues for identifying shouting, particularly useful when fine-structure and dynamic range cues are absent, for example, in cochlear implant users.
Collapse
Affiliation(s)
- Kostis Dimos
- Department of Computational Linguistics, University of Zurich, Zurich, , ,
| | - Lei He
- Department of Computational Linguistics, University of Zurich, Zurich, , ,
| | - Volker Dellwo
- Department of Computational Linguistics, University of Zurich, Zurich, , ,
| |
Collapse
|
4
|
Cacace AT, Berri B. Blast Overpressures as a Military and Occupational Health Concern. Am J Audiol 2023; 32:779-792. [PMID: 37713532 DOI: 10.1044/2023_aja-23-00125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/17/2023] Open
Abstract
PURPOSE This tutorial reviews effects of environmental stressors like blast overpressures and other well-known acoustic contaminants (continuous, intermittent, and impulsive noise) on hearing, tinnitus, vestibular, and balance-related functions. Based on the overall outcome of these effects, detailed consideration is given to the health and well-being of individuals. METHOD Because hearing loss and tinnitus are consequential in affecting quality of life, novel neuromodulation paradigms are reviewed for their positive abatement and treatment-related effects. Examples of clinical data, research strategies, and methodological approaches focus on repetitive transcranial magnetic stimulation (rTMS) and electrical stimulation of the vagus nerve paired with tones (VNSt) for their unique contributions to this area. RESULTS Acoustic toxicants transmitted through the atmosphere are noteworthy for their propensity to induce hearing loss and tinnitus. Mounting evidence also indicates that high-level rapid onset changes in atmospheric sound pressure can significantly impact vestibular and balance function. Indeed, the risk of falling secondary to loss of, or damage to, sensory receptor cells in otolith organs (utricle and saccule) is a primary reason for this concern. As part of the complexities involved in VNSt treatment strategies, vocal dysfunction may also manifest. In addition, evaluation of temporospatial gait parameters is worthy of consideration based on their ability to detect and monitor incipient neurological disease, cognitive decline, and mortality. CONCLUSION Highlighting these respective areas underscores the need to enhance information exchange among scientists, clinicians, and caregivers on the benefits and complications of these outcomes.
Collapse
Affiliation(s)
- Anthony T Cacace
- Department of Communication Sciences & Disorders, Wayne State University, Detroit, MI
| | - Batoul Berri
- Department of Communication Sciences & Disorders, Wayne State University, Detroit, MI
- Department of Otolaryngology, University of Michigan, Ann Arbor
| |
Collapse
|
5
|
Whitling S, Wan Q, Berardi ML, Hunter EJ. Effects of warm-up exercises on self-assessed vocal effort. LOGOP PHONIATR VOCO 2023; 48:172-179. [PMID: 35713650 PMCID: PMC10020864 DOI: 10.1080/14015439.2022.2075459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 04/29/2022] [Indexed: 10/18/2022]
Abstract
PURPOSE An elevated sense of vocal effort due to increased vocal demand is frequently reported by patients with voice disorders. However, effects of vocal warm-up on self-assessed vocal effort have not been thoroughly examined. A recently developed version of the Borg CR-10 Scale facilitates vocal effort assessments, following different vocal warm-up tasks. METHODS Effects of a short (5 min) vocal warm-up on self-assessed vocal effort was evaluated using the Borg CR-10. Twenty-six vocally healthy participants (13F, 13M, mean age 22.6), in two randomised groups, underwent sessions of either reading aloud or semi-occluded vocal tract exercises (SOVTE). Vocal effort was evaluated at four times: pre to post vocal warm-up and two silence periods. Non-parametric analyses for repeated measures and calculations for within-subject standard deviation were applied in group comparisons. RESULTS Following vocal warm-up, vocal effort ratings were increased to a statistically significant degree in both intervention groups compared to baseline ratings. After a 5-min rest in silence following completion of the vocal warm-up, vocal effort ratings returned to baseline levels in both groups. The drop in ratings immediately post warm-up compared to 5 min later was statistically significant for the SOVTE group. CONCLUSIONS Five minutes of vocal warm-up caused increased self-perceived vocal effort in vocally healthy individuals. The increased sense of effort dissipated faster following warm-up for the SOVTE group. When using the Borg CR-10 scale to track vocal effort, it may be beneficial to apply experience-based anchors.
Collapse
Affiliation(s)
- Susanna Whitling
- Department of Logopedics Phoniatrics and Audiology, Lund University, Lund, Sweden
| | - Qin Wan
- School of Education Science, East China Normal University, Shanghai, China
| | | | - Eric J. Hunter
- Department of Clinical Sciences and Disorders, Michigan State University, East Lansing, USA
| |
Collapse
|
6
|
Park Y, Baker Brehm S, Kelchner L, Weinrich B, McElfresh K, Anand S, Shrivastav R, de Alarcon A, Eddins DA. Effects of Vibratory Source on Auditory-Perceptual and Bio-Inspired Computational Measures of Pediatric Voice Quality. J Voice 2023:S0892-1997(23)00254-0. [PMID: 37739862 PMCID: PMC10950844 DOI: 10.1016/j.jvoice.2023.08.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/11/2023] [Accepted: 08/14/2023] [Indexed: 09/24/2023]
Abstract
OBJECTIVE The vibratory source for voicing in children with dysphonia is classified into three categories including a glottal vibratory source (GVS) observed in those with vocal lesions or hyperfunction; supraglottal vibratory sources (SGVS) observed secondary to laryngeal airway injuries, malformations, or reconstruction surgeries; and a combination of both glottal and supraglottal vibratory sources called mixed vibratory source (MVS). This study evaluated the effects of vibratory source on three primary dimensions of voice quality (breathiness, roughness, and strain) in children with GVS, SGVS, and MVS using single-variable matching tasks and computational measures obtained from bio-inspired auditory models. METHODS A total of 44 dysphonic voice samples from children aged 4-11 years were selected. Seven listeners rated breathiness, roughness, and strain of 1000-ms /ɑ/ samples using single-variable matching tasks. Computational estimates of pitch strength, amplitude modulation filterbank output, and sharpness were obtained through custom-designed MATLAB algorithms. RESULTS Perceived roughness and strain were significantly higher in children with SGVS and MVS compared to children with GVS. Among the computational measures, only the modulation filterbank output resulted in significant differences among vibratory sources; a posthoc test revealed that children with SGVS had greater amplitude modulation than children with GVS, as expected from their rougher voice quality. CONCLUSIONS The results indicate that the output of an auditory amplitude modulation filterbank model may capture characteristics of SGVS that are strongly related to the rough voice quality.
Collapse
Affiliation(s)
- Yeonggwang Park
- Department of Communication Sciences and Disorders, University of Central Florida, Orlando, Florida.
| | - Susan Baker Brehm
- Department of Speech Pathology and Audiology, Miami University, Oxford, Ohio; Division of Speech-Language Pathology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - Lisa Kelchner
- Division of Speech-Language Pathology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio; Department of Communication Sciences and Disorders, University of Cincinnati, Cincinnati, Ohio
| | - Barbara Weinrich
- Department of Speech Pathology and Audiology, Miami University, Oxford, Ohio; Division of Speech-Language Pathology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - Kevin McElfresh
- Division of Speech-Language Pathology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - Supraja Anand
- Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida
| | - Rahul Shrivastav
- Office of the Provost & Executive Vice President, Indiana University, Bloomington, Indiana
| | - Alessandro de Alarcon
- Pediatric Otolaryngology Head & Neck Surgery, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - David A Eddins
- Department of Communication Sciences and Disorders, University of Central Florida, Orlando, Florida
| |
Collapse
|
7
|
McKenna VS, Patel TH, Kendall CL, Howell RJ, Gustin RL. Voice Acoustics and Vocal Effort in Mask-Wearing Healthcare Professionals: A Comparison Pre- and Post-Workday. J Voice 2023; 37:802.e15-802.e23. [PMID: 34112547 DOI: 10.1016/j.jvoice.2021.04.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 04/20/2021] [Accepted: 04/27/2021] [Indexed: 01/17/2023]
Abstract
OBJECTIVE We evaluated voice acoustics and self-perceptual ratings in healthcare workers required to wear face masks throughout their workday. METHODS Eighteen subjects (11 cisgender female, 7 cisgender male; M = 33.72 years, SD = 8.30) completed self-perceptual ratings and acoustic recordings before and after a typical workday. Chosen measures were specific to vocal effort, dysphonia, and laryngeal tension. Mixed effects models were calculated to determine the impact of session, mask type, sex, and their interactions on the set of perceptual and acoustic measures. RESULTS The subjects self-reported a significant increase in vocal effort following the workday. These perceptual changes coincided with an increase in vocal intensity and harmonics-to-noise ratio, but decrease in relative fundamental frequency offset 10. As expected, men and women differed in measures related to fundamental frequency and vocal tract length. CONCLUSION Healthcare professionals wearing masks reported greater vocal symptoms post-workday compared to pre-workday. These symptoms coincided with acoustic changes previously related to vocal effort; however, the degree of change was considered mild. Further research is needed to determine whether vocal hygiene strategies may reduce vocal symptoms in mask-wearing workers.
Collapse
Affiliation(s)
- Victoria S McKenna
- Department of Communication Sciences and Disorders, University of Cincinnati; Department of Biomedical Engineering, University of Cincinnati.
| | - Tulsi H Patel
- Department of Communication Sciences and Disorders, University of Cincinnati
| | - Courtney L Kendall
- Department of Communication Sciences and Disorders, University of Cincinnati
| | - Rebecca J Howell
- Department of Otolaryngology-Head & Neck Surgery, University of Cincinnati
| | - Renee L Gustin
- Department of Otolaryngology-Head & Neck Surgery, University of Cincinnati
| |
Collapse
|
8
|
Castillo-Allendes A, Guzmán-Ferrada D, Hunter EJ, Fuentes-López E. Tracking Occupational Voice State with a Visual Analog Scale: Voice Quality, Vocal Fatigue, and Effort. Laryngoscope 2023; 133:1676-1682. [PMID: 36134759 DOI: 10.1002/lary.30398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 08/03/2022] [Accepted: 08/17/2022] [Indexed: 11/06/2022]
Abstract
BACKGROUND Due to elevated vocal health risk in industries such as call centers, there is a need to have accessible and quick self-report tools for voice symptoms. This study aimed to determine if the concurrent and construct validity of three visual analog scales (VASs) of voice quality and symptoms could be used as a screening tool in call center agents. METHODS A cross-sectional study was carried out in three call center companies. The Voice Handicap Index-10 (VHI-10) and a vocal hygiene and symptoms survey were administered to 66 call center workers. Further, acoustic parameters including harmonics-to-noise ratio (HNR), smoothed cepstral peak prominence (CPPs), L1-L0 slope, and Alpha ratio were collected. Finally, workers completed three VASs capturing self-perception of vocal effort (VAS-1), voice quality (VAS-2), and vocal fatigue (VAS-3). Linear regression models with bootstrapping evaluated the possible relationship between the three VASs measurements, self-perceived vocal symptoms, and acoustic parameters. RESULTS VAS-1 scores were associated with HNR and voice breaks, VAS-2 with voice breaks, and VAS-3 with Alpha ratio. Using the area under a receiver operating characteristic curve (AUC), the highest AUC for detecting an altered VHI-10 questionnaire score was observed for the three VASs. Also, the highest AUC for detecting altered CPPs was reached for the VAS-1. CONCLUSIONS VAS as a self-report instrument of vocal symptoms is related to psychosocial voice impairment and alterations of acoustic voice parameters in call center workers. Such instruments could be easily implemented to identify voice complaints in these populations. LEVEL OF EVIDENCE 2 (Diagnosis research question) Laryngoscope, 133:1676-1682, 2023.
Collapse
Affiliation(s)
- Adrián Castillo-Allendes
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan, USA
| | - Daniel Guzmán-Ferrada
- Escuela de Fonoaudiología, Facultad de Ciencias de la Salud, Universidad Bernardo O'Higgins, Santiago, Chile
| | - Eric J Hunter
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan, USA
| | - Eduardo Fuentes-López
- Carrera de Fonoaudiología, Departamento de Ciencias de la Salud, Facultad de Medicina, Pontificia, Universidad Católica de Chile, Santiago, Chile
| |
Collapse
|
9
|
Nguyen DD, Madill C. Auditory-perceptual Parameters as Predictors of Voice Acoustic Measures. J Voice 2023:S0892-1997(23)00088-7. [PMID: 37003863 DOI: 10.1016/j.jvoice.2023.02.030] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 02/23/2023] [Accepted: 02/23/2023] [Indexed: 04/03/2023]
Abstract
BACKGROUND Much research has examined the relationship between perceptual and acoustic measures. However, little is known about the prediction values of perceptual measures on an acoustic parameter. AIMS This study utilized simulated and disordered voice samples to investigate the prediction values of breathiness, roughness, and strain ratings on the selection of some time-based and spectral-based measures of voice quality. METHOD This study retrospectively analysed two sets of precollected data. The experimental data had been collected from nine trained speakers manipulating false vocal fold activity, true vocal fold mass, and larynx height. The voice-disordered data had been extracted from a clinical database for 68 patients with muscle tension voice disorders (MTVD). Both data sets had been perceptually rated for breathiness, roughness, and strain. Voice samples (prolonged vowel /ɑ/ and Rainbow Passage readings) had undergone acoustic analysis using Praat for harmonics-to-noise ratio (HNR) and the program "Analysis of Dysphonia in Speech and Voice" (ADSV) for cepstral peak prominence (CPP), Cepstral/Spectral Index of Dysphonia (CSID), and Low/High spectral ratio (L/H ratio). Perceptual parameters were regressed against these acoustic measures to test their prediction values. RESULTS Reliability data showed satisfactory intra- and inter-reliability of perceptual ratings for both data sets. Breathiness significantly predicted CPP (both vocal tasks) and CSID (Rainbow Passage) in experimental data and predicted all the acoustic measures in MTVD data. Roughness significantly predicted HNR, CPP, and CSID in experimental data, and CPP (Rainbow Passage) and CSID (both vocal tasks) in MTVD data. Strain (both vocal tasks) significantly predicted L/H ratio in both data sets. CONCLUSIONS Breathiness ratings predicted selection of HNR, CPP and CSID; roughness ratings predicted selection of CPP and CSID, and strain ratings predicted L/H ratio.
Collapse
Affiliation(s)
- Duy Duong Nguyen
- Voice Research Laboratory, Sydney School of Health Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Catherine Madill
- Voice Research Laboratory, Sydney School of Health Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia.
| |
Collapse
|
10
|
Park Y, Anand S, Gifford SM, Shrivastav R, Eddins DA. Development and Validation of a Single-Variable Comparison Stimulus for Matching Strained Voice Quality Using a Psychoacoustic Framework. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:16-29. [PMID: 36516473 PMCID: PMC10023177 DOI: 10.1044/2022_jslhr-22-00280] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 08/17/2022] [Accepted: 09/01/2022] [Indexed: 06/17/2023]
Abstract
PURPOSE Acoustic and perceptual quantification of vocal strain has been a vexing problem for years. To increase measurement rigor, a suitable single-variable matching stimulus for strain was developed and validated, based on the matching stimulus used previously for breathy and rough voice qualities. METHOD A set of 21 comparison stimuli for a single-variable matching task (SVMT) was synthesized based on a speech-shaped sawtooth waveform mixed with speech-shaped noise. Variable bandpass filter gain in mid-to-high frequencies achieved a wide range of computed sharpness (in constant sharpness steps) and served as the independent variable for the SVMT. Ten natural /ɑ/ stimuli with a wide range of the primary voice quality of strain and a minimum of breathiness or roughness were selected and assessed using the SVMT. Natural voice samples and synthetic comparison stimuli were also assessed using a perceptual magnitude estimation (ME) task. RESULTS ME data validated the correspondence of the set of comparison stimuli to varying perceived strain. Perceived strain magnitudes of the comparison stimuli increased significantly and linearly with computed sharpness (r 2 = .99). A linear regression revealed that strain matching values were significantly predicted by computed sharpness (r 2 = .96) and perceived strain magnitudes (r 2 = .95) of the natural voice stimuli. CONCLUSION The perception of vocal strain is strongly associated with computed sharpness and is captured accurately and precisely using an SVMT, in which the independent variable is the bandpass filter gain (in steps of equal sharpness) applied to the comparison stimuli.
Collapse
Affiliation(s)
- Yeonggwang Park
- Department of Communication Sciences & Disorders, University of South Florida, Tampa
| | - Supraja Anand
- Department of Communication Sciences & Disorders, University of South Florida, Tampa
| | - Sophia M. Gifford
- Department of Communication Sciences & Disorders, University of South Florida, Tampa
| | - Rahul Shrivastav
- Office of the Provost & Executive Vice President, Indiana University, Bloomington
| | - David A. Eddins
- Department of Communication Sciences & Disorders, University of South Florida, Tampa
| |
Collapse
|
11
|
Harding EE, Gaudrain E, Hrycyk IJ, Harris RL, Tillmann B, Maat B, Free RH, Başkent D. Musical Emotion Categorization with Vocoders of Varying Temporal and Spectral Content. Trends Hear 2023; 27:23312165221141142. [PMID: 36628512 PMCID: PMC9837297 DOI: 10.1177/23312165221141142] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
While previous research investigating music emotion perception of cochlear implant (CI) users observed that temporal cues informing tempo largely convey emotional arousal (relaxing/stimulating), it remains unclear how other properties of the temporal content may contribute to the transmission of arousal features. Moreover, while detailed spectral information related to pitch and harmony in music - often not well perceived by CI users- reportedly conveys emotional valence (positive, negative), it remains unclear how the quality of spectral content contributes to valence perception. Therefore, the current study used vocoders to vary temporal and spectral content of music and tested music emotion categorization (joy, fear, serenity, sadness) in 23 normal-hearing participants. Vocoders were varied with two carriers (sinewave or noise; primarily modulating temporal information), and two filter orders (low or high; primarily modulating spectral information). Results indicated that emotion categorization was above-chance in vocoded excerpts but poorer than in a non-vocoded control condition. Among vocoded conditions, better temporal content (sinewave carriers) improved emotion categorization with a large effect while better spectral content (high filter order) improved it with a small effect. Arousal features were comparably transmitted in non-vocoded and vocoded conditions, indicating that lower temporal content successfully conveyed emotional arousal. Valence feature transmission steeply declined in vocoded conditions, revealing that valence perception was difficult for both lower and higher spectral content. The reliance on arousal information for emotion categorization of vocoded music suggests that efforts to refine temporal cues in the CI user signal may immediately benefit their music emotion perception.
Collapse
Affiliation(s)
- Eleanor E. Harding
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen,
The Netherlands,Graduate School of Medical Sciences, Research School of Behavioural
and Cognitive Neurosciences, University of Groningen, Groningen,
The Netherlands,Prins Claus Conservatoire, Hanze University of Applied Sciences, Groningen, The Netherlands,Eleanor E. Harding, Department of Otorhinolarynology, University Medical Center Groningen, Hanzeplein 1 9713 GZ, Groningen, The Netherlands.
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen,
The Netherlands,Lyon Neuroscience Research Center, CNRS UMR5292, Inserm U1028, Université Lyon 1, Université de Saint-Etienne, Lyon, France
| | - Imke J. Hrycyk
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen,
The Netherlands,Graduate School of Medical Sciences, Research School of Behavioural
and Cognitive Neurosciences, University of Groningen, Groningen,
The Netherlands
| | - Robert L. Harris
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen,
The Netherlands,Prins Claus Conservatoire, Hanze University of Applied Sciences, Groningen, The Netherlands
| | - Barbara Tillmann
- Lyon Neuroscience Research Center, CNRS UMR5292, Inserm U1028, Université Lyon 1, Université de Saint-Etienne, Lyon, France
| | - Bert Maat
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen,
The Netherlands,Graduate School of Medical Sciences, Research School of Behavioural
and Cognitive Neurosciences, University of Groningen, Groningen,
The Netherlands,Cochlear Implant Center Northern Netherlands, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Rolien H. Free
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen,
The Netherlands,Graduate School of Medical Sciences, Research School of Behavioural
and Cognitive Neurosciences, University of Groningen, Groningen,
The Netherlands,Cochlear Implant Center Northern Netherlands, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen,
The Netherlands,Graduate School of Medical Sciences, Research School of Behavioural
and Cognitive Neurosciences, University of Groningen, Groningen,
The Netherlands
| |
Collapse
|
12
|
Groll MD, Peterson SD, Zañartu M, Vojtech JM, Stepp CE. Empirical Evaluation of the Role of Vocal Fold Collision on Relative Fundamental Frequency in Voicing Offset. J Voice 2022:S0892-1997(22)00291-0. [PMID: 36336485 PMCID: PMC10154433 DOI: 10.1016/j.jvoice.2022.09.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 09/16/2022] [Accepted: 09/19/2022] [Indexed: 11/06/2022]
Abstract
OBJECTIVES Relative fundamental frequency (RFF) is an acoustic measure of changes in fundamental frequency during voicing transitions. The physiological mechanisms underlying RFF remain unclear. Recent modeling suggests that changes in RFF during voicing offset are due to decreases in overall system stiffness as a direct result of the cessation of vocal fold collision. To evaluate this finding empirically, here we examined whether variable timing between the end of vocal fold collision and the final voicing cycle used to calculate RFF explained the variability in RFF across individual voicing offset utterances. METHODS RFF during voicing offset was calculated from /ifi/ utterances produced by 35 participants under endoscopy, with and without vocal effort. RFF was calculated via two methods, in which utterances were aligned by (1) the end of vocal fold collision, or (2) the end of voicing. Analyses of variance were used to determine the effects of vocal effort and RFF method on the mean and standard deviation of RFF. RESULTS Aligning by vocal fold collision resulted in statistically significantly lower standard deviations. RFF means were statistically higher using the collision method; however, the degree of vocal effort was statistically significant regardless of the method. CONCLUSIONS These results provide empirical evidence to support that decreases in RFF during voicing offset are a result of decreases in system stiffness due to termination of vocal fold collision.
Collapse
Affiliation(s)
- Matti D Groll
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts; Department of Speech, Language, and Hearing Sciences, Boston University, Boston, Massachusetts.
| | - Sean D Peterson
- Department of Mechanical and Mechatronics Engineering, University of Waterloo, Ontario, Canada
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Jennifer M Vojtech
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts; Department of Speech, Language, and Hearing Sciences, Boston University, Boston, Massachusetts
| | - Cara E Stepp
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts; Department of Speech, Language, and Hearing Sciences, Boston University, Boston, Massachusetts; Department of Otolaryngology-Head and Neck Surgery, Boston University School of Medicine, Boston, Massachusetts
| |
Collapse
|
13
|
Cortés JP, Lin JZ, Marks KL, Espinoza VM, Ibarra EJ, Zañartu M, Hillman RE, Mehta DD. Ambulatory Monitoring of Subglottal Pressure Estimated from Neck-Surface Vibration in Individuals with and without Voice Disorders. APPLIED SCIENCES (BASEL, SWITZERLAND) 2022; 12:10692. [PMID: 36777332 PMCID: PMC9910342 DOI: 10.3390/app122110692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
The aerodynamic voice assessment of subglottal air pressure can discriminate between speakers with typical voices from patients with voice disorders, with further evidence validating subglottal pressure as a clinical outcome measure. Although estimating subglottal pressure during phonation is an important component of a standard voice assessment, current methods for estimating subglottal pressure rely on non-natural speech tasks in a clinical or laboratory setting. This study reports on the validation of a method for subglottal pressure estimation in individuals with and without voice disorders that can be translated to connected speech to enable the monitoring of vocal function and behavior in real-world settings. During a laboratory calibration session, a participant-specific multiple regression model was derived to estimate subglottal pressure from a neck-surface vibration signal that can be recorded during natural speech production. The model was derived for vocally typical individuals and patients diagnosed with phonotraumatic vocal fold lesions, primary muscle tension dysphonia, and unilateral vocal fold paralysis. Estimates of subglottal pressure using the developed method exhibited significantly lower error than alternative methods in the literature, with average errors ranging from 1.13 to 2.08 cm H2O for the participant groups. The model was then applied during activities of daily living, thus yielding ambulatory estimates of subglottal pressure for the first time in these populations. Results point to the feasibility and potential of real-time monitoring of subglottal pressure during an individual's daily life for the prevention, assessment, and treatment of voice disorders.
Collapse
Affiliation(s)
- Juan P. Cortés
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso 2390123, Chile
| | - Jon Z. Lin
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Katherine L. Marks
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
- Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA 02129, USA
- Speech, Language & Hearing Sciences Department, College of Health & Rehabilitation: Sargent College, Boston University, Boston, MA 02215, USA
| | | | - Emiro J. Ibarra
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso 2390123, Chile
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso 2390123, Chile
| | - Robert E. Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
- Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA 02129, USA
- Department of Surgery, Massachusetts General Hospital–Harvard Medical School, Boston, MA 02114, USA
- Speech and Hearing Bioscience and Technology, Division of Medical Sciences, Harvard Medical School, Boston, MA 02115, USA
| | - Daryush D. Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
- Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA 02129, USA
- Department of Surgery, Massachusetts General Hospital–Harvard Medical School, Boston, MA 02114, USA
- Speech and Hearing Bioscience and Technology, Division of Medical Sciences, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
14
|
Paschalidou S. Effort inference and prediction by acoustic and movement descriptors in interactions with imaginary objects during Dhrupad vocal improvisation. WEARABLE TECHNOLOGIES 2022; 3:e14. [PMID: 38486912 PMCID: PMC10936277 DOI: 10.1017/wtc.2022.8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 05/05/2022] [Accepted: 05/19/2022] [Indexed: 03/17/2024]
Abstract
In electronic musical instruments (EMIs), the concept of "sound sculpting" was proposed by Mulder, in which imaginary objects are manually sculpted to produce sounds, although promising has had some limitations: driven by pure intuition, only the objects' geometrical properties were mapped to sound, while effort-which is often regarded as a key factor of expressivity in music performance-was neglected. The aim of this paper is to enhance such digital interactions by accounting for the perceptual measure of effort that is conveyed through well-established gesture-sound links in the ecologically valid conditions of non-digital music performances. Thus, it reports on the systematic exploration of effort in Dhrupad vocal improvisation, in which singers are often observed to engage with melodic ideas by manipulating intangible, imaginary objects with their hands. The focus is devising formalized descriptions to infer the amount of effort that such interactions are perceived to require and classify gestures as interactions with elastic versus rigid objects, based on original multimodal data collected in India for the specific study. Results suggest that a good part of variance for both effort levels and gesture classes can be explained through a small set of statistically significant acoustic and movement features extracted from the raw data and lead to rejecting the null hypothesis that effort is unrelated to the musical context. This may have implications on how EMIs could benefit from effort as an intermediate mapping layer and naturally opens discussions on whether physiological data may offer a more intuitive measure of effort in wearable technologies.
Collapse
Affiliation(s)
- Stella Paschalidou
- Hellenic Mediterranean University, School of Music and Optoacoustic Technologies, Department of Music Technology and Acoustics, Greece
| |
Collapse
|
15
|
Dahl KL, François FA, Buckley DP, Stepp CE. Voice and Speech Changes in Transmasculine Individuals Following Circumlaryngeal Massage and Laryngeal Reposturing. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2022; 31:1368-1382. [PMID: 35394801 PMCID: PMC9567379 DOI: 10.1044/2022_ajslp-21-00245] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 01/03/2022] [Accepted: 01/24/2022] [Indexed: 05/26/2023]
Abstract
PURPOSE The purpose of this study was to measure the short-term effects of circumlaryngeal massage and laryngeal reposturing on acoustic and perceptual characteristics of voice in transmasculine individuals. METHOD Fifteen transmasculine individuals underwent one session of sequential circumlaryngeal massage and laryngeal reposturing with a speech-language pathologist. Voice recordings were collected at three time points-baseline, postmassage, and postreposturing. Fundamental frequency (f o), formant frequencies, and relative fundamental frequency (RFF; an acoustic correlate of laryngeal tension) were measured. Estimates of vocal tract length (VTL) were derived from formant frequencies. Twelve listeners rated the perceived masculinity of participants' voices at each time point. Repeated-measures analyses of variance measured the effect of time point on f o, estimated VTL, RFF, and perceived voice masculinity. Significant effects were evaluated with post hoc Tukey's tests. RESULTS Between baseline and end of the session, f o decreased, VTL increased, and participant voices were perceived as more masculine, all with statistically significant differences. RFF did not differ significantly at any time point. Outcomes were highly variable at the individual level. CONCLUSION Circumlaryngeal massage and laryngeal reposturing have short-term effects on select acoustic (f o, estimated VTL) and perceptual characteristics (listener-assigned voice masculinity) of voice in transmasculine individuals. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.19529299.
Collapse
Affiliation(s)
- Kimberly L. Dahl
- Department of Speech, Language & Hearing Sciences, Boston University, MA
| | | | - Daniel P. Buckley
- Department of Speech, Language & Hearing Sciences, Boston University, MA
- Department of Otolaryngology—Head & Neck Surgery, Boston University School of Medicine, MA
| | - Cara E. Stepp
- Department of Speech, Language & Hearing Sciences, Boston University, MA
- Department of Otolaryngology—Head & Neck Surgery, Boston University School of Medicine, MA
- Department of Biomedical Engineering, Boston University, MA
| |
Collapse
|
16
|
Brkic FF, Liu DT, Campion NJ, Leonhard M, Altumbabic S, Korlatovic M, Kaider A, Kabil-Hamidovic J, Brkic F, Vyskocil E. Changes in Acoustic Aspects of Vocal Function in Children After Adenotonsillectomy. J Voice 2022; 36:438.e19-438.e24. [PMID: 32703724 DOI: 10.1016/j.jvoice.2020.06.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 06/17/2020] [Accepted: 06/19/2020] [Indexed: 01/12/2023]
Abstract
BACKGROUND Adenotonsillectomy is one of the most common pediatric surgical procedures. Postoperative voice changes are a very common concern among patient's parents. Therefore, the aim of this study is to analyze acoustic voice parameters after adenotonsillectomy, tonsillectomy, and adenoidectomy in pediatric patients in a tertiary referral academic center. PATIENTS AND METHODS All pediatric patients undergoing an adenotonsillectomy, tonsillectomy or adenoidectomy in a single center from 2002 to 2018 were included in the study. Change of fundamental frequency, jitter, shimmer, and harmonic-noise ratio at first, seventh and 30th postoperative day compared to preoperative values were the primary outcome parameters. Statistical analysis was performed using repeated measures analysis of variance model. RESULTS A total of 1258 patients were included in the study. The mean age of patients at the time of surgery was 8.3 years (range 3.0-18.0 years). Around 698 were male (55.5%) and 560 female (44.5%). The values of fundamental frequency increased significantly after the first and seventh postoperative day (P = 0.001 both) but normalized 1 month after surgery (P = 0.962). At the first postoperative month, values of jitter and shimmer decreased significantly (P = 0.005 and P = 0.002, respectively). Measurements of harmonic-noise ratio revealed a significant increase 30 days after surgery (P = 0.004). CONCLUSION Statistically significant differences in objective voice parameters within the first postoperative month after tonsillectomy, adenoidectomy, and adenotonsillectomy were observed. The fundamental frequency returned to normal 1 month after surgery. These findings can contribute in soothing the concerns of parents regarding postoperative voice changes.
Collapse
Affiliation(s)
- Faris F Brkic
- Department of Otorhinolaryngology, Head and Neck Surgery, Medical University of Vienna, Vienna, Austria
| | - David Tianxiang Liu
- Department of Otorhinolaryngology, Head and Neck Surgery, Medical University of Vienna, Vienna, Austria
| | - Nicholas James Campion
- Department of Otorhinolaryngology, Head and Neck Surgery, Medical University of Vienna, Vienna, Austria
| | - Matthias Leonhard
- Department of Otorhinolaryngology, Head and Neck Surgery, Medical University of Vienna, Vienna, Austria
| | - Selma Altumbabic
- Department of Otorhinolaryngology, University Clinical Center Tuzla, Bosnia and Herzegovina
| | - Mirsada Korlatovic
- Department of Otorhinolaryngology, University Clinical Center Tuzla, Bosnia and Herzegovina
| | - Alexandra Kaider
- Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | | | - Fuad Brkic
- Department of Otorhinolaryngology, University Clinical Center Tuzla, Bosnia and Herzegovina
| | - Erich Vyskocil
- Department of Otorhinolaryngology, Head and Neck Surgery, Medical University of Vienna, Vienna, Austria.
| |
Collapse
|
17
|
Kapsner-Smith MR, Díaz-Cádiz ME, Vojtech JM, Buckley DP, Mehta DD, Hillman RE, Tracy LF, Noordzij JP, Eadie TL, Stepp CE. Clinical Cutoff Scores for Acoustic Indices of Vocal Hyperfunction That Combine Relative Fundamental Frequency and Cepstral Peak Prominence. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:1349-1369. [PMID: 35263546 PMCID: PMC9499364 DOI: 10.1044/2021_jslhr-21-00466] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
PURPOSE This study examined the discriminative ability of acoustic indices of vocal hyperfunction combining smoothed cepstral peak prominence (CPPS) and relative fundamental frequency (RFF). METHOD Demographic, CPPS, and RFF parameters were entered into logistic regression models trained on two 1:1 case-control groups: individuals with and without nonphonotraumatic vocal hyperfunction (NPVH; n = 360) and phonotraumatic vocal hyperfunction (PVH; n = 240). Equations from the final models were used to predict group membership in two independent test sets (n = 100 each). RESULTS Both CPPS and RFF parameters significantly improved model fits for NPVH and PVH after accounting for demographics. CPPS explained unique variance beyond RFF in both models. RFF explained unique variance beyond CPPS in the PVH model. Final models included CPPS and RFF offset parameters for both NPVH and PVH; RFF onset parameters were significant only in the PVH model. Area under the receiver operating characteristic curve analysis for the independent test sets revealed acceptable classification for NPVH (72%) and good classification for PVH (86%). CONCLUSIONS A combination of CPPS and RFF parameters showed better discriminative ability than either measure alone for PVH. Clinical cutoff scores for acoustic indices of vocal hyperfunction are proposed for assessment and screening purposes.
Collapse
Affiliation(s)
| | | | - Jennifer M Vojtech
- Department of Speech, Language & Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
| | - Daniel P Buckley
- Department of Speech, Language & Hearing Sciences, Boston University, MA
- Department of Otolaryngology-Head & Neck Surgery, Boston University School of Medicine, MA
| | - Daryush D Mehta
- MGH Institute of Health Professions, Boston, MA
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston
- Department of Surgery, Harvard Medical School, Cambridge, MA
| | - Robert E Hillman
- MGH Institute of Health Professions, Boston, MA
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston
- Department of Surgery, Harvard Medical School, Cambridge, MA
| | - Lauren F Tracy
- Department of Speech, Language & Hearing Sciences, Boston University, MA
- Department of Otolaryngology-Head & Neck Surgery, Boston University School of Medicine, MA
| | - J Pieter Noordzij
- Department of Speech, Language & Hearing Sciences, Boston University, MA
- Department of Otolaryngology-Head & Neck Surgery, Boston University School of Medicine, MA
| | - Tanya L Eadie
- Department of Speech & Hearing Sciences, University of Washington, Seattle
| | - Cara E Stepp
- Department of Speech, Language & Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
- Department of Otolaryngology-Head & Neck Surgery, Boston University School of Medicine, MA
| |
Collapse
|
18
|
Groll MD, Vojtech JM, Hablani S, Mehta DD, Buckley DP, Noordzij JP, Stepp CE. Automated Relative Fundamental Frequency Algorithms for Use With Neck-Surface Accelerometer Signals. J Voice 2022; 36:156-169. [PMID: 32653267 PMCID: PMC7790853 DOI: 10.1016/j.jvoice.2020.06.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Accepted: 06/04/2020] [Indexed: 10/23/2022]
Abstract
OBJECTIVE Relative fundamental frequency (RFF) has been suggested as a potential acoustic measure of vocal effort. However, current clinical standards for RFF measures require time-consuming manual markings. Previous semi-automated algorithms have been developed to calculate RFF from microphone signals. The current study aimed to develop fully automated algorithms to calculate RFF from neck-surface accelerometer signals for ecological momentary assessment and ambulatory monitoring of voice. METHODS Training a set of 2646 /vowel-fricative-vowel/ utterances from 317 unique speakers, with and without voice disorders, was used to develop automated algorithms to calculate RFF values from neck-surface accelerometer signals. The algorithms first rejected utterances with poor vowel-to-noise ratios, then identified fricative locations, then used signal features to determine voicing boundary cycles, and finally calculated corresponding RFF values. These automated RFF values were compared to the clinical gold-standard of manual RFF calculated from simultaneously collected microphone signals in a novel test set of 639 utterances from 77 unique speakers. RESULTS Automated accelerometer-based RFF values resulted in an average mean bias error (MBE) across all cycles of 0.027 ST, with an MBE of 0.152 ST and -0.252 ST in the offset and onset cycles closest to the fricative, respectively. CONCLUSION All MBE values were smaller than the expected changes in RFF values following successful voice therapy, suggesting that the current algorithms could be used for ecological momentary assessment and ambulatory monitoring via neck-surface accelerometer signals.
Collapse
Affiliation(s)
- Matti D. Groll
- Department of Biomedical Engineering, Boston University, Boston, 02215, Massachusetts,Department of Speech, Language and Hearing Sciences, Boston University, Boston, 02215, Massachusetts
| | - Jennifer M. Vojtech
- Department of Biomedical Engineering, Boston University, Boston, 02215, Massachusetts,Department of Speech, Language and Hearing Sciences, Boston University, Boston, 02215, Massachusetts
| | - Surbhi Hablani
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, 02215, Massachusetts
| | - Daryush D. Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation and MGH Institute of Health Professions, Massachusetts General Hospital, Boston, 02114, Massachusetts,Department of Surgery, Harvard Medical School, Boston, 02144, Massachusetts,Program in Rehabilitation Sciences, MGH Institute of Health Professions, Boston, 02129, Massachusetts,Speech and Hearing Bioscience and Technology Program, Division of Medical Sciences, Harvard Medical School, Boston, 02144, Massachusetts
| | - Daniel P. Buckley
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, 02215, Massachusetts,Department of Otolaryngology – Head and Neck Surgery, Boston University School of Medicine, Boston, 02118, Massachusetts
| | - J. Pieter Noordzij
- Department of Otolaryngology – Head and Neck Surgery, Boston University School of Medicine, Boston, 02118, Massachusetts
| | - Cara E. Stepp
- Department of Biomedical Engineering, Boston University, Boston, 02215, Massachusetts,Department of Speech, Language and Hearing Sciences, Boston University, Boston, 02215, Massachusetts,Department of Otolaryngology – Head and Neck Surgery, Boston University School of Medicine, Boston, 02118, Massachusetts
| |
Collapse
|
19
|
McKenna VS, Kendall CL, Patel TH, Howell RJ, Gustin RL. Impact of Face Masks on Speech Acoustics and Vocal Effort in Healthcare Professionals. Laryngoscope 2022; 132:391-397. [PMID: 34287933 PMCID: PMC8742743 DOI: 10.1002/lary.29763] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 07/07/2021] [Accepted: 07/08/2021] [Indexed: 02/03/2023]
Abstract
OBJECTIVES/HYPOTHESIS We investigated speech acoustics and self-reported vocal symptoms in mask-wearing healthcare professionals. We hypothesized that there would be an attenuation of spectral energies and increase in vocal effort during masked speech compared to unmasked speech. STUDY DESIGN Within and between subject quasi-experimental design. METHODS We prospectively enrolled 21 healthcare providers (13 cisgender female, 8 cisgender male; M = 32.9 years; SD = 7.9 years) and assessed acoustics and perceptual measures with and without a face mask in place. Measurements included: 1) acoustic Vowel Articulation Index (VAI); 2) cepstral and spectral acoustic measures; 3) traditional vocal measures (e.g., fundamental frequency, intensity); 4) relative fundamental frequency (RFF); and 5) self-reported ratings of vocal effort and dyspnea. RESULTS During masked speech, there was a significant reduction in VAI, high-frequency information (>4 kHz), and RFF offset 10, as well as a significant increase in cepstral peak prominence and perceived vocal effort. Further analysis showed that high-frequency attenuation was more pronounced when wearing an N95 mask compared to a simple mask. CONCLUSIONS Face masks pose an additional barrier to effective communication that primarily impacts spectral characteristics, vowel space measures, and vocal effort. Future work should evaluate how long-term mask use impacts vocal health and may contribute to vocal problems. LEVEL OF EVIDENCE 3 Laryngoscope, 132:391-397, 2022.
Collapse
Affiliation(s)
- Victoria S. McKenna
- Department of Communication Sciences and Disorders, University of Cincinnati
- Department of Biomedical Engineering, University of Cincinnati
- Corresponding Author: 3225 Eden Ave, Cincinnati, Ohio 45267; ; 513-558-8507
| | - Courtney L. Kendall
- Department of Communication Sciences and Disorders, University of Cincinnati
| | - Tulsi H. Patel
- Department of Communication Sciences and Disorders, University of Cincinnati
| | - Rebecca J. Howell
- Department of Otolaryngology-Head & Neck Surgery, University of Cincinnati
| | - Renee L. Gustin
- Department of Otolaryngology-Head & Neck Surgery, University of Cincinnati
| |
Collapse
|
20
|
Abur D, Perkell JS, Stepp CE. Impact of Vocal Effort on Respiratory and Articulatory Kinematics. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:5-21. [PMID: 34843405 PMCID: PMC9150749 DOI: 10.1044/2021_jslhr-21-00323] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 07/27/2021] [Accepted: 08/24/2021] [Indexed: 06/13/2023]
Abstract
PURPOSE The goal of this study was to examine the effects of increases in vocal effort, without changing speech intensity, on respiratory and articulatory kinematics in young adults with typical voices. METHOD A total of 10 participants completed a reading task under three speaking conditions: baseline, mild vocal effort, and maximum vocal effort. Respiratory inductance plethysmography bands around the chest and abdomen were used to estimate lung volumes during speech, and sensor coils for electromagnetic articulography were used to transduce articulatory movements, resulting in the following outcome measures: lung volume at speech initiation (LVSI) and at speech termination (LVST), articulatory kinematic vowel space (AKVS) of two points on the tongue dorsum (body and blade), and lip aperture. RESULTS With increases in vocal effort, and no statistical changes in speech intensity, speakers showed: (a) no statistically significant differences in LVST, (b) statistically significant increases in LVSI, (c) no statistically significant differences in AKVS measures, and (d) statistically significant reductions in lip aperture. CONCLUSIONS Speakers with typical voices exhibited larger lung volumes at speech initiation during increases in vocal effort, paired with reduced lip displacements. To our knowledge, this is the first study to demonstrate evidence that articulatory kinematics are impacted by modulations in vocal effort. However, the mechanisms underlying vocal effort may differ between speakers with and without voice disorders. Thus, future work should examine the relationship between articulatory kinematics, respiratory kinematics, and laryngeal-level changes during vocal effort in speakers with and without voice disorders. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.17065457.
Collapse
Affiliation(s)
- Defne Abur
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Joseph S. Perkell
- Department of Speech, Language and Hearing Sciences, Boston University, MA
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge
| | - Cara E. Stepp
- Department of Speech, Language and Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
- Department of Otolaryngology-Head & Neck Surgery, Boston University School of Medicine, MA
| |
Collapse
|
21
|
McKenna VS, Gustin RL, Howell RJ, Patel TH, Emery MB, Kendall CL, Kelliher NJ. Developing Educational Health Modules to Improve Vocal Wellness in Mask-Wearing Occupational Voice Users. J Voice 2021:S0892-1997(21)00392-1. [PMID: 34969558 PMCID: PMC9234102 DOI: 10.1016/j.jvoice.2021.11.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 11/14/2021] [Accepted: 11/18/2021] [Indexed: 11/20/2022]
Abstract
OBJECTIVE To develop educational modules to improve vocal wellness and optimize communication in mask-wearing occupational voice users. METHODS Module development focused on identifying accurate, understandable, and actionable steps to improve vocal wellness in the workplace. We i) interviewed eight voice-specialized speech-language pathologists and researchers on current speech and voice recommendations for mask-wearers, ii) developed educational content using the standardized Patient Education Materials Assessment Tool (PEMAT), iii) assessed the ability of nine mask-wearing community members to learn educational content, and iv) compared behavioral, acoustical, and perceptual changes in four mask-wearing healthcare professionals following educational training. RESULTS We created three educational modules that described key vocal health and communication strategies, including microphone amplification, postural alignment, clear speech, hydration, vocal naps, and vocal warm-ups. PEMAT scores were 96% and 93% on understandability and actionability, respectively. Mask-wearing healthcare professionals increased use of 4 out of the 6 strategies following educational training and were able to retain information at rates >90% at 1-week follow-up. CONCLUSIONS We developed a set of free-to-use educational modules to promote vocal wellness among mask-wearing occupational voice users (see VSMechLab.com). Future work should examine the impact of these strategies on voice measures in a larger group of mask-wearing community members.
Collapse
Affiliation(s)
- Victoria S McKenna
- Department of Communication Sciences and Disorders, University of Cincinnati, Cincinnati, Ohio; Department of Biomedical Engineering, University of Cincinnati, Cincinnati, Ohio; Department of Otolaryngology-Head & Neck Surgery, University of Cincinnati, Cincinnati, Ohio.
| | - Renee L Gustin
- Department of Otolaryngology-Head & Neck Surgery, University of Cincinnati, Cincinnati, Ohio
| | - Rebecca J Howell
- Department of Otolaryngology-Head & Neck Surgery, University of Cincinnati, Cincinnati, Ohio
| | - Tulsi H Patel
- Department of Communication Sciences and Disorders, University of Cincinnati, Cincinnati, Ohio
| | - Mariah B Emery
- Department of Communication Sciences and Disorders, University of Cincinnati, Cincinnati, Ohio
| | - Courtney L Kendall
- Department of Communication Sciences and Disorders, University of Cincinnati, Cincinnati, Ohio
| | - Nicholas J Kelliher
- Department of Voice, College-Conservatory of Music, University of Cincinnati, Cincinnati, Ohio
| |
Collapse
|
22
|
Marks KL, Verdi A, Toles LE, Stipancic KL, Ortiz AJ, Hillman RE, Mehta DD. Psychometric Analysis of an Ecological Vocal Effort Scale in Individuals With and Without Vocal Hyperfunction During Activities of Daily Living. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2021; 30:2589-2604. [PMID: 34665647 PMCID: PMC9132024 DOI: 10.1044/2021_ajslp-21-00111] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 06/11/2021] [Accepted: 07/07/2021] [Indexed: 05/29/2023]
Abstract
Objective The purpose of this study was to examine the psychometric properties of an ecological vocal effort scale linked to a voicing task. Method Thirty-eight patients with nodules, 18 patients with muscle tension dysphonia, and 45 vocally healthy control individuals participated in a week of ambulatory voice monitoring. A global vocal status question was asked hourly throughout the day. Participants produced a vowel-consonant-vowel syllable string and rated the vocal effort needed to produce the task on a visual analog scale. Test-retest reliability was calculated for a subset using the intraclass correlation coefficient, ICC(A, 1). Construct validity was assessed by (a) comparing the weeklong vocal effort ratings between the patient and control groups and (b) comparing weeklong vocal effort ratings before and after voice rehabilitation in a subset of 25 patients. Cohen's d, the standard error of measurement (SEM), and the minimal detectable change (MDC) assessed sensitivity. The minimal clinically important difference (MCID) assessed responsiveness. Results Test-retest reliability was excellent, ICC(A, 1) = .96. Weeklong mean effort was statistically higher in the patients than in controls (d = 1.62) and lower after voice rehabilitation (d = 1.75), supporting construct validity and sensitivity. SEM was 4.14, MDC was 11.47, and MCID was 9.74. Since the MCID was within the error of the measure, we must rely upon the MDC to detect real changes in ecological vocal effort. Conclusion The ecological vocal effort scale offers a reliable, valid, and sensitive method of monitoring vocal effort changes during the daily life of individuals with and without vocal hyperfunction.
Collapse
Affiliation(s)
- Katherine L. Marks
- MGH Institute of Health Professions, Boston, MA
- Massachusetts General Hospital, Boston
| | - Alessandra Verdi
- MGH Institute of Health Professions, Boston, MA
- Massachusetts General Hospital, Boston
| | - Laura E. Toles
- MGH Institute of Health Professions, Boston, MA
- Massachusetts General Hospital, Boston
| | - Kaila L. Stipancic
- MGH Institute of Health Professions, Boston, MA
- University at Buffalo, NY
| | - Andrew J. Ortiz
- Massachusetts General Hospital, Boston
- Harvard Medical School, Boston, MA
| | - Robert E. Hillman
- MGH Institute of Health Professions, Boston, MA
- Massachusetts General Hospital, Boston
- Harvard Medical School, Boston, MA
| | - Daryush D. Mehta
- MGH Institute of Health Professions, Boston, MA
- Massachusetts General Hospital, Boston
- Harvard Medical School, Boston, MA
| |
Collapse
|
23
|
Weerathunge HR, Segina RK, Tracy L, Stepp CE. Accuracy of Acoustic Measures of Voice via Telepractice Videoconferencing Platforms. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2586-2599. [PMID: 34157251 PMCID: PMC8632479 DOI: 10.1044/2021_jslhr-20-00625] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 12/19/2020] [Accepted: 03/23/2021] [Indexed: 05/31/2023]
Abstract
Purpose Telepractice improves patient access to clinical care for voice disorders. Acoustic assessment has the potential to provide critical, objective information during telepractice, yet its validity via telepractice is currently unknown. The current study investigated the accuracy of acoustic measures of voice in a variety of telepractice platforms. Method Twenty-nine voice samples from individuals with dysphonia were transmitted over six video conferencing platforms (Zoom with and without enhancements, Cisco WebEx, Microsoft Teams, Doxy.me, and VSee Messenger). Standard time-, spectral-, and cepstral-based acoustic measures were calculated. The effect of transmission condition on each acoustic measure was assessed using repeated-measures analyses of variance. For those acoustic measures for which transmission condition was a significant factor, linear regression analysis was performed on the difference between the original recording and each telepractice platform, with the overall severity of dysphonia, Internet speed, and ambient noise from the transmitter as predictors. Results Transmission condition was a statistically significant factor for all acoustic measures except for mean fundamental frequency (f o). Ambient noise from the transmitter was a significant predictor of differences between platforms and the original recordings for all acoustic measures except f o measures. All telepractice platforms affected acoustic measures in a statistically significantly manner, although the effects of platforms varied by measure. Conclusions Overall, measures of f o were the least impacted by telepractice transmission. Microsoft Teams had the least and Zoom (with enhancements) had the most pronounced effects on acoustic measures. These results provide valuable insight into the relative validity of acoustic measures of voice when collected via telepractice. Supplemental Material https://doi.org/10.23641/asha.14794812.
Collapse
Affiliation(s)
- Hasini R. Weerathunge
- Department of Biomedical Engineering, Boston University, MA
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Roxanne K. Segina
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Lauren Tracy
- Department of Otolaryngology—Head & Neck Surgery, Boston University School of Medicine, MA
| | - Cara E. Stepp
- Department of Biomedical Engineering, Boston University, MA
- Department of Speech, Language and Hearing Sciences, Boston University, MA
- Department of Otolaryngology—Head & Neck Surgery, Boston University School of Medicine, MA
| |
Collapse
|
24
|
Jin JL, Baylor C, Yorkston K. Predicting Communicative Participation in Adults Across Communication Disorders. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2021; 30:1301-1313. [PMID: 33656912 PMCID: PMC8702843 DOI: 10.1044/2020_ajslp-20-00100] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 09/01/2020] [Accepted: 10/20/2020] [Indexed: 05/29/2023]
Abstract
Purpose The purpose of this study was to explore the extent to which communicative participation differs across diagnoses and if there are common predictor variables for communicative participation across diagnoses. Method Survey data on self-report variables including communicative participation were collected from 141 community-dwelling adults with communication disorders due to Parkinson's disease, cerebrovascular accident, spasmodic dysphonia, or vocal fold immobility (VFI). Analysis of covariance was used to determine communicative participation differences between diagnoses, with age, sex, and hearing status as covariates. Sequential entry linear regression was used to examine associations between communicative participation and variables representing a range of psychosocial constructs across diagnoses. Results The VFI group had the least favorable communicative participation differing significantly from Parkinson's disease and spasmodic dysphonia groups. Self-rated speech/voice severity, self-rated effort, mental health, perceived social support, and resilience contributed to variance in communicative participation when pooled across diagnoses. The relationship between communicative participation and the variables of effort and resilience differed significantly when diagnosis was considered. Conclusions The findings suggest that communicative participation restrictions may vary across some diagnoses but not others. People with VFI appear to differ from other diagnosis groups in the extent of participation restrictions. Effort and resilience may play different roles in contributing to communicative participation in different disorders, but constructs such as social support, severity, and mental health appear to have consistent relationships with communicative participation across diagnoses. The findings can help clinicians identify psychosocial factors beyond the impairment that impact clients' communication in daily situations.
Collapse
Affiliation(s)
- Jingyu Linna Jin
- Department of Rehabilitation Medicine, University of Washington, Seattle
| | - Carolyn Baylor
- Department of Rehabilitation Medicine, University of Washington, Seattle
| | - Kathryn Yorkston
- Department of Rehabilitation Medicine, University of Washington, Seattle
| |
Collapse
|
25
|
Hunter EJ, Berardi ML, van Mersbergen M. Relationship Between Tasked Vocal Effort Levels and Measures of Vocal Intensity. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1829-1840. [PMID: 34057833 PMCID: PMC8740752 DOI: 10.1044/2021_jslhr-20-00465] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 01/04/2021] [Accepted: 02/19/2021] [Indexed: 06/12/2023]
Abstract
Purpose Patients with voice problems commonly report increased vocal effort, regardless of the underlying pathophysiology. Previous studies investigating vocal effort and voice production have used a range of methods to quantify vocal effort. The goals of the current study were to use the Borg CR100 effort scale to (a) demonstrate the relation between vocal intensity or vocal level (dB) and tasked vocal effort goals and (b) investigate the repeated measure reliability of vocal level at tasked effort level goals. Method Three types of speech (automatic, read, and structured spontaneous) were elicited at four vocal effort level goals on the Borg CR100 scale (2, 13, 25, and 50) from 20 participants (10 females and 10 males). Results Participants' vocal level reliably changed approximately 5 dB between the elicited effort level goals; this difference was statistically significant and repeatable. Biological females produced a voice with consistently less intensity for a vocal effort level goal compared to biological males. Conclusions The results indicate the utility of the Borg CR100 in tracking effort in voice production that is repeatable with respect to vocal level (dB). Future research will investigate other metrics of voice production with the goal of understanding the mechanisms underlying vocal effort and the external environmental influences on the perception of vocal effort.
Collapse
Affiliation(s)
- Eric J. Hunter
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing
| | - Mark L. Berardi
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing
| | | |
Collapse
|
26
|
Effects of Sidetone Amplification on Vocal Function During Telecommunication. J Voice 2021:S0892-1997(21)00124-7. [PMID: 33992477 DOI: 10.1016/j.jvoice.2021.03.027] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/22/2021] [Accepted: 03/23/2021] [Indexed: 11/23/2022]
Abstract
PURPOSE Society has become increasingly dependent on telecommunication, which has been shown to negatively impact vocal function. This study explores the use of sidetone regulation during audio-visual communication as one potential technique to alleviate the effects of telecommunication on the voice. METHOD The speech acoustics of 18 participants with typical voices were measured during conversational tasks during three conditions of sidetone amplification: baseline (no sidetone amplification), low sidetone amplification, and high sidetone amplification. Vocal intensity, vocal quality (estimated using acoustic measures of the low-high ratio and the smoothed cepstral peak prominence), and self-perceived vocal effort were used to measure the impacts of sidetone amplification on vocal function. RESULTS Compared to baseline, there were statistically significant decreases in vocal intensity and increases in low-high ratio in the high level of sidetone amplification condition. Changes in these measures were not significantly correlated. When asked to rank conditions based on their perceived vocal effort, participants most often ranked the high level of sidetone amplification as least effortful; however, the visual-analog ratings of vocal effort were not significantly different between conditions. The smoothed cepstral peak prominence did not change with varying levels of sidetone amplification. CONCLUSIONS Vocal intensity decreased with high levels of sidetone amplification. High levels of sidetone amplification also resulted in increases in the low-high ratio, which were shown to be more than just a byproduct of decreased vocal intensity. The impact of sidetone amplification on vocal effort was less clear, but results suggested that participants generally decreased their vocal effort with increased levels of sidetone amplification. This was a preliminary study and future work is warranted in a population of participants with voice complaints and in a more noisy, realistic environments.
Collapse
|
27
|
Groll MD, Hablani S, Stepp CE. The Relationship Between Voice Onset Time and Increase in Vocal Effort and Fundamental Frequency. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1197-1209. [PMID: 33820431 PMCID: PMC8608153 DOI: 10.1044/2021_jslhr-20-00505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 10/19/2020] [Accepted: 01/13/2021] [Indexed: 06/12/2023]
Abstract
Purpose Prior work suggests that voice onset time (VOT) may be impacted by laryngeal tension: VOT means decrease when individuals with typical voices increase their fundamental frequency (f o) and VOT variability is increased in individuals with vocal hyperfunction, a voice disorder characterized by increased laryngeal tension. This study further explored the relationship between VOT and laryngeal tension during increased f o, vocal effort, and vocal strain. Method Sixteen typical speakers of American English were instructed to produce VOT utterances under four conditions: baseline, high pitch, effort, and strain. Repeated-measures analysis of variance models were used to analyze the effects of condition on VOT means and standard deviations (SDs); pairwise comparisons were used to determine significant differences between conditions. Results Voicing, condition, and their interaction significantly affected VOT means. Voiceless VOT means significantly decreased for high pitch (p < .001) relative to baseline; however, no changes in voiceless VOT means were found for effort or strain relative to baseline. Although condition had a significant effect on VOT SDs, there were no significant differences between effort, strain, and high pitch conditions relative to baseline. Conclusions Speakers with typical voices likely engage different musculature to increase pitch than to increase vocal effort and strain. The increased VOT variability present with vocal hyperfunction is not seen in individuals with typical voices using increased effort and strain, supporting the assertion that this feature of vocal hyperfunction may be related to disordered vocal motor control rather than resulting from effortful voice production.
Collapse
Affiliation(s)
- Matti D. Groll
- Department of Biomedical Engineering, Boston University, MA
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
| | - Surbhi Hablani
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
| | - Cara E. Stepp
- Department of Biomedical Engineering, Boston University, MA
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
- Department of Otolaryngology – Head and Neck Surgery, Boston University School of Medicine, MA
| |
Collapse
|
28
|
Park Y, Wang F, Díaz-Cádiz M, Vojtech JM, Groll MD, Stepp CE. Vocal fold kinematics and relative fundamental frequency as a function of obstruent type and speaker age. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:2189. [PMID: 33940922 PMCID: PMC8018794 DOI: 10.1121/10.0003961] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2020] [Revised: 03/02/2021] [Accepted: 03/11/2021] [Indexed: 06/12/2023]
Abstract
The acoustic measure, relative fundamental frequency (RFF), has been proposed as an objective metric for assessing vocal hyperfunction; however, its underlying physiological mechanisms have not yet been fully characterized. This study aimed to characterize the relationship between RFF and vocal fold kinematics. Simultaneous acoustic and high-speed videoendoscopic (HSV) recordings were collected as younger and older speakers repeated the utterances /ifi/ and /iti/. RFF values at voicing offsets and onsets surrounding the obstruents were estimated from acoustic recordings, whereas glottal angles, durations of voicing offset and onset, and a kinematic estimate of laryngeal stiffness (KS) were obtained from HSV images. No differences were found between younger and older speakers for any measure. RFF did not differ between the two obstruents at voicing offset; however, fricatives necessitated larger glottal angles and longer durations to devoice. RFF values were lower and glottal angles were greater for stops relative to fricatives at voicing onset. KS values were greater in stops relative to fricatives. The less adducted vocal folds with greater KS and lower RFF at voicing onset for stops relative to fricatives in this study were in accordance with prior speculations that decreased vocal fold contact area and increased laryngeal stiffness may decrease RFF.
Collapse
Affiliation(s)
- Yeonggwang Park
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Feng Wang
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, USA
| | - Manuel Díaz-Cádiz
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Jennifer M Vojtech
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, USA
| | - Matti D Groll
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, USA
| | - Cara E Stepp
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| |
Collapse
|
29
|
Nguyen DD, McCabe P, Thomas D, Purcell A, Doble M, Novakovic D, Chacon A, Madill C. Acoustic voice characteristics with and without wearing a facemask. Sci Rep 2021; 11:5651. [PMID: 33707509 PMCID: PMC7970997 DOI: 10.1038/s41598-021-85130-8] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 02/19/2021] [Indexed: 01/31/2023] Open
Abstract
Facemasks are essential for healthcare workers but characteristics of the voice whilst wearing this personal protective equipment are not well understood. In the present study, we compared acoustic voice measures in recordings of sixteen adults producing standardised vocal tasks with and without wearing either a surgical mask or a KN95 mask. Data were analysed for mean spectral levels at 0-1 kHz and 1-8 kHz regions, an energy ratio between 0-1 and 1-8 kHz (LH1000), harmonics-to-noise ratio (HNR), smoothed cepstral peak prominence (CPPS), and vocal intensity. In connected speech there was significant attenuation of mean spectral level at 1-8 kHz region and there was no significant change in this measure at 0-1 kHz. Mean spectral levels of vowel did not change significantly in mask-wearing conditions. LH1000 for connected speech significantly increased whilst wearing either a surgical mask or KN95 mask but no significant change in this measure was found for vowel. HNR was higher in the mask-wearing conditions than the no-mask condition. CPPS and vocal intensity did not change in mask-wearing conditions. These findings implied an attenuation effects of wearing these types of masks on the voice spectra with surgical mask showing less impact than the KN95.
Collapse
Affiliation(s)
- Duy Duong Nguyen
- grid.1013.30000 0004 1936 834XVoice Research Laboratory, Faculty of Medicine and Health, D18, Susan Wakil Health Building, Camperdown Campus, The University of Sydney, Western Avenue, Sydney, NSW 2006 Australia
| | - Patricia McCabe
- grid.1013.30000 0004 1936 834XVoice Research Laboratory, Faculty of Medicine and Health, D18, Susan Wakil Health Building, Camperdown Campus, The University of Sydney, Western Avenue, Sydney, NSW 2006 Australia
| | - Donna Thomas
- grid.1013.30000 0004 1936 834XVoice Research Laboratory, Faculty of Medicine and Health, D18, Susan Wakil Health Building, Camperdown Campus, The University of Sydney, Western Avenue, Sydney, NSW 2006 Australia
| | - Alison Purcell
- grid.1013.30000 0004 1936 834XVoice Research Laboratory, Faculty of Medicine and Health, D18, Susan Wakil Health Building, Camperdown Campus, The University of Sydney, Western Avenue, Sydney, NSW 2006 Australia
| | - Maree Doble
- grid.1013.30000 0004 1936 834XVoice Research Laboratory, Faculty of Medicine and Health, D18, Susan Wakil Health Building, Camperdown Campus, The University of Sydney, Western Avenue, Sydney, NSW 2006 Australia
| | - Daniel Novakovic
- grid.1013.30000 0004 1936 834XVoice Research Laboratory, Faculty of Medicine and Health, D18, Susan Wakil Health Building, Camperdown Campus, The University of Sydney, Western Avenue, Sydney, NSW 2006 Australia
| | - Antonia Chacon
- grid.1013.30000 0004 1936 834XVoice Research Laboratory, Faculty of Medicine and Health, D18, Susan Wakil Health Building, Camperdown Campus, The University of Sydney, Western Avenue, Sydney, NSW 2006 Australia
| | - Catherine Madill
- grid.1013.30000 0004 1936 834XVoice Research Laboratory, Faculty of Medicine and Health, D18, Susan Wakil Health Building, Camperdown Campus, The University of Sydney, Western Avenue, Sydney, NSW 2006 Australia
| |
Collapse
|
30
|
Fiorella ML, Cavallaro G, Di Nicola V, Quaranta N. Voice Differences When Wearing and Not Wearing a Surgical Mask. J Voice 2021; 37:467.e1-467.e7. [PMID: 33712355 DOI: 10.1016/j.jvoice.2021.01.026] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 01/20/2021] [Accepted: 01/26/2021] [Indexed: 10/21/2022]
Abstract
OBJECTIVE The purpose of our study was to investigate the impact of surgical mask on some vocal parameters such as F0, vocal intensity, jitter, shimmer and harmonics-to-noise ratio in order to understand how surgical mask can affect voice and verbal communication in adults. METHODS The study was carried out on a selected group of 60 healthy subjects. All subjects were trained to voice a vocal sample of a sustained /a/, at a conversational voice intensity for the Maximum Phonation Time (MPT), wearing the surgical mask and then without wearing the surgical mask. Voice samples were recorded directly in Praat. RESULTS There were no statistically significant differences in any acoustic parameter between the masked and unmasked condition. There was a non-significant decrease in vocal intensity in 65% of the subjects while wearing a surgical mask. CONCLUSIONS The statistical comparison carried out between all the acoustic voice parameters observed, extracted wearing and not wearing a surgical mask did not reveal any significant statistical difference. Most of the subjects, after wearing the surgical mask, presented a decrease in vocal intensity measured. Our conclusion was that wearing a mask is likely to induce the unconscious need to increase the vocal effort, resulting over time in a greater risk of developing functional dysphonia. The reduction of intensity can affect also social interaction and speech audibility, especially for individuals with hearing loss.
Collapse
Affiliation(s)
- Maria Luisa Fiorella
- Otolaryngology Unit, Department of Biomedical Sciences, Neuroscience and Sensory Organs, University of Bari "Aldo Moro", Bari, Italy
| | - Giada Cavallaro
- Otolaryngology Unit, Department of Biomedical Sciences, Neuroscience and Sensory Organs, University of Bari "Aldo Moro", Bari, Italy.
| | - Vincenzo Di Nicola
- Otolaryngology Unit, Department of Biomedical Sciences, Neuroscience and Sensory Organs, University of Bari "Aldo Moro", Bari, Italy
| | - Nicola Quaranta
- Otolaryngology Unit, Department of Biomedical Sciences, Neuroscience and Sensory Organs, University of Bari "Aldo Moro", Bari, Italy
| |
Collapse
|
31
|
Fujiki RB, Thibeault SL. The Relationship Between Auditory-Perceptual Rating Scales and Objective Voice Measures in Children With Voice Disorders. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2021; 30:228-238. [PMID: 33439742 DOI: 10.1044/2020_ajslp-20-00188] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose The purpose of this study was to determine concurrent validity of the Grade, Roughness, Breathiness, Asthenia, and Strain (GRBAS) and Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) auditory-perceptual scales in children with voice disorders. A secondary purpose was to determine correlation between the GRBAS, CAPE-V, and objective voice measures. Method GRBAS and CAPE-V ratings and acoustic and aerodynamic measures were collected from the University of Wisconsin-Madison Voice and Swallow Outcomes Database. Correlations between CAPE-V and GRBAS ratings were calculated for overall severity of dysphonia, roughness, breathiness, and strain. Correlations between auditory-perceptual voice ratings and objective voice measures were also examined. Results One hundred thirty GRBAS and CAPE-V auditory-perceptual ratings were significantly correlated for overall severity, roughness, breathiness, and strain. r 2 values were highest for overall severity of dysphonia (r 2 = .75) and lowest for strain (r 2 = .54). CAPE-V and GRBAS ratings were largely associated with similar acoustic and aerodynamic measures. The highest correlations were observed for auditory-perceptual ratings of breathiness and jitter% (CAPE-V r 2 = .44, GRBAS r 2 = .44), shimmer% (CAPE-V r 2 = .45, GRBAS r 2 = .45), noise-to-harmonic ratio (CAPE-V r 2 = .42, GRBAS r 2 = .40), fundamental frequency (CAPE-V r 2 = .47, GRBAS r 2 = .44), and maximum phonation time (CAPE-V r 2 = .56, GRBAS r 2 = .51). Akaike information criterion values indicated that CAPE-V ratings were more strongly correlated with objective voice measures than GRBAS ratings. Conclusions CAPE-V and GRBAS scales have concurrent validity in children with voice disorders. CAPE-V ratings are more strongly correlated with acoustic and aerodynamic voice measures.
Collapse
|
32
|
Park Y, Cádiz MD, Nagle KF, Stepp CE. Perceptual and Acoustic Assessment of Strain Using Synthetically Modified Voice Samples. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3897-3908. [PMID: 33151770 PMCID: PMC8608200 DOI: 10.1044/2020_jslhr-20-00294] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 07/23/2020] [Accepted: 08/17/2020] [Indexed: 06/11/2023]
Abstract
Purpose Assessment of strained voice quality is difficult due to the weak reliability of auditory-perceptual evaluation and lack of strong acoustic correlates. This study evaluated the contributions of relative fundamental frequency (RFF) and mid-to-high frequency noise to the perception of strain. Method Stimuli were created using recordings of speakers producing /ifi/ with a comfortable voice and with maximum vocal effort. RFF values of the comfortable voice samples were synthetically lowered, and RFF values of the maximum vocal effort samples were synthetically raised. Mid-to-high frequency noise was added to the samples. Twenty listeners rated strain in a visual sort-and-rate task. The effects of RFF modification and added noise on strain were assessed using an analysis of variance; intra- and interrater reliability were compared with and without noise. Results Lowering RFF in the comfortable voice samples increased their perceived strain, whereas raising RFF in the maximum vocal effort samples decreased their strain. Adding noise increased strain and decreased intra- and interrater reliability relative to samples without added noise. Conclusions Both RFF and mid-to-high frequency noise contribute to the perception of strain. The presence of dysphonia may decrease the reliability of auditory-perceptual evaluation of strain, which supports the need for complementary objective assessments. Supplemental Material https://doi.org/10.23641/asha.13172252.
Collapse
Affiliation(s)
- Yeonggwang Park
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Manuel Díaz Cádiz
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Kathleen F. Nagle
- Department of Speech-Language Pathology, Seton Hall University, South Orange, NJ
| | - Cara E. Stepp
- Department of Speech, Language and Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
- Department of Otolaryngology – Head and Neck Surgery, Boston University School of Medicine, MA
| |
Collapse
|
33
|
Tracy LF, Segina RK, Cadiz MD, Stepp CE. The Impact of Communication Modality on Voice Production. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:2913-2920. [PMID: 32762517 PMCID: PMC7890225 DOI: 10.1044/2020_jslhr-20-00161] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 06/04/2020] [Accepted: 06/18/2020] [Indexed: 06/11/2023]
Abstract
Purpose Communicating remotely using audio and audiovisual technology is ubiquitous in modern work and social environments. Remote communication is increasing in medicine and in voice therapy delivery, and this evolution may have an impact on speakers' voices. This study sought to determine whether these communication modalities impact the voice production of typical speakers. Method The speech acoustics of 12 participants with healthy voices were recorded as they held standardized conversations with a single investigator using three communication modalities: in-person, remote-audio, and remote-audiovisual. Participants rated their vocal effort on a 100-mm visual analog scale. Results Compared to in-person communication, self-ratings of vocal effort were statistically significantly increased for remote-audiovisual communication; vocal effort during remote-audio and in-person communication were not significantly different. In comparison to in-person communication, vocal intensity and smoothed cepstral peak prominence (CPPS) were statistically significantly higher during remote-audio and remote-audiovisual communication. Effect sizes for CPPS changes were larger than for sound pressure level (SPL), and changes in CPPS and SPL between in-person and remote-audiovisual communication were not significantly correlated. Conclusions Vocal effort and SPL were increased when using remote-audio and remote-audiovisual communication in comparison to in-person communication. Voice quality was also impacted by technology use, with changes in CPPS that were consistent with, but not fully explained by, increases in SPL. This may impact the telepractice delivery of voice therapy, and further investigation is warranted.
Collapse
Affiliation(s)
- Lauren F. Tracy
- Department of Otolaryngology—Head and Neck Surgery, Boston University School of Medicine, MA
| | - Roxanne K. Segina
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Manuel Diaz Cadiz
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Cara E. Stepp
- Department of Otolaryngology—Head and Neck Surgery, Boston University School of Medicine, MA
- Department of Speech, Language and Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
| |
Collapse
|
34
|
Buckley DP, Cadiz MD, Eadie TL, Stepp CE. Acoustic Model of Perceived Overall Severity of Dysphonia in Adductor-Type Laryngeal Dystonia. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:2713-2722. [PMID: 32692616 PMCID: PMC7872728 DOI: 10.1044/2020_jslhr-19-00354] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 01/28/2020] [Accepted: 05/19/2020] [Indexed: 05/19/2023]
Abstract
Purpose This study is a secondary analysis of existing data. The goal of the study was to construct an acoustic model of perceived overall severity of dysphonia in adductory laryngeal dystonia (AdLD). We predicted that acoustic measures (a) related to voice and pitch breaks and (b) related to vocal effort would form the primary elements of a model corresponding to auditory-perceptual ratings of overall severity of dysphonia. Method Twenty inexperienced listeners evaluated the overall severity of dysphonia of speech stimuli from 19 individuals with AdLD. Acoustic features related to primary signs of AdLD (hyperadduction resulting in pitch and voice breaks) and to a potential secondary symptom of AdLD (vocal effort, measures of relative fundamental frequency) were computed from the speech stimuli. Multiple linear regression analysis was applied to construct an acoustic model of the overall severity of dysphonia. Results The acoustic model included an acoustic feature related to pitch and voice breaks and three acoustic measures derived from relative fundamental frequency; it explained 84.9% of the variance in the auditory-perceptual ratings of overall severity of dysphonia in the speech samples. Conclusions Auditory-perceptual ratings of overall severity of dysphonia in AdLD were related to acoustic features of primary signs (pitch and voice breaks, hyperadduction associated with laryngeal spasms) and were also related to acoustic features of vocal effort. This suggests that compensatory vocal effort may be a secondary symptom in AdLD. Future work to generalize this acoustic model to a larger, independent data set is necessary before clinical translation is warranted.
Collapse
Affiliation(s)
- Daniel P. Buckley
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
- Department of Otolaryngology–Head and Neck Surgery, Boston University School of Medicine, MA
| | - Manuel Diaz Cadiz
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
| | - Tanya L. Eadie
- Department of Speech and Hearing Sciences, University of Washington, Seattle
| | - Cara E. Stepp
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
- Department of Otolaryngology–Head and Neck Surgery, Boston University School of Medicine, MA
- Department of Biomedical Engineering, Boston University, MA
| |
Collapse
|
35
|
Cantor-Cutiva LC, Robles-Vega HY, Sánchez EA, Morales DA. Differences on Voice Acoustic Parameters between Colombian College Professors with and without Vocal Fatigue. J Voice 2020; 36:219-225. [PMID: 32564941 DOI: 10.1016/j.jvoice.2020.05.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 05/11/2020] [Accepted: 05/12/2020] [Indexed: 11/25/2022]
Abstract
AIM To determine which acoustic parameters may be associated with vocal fatigue among college professors in Bogotá-Colombia. METHOD This was a cross-sectional study including 27 voice samples of college professors. RESULTS A gender analysis showed that mean fundamental frequency increased significantly among men and women who reported vocal fatigue compared with the those without fatigue (138.2Hz vs 122.3Hz for males; and 228.7Hz vs 188.9Hz for females; Mann-Whitney U test P value <0.01). Participants with vocal fatigue demonstrated a significantly decreased standard deviation of vocal sound pressure level compared to participants without vocal fatigue (8.7 dB vs 10.2 dB; Mann-Whitney U test P value <0.01). For the males in our sample, fundamental frequency had fair discriminatory value for vocal fatigue (area under the curve = 0.7). Sensitivity and specificity were moderate at a cut-off of 125 Hz (0.7 and 0.6 respectively). For females in this sample, the discriminatory value of fundamental frequency was slightly higher (area under the curve = 0.8). At a cut-off of 200 Hz, sensitivity was high (0.9) and specificity were moderate (0.7). CONCLUSION In conclusion, the fundamental frequencies and standard deviations of vocal sound pressure level are good indicators of and may be used to identify college professors with vocal fatigue. Clinically, voice clinicians may aim to train their clients to produce speech with a higher variation of "intensity" in order to avoid vocal fatigue.
Collapse
Affiliation(s)
- Lady Catherine Cantor-Cutiva
- Department of Health Sciences, Speech and Language Pathology Program, Universidad Manuela Beltrán, Bogotá, Colombia.
| | - Hédrick Yoseft Robles-Vega
- Department of Engineering, Biomedical Engineering Program, Universidad Manuela Beltrán, Bogotá, Colombia
| | | | - Diego Alejandro Morales
- Department of Engineering, Biomedical Engineering Program, Universidad Manuela Beltrán, Bogotá, Colombia
| |
Collapse
|
36
|
Groll MD, McKenna VS, Hablani S, Stepp CE. Formant-Estimated Vocal Tract Length and Extrinsic Laryngeal Muscle Activation During Modulation of Vocal Effort in Healthy Speakers. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:1395-1403. [PMID: 32379521 PMCID: PMC7842116 DOI: 10.1044/2020_jslhr-19-00234] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Revised: 12/16/2019] [Accepted: 01/28/2020] [Indexed: 05/31/2023]
Abstract
Purpose The goal of this study was to explore the relationships among vocal effort, extrinsic laryngeal muscle activity, and vocal tract length (VTL) within healthy speakers. We hypothesized that increased vocal effort would result in increased suprahyoid muscle activation and decreased VTL, as previously observed in individuals with vocal hyperfunction. Method Twenty-eight healthy speakers of American English produced vowel-consonant-vowel utterances under varying levels of vocal effort. VTL was estimated from the vowel formants. Three surface electromyography sensors measured the activation of the suprahyoid and infrahyoid muscle groups. A general linear model was used to investigate the effects of vocal effort level and surface electromyography on VTL. Two additional general linear models were used to investigate the effects of vocal effort on suprahyoid and infrahyoid muscle activities. Results Neither vocal effort nor extrinsic muscle activity showed significant effects on VTL; however, the degree of extrinsic muscle activity of both suprahyoid and infrahyoid muscle groups increased with increases in vocal effort. Conclusion Increasing vocal effort resulted in increased activation of both suprahyoid and infrahyoid musculature in healthy adults, with no change to VTL.
Collapse
Affiliation(s)
- Matti D. Groll
- Department of Biomedical Engineering, Boston University, MA
- Department of Speech, Language, & Hearing Sciences, Boston University, MA
| | - Victoria S. McKenna
- Department of Speech, Language, & Hearing Sciences, Purdue University, West Lafayette, IN
| | - Surbhi Hablani
- Department of Speech, Language, & Hearing Sciences, Boston University, MA
| | - Cara E. Stepp
- Department of Biomedical Engineering, Boston University, MA
- Department of Speech, Language, & Hearing Sciences, Boston University, MA
- Department of Otolaryngology—Head and Neck Surgery, Boston University School of Medicine, MA
| |
Collapse
|
37
|
Kochilas HL, Cacace AT, Arnold A, Seidman MD, Tarver WB. Vagus nerve stimulation paired with tones for tinnitus suppression: Effects on voice and hearing. Laryngoscope Investig Otolaryngol 2020; 5:286-296. [PMID: 32337360 PMCID: PMC7178458 DOI: 10.1002/lio2.364] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 01/23/2020] [Accepted: 02/08/2020] [Indexed: 12/16/2022] Open
Abstract
OBJECTIVE In individuals with chronic tinnitus, our interest was to determine whether daily low-level electrical stimulation of the vagus nerve paired with tones (paired-VNSt) for tinnitus suppression had any adverse effects on motor-speech production and physiological acoustics of sustained vowels. Similarly, we were also interested in evaluating for changes in pure-tone thresholds, word-recognition performance, and minimum-masking levels. Both voice and hearing functions were measured repeatedly over a period of 1 year. STUDY DESIGN Longitudinal with repeated-measures. METHODS Digitized samples of sustained frontal, midline, and back vowels (/e/, /o/, /ah/) were analyzed with computer software to quantify the degree of jitter, shimmer, and harmonic-to-noise ratio contained in these waveforms. Pure-tone thresholds, monosyllabic word-recognition performance, and MMLs were also evaluated for VNS alterations. Linear-regression analysis was the benchmark statistic used to document change over time in voice and hearing status from a baseline condition. RESULTS Most of the regression functions for the vocal samples and audiometric variables had slope values that were not significantly different from zero. Four of the nine vocal functions showed a significant improvement over time, whereas three of the pure tone regression functions at 2-4 kHz showed some degree of decline; all changes observed were for the left ear, all were at adjacent frequencies, and all were ipsilateral to the side of VNS. However, mean pure-tone threshold changes did not exceed 4.29 dB from baseline and therefore, would not be considered clinically significant. In some individuals, larger threshold shifts were observed. No significant regression/slope effects were observed for word-recognition or MMLs. CONCLUSION Quantitative voice analysis and assessment of audiometric variables showed minimal if any evidence of adverse effects using paired-VNSt over a treatment period of 1 year. Therefore, we conclude that paired-VNSt is a safe tool for tinnitus abatement in humans without significant side effects. LEVEL OF EVIDENCE Level IV.
Collapse
Affiliation(s)
- Helen L. Kochilas
- North Atlanta Ears, Nose, Throat & Allergy, AlpharettaGeorgia
- Present address:
North Atlanta Ears, Nose, Throat & AllergyAlpharettaGeorgia
| | - Anthony T. Cacace
- Department of Communication Sciences & Disorders, Wayne State University, DetroitMichigan
| | - Amy Arnold
- The Hearing Clinic, BrightonMichigan
- Present address:
The Hearing ClinicBrightonMichigan
| | - Michael D. Seidman
- Florida ENT Surgical Specialists, Florida Hospital Medical Group, Head & Neck Surgery Center of Florida, CelebrationFlorida
- Present address:
Florida Hospital Medical GroupHead & Neck Surgery Center of FloridaCelebrationFlorida
| | | |
Collapse
|
38
|
Heller Murray ES, Segina RK, Woodnorth GH, Stepp CE. Relative Fundamental Frequency in Children With and Without Vocal Fold Nodules. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:361-371. [PMID: 32073342 PMCID: PMC7210445 DOI: 10.1044/2019_jslhr-19-00058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Purpose Relative fundamental frequency (RFF) is an acoustic measure that is sensitive to functional voice differences in adults. The aim of the current study was to evaluate RFF in children, as there are known structural and functional differences between the pediatric and adult vocal mechanisms. Method RFF was analyzed in 28 children with vocal fold nodules (CwVN, M = 9.0 years) and 28 children with typical voices (CwTV, M = 8.9 years). RFF is the instantaneous fundamental frequency (f 0) of the 10 vocalic cycles during devoicing (vocal offset) and 10 vocalic cycles during the revoicing (vocal onset) of the vowels that surround a voiceless consonant. Each cycle's f 0 was normalized to a steady-state portion of the vowel. RFF values for the cycles closest to the voiceless consonant, that is, Offset Cycle 10 and Onset Cycle 1, were examined. Results Average RFF values for Offset Cycle 10 and Onset Cycle 1 did not differ between CwVN and CwTV; however, within-subject variability of Offset Cycle 10 was decreased in CwVN. Across both groups, male children had lower Offset Cycle 10 RFF values as compared to female children. Additionally, Onset Cycle 1 values were decreased in younger children as compared to those of older children. Conclusions Unlike previous work with adults, CwVN did not have significantly different RFF values than CwTV. Younger children had lower RFF values for Onset Cycle 1 than older children, suggesting that vocal onset f 0 may provide information on the maturity of the laryngeal motor system.
Collapse
Affiliation(s)
- Elizabeth S. Heller Murray
- Department of Speech, Language & Hearing Sciences, Boston University, MA
- Department of Otolaryngology and Communication Enhancement, Boston Children's Hospital, MA
| | - Roxanne K. Segina
- Department of Speech, Language & Hearing Sciences, Boston University, MA
| | | | - Cara E. Stepp
- Department of Speech, Language & Hearing Sciences, Boston University, MA
- Department of Otolaryngology—Head & Neck Surgery, Boston University School of Medicine, MA
- Department of Biomedical Engineering, Boston University, MA
| |
Collapse
|
39
|
Lin JZ, Espinoza VM, Marks KL, Zañartu M, Mehta DD. Improved subglottal pressure estimation from neck-surface vibration in healthy speakers producing non-modal phonation. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING 2020; 14:449-460. [PMID: 34079612 PMCID: PMC8168553 DOI: 10.1109/jstsp.2019.2959267] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Subglottal air pressure plays a major role in voice production and is a primary factor in controlling voice onset, offset, sound pressure level, glottal airflow, vocal fold collision pressures, and variations in fundamental frequency. Previous work has shown promise for the estimation of subglottal pressure from an unobtrusive miniature accelerometer sensor attached to the anterior base of the neck during typical modal voice production across multiple pitch and vowel contexts. This study expands on that work to incorporate additional accelerometer-based measures of vocal function to compensate for non-modal phonation characteristics and achieve an improved estimation of subglottal pressure. Subjects with normal voices repeated /p/-vowel syllable strings from loud-to-soft levels in multiple vowel contexts (/ɑ/, /i/, and /u/), pitch conditions (comfortable, lower than comfortable, higher than comfortable), and voice quality types (modal, breathy, strained, and rough). Subject-specific, stepwise regression models were constructed using root-mean-square (RMS) values of the accelerometer signal alone (baseline condition) and in combination with cepstral peak prominence, fundamental frequency, and glottal airflow measures derived using subglottal impedance-based inverse filtering. Five-fold cross-validation assessed the robustness of model performance using the root-mean-square error metric for each regression model. Each cross-validation fold exhibited up to a 25% decrease in prediction error when the model incorporated multidimensional aspects of the accelerometer signal compared with RMS-only models. Improved estimation of subglottal pressure for non-modal phonation was thus achievable, lending to future studies of subglottal pressure estimation in patients with voice disorders and in ambulatory voice recordings.
Collapse
Affiliation(s)
- Jon Z Lin
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114 USA
| | | | - Katherine L Marks
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114 USA
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa Maria, Valparaíso, Chile
| | - Daryush D Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital-Harvard Medical School, Boston, MA 02114 USA
| |
Collapse
|
40
|
Vojtech JM, Segina RK, Buckley DP, Kolin KR, Tardif MC, Noordzij JP, Stepp CE. Refining algorithmic estimation of relative fundamental frequency: Accounting for sample characteristics and fundamental frequency estimation method. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:3184. [PMID: 31795681 PMCID: PMC6847943 DOI: 10.1121/1.5131025] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Revised: 10/07/2019] [Accepted: 10/08/2019] [Indexed: 05/26/2023]
Abstract
Relative fundamental frequency (RFF) is a promising acoustic measure for evaluating voice disorders. Yet, the accuracy of the current RFF algorithm varies across a broad range of vocal signals. The authors investigated how fundamental frequency (fo) estimation and sample characteristics impact the relationship between manual and semi-automated RFF estimates. Acoustic recordings were collected from 227 individuals with and 256 individuals without voice disorders. Common fo estimation techniques were compared to the autocorrelation method currently implemented in the RFF algorithm. Pitch strength-based categories were constructed using a training set (1158 samples), and algorithm thresholds were tuned to each category. RFF was then computed on an independent test set (291 samples) using category-specific thresholds and compared against manual RFF via mean bias error (MBE) and root-mean-square error (RMSE). Auditory-SWIPE' for fo estimation led to the greatest correspondence with manual RFF and was implemented in concert with category-specific thresholds. Refining fo estimation and accounting for sample characteristics led to increased correspondence with manual RFF [MBE = 0.01 semitones (ST), RMSE = 0.28 ST] compared to the unmodified algorithm (MBE = 0.90 ST, RMSE = 0.34 ST), reducing the MBE and RMSE of semi-automated RFF estimates by 88.4% and 17.3%, respectively.
Collapse
Affiliation(s)
- Jennifer M Vojtech
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, USA
| | - Roxanne K Segina
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Daniel P Buckley
- Department of Otolaryngology-Head and Neck Surgery, Boston University School of Medicine, 72 East Concord Street, Boston, Massachusetts 02118, USA
| | - Katharine R Kolin
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Monique C Tardif
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - J Pieter Noordzij
- Department of Otolaryngology-Head and Neck Surgery, Boston University School of Medicine, 72 East Concord Street, Boston, Massachusetts 02118, USA
| | - Cara E Stepp
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| |
Collapse
|
41
|
Cler GJ, McKenna VS, Dahl KL, Stepp CE. Longitudinal Case Study of Transgender Voice Changes Under Testosterone Hormone Therapy. J Voice 2019; 34:748-762. [PMID: 30987859 DOI: 10.1016/j.jvoice.2019.03.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Revised: 03/14/2019] [Accepted: 03/14/2019] [Indexed: 10/27/2022]
Abstract
The purpose of this study was to comprehensively evaluate voice and speech changes in one healthy 30-year-old transgender male undergoing testosterone therapy for transition. Testing occurred at three timepoints before cross-sex hormone therapy and every 2 weeks thereafter for 1 year. Data collected included measures of acoustics, aerodynamics, and laryngeal structure and function via flexible laryngoscopy. Analysis included acoustic correlates of pitch, loudness, voice quality, and vocal tract length, as well as perceptual measures of voice quality and gender. Speaking fundamental frequency (fo) lowered from 183 Hz to 134 Hz. Phonatory frequency range (ie, minimum and maximum singing range) shifted from a range of D#3-E6 to a range of A2-A5. Perceptual measures of voice quality indicated no negative changes. Naïve listeners reliably rated the participant's speech samples as male after 37 weeks on testosterone. Few studies document in detail the variety of voice changes that occur during cross-sex hormone therapy, focusing instead on fo alone. This study adds to the literature a comprehensive case study of speech and voice changes experienced by one transmasculine participant undergoing testosterone therapy.
Collapse
Affiliation(s)
- Gabriel J Cler
- Graduate Program for Neuroscience, Boston University, Boston, Massachusetts; Department of Speech, Language, and Hearing Sciences, Boston University, Boston, Massachusetts.
| | - Victoria S McKenna
- Department of Speech, Language, and Hearing Sciences, Boston University, Boston, Massachusetts
| | - Kimberly L Dahl
- Department of Speech, Language, and Hearing Sciences, Boston University, Boston, Massachusetts
| | - Cara E Stepp
- Graduate Program for Neuroscience, Boston University, Boston, Massachusetts; Department of Speech, Language, and Hearing Sciences, Boston University, Boston, Massachusetts; Department of Biomedical Engineering, Boston University, Boston, Massachusetts; Department of Otolaryngology - Head and Neck Surgery, Boston University School of Medicine, Boston, Massachusetts
| |
Collapse
|