1
|
Iob NA, He L, Ternström S, Cai H, Brockmann-Bauser M. Effects of Speech Characteristics on Electroglottographic and Instrumental Acoustic Voice Analysis Metrics in Women With Structural Dysphonia Before and After Treatment. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:1660-1681. [PMID: 38758676 DOI: 10.1044/2024_jslhr-23-00253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2024]
Abstract
PURPOSE Literature suggests a dependency of the acoustic metrics, smoothed cepstral peak prominence (CPPS) and harmonics-to-noise ratio (HNR), on human voice loudness and fundamental frequency (F0). Even though this has been explained with different oscillatory patterns of the vocal folds, so far, it has not been specifically investigated. In the present work, the influence of three elicitation levels, calibrated sound pressure level (SPL), F0 and vowel on the electroglottographic (EGG) and time-differentiated EGG (dEGG) metrics hybrid open quotient (OQ), dEGG OQ and peak dEGG, as well as on the acoustic metrics CPPS and HNR, was examined, and their suitability for voice assessment was evaluated. METHOD In a retrospective study, 29 women with a mean age of 25 years (± 8.9, range: 18-53) diagnosed with structural vocal fold pathologies were examined before and after voice therapy or phonosurgery. Both acoustic and EGG signals were recorded simultaneously during the phonation of the sustained vowels /ɑ/, /i/, and /u/ at three elicited levels of loudness (soft/comfortable/loud) and unconstrained F0 conditions. RESULTS A linear mixed-model analysis showed a significant effect of elicitation effort levels on peak dEGG, HNR, and CPPS (all p < .01). Calibrated SPL significantly influenced HNR and CPPS (both p < .01). Furthermore, F0 had a significant effect on peak dEGG and CPPS (p < .0001). All metrics showed significant changes with regard to vowel (all p < .05). However, the treatment had no effect on the examined metrics, regardless of the treatment type (surgery vs. voice therapy). CONCLUSIONS The value of the investigated metrics for voice assessment purposes when sampled without sufficient control of SPL and F0 is limited, in that they are significantly influenced by the phonatory context, be it speech or elicited sustained vowels. Future studies should explore the diagnostic value of new data collation approaches such as voice mapping, which take SPL and F0 effects into account.
Collapse
Affiliation(s)
- Naomi Anna Iob
- Division of Phoniatrics and Speech Pathology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Zurich, University of Zurich, Switzerland
| | - Lei He
- Division of Phoniatrics and Speech Pathology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Zurich, University of Zurich, Switzerland
- Department of Computational Linguistics, University of Zurich, Switzerland
| | - Sten Ternström
- Division of Speech, Music and Hearing, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Huanchen Cai
- Division of Speech, Music and Hearing, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Meike Brockmann-Bauser
- Division of Phoniatrics and Speech Pathology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Zurich, University of Zurich, Switzerland
| |
Collapse
|
2
|
Ternström S. Pragmatic De-Noising of Electroglottographic Signals. Bioengineering (Basel) 2024; 11:479. [PMID: 38790346 PMCID: PMC11117636 DOI: 10.3390/bioengineering11050479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 04/30/2024] [Accepted: 05/07/2024] [Indexed: 05/26/2024] Open
Abstract
In voice analysis, the electroglottographic (EGG) signal has long been recognized as a useful complement to the acoustic signal, but only when the vocal folds are actually contacting, such that this signal has an appreciable amplitude. However, phonation can also occur without the vocal folds contacting, as in breathy voice, in which case the EGG amplitude is low, but not zero. It is of great interest to identify the transition from non-contacting to contacting, because this will substantially change the nature of the vocal fold oscillations; however, that transition is not in itself audible. The magnitude of the cycle-normalized peak derivative of the EGG signal is a convenient indicator of vocal fold contacting, but no current EGG hardware has a sufficient signal-to-noise ratio of the derivative. We show how the textbook techniques of spectral thresholding and static notch filtering are straightforward to implement, can run in real time, and can mitigate several noise problems in EGG hardware. This can be useful to researchers in vocology.
Collapse
Affiliation(s)
- Sten Ternström
- Division of Speech, Music and Hearing, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, 100 44 Stockholm, Sweden
| |
Collapse
|
3
|
Cai H, Ternström S, Chaffanjon P, Henrich Bernardoni N. Effects on Voice Quality of Thyroidectomy: A Qualitative and Quantitative Study Using Voice Maps. J Voice 2024:S0892-1997(24)00082-1. [PMID: 38714436 DOI: 10.1016/j.jvoice.2024.03.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 03/11/2024] [Accepted: 03/12/2024] [Indexed: 05/09/2024]
Abstract
OBJECTIVES This study aims to explore the effects of thyroidectomy-a surgical intervention involving the removal of the thyroid gland-on voice quality, as represented by acoustic and electroglottographic measures. Given the thyroid gland's proximity to the inferior and superior laryngeal nerves, thyroidectomy carries a potential risk of affecting vocal function. While earlier studies have documented effects on the voice range, few studies have looked at voice quality after thyroidectomy. Since voice quality effects could manifest in many ways, that a priori are unknown, we wish to apply an exploratory approach that collects many data points from several metrics. METHODS A voice-mapping analysis paradigm was applied retrospectively on a corpus of spoken and sung sentences produced by patients who had thyroid surgery. Voice quality changes were assessed objectively for 57 patients prior to surgery and 2months after surgery, by making comparative voice maps, pre- and post-intervention, of six acoustic and electroglottographic (EGG) metrics. RESULTS After thyroidectomy, statistically significant changes consistent with a worsening of voice quality were observed in most metrics. For all individual metrics, however, the effect sizes were too small to be clinically relevant. Statistical clustering of the metrics helped to clarify the nature of these changes. While partial thyroidectomy demonstrated greater uniformity than did total thyroidectomy, the type of perioperative damage had no discernible impact on voice quality. CONCLUSIONS Changes in voice quality after thyroidectomy were related mostly to increased phonatory instability in both the acoustic and EGG metrics. Clustered voice metrics exhibited a higher correlation to voice complaints than did individual voice metrics.
Collapse
Affiliation(s)
- Huanchen Cai
- Division of Speech, Music and Hearing, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - Sten Ternström
- Division of Speech, Music and Hearing, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Philippe Chaffanjon
- University of Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France; Medical School, Université Grenoble Alpes, Grenoble, France
| | | |
Collapse
|
4
|
Fleischer M, Rummel S, Stritt F, Fischer J, Bock M, Echternach M, Richter B, Traser L. Voice efficiency for different voice qualities combining experimentally derived sound signals and numerical modeling of the vocal tract. Front Physiol 2022; 13:1081622. [PMID: 36620215 PMCID: PMC9822708 DOI: 10.3389/fphys.2022.1081622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 12/01/2022] [Indexed: 12/24/2022] Open
Abstract
Purpose: Concerning voice efficiency considerations of different singing styles, from western classical singing to contemporary commercial music, only limited data is available to date. This single-subject study attempts to quantify the acoustic sound intensity within the human glottis depending on different vocal tract configurations and vocal fold vibration. Methods: Combining Finite-Element-Models derived from 3D-MRI data, audio recordings, and electroglottography (EGG) we analyzed vocal tract transfer functions, particle velocity and acoustic pressure at the glottis, and EGG-related quantities to evaluate voice efficiency at the glottal level and resonance characteristics of different voice qualities according to Estill Voice Training®. Results: Voice qualities Opera and Belting represent highly efficient strategies but apply different vowel strategies and should thus be capable of predominate orchestral sounds. Twang and Belting use similar vowels, but the twang vocal tract configuration enabled the occurrence of anti-resonances and was associated with reduced vocal fold contact but still partially comparable energy transfer from the glottis to the vocal tract. Speech was associated with highly efficient glottal to vocal tract energy transfer, but with the absence of psychoactive strategies makes it more susceptible to noise interference. Falsetto and Sobbing apply less efficiently. Falsetto mainly due to its voice source characteristics, Sobbing due to energy loss in the vocal tract. Thus technical amplification might be appropriate here. Conclusion: Differences exist between voice qualities regarding the sound intensity, caused by different vocal tract morphologies and oscillation characteristics of the vocal folds. The combination of numerical analysis of geometries inside the human body and experimentally determined data outside sheds light on acoustical quantities at the glottal level.
Collapse
Affiliation(s)
- Mario Fleischer
- Department of Audiology and Phoniatrics, Charité—Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany,*Correspondence: Mario Fleischer,
| | | | - Fiona Stritt
- Medical Center, Institute of Musicians’ Medicine, University of Freiburg, Freiburg, Germany,Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Johannes Fischer
- Faculty of Medicine, University of Freiburg, Freiburg, Germany,Medical Center, Department of Radiology, Medical Physics, University of Freiburg, Freiburg, Germany
| | - Michael Bock
- Faculty of Medicine, University of Freiburg, Freiburg, Germany,Medical Center, Department of Radiology, Medical Physics, University of Freiburg, Freiburg, Germany
| | - Matthias Echternach
- Department of Otorhinolaryngology, Ludwig-Maximilians-Universität München, Division of Phoniatrics and Pediatric Audiology, LMU Klinikum, Munich, Germany
| | - Bernhard Richter
- Medical Center, Institute of Musicians’ Medicine, University of Freiburg, Freiburg, Germany,Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Louisa Traser
- Medical Center, Institute of Musicians’ Medicine, University of Freiburg, Freiburg, Germany,Faculty of Medicine, University of Freiburg, Freiburg, Germany
| |
Collapse
|
5
|
Patel RR, Ternström S. Quantitative and Qualitative Electroglottographic Wave Shape Differences in Children and Adults Using Voice Map-Based Analysis. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2977-2995. [PMID: 34319772 DOI: 10.1044/2021_jslhr-20-00717] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Purpose The purpose of this study is to identify the extent to which various measurements of contacting parameters differ between children and adults during habitual range and overlap vocal frequency/intensity, using voice map-based assessment of noninvasive electroglottography (EGG). Method EGG voice maps were analyzed from 26 adults (22-45 years) and 22 children (4-8 years) during connected speech and vowel /a/ over the habitual range and the overlap vocal frequency/intensity from the voice range profile task on the vowel /a/. Mean and standard deviations of contact quotient by integration, normalized contacting speed, quotient of speed by integration, and cycle-rate sample entropy were obtained. Group differences were evaluated using the linear mixed model analysis for the habitual range connected speech and the vowel, whereas analysis of covariance was conducted for the overlap vocal frequency/intensity from the voice range profile task. Presence of a "knee" on the EGG wave shape was determined by visual inspection of the presence of convexity along the decontacting slope of the EGG pulse and the presence of the second derivative zero-crossing. Results The contact quotient by integration, normalized contacting speed, quotient of speed by integration, and cycle-rate sample entropy were significantly different in children compared to (a) adult males for habitual range and (b) adult males and adult females for the overlap vocal frequency/intensity. None of the children had a "knee" on the decontacting slope of the EGG slope. Conclusion EGG parameters of contact quotient by integration, normalized contacting speed, quotient of speed by integration, cycle-rate sample entropy, and absence of a "knee" on the decontacting slope characterize the wave shape differences between children and adults, whereas the normalized contacting speed, quotient of speed by integration, cycle-rate sample entropy, and presence of a "knee" on the downward pulse slope characterize the wave shape differences between adult males and adult females. Supplemental Material https://doi.org/10.23641/asha.15057345.
Collapse
Affiliation(s)
- Rita R Patel
- Department of Speech, Language and Hearing Sciences, Indiana University Bloomington
| | - Sten Ternström
- Division of Speech, Music, and Hearing, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|
6
|
Lã FM, Ternström S. Flow ball-assisted voice training: Immediate effects on vocal fold contacting. Biomed Signal Process Control 2020. [DOI: 10.1016/j.bspc.2020.102064] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|