1
|
Tomaszewska JZ, Georgakis A. Electroglottography in Medical Diagnostics of Vocal Tract Pathologies: A Systematic Review. J Voice 2023:S0892-1997(23)00388-0. [PMID: 38143204 DOI: 10.1016/j.jvoice.2023.12.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 12/02/2023] [Accepted: 12/04/2023] [Indexed: 12/26/2023]
Abstract
Electroglottography (EGG) is a technology developed for measuring the vocal fold contact area during human voice production. Although considered subjective and unreliable as a sole diagnostic method, with the correct application of relevant computational methods, it can constitute a most promising non-invasive voice disorder diagnostic tools in a form of a digital vocal tract pathology classifier. The aim of the following study is to gather and evaluate currently existing digital voice quality assessment systems and vocal tract abnormality classification systems that rely on the use of electroglottographic bio-impedance signals. To fully comprehend the findings of this review, first the subject of EGG is introduced. For that, we summarise most relevant existing research on EGG with a particular focus on its application in diagnostics. Then, we move on to the focal point of this work, which is describing and comparing the existing EGG-based digital voice pathology classification systems. With the application of PRISMA model, 13 articles were chosen and analysed in detail. Direct comparison between chosen studies brought us to pivotal conclusions, which have been described in Section 5 of this report. Meanwhile, certain limitations arising from the literature were identified, such as questionable understanding of the nature of EGG bio-impedance signals. The appropriate recommendations for future work were made, including the application of different methods for EGG feature extraction, as well as the need for continuous EGG datasets development containing signals gathered in various conditions and with different equipments.
Collapse
|
2
|
Codino J, Jackson-Menaldi MC, Rubin A, Torres ME. Automated Quantification of Inflection Events in The Electroglottographic Signal. J Voice 2023; 37:640-647. [PMID: 34162494 DOI: 10.1016/j.jvoice.2021.05.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Revised: 05/01/2021] [Accepted: 05/10/2021] [Indexed: 11/26/2022]
Affiliation(s)
- Juliana Codino
- Lakeshore Professional Voice Center, Lakeshore Ear, Nose and Throat Center, MI, USA
| | - María Cristina Jackson-Menaldi
- Laboratorio de Señales y Dinámicas no Lineales, Facultad de Ingeniería, Universidad Nacional de Entre Ríos, Argentina, National Council for Scientific and Technical Research (CONICET), Argentina
| | - Adam Rubin
- Laboratorio de Señales y Dinámicas no Lineales, Facultad de Ingeniería, Universidad Nacional de Entre Ríos, Argentina, National Council for Scientific and Technical Research (CONICET), Argentina
| | - María Eugenia Torres
- Laboratorio de Señales y Dinámicas no Lineales, Facultad de Ingeniería, Universidad Nacional de Entre Ríos, Argentina, National Council for Scientific and Technical Research (CONICET), Argentina
| |
Collapse
|
3
|
Zhao W, Singh R. Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1039. [PMID: 37509986 PMCID: PMC10378572 DOI: 10.3390/e25071039] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 07/03/2023] [Accepted: 07/07/2023] [Indexed: 07/30/2023]
Abstract
During phonation, the vocal folds exhibit a self-sustained oscillatory motion, which is influenced by the physical properties of the speaker's vocal folds and driven by the balance of bio-mechanical and aerodynamic forces across the glottis. Subtle changes in the speaker's physical state can affect voice production and alter these oscillatory patterns. Measuring these can be valuable in developing computational tools that analyze voice to infer the speaker's state. Traditionally, vocal fold oscillations (VFOs) are measured directly using physical devices in clinical settings. In this paper, we propose a novel analysis-by-synthesis approach that allows us to infer the VFOs directly from recorded speech signals on an individualized, speaker-by-speaker basis. The approach, called the ADLES-VFT algorithm, is proposed in the context of a joint model that combines a phonation model (with a glottal flow waveform as the output) and a vocal tract acoustic wave propagation model such that the output of the joint model is an estimated waveform. The ADLES-VFT algorithm is a forward-backward algorithm which minimizes the error between the recorded waveform and the output of this joint model to estimate its parameters. Once estimated, these parameter values are used in conjunction with a phonation model to obtain its solutions. Since the parameters correlate with the physical properties of the vocal folds of the speaker, model solutions obtained using them represent the individualized VFOs for each speaker. The approach is flexible and can be applied to various phonation models. In addition to presenting the methodology, we show how the VFOs can be quantified from a dynamical systems perspective for classification purposes. Mathematical derivations are provided in an appendix for better readability.
Collapse
Affiliation(s)
- Wayne Zhao
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Rita Singh
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
4
|
Analysis of localized bioimpedance from healthy young adults during activities of the vocal folds using Cole-impedance model representation. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
5
|
Echternach M, Herbst CT, Köberlein M, Story B, Döllinger M, Gellrich D. Are source-filter interactions detectable in classical singing during vowel glides? THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:4565. [PMID: 34241428 DOI: 10.1121/10.0005432] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 06/03/2021] [Indexed: 06/13/2023]
Abstract
In recent studies, it has been assumed that vocal tract formants (Fn) and the voice source could interact. However, there are only few studies analyzing this assumption in vivo. Here, the vowel transition /i/-/a/-/u/-/i/ of 12 professional classical singers (6 females, 6 males) when phonating on the pitch D4 [fundamental frequency (ƒo) ca. 294 Hz] were analyzed using transnasal high speed videoendoscopy (20.000 fps), electroglottography (EGG), and audio recordings. Fn data were calculated using a cepstral method. Source-filter interaction candidates (SFICs) were determined by (a) algorithmic detection of major intersections of Fn/nƒo and (b) perceptual assessment of the EGG signal. Although the open quotient showed some increase for the /i-a/ and /u-i/ transitions, there were no clear effects at the expected Fn/nƒo intersections. In contrast, ƒo adjustments and changes in the phonovibrogram occurred at perceptually derived SFICs, suggesting level-two interactions. In some cases, these were constituted by intersections between higher nƒo and Fn. The presented data partially corroborates that vowel transitions may result in level-two interactions also in professional singers. However, the lack of systematically detectable effects suggests either the absence of a strong interaction or existence of confounding factors, which may potentially counterbalance the level-two-interactions.
Collapse
Affiliation(s)
- Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Marchioninistrasse 15, Munich, 81377, Germany
| | - Christian T Herbst
- Antonio Salieri Department of Vocal Studies and Vocal Research in Music Education, University of Music and Performing Arts Vienna, Vienna, Austria
| | - Marie Köberlein
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Marchioninistrasse 15, Munich, 81377, Germany
| | - Brad Story
- Department of Speech, Language, and Hearing Sciences, University of Arizona, Tucson, Arizona 85718, USA
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School Waldstrasse 1, Erlangen, 91054, Germany
| | - Donata Gellrich
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Marchioninistrasse 15, Munich, 81377, Germany
| |
Collapse
|
6
|
Frič M, Hruška V, Dlask P. Full-field face vibration measurement in singing—Case study. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
7
|
Vaiano T, Herbella FA, Behlau M. High-Resolution Manometry as a Tool for Biofeedback in Vertical Laryngeal Positioning. J Voice 2021; 35:418-421. [DOI: 10.1016/j.jvoice.2019.10.018] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 10/31/2019] [Accepted: 10/31/2019] [Indexed: 10/25/2022]
|
8
|
Electroglottography – An Update. J Voice 2020; 34:503-526. [DOI: 10.1016/j.jvoice.2018.12.014] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2018] [Revised: 12/27/2018] [Accepted: 12/28/2018] [Indexed: 11/21/2022]
|
9
|
Selamtzis A, Ternström S, Richter B, Burk F, Köberlein M, Echternach M. A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:3275. [PMID: 30599695 DOI: 10.1121/1.5066456] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Accepted: 10/14/2018] [Indexed: 06/09/2023]
Abstract
This study compares the use of electroglottograms (EGGs) and glottal area waveforms (GAWs) to study phonation in different vibratory states as produced by professionally trained singers. Six western classical tenors were asked to phonate pitch glides from modal to falsetto phonation, or from modal to their stage voice above the passaggio (SVaP). For each pitch glide the sample entropy (SampEn) of the EGG signal was calculated to detect the occurrence of phonatory instabilities and establish a "ground truth" for the performed phonation type. The cycles before the maximum SampEn were labeled as modal, and the cycles after the peak were labeled as either falsetto, or SVaP. Three automatic categorizations of vibratory state were performed using clustering: one based only on the EGG, one based on the GAW, and one based on their combination. The error rate (clustering vs ground truth) was, on average, lower than 10% for all of the three settings, revealing no special advantage of the GAW over EGG, and vice versa. Modal voice cycles exhibited a larger contact quotient, larger normalized derivative peak ratio, and lower rise time, compared to SVaP and falsetto. The GAW-based normalized maximum area declination rate was larger in SVaP compared to modal voice.
Collapse
Affiliation(s)
- Andreas Selamtzis
- Department of Speech, Music and Hearing, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Lindstedtsvägen 24, Breisacherstraße 60, Stockholm, SE-100 44, Sweden
| | - Sten Ternström
- Department of Speech, Music and Hearing, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Lindstedtsvägen 24, Breisacherstraße 60, Stockholm, SE-100 44, Sweden
| | - Bernard Richter
- Institute of Musicians' Medicine, Freiburg University Medical Center, Breisacher Strausse 60, Freiburg, 79106, Germany
| | - Fabian Burk
- Department of Otorhinolaryngology, University Medical Center Schleswig-Holstein, Arnold-Heller-Straße 3, 24105 Kiel, Germany
| | - Marie Köberlein
- Institute of Musicians' Medicine, Freiburg University Medical Center, Breisacher Strausse 60, Freiburg, 79106, Germany
| | - Matthias Echternach
- Department of Otorhinolaryngology, Division of Phoniatrics and Pediatric Audiology, Munich University, Campus Großhadern, Marchioninistraße 15, Munich, 81377, Germany
| |
Collapse
|
10
|
Rasmussen JH, Herbst CT, Elemans CPH. Quantifying syringeal dynamics in vitro using electroglottography. ACTA ACUST UNITED AC 2018; 221:jeb.172247. [PMID: 29880637 DOI: 10.1242/jeb.172247] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Accepted: 05/30/2018] [Indexed: 11/20/2022]
Abstract
The complex and elaborate vocalizations uttered by many of the 10,000 extant bird species are considered a major driver in their evolutionary success, warranting study of the underlying mechanisms of vocal production. Additionally, birdsong has developed into a highly productive model system for vocal imitation learning and motor control, where, in contrast to humans, we have experimental access to the entire neuromechanical control loop. In human voice production, complex laryngeal geometry, vocal fold tissue properties, airflow and laryngeal musculature all interact to ultimately control vocal fold kinematics. Quantifying vocal fold kinematics is thus critical to understanding neuromechanical control of voiced sound production, but in vivo imaging of vocal fold kinematics in birds is experimentally challenging. Here, we adapted and tested electroglottography (EGG) as a novel tool for examining vocal fold kinematics in the avian vocal organ, the syrinx. We furthermore imaged and quantified syringeal kinematics in the pigeon (Columba livia) syrinx with unprecedented detail. Our results show that EGG signals predict (1) the relative amount of contact between the avian equivalent of vocal folds and (2) essential parameters describing vibratory kinematics, such as fundamental frequency, and timing of syringeal opening and closing events. As such, EGG provides novel opportunities for measuring syringeal vibratory kinematic parameters in vivo Furthermore, the opportunity for imaging syringeal vibratory kinematics from multiple planar views (horizontal and coronal) simultaneously promotes birds as an excellent model system for studying kinematics and control of voiced sound production in general, including in humans and other mammals.
Collapse
Affiliation(s)
- Jeppe H Rasmussen
- Department of Biology, University of Southern Denmark, 5230 Odense, Denmark
| | - Christian T Herbst
- Department of Cognitive Biology, University of Vienna, 1090 Vienna, Austria
| | - Coen P H Elemans
- Department of Biology, University of Southern Denmark, 5230 Odense, Denmark
| |
Collapse
|
11
|
HERBST CHRISTIANT, DUNN JACOBC. Non-invasive documentation of primate voice production using electroglottography. ANTHROPOL SCI 2018. [DOI: 10.1537/ase.180201] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Affiliation(s)
| | - JACOB C. DUNN
- Department of Animal and Environmental Biology, Faculty of Science & Technology, Anglia Ruskin University, Cambridge
- Division of Biological Anthropology, University of Cambridge, Cambridge
| |
Collapse
|
12
|
Echternach M, Burk F, Rose F, Herbst CT, Burdumy M, Döllinger M, Richter B. [Impact of functional mass lesions in professional female singers : Biomechanics of vocal fold oscillation in the register transition regions]. HNO 2017; 66:308-320. [PMID: 29247438 DOI: 10.1007/s00106-017-0447-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
BACKGROUND The influence of functional mass lesions on vocal fold oscillation patterns in vocally challenging tasks is not yet understood in detail. MATERIALS UND METHODS Glissandi on the vowel [a:] from 220 to 440 Hz and 440 to 880 Hz were analyzed in three groups of four professional female singers: without a mass lesion or dysphony (group A), with a functional mass lesion (swellings without a great impact on oscillation patterns during stroboscopy; group B), and with organic dysphony (group C). High-speed digital imaging (HSDI; 20,000 fps), and acoustic and electroglottographic (EGG) signals were used for analysis. Based on the EGG sample entropy, time windows for analysis of register transition phenomena were constructed. The voice signals (glottal area waveform, GAW; acoustic and EGG signals) were perceptually rated in terms of the noticeability of registration events. RESULTS The absolute sample entropy revealed maxima in fundamental frequency regions where register transitions typically occur. Groups A and B could be distinguished neither by perceptual rating nor based on sample entropy values. In comparison to the other two groups, the absolute sample entropy values of group C were greater in the lower glissando. However, the larger vocal fold oscillatory irregularities were observable for the upper glissando in this group. CONCLUSION Functional mass lesions do not influence biomechanics adversely in vocally challenging tasks such as register transitions. The use of sample entropy as a criterion for detection of register transitions is promising, but needs further validation.
Collapse
Affiliation(s)
- M Echternach
- Freiburger Institut für Musikermedizin, Medizinische Fakultät, Albert-Ludwigs-Universität und Universitätsklinikum Freiburg, Breisacher Str. 60, 79106, Freiburg i.Br., Deutschland.
| | - F Burk
- Freiburger Institut für Musikermedizin, Medizinische Fakultät, Albert-Ludwigs-Universität und Universitätsklinikum Freiburg, Breisacher Str. 60, 79106, Freiburg i.Br., Deutschland
| | - F Rose
- Freiburger Institut für Musikermedizin, Medizinische Fakultät, Albert-Ludwigs-Universität und Universitätsklinikum Freiburg, Breisacher Str. 60, 79106, Freiburg i.Br., Deutschland
| | - C T Herbst
- Department für Musikwissenschaft, Universität Mozarteum Salzburg, Salzburg, Österreich
| | - M Burdumy
- Medizin Physik, Medizinische Fakultät, Albert-Ludwigs-Universität und Universitätsklinikum Freiburg, Breisacher Str. 60a, 79106, Freiburg, Deutschland
| | - M Döllinger
- Abteilung für Phoniatrie und Pädaudiologie an der HNO Klinik Erlangen, Universitätsklinikum Erlangen, Bohlenplatz 21, 91054, Erlangen, Deutschland
| | - B Richter
- Freiburger Institut für Musikermedizin, Medizinische Fakultät, Albert-Ludwigs-Universität und Universitätsklinikum Freiburg, Breisacher Str. 60, 79106, Freiburg i.Br., Deutschland
| |
Collapse
|
13
|
Macerata A, Nacci A, Manti M, Cianchetti M, Matteucci J, Romeo SO, Fattori B, Berrettini S, Laschi C, Ursino F. Evaluation of the Electroglottographic signal variability by amplitude-speed combined analysis. Biomed Signal Process Control 2017. [DOI: 10.1016/j.bspc.2016.10.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
14
|
Herbst CT, Schutte HK, Bowling DL, Svec JG. Comparing Chalk With Cheese—The EGG Contact Quotient Is Only a Limited Surrogate of the Closed Quotient. J Voice 2017; 31:401-409. [DOI: 10.1016/j.jvoice.2016.11.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2016] [Revised: 11/06/2016] [Accepted: 11/08/2016] [Indexed: 10/20/2022]
|
15
|
Echternach M, Burk F, Köberlein M, Selamtzis A, Döllinger M, Burdumy M, Richter B, Herbst CT. Laryngeal evidence for the first and second passaggio in professionally trained sopranos. PLoS One 2017; 12:e0175865. [PMID: 28467509 PMCID: PMC5414960 DOI: 10.1371/journal.pone.0175865] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2016] [Accepted: 03/31/2017] [Indexed: 11/18/2022] Open
Abstract
Introduction Due to a lack of empirical data, the current understanding of the laryngeal mechanics in the passaggio regions (i.e., the fundamental frequency ranges where vocal registration events usually occur) of the female singing voice is still limited. Material and methods In this study the first and second passaggio regions of 10 professionally trained female classical soprano singers were analyzed. The sopranos performed pitch glides from A3 (ƒo = 220 Hz) to A4 (ƒo = 440 Hz) and from A4 (ƒo = 440 Hz) to A5 (ƒo = 880 Hz) on the vowel [iː]. Vocal fold vibration was assessed with trans-nasal high speed videoendoscopy at 20,000 fps, complemented by simultaneous electroglottographic (EGG) and acoustic recordings. Register breaks were perceptually rated by 12 voice experts. Voice stability was documented with the EGG-based sample entropy. Glottal opening and closing patterns during the passaggi were analyzed, supplemented with open quotient data extracted from the glottal area waveform. Results In both the first and the second passaggio, variations of vocal fold vibration patterns were found. Four distinct patterns emerged: smooth transitions with either increasing or decreasing durations of glottal closure, abrupt register transitions, and intermediate loss of vocal fold contact. Audible register transitions (in both the first and second passaggi) generally coincided with higher sample entropy values and higher open quotient variance through the respective passaggi. Conclusions Noteworthy vocal fold oscillatory registration events occur in both the first and the second passaggio even in professional sopranos. The respective transitions are hypothesized to be caused by either (a) a change of laryngeal biomechanical properties; or by (b) vocal tract resonance effects, constituting level 2 source-filter interactions.
Collapse
Affiliation(s)
- Matthias Echternach
- Institute of Musicians’ Medicine, University of Freiburg Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Fabian Burk
- Institute of Musicians’ Medicine, University of Freiburg Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Marie Köberlein
- Institute of Musicians’ Medicine, University of Freiburg Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Andreas Selamtzis
- Royal Technical University, Music Acoustics. Lindstedtsvägen 24, Stockholm, Sweden
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Medical School, Waldstrasse 1, Erlangen, Germany
| | - Michael Burdumy
- Department of Medical Physics, University of Freiburg Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Bernhard Richter
- Institute of Musicians’ Medicine, University of Freiburg Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Christian Thomas Herbst
- Laboratory of Bio-Acoustics, Department of Cognitive Biology, University of Vienna, Althanstraße 14, Vienna, Austria
- * E-mail:
| |
Collapse
|
16
|
Echternach M, Burk F, Köberlein M, Herbst CT, Döllinger M, Burdumy M, Richter B. Oscillatory Characteristics of the Vocal Folds Across the Tenor Passaggio. J Voice 2017; 31:381.e5-381.e14. [DOI: 10.1016/j.jvoice.2016.06.015] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2016] [Revised: 06/24/2016] [Accepted: 06/27/2016] [Indexed: 10/21/2022]
|
17
|
Automated Electroglottographic Inflection Events Detection. A Pilot Study. J Voice 2016; 30:768.e1-768.e10. [DOI: 10.1016/j.jvoice.2015.10.020] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Accepted: 10/29/2015] [Indexed: 11/20/2022]
|
18
|
Echternach M, Burk F, Burdumy M, Herbst CT, Köberlein M, Döllinger M, Richter B. The influence of vocal fold mass lesions on the passaggio region of professional singers. Laryngoscope 2016; 127:1392-1401. [DOI: 10.1002/lary.26332] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2016] [Revised: 08/10/2016] [Accepted: 08/30/2016] [Indexed: 11/09/2022]
Affiliation(s)
- Matthias Echternach
- Institute of Musicians' Medicine; Freiburg University Medical Center; Freiburg Germany
| | - Fabian Burk
- Institute of Musicians' Medicine; Freiburg University Medical Center; Freiburg Germany
| | - Michael Burdumy
- Division of Radiology; Department of Medical Physics; Freiburg University Medical Center; Freiburg Germany
| | - Christian T. Herbst
- Laboratory of Bio-Acoustics, Department of Cognitive Biology; University of Vienna; Vienna Austria
| | - Marie Köberlein
- Institute of Musicians' Medicine; Freiburg University Medical Center; Freiburg Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology; Department of Otorhinolaryngology-Head & Neck Surgery; University Hospital Erlangen Medical School; Erlangen Germany
| | - Bernhard Richter
- Institute of Musicians' Medicine; Freiburg University Medical Center; Freiburg Germany
| |
Collapse
|
19
|
Bourne T, Garnier M, Samson A. Physiological and acoustic characteristics of the male music theatre voice. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:610. [PMID: 27475183 DOI: 10.1121/1.4954751] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Six male music theatre singers were recorded in three different voice qualities: legit and two types of belt ("chesty" and "twangy"), on two vowels ([e] and [ɔ]), at four increasing pitches in the upper limit of each singer's belt range (∼250-440 Hz). The audio signal, the electroglottographic (EGG) signal, and the vocal tract impedance were all measured simultaneously. Voice samples were analyzed and then evaluated perceptually by 16 expert listeners. The three qualities were produced with significant differences at the physiological, acoustical, and perceptual levels: Singers produced belt qualities with a higher EGG contact quotient (CQEGG) and greater contacting speed quotient (Qcs), greater sound pressure level (SPL), and energy above 1 kHz (alpha ratio), and with higher frequencies of the first two vocal tract resonances (fR1, fR2), especially in the upper pitch range when compared to legit. Singers produced the chesty belt quality with higher CQEGG, Qcs, and SPL values and lower alpha ratios over the whole belt range, and with higher fR1 at the higher pitch range when compared to twangy belt. Consistent tuning of fR1 to the second voice harmonic (2f0) was observed in all three qualities and for both vowels. Expert listeners tended to identify all qualities based on the same acoustical and physiological variations as those observed in the singers' intended qualities.
Collapse
Affiliation(s)
- Tracy Bourne
- Federation University, Arts Academy, Ballarat, Victoria 3353, Australia
| | - Maëva Garnier
- CNRS, GIPSA-lab, 11 rue des Mathématiques, Grenoble Campus BP46, F-38402 Saint Martin d'Hères Cedex, France
| | - Adeline Samson
- Laboratoire Jean Kuntzmann, UMR CNRS 5225, University Grenoble-Alpes, Grenoble, France
| |
Collapse
|
20
|
Enflo L, Herbst CT, Sundberg J, McAllister A. Comparing Vocal Fold Contact Criteria Derived From Audio and Electroglottographic Signals. J Voice 2015; 30:381-8. [PMID: 26546098 DOI: 10.1016/j.jvoice.2015.05.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Accepted: 05/20/2015] [Indexed: 11/25/2022]
Abstract
OBJECTIVES Collision threshold pressure (CTP), that is, the lowest subglottal pressure facilitating vocal fold contact during phonation, is likely to reflect relevant vocal fold properties. The amplitude of an electroglottographic (EGG) signal or the amplitude of its first derivative (dEGG) has been used as criterion of such contact. Manual measurement of CTP is time consuming, making the development of a simpler, alternative method desirable. METHOD In this investigation, we compare CTP values measured manually to values automatically derived from dEGG and to values derived from a set of alternative parameters, some obtained from audio and some from EGG signals. One of the parameters was the novel EGG wavegram, which visualizes sequences of EGG or dEGG cycles, normalized with respect to period and amplitude. Raters with and without previous acquaintance with EGG analysis marked the disappearance of vocal fold contact in dEGG and in wavegram displays of /pa:/-sequences produced with continuously decreasing vocal loudness by seven singer subjects. RESULTS Vocal fold contact was mostly identified accurately in displays of both dEGG amplitude and wavegram. Automatically derived CTP values showed high correlation with those measured manually and with those derived from the ratings of the visual displays. Seven other parameters were tested as criteria of such contact. Mainly, because of noise in the EGG signal, most of them yielded CTP values differing considerably from those derived from the manual and the automatic methods, although the EGG spectrum slope showed a high correlation. CONCLUSION The possibility of measuring CTP automatically seems promising for future investigations.
Collapse
Affiliation(s)
- Laura Enflo
- Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden; Department of Speech, Music and Hearing, Royal Institute of Technology (KTH), Stockholm, Sweden; Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts.
| | - Christian T Herbst
- Voice Research Laboratory, Department of Biophysics, Faculty of Science, Palacký University Olomouc, Olomouc, Czech Republic; Laboratory of Bio-Acoustics, Department of Cognitive Biology, University of Vienna, Vienna, Austria
| | - Johan Sundberg
- Department of Speech, Music and Hearing, Royal Institute of Technology (KTH), Stockholm, Sweden
| | - Anita McAllister
- Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden; Department of CLINTEC, Karolinska Institute, Stockholm, Sweden
| |
Collapse
|
21
|
Awan SN, Krauss AR, Herbst CT. An Examination of the Relationship Between Electroglottographic Contact Quotient, Electroglottographic Decontacting Phase Profile, and Acoustical Spectral Moments. J Voice 2015; 29:519-29. [DOI: 10.1016/j.jvoice.2014.10.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2014] [Accepted: 10/23/2014] [Indexed: 10/23/2022]
|
22
|
Herbst CT, Hess M, Müller F, Švec JG, Sundberg J. Glottal Adduction and Subglottal Pressure in Singing. J Voice 2015; 29:391-402. [PMID: 25944295 DOI: 10.1016/j.jvoice.2014.08.009] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2014] [Accepted: 08/13/2014] [Indexed: 11/17/2022]
Abstract
Previous research suggests that independent variation of vocal loudness and glottal configuration (type and degree of vocal fold adduction) does not occur in untrained speech production. This study investigated whether these factors can be varied independently in trained singing and how subglottal pressure is related to average glottal airflow, voice source properties, and sound level under these conditions. A classically trained baritone produced sustained phonations on the endoscopic vowel [i:] at pitch D4 (approximately 294 Hz), exclusively varying either (a) vocal register; (b) phonation type (from "breathy" to "pressed" via cartilaginous adduction); or (c) vocal loudness, while keeping the others constant. Phonation was documented by simultaneous recording of videokymographic, electroglottographic, airflow and voice source data, and by percutaneous measurement of relative subglottal pressure. Register shifts were clearly marked in the electroglottographic wavegram display. Compared with chest register, falsetto was produced with greater pulse amplitude of the glottal flow, H1-H2, mean airflow, and with lower maximum flow declination rate (MFDR), subglottal pressure, and sound pressure. Shifts of phonation type (breathy/flow/neutral/pressed) induced comparable systematic changes. Increase of vocal loudness resulted in increased subglottal pressure, average flow, sound pressure, MFDR, glottal flow pulse amplitude, and H1-H2. When changing either vocal register or phonation type, subglottal pressure and mean airflow showed an inverse relationship, that is, variation of glottal flow resistance. The direct relation between subglottal pressure and airflow when varying only vocal loudness demonstrated independent control of vocal loudness and glottal configuration. Achieving such independent control of phonatory control parameters would be an important target in vocal pedagogy and in voice therapy.
Collapse
Affiliation(s)
- Christian T Herbst
- Voice Research Lab, Department of Biophysics, Faculty of Science, Palacký University Olomouc, Olomouc, Czech Republic; Laboratory of Bio-Acoustics, Department of Cognitive Biology, University of Vienna, Wien, Austria.
| | - Markus Hess
- Department of Voice, Speech and Hearing Disorders, University Medical Center Hamburg-Eppendorf, University of Hamburg, Hamburg, Germany
| | - Frank Müller
- Department of Voice, Speech and Hearing Disorders, University Medical Center Hamburg-Eppendorf, University of Hamburg, Hamburg, Germany
| | - Jan G Švec
- Voice Research Lab, Department of Biophysics, Faculty of Science, Palacký University Olomouc, Olomouc, Czech Republic
| | - Johan Sundberg
- Department of Speech, Music, and Hearing, School of Computer Science and Communication, KTH Voice Research Centre, Stockholm, Sweden; University College of Music Education Stockholm, Stockholm, Sweden
| |
Collapse
|
23
|
Hohm J, Döllinger M, Bohr C, Kniesburges S, Ziethe A. Influence of F0 and Sequence Length of Audio and Electroglottographic Signals on Perturbation Measures for Voice Assessment. J Voice 2015; 29:517.e11-21. [PMID: 25944290 DOI: 10.1016/j.jvoice.2014.10.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2014] [Accepted: 10/01/2014] [Indexed: 12/01/2022]
Abstract
OBJECTIVE Within the functional assessment of voice disorders, an objective analysis of measured parameters from audio, electroglottographic (EGG), or visual signals is desired. In a typical clinical situation, reliable objective analysis is not always possible due to missing standardization and unknown stability of the clinical parameters. The aim of this study was to investigate the robustness/stability of measured clinical parameters of the audio and EGG signals in a typical clinical setting to ensure a reliable objective analysis. In particular, the influence of F0 and of the sequence length on several definitions of jitter and shimmer will be analyzed. PATIENTS AND METHODS Seventy-four young healthy women produced a sustained vowel /a/ and an upward triad with abrupt changeovers. Different sequence lengths (100, 150, 500, and 1000 ms) of sustained phonation and triads (100 and 150 ms) were extracted from the audio and EGG signals. In total, six variations of jitter and four variations of shimmer parameters were analyzed. RESULTS Jitter%, Jitter11p, and JitterPPQ of the audio signal as well as Jittermean, Shimmer, and Shimmer11p of the EGG signal are unaffected by both sequence length and F0. CONCLUSIONS Influence of F0 and sequence length on several perturbation measures of the audio and EGG signals was identified. For an objective clinical voice assessment, unaffected definitions of jitter and shimmer should be preferred and applied to enable comparability between different recordings, examinations, and studies.
Collapse
Affiliation(s)
- Julian Hohm
- Department for Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Erlangen, Germany
| | - Michael Döllinger
- Department for Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Erlangen, Germany
| | - Christopher Bohr
- Department for Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Erlangen, Germany
| | - Stefan Kniesburges
- Department for Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Erlangen, Germany
| | - Anke Ziethe
- Department for Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Erlangen, Germany.
| |
Collapse
|
24
|
Jannetts S, Lowit A. Cepstral Analysis of Hypokinetic and Ataxic Voices: Correlations With Perceptual and Other Acoustic Measures. J Voice 2014; 28:673-80. [DOI: 10.1016/j.jvoice.2014.01.013] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2013] [Accepted: 01/23/2014] [Indexed: 10/25/2022]
|
25
|
Selamtzis A, Ternström S. Analysis of vibratory states in phonation using spectral features of the electroglottographic signal. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:2773-2783. [PMID: 25373977 DOI: 10.1121/1.4896466] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
The vocal folds can oscillate in several different ways, manifest to practitioners and clinicians as "registers" or "mechanisms," of which the two most often considered are modal voice and falsetto voice. Here these will be taken as instances of different "vibratory states," i.e., distinct quasi-stationary patterns of vibration of the vocal folds. State transitions are common in biomechanical nonlinear oscillators, and they are often abrupt and impossible to predict exactly. Therefore, vibratory states are a source of confounding variation, for instance when acquiring a voice range profile (VRP). In the quest for a state-based, non-invasive VRP, a semi-automatic method based on the short-term spectrum of the electroglottographic (EGG) signal was developed. The method identifies rapid vibratory state transitions, such as the modal-falsetto switch, and clusters the EGG data based on their similarities in the relative levels and phases of the lower frequency components. Productions of known modal and falsetto voice were accurately clustered by a Gaussian mixture model. When mapped into the VRP, this EGG-based clustering revealed connected regions of different vibratory sub-regimes in both modal and falsetto.
Collapse
Affiliation(s)
- Andreas Selamtzis
- Department of Speech, Music and Hearing, School of Computer Science and Communication, KTH Royal Institute of Technology, SE-100 44 Stockholm, Sweden
| | - Sten Ternström
- Department of Speech, Music and Hearing, School of Computer Science and Communication, KTH Royal Institute of Technology, SE-100 44 Stockholm, Sweden
| |
Collapse
|
26
|
Herbst CT, Lohscheller J, Švec JG, Henrich N, Weissengruber G, Fitch WT. Glottal opening and closing events investigated by electroglottography and super-high-speed video recordings. J Exp Biol 2014; 217:955-63. [DOI: 10.1242/jeb.093203] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Previous research has suggested that the peaks in the first derivative (dEGG) of the electroglottographic (EGG) signal are good approximate indicators of the events of glottal opening and closing. These findings were based on high-speed video (HSV) recordings with frame rates 10 times lower than the sampling frequencies of the corresponding EGG data. The present study attempts to corroborate these previous findings, utilizing super-HSV recordings. The HSV and EGG recordings (sampled at 27 and 44 kHz, respectively) of an excised canine larynx phonation were synchronized by an external TTL signal to within 0.037 ms. Data were analyzed by means of glottovibrograms, digital kymograms, the glottal area waveform and the vocal fold contact length (VFCL), a new parameter representing the time-varying degree of ‘zippering’ closure along the anterior–posterior (A–P) glottal axis. The temporal offsets between glottal events (depicted in the HSV recordings) and dEGG peaks in the opening and closing phase of glottal vibration ranged from 0.02 to 0.61 ms, amounting to 0.24–10.88% of the respective glottal cycle durations. All dEGG double peaks coincided with vibratory A–P phase differences. In two out of the three analyzed video sequences, peaks in the first derivative of the VFCL coincided with dEGG peaks, again co-occurring with A–P phase differences. The findings suggest that dEGG peaks do not always coincide with the events of glottal closure and initial opening. Vocal fold contacting and de-contacting do not occur at infinitesimally small instants of time, but extend over a certain interval, particularly under the influence of A–P phase differences.
Collapse
Affiliation(s)
- Christian T. Herbst
- Voice Research Laboratory, Department of Biophysics, Faculty of Science, Palacký University Olomouc, tr. 17. Listopadu 12, 771 46 Olomouc, Czech Republic
- Laboratory of Bio-Acoustics, Department of Cognitive Biology, University of Vienna, Althanstraße 14, 1090 Vienna, Austria
| | - Jörg Lohscheller
- University of Applied Sciences, Department of Computer Science, Schneidershof, 54293 Trier, Germany
| | - Jan G. Švec
- Voice Research Laboratory, Department of Biophysics, Faculty of Science, Palacký University Olomouc, tr. 17. Listopadu 12, 771 46 Olomouc, Czech Republic
| | - Nathalie Henrich
- GIPSA-lab, CNRS, Grenoble INP, Grenoble University, 11 rue des Mathématiques – BP 46, 38402 Saint Martin d'Hères cedex, France
| | - Gerald Weissengruber
- University of Veterinary Medicine Vienna, Institute for Anatomy, Histology and Embryology, Veterinärplatz 1, 1210 Vienna, Austria
| | - W. Tecumseh Fitch
- Laboratory of Bio-Acoustics, Department of Cognitive Biology, University of Vienna, Althanstraße 14, 1090 Vienna, Austria
| |
Collapse
|
27
|
Herbst CT, Herzel H, Svec JG, Wyman MT, Fitch WT. Visualization of system dynamics using phasegrams. J R Soc Interface 2013; 10:20130288. [PMID: 23697715 PMCID: PMC4043161 DOI: 10.1098/rsif.2013.0288] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2013] [Accepted: 04/29/2013] [Indexed: 11/29/2022] Open
Abstract
A new tool for visualization and analysis of system dynamics is introduced: the phasegram. Its application is illustrated with both classical nonlinear systems (logistic map and Lorenz system) and with biological voice signals. Phasegrams combine the advantages of sliding-window analysis (such as the spectrogram) with well-established visualization techniques from the domain of nonlinear dynamics. In a phasegram, time is mapped onto the x-axis, and various vibratory regimes, such as periodic oscillation, subharmonics or chaos, are identified within the generated graph by the number and stability of horizontal lines. A phasegram can be interpreted as a bifurcation diagram in time. In contrast to other analysis techniques, it can be automatically constructed from time-series data alone: no additional system parameter needs to be known. Phasegrams show great potential for signal classification and can act as the quantitative basis for further analysis of oscillating systems in many scientific fields, such as physics (particularly acoustics), biology or medicine.
Collapse
Affiliation(s)
- Christian T Herbst
- Department of Cognitive Biology, Laboratory of Bioacoustics, University of Vienna, Althanstrasse 14, 1090 Vienna, Austria.
| | | | | | | | | |
Collapse
|
28
|
Unger J, Meyer T, Herbst CT, Fitch WTS, Döllinger M, Lohscheller J. Phonovibrographic wavegrams: visualizing vocal fold kinematics. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:1055-1064. [PMID: 23363121 DOI: 10.1121/1.4774378] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Recently, endoscopic high-speed laryngoscopy has been established for commercial use as a state-of-the-art technique to examine vocal fold kinematics. Since modern cameras provide sampling rates of several thousand frames per second, a high volume of data has to be considered for visual and objective analysis. A method for visualizing endoscopic high speed videos in three-dimensional cycle-based graphs combining and extending the approaches of phonovibrograms and electroglottographic wavegrams is presented. To build a phonovibrographic wavegram, individual cycles of a phonovibrogram are segmented, normalized in cycle duration, and concatenated over time. For analyzing purposes, the emerging three-dimensional scalar field is visualized with different rendering techniques providing information of different aspects of vocal fold kinematics. The phonovibrographic wavegram incorporates information about the glottal closure type, size, and location of the amplitudes, symmetry, periodicity, and phase information. The potential of the approach to visualize the characteristics of vocal fold vibration in a compact and intuitive way is demonstrated within two healthy and three pathologic subjects. The phonovibrographic wavegram allows a comprehensive analysis of vocal fold kinematics and reveals information that remains hidden with other visualization techniques.
Collapse
Affiliation(s)
- Jakob Unger
- Department of Computer Science, University of Applied Science Trier, Schneidershof, 54293 Trier, Germany.
| | | | | | | | | | | |
Collapse
|