51
|
Bonilha HS, Deliyski DD. Period and Glottal Width Irregularities in Vocally Normal Speakers. J Voice 2008; 22:699-708. [DOI: 10.1016/j.jvoice.2007.03.002] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2006] [Accepted: 03/01/2007] [Indexed: 10/22/2022]
|
52
|
Schwarz R, Döllinger M, Wurzbacher T, Eysholdt U, Lohscheller J. Spatio-temporal quantification of vocal fold vibrations using high-speed videoendoscopy and a biomechanical model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:2717-32. [PMID: 18529190 DOI: 10.1121/1.2902167] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Pathologic changes within the organic constitution of vocal folds or a functional impairment of the larynx may result in disturbed or even irregular vocal fold vibrations. The consequences are perturbations of the acoustic speech signal which are perceived as a hoarse voice. By means of appropriate image processing techniques, the vocal fold dynamics are extracted from digital high-speed videos. This study addresses the approach to obtain a parametric description of the spatio-temporal characteristics of the vocal fold oscillations for the aim of classification. For this purpose a biomechanical vocal fold model is introduced. An automatic optimization procedure is developed for fitting the model dynamics to the observed vocal fold oscillations. Thus, the resulting parameter values represent a specific vibration pattern and serve as an objective quantification measure. Performance and reliability of the optimization procedure are validated with synthetically generated data sets. The high-speed videos of two normal voice subjects and six patients suffering from different voice disorders are processed. The resulting model parameters represent a rough approximation of physiological parameters along the entire vocal folds.
Collapse
Affiliation(s)
- Raphael Schwarz
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, South Carolina 29208, USA.
| | | | | | | | | |
Collapse
|
53
|
Wurzbacher T, Döllinger M, Schwarz R, Hoppe U, Eysholdt U, Lohscheller J. Spatiotemporal classification of vocal fold dynamics by a multimass model comprising time-dependent parameters. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:2324-34. [PMID: 18397036 DOI: 10.1121/1.2835435] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
A model-based approach is proposed to objectively measure and classify vocal fold vibrations by left-right asymmetries along the anterior-posterior direction, especially in the case of nonstationary phonation. For this purpose, vocal fold dynamics are recorded in real time with a digital high-speed camera during phonation of sustained vowels as well as pitch raises. The dynamics of a multimass model with time-dependent parameters are matched to vocal fold vibrations extracted at dorsal, medial, and ventral positions by an automatic optimization procedure. The block-based optimization accounts for nonstationary vibrations and compares the vocal fold and model dynamics by wavelet coefficients. The optimization is verified with synthetically generated data sets and is applied to 40 clinical high-speed recordings comprising normal and pathological voice subjects. The resulting model parameters allow an intuitive visual assessment of vocal fold instabilities within an asymmetry diagram and are applicable to an objective quantification of asymmetries.
Collapse
Affiliation(s)
- Tobias Wurzbacher
- Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Medical School, Erlangen, Germany.
| | | | | | | | | | | |
Collapse
|
54
|
Lohscheller J, Eysholdt U, Toy H, Dollinger M. Phonovibrography: mapping high-speed movies of vocal fold vibrations into 2-D diagrams for visualizing and analyzing the underlying laryngeal dynamics. IEEE TRANSACTIONS ON MEDICAL IMAGING 2008; 27:300-9. [PMID: 18334426 DOI: 10.1109/tmi.2007.903690] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Endoscopic high-speed laryngoscopy in combination with image analysis strategies is the most promising approach to investigate the interrelation between vocal fold vibrations and voice disorders. So far, due to the lack of an objective and standardized analysis procedure a unique characterization of vocal fold vibrations has not been achieved yet. We present a visualization and analysis strategy which transforms the segmented edges of vibrating vocal folds into a single 2-D image, denoted Phonovibrogram (PVG). Within a PVG the individual type of vocal fold vibration becomes uniquely characterized by specific geometric patterns. The PVG geometries give an intuitive access on the type and degree of the laryngeal asymmetry and can be quantified using an image segmentation approach. The PVG analysis was applied to 14 representative recordings derived from a high-speed database comprising normal and pathological voices. We demonstrate that PVGs are capable to differentiate and quantify different types of normal and pathological vocal fold vibrations. The objective and precise quantification of the PVG geometry may have the potential to realize a novel classification of vocal fold vibrations.
Collapse
Affiliation(s)
- Jörg Lohscheller
- Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen Medical School, 91054 Erlangen, Germany.
| | | | | | | |
Collapse
|
55
|
Braunschweig T, Flaschka J, Schelhorn-Neise P, Döllinger M. High-speed video analysis of the phonation onset, with an application to the diagnosis of functional dysphonias. Med Eng Phys 2008; 30:59-66. [PMID: 17317268 DOI: 10.1016/j.medengphy.2006.12.007] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2006] [Revised: 12/18/2006] [Accepted: 12/29/2006] [Indexed: 12/01/2022]
Abstract
An objective method for the diagnosis of functional dysphonias is presented. The mathematically motivated approach was evaluated on 71 female subjects with normal voice or functional dysphonia. Using digital high-speed recordings, the phonation onset process was recorded in real-time for 8-10 different sound pressure levels for each subject. From these recordings two parameters were mathematically estimated, reflecting the phonation onset dynamics. The growth of the vocal fold amplitudes during the phonation onset process was described by a parameter a for which its lower threshold value a(th) was extrapolated. This threshold reflects the myoelastic tonus within the vocal folds. The second parameter was the maximum sound pressure level L(max). It allows conclusions on voice efficiency with respect to the necessary subglottal pressure and the myoelastic forces. Due to the significant differences of these parameters between the pathological groups and normal voices, the presented method is a stable and objective tool for medical diagnosis.
Collapse
Affiliation(s)
- T Braunschweig
- University Hospital Jena, Department of Phoniatrics and Pediatric Audiology, Stoystr. 3, 07743 Jena, Germany.
| | | | | | | |
Collapse
|
56
|
Tao C, Zhang Y, Jiang JJ. Extracting Physiologically Relevant Parameters of Vocal Folds From High-Speed Video Image Series. IEEE Trans Biomed Eng 2007; 54:794-801. [PMID: 17518275 DOI: 10.1109/tbme.2006.889182] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In this paper, a new method is proposed to extract the physiologically relevant parameters of the vocal fold mathematic model including masses, spring constants and damper constants from high-speed video (HSV) image series. This method uses a genetic algorithm to optimize the model parameters until the model and the realistic vocal folds have similar dynamic behavior. Numerical experiments theoretically test the validity of the proposed parameter estimation method. Then the validated method is applied to extract the physiologically relevant parameters from the glottal area series measured by HSV in an excised larynx model. With the estimated parameters, the vocal fold model accurately describes the vibration of the observed vocal folds. Further studies show that the proposed parameter estimation method can successfully detect the increase of longitudinal tension due to the vocal fold elongation from the glottal area signal. These results imply the potential clinical application of this method in inspecting the tissue properties of vocal fold.
Collapse
Affiliation(s)
- Chao Tao
- Department of Surgery, Division of Otolaryngology Head and Neck Surgery, University of Wisconsin Medical School, Madison, WI 53792-7375, USA.
| | | | | |
Collapse
|
57
|
Wurzbacher T, Schwarz R, Döllinger M, Hoppe U, Eysholdt U, Lohscheller J. Model-based classification of nonstationary vocal fold vibrations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:1012-27. [PMID: 16938988 DOI: 10.1121/1.2211550] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Classification of vocal fold vibrations is an essential task of the objective assessment of voice disorders. For historical reasons, the conventional clinical examination of vocal fold vibrations is done during stationary, sustained phonation. However, the conclusions drawn from a stationary phonation are restricted to the observed steady-state vocal fold vibrations and cannot be generalized to voice mechanisms during running speech. This study addresses the approach of classifying real-time recordings of vocal fold oscillations during a nonstationary phonation paradigm in the form of a pitch raise. The classification is based on asymmetry measures derived from a time-dependent biomechanical two-mass model of the vocal folds which is adapted to observed vocal fold motion curves with an optimization procedure. After verification of the algorithm performance the method was applied to clinical problems. Recordings of ten subjects with normal voice and ten dysphonic subjects have been evaluated during stationary as well as nonstationary phonation. In the case of nonstationary phonation the model-based classification into "normal" and "dysphonic" succeeds in all cases, while it fails in the case of sustained phonation. The nonstationary vocal fold vibrations contain additional information about vocal fold irregularities, which are needed for an objective interpretation and classification of voice disorders.
Collapse
Affiliation(s)
- Tobias Wurzbacher
- Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Medical School, Erlangen, Germany
| | | | | | | | | | | |
Collapse
|
58
|
Schwarz R, Hoppe U, Schuster M, Wurzbacher T, Eysholdt U, Lohscheller J. Classification of unilateral vocal fold paralysis by endoscopic digital high-speed recordings and inversion of a biomechanical model. IEEE Trans Biomed Eng 2006; 53:1099-108. [PMID: 16761837 DOI: 10.1109/tbme.2006.873396] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Hoarseness in unilateral vocal fold paralysis is mainly due to irregular vocal fold vibrations caused by asymmetries within the larynx physiology. By means of a digital high-speed camera vocal fold oscillations can be observed in real-time. It is possible to extract the irregular vocal fold oscillations from the high-speed recordings using appropriate image processing techniques. An inversion procedure is developed which adjusts the parameters of a biomechanical model of the vocal folds to reproduce the irregular vocal fold oscillations. Within the inversion procedure a first parameter approximation is achieved through a knowledge-based algorithm. The final parameter optimization is performed using a genetic algorithm. The performance of the inversion procedure is evaluated using 430 synthetically generated data sets. The evaluation results comprise an error estimation of the inversion procedure and show the reliability of the algorithm. The inversion procedure is applied to 15 healthy voice subjects and 15 subjects suffering from unilateral vocal fold paralysis. The optimized parameter sets allow a classification of pathologic and healthy vocal fold oscillations. The classification may serve as a basis for therapy selection and quantification of therapy outcome in case of unilateral vocal fold paralysis.
Collapse
Affiliation(s)
- Raphael Schwarz
- Department of Phoniatrics and Pediatric Audiology, University of Erlangen-Nürnberg, Germany.
| | | | | | | | | | | |
Collapse
|
59
|
Dresel C, Mergell P, Hoppe U, Eysholdt U. An asymmetric smooth contour two-mass model for recurrent laryngeal nerve paralysis. LOGOP PHONIATR VOCO 2006; 31:61-75. [PMID: 16754278 DOI: 10.1080/14015430500363232] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Irregular vocal fold vibrations are assumed to be a major cause of hoarseness. A common clinical condition presenting with hoarseness is a unilateral recurrent laryngeal nerve paralysis (RLNP). In order to explain high-speed video recordings of clinical RLNP, RLNP-type vocal fold vibrations are simulated by extending the well known two-mass model (2MM) to an asymmetric smooth-contour two-mass model (SC2MM). Polynomial interpolations form a smooth surface over the lumped elements of the 2MM. Laryngeal asymmetry is accounted for by introduction of an asymmetry coefficient and an anterior commissure angle which models a variable glottal closure insufficiency. Compared to the 2MM, the SC2MM yields a smaller glottal volume flow and is more stable in critical parameter constellations of RLNP-like conditions. It is able to model the vocal fold dynamics during a glottal closure insufficiency.
Collapse
Affiliation(s)
- Christian Dresel
- Department of Neurology, Klinikum rechts der Isar, Technische Universitaet Muenchen, Germany.
| | | | | | | |
Collapse
|
60
|
Zhang Y, Tao C, Jiang JJ. Parameter estimation of an asymmetric vocal-fold system from glottal area time series using chaos synchronization. CHAOS (WOODBURY, N.Y.) 2006; 16:023118. [PMID: 16822021 DOI: 10.1063/1.2203092] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
In this paper, we apply an iterative parameter adaption scheme based on chaos synchronization to estimate system parameters of the asymmetric vocal folds from glottal area time series. The original asymmetric vocal-fold system associated with recurrent laryngeal paralysis shows chaotic vibrations with positive Lyapunov exponents. Aperiodic glottal area time series from the original system will be applied as the feedback variable coupling the simulative and the original vocal-fold systems. The parameter adaption technique based on chaos synchronization is employed to manipulate the simulative system parameters. The chaotic vibrations, system parameters, and the bifurcation diagram of the original vocal-fold system can be exactly reproduced in the simulative system, and the two chaotic systems can be synchronized. Furthermore, the effects of noise, sampling rate, and equation difference due to nonlinear spring terms on vocal-fold parameter estimations are investigated. Despite large noise perturbations, large equation differences, and low sampling rate, the parameter adaption scheme can effectively estimate the original vocal-fold system parameters. This study provides a theoretical base to apply chaos synchronization to estimate the vocal-fold system parameters from the glottal area data and show its potential application in laryngeal physiology.
Collapse
Affiliation(s)
- Yu Zhang
- Department of Surgery, Division of Otolaryngology Head and Neck Surgery, University of Wisconsin Medical School, Madison, Wisconsin 53792-7375, USA
| | | | | |
Collapse
|
61
|
Döllinger M, Berry DA, Berke GS. Medial surface dynamics of an in vivo canine vocal fold during phonation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 117:3174-83. [PMID: 15957785 DOI: 10.1121/1.1871772] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Quantitative measurement of the medial surface dynamics of the vocal folds is important for understanding how sound is generated within the larynx. Building upon previous excised hemilarynx studies, the present study extended the hemilarynx methodology to the in vivo canine larynx. Through use of an in vivo model, the medial surface dynamics of the vocal fold were examined as a function of active thyroarytenoid muscle contraction. Data were collected using high-speed digital imaging at a sampling frequency of 2000 Hz, and a spatial resolution of 1024 x 1024 pixels. Chest-like and fry-like vibrations were observed, but could not be distinguished based on the input stimulation current to the recurrent laryngeal nerve. The subglottal pressure did distinguish the registers, as did an estimate of the thyroarytenoid muscle activity. Upon quantification of the three-dimensional motion, the method of Empirical Eigenfunctions was used to extract the underlying modes of vibration, and to investigate mechanisms of sustained oscillation. Results were compared with previous findings from excised larynx experiments and theoretical models.
Collapse
Affiliation(s)
- Michael Döllinger
- Laryngeal Dynamics Laboratory, UCLA Division of Head & Neck Surgery, 1000 Veteran Ave. Suite 31-24, Los Angeles, California, 90095-1794, USA.
| | | | | |
Collapse
|
62
|
Lucero JC, Koenig LL. Simulations of temporal patterns of oral airflow in men and women using a two-mass model of the vocal folds under dynamic control. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 117:1362-72. [PMID: 15807024 DOI: 10.1121/1.1853235] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
In this study we use a low-dimensional laryngeal model to reproduce temporal variations in oral airflow produced by speakers in the vicinity of an abduction gesture. It attempts to characterize these temporal patterns in terms of biomechanical parameters such as glottal area, vocal fold stiffness, subglottal pressure, and gender differences in laryngeal dimensions. A two-mass model of the vocal folds coupled to a two-tube approximation of the vocal tract is fitted to oral airflow records measured in men and women during the production of /aha/ utterances, using the subglottal pressure, glottal width, and Q factor as control parameters. The results show that the model is capable of reproducing the airflow records with good approximation. A nonlinear damping characteristics is needed, to reproduce the flow variation at glottal abduction. Devoicing is achieved by the combined action of vocal fold abduction, the decrease of subglottal pressure, and the increase of vocal fold tension. In general, the female larynx has a more restricted region of vocal fold oscillation than the male one. This would explain the more frequent devoicing in glottal abduction-adduction gestures for /h/ in running speech by women, compared to men.
Collapse
Affiliation(s)
- Jorge C Lucero
- Department of Mathematics, University of Brasilia, Brasilia DF 70910-900, Brazil.
| | | |
Collapse
|
63
|
Tokuda I, Herzel H. Detecting synchronizations in an asymmetric vocal fold model from time series data. CHAOS (WOODBURY, N.Y.) 2005; 15:13702. [PMID: 15836270 DOI: 10.1063/1.1848232] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
A nonlinear modeling approach is presented for the reconstruction of the synchronization structure in an asymmetric two-mass model from time series data. The asymmetric two-mass model describes a variety of normal and pathological human voices associated with synchronous and desynchronous oscillations of the two asymmetric vocal folds. Our technique recovers the synchronization diagram, which yields the regimes of synchronization as well as desynchronization, which are dependent upon the asymmetry parameter and the subglottal pressure. This allows the prediction of the regime of pathological phonation associated with desynchronization of the vocal folds from a few sets of recorded time series. It is shown that the modeling is quite effective when the time series data are chaotic and if they are taken from a regime of desynchronization. We discuss the applicability of the present approach as a diagnostic tool for voice pathologies.
Collapse
Affiliation(s)
- Isao Tokuda
- Institute for Theoretical Biology, Humboldt University of Berlin, Invalidenstrasse 43, D-10115 Berlin, Germany.
| | | |
Collapse
|
64
|
Tao C, Zhang Y, Du G, Jiang JJ. Estimating model parameters by chaos synchronization. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2004; 69:036204. [PMID: 15089389 DOI: 10.1103/physreve.69.036204] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2003] [Indexed: 05/24/2023]
Abstract
Using chaos synchronization and a proposed iterative method of parameter adaptation, we precisely estimate the model parameters of chaotic systems and synchronize two chaotic systems with originally mismatching model parameters. This parameter adaptation method can be applied to a spatiotemporal chaotic system with a one-way-coupled map lattice. As a biomedical application, this method is capable of estimating the asymmetric tension parameter of a vocal fold model.
Collapse
Affiliation(s)
- Chao Tao
- Institute of Acoustics, State Key Laboratory of Modern Acoustics, Nanjing University, Nanjing 210093, People's Republic of China
| | | | | | | |
Collapse
|
65
|
Eysholdt U, Rosanowski F, Hoppe U. Vocal fold vibration irregularities caused by different types of laryngeal asymmetry. Eur Arch Otorhinolaryngol 2003; 260:412-7. [PMID: 12690514 DOI: 10.1007/s00405-003-0606-y] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2002] [Accepted: 03/12/2003] [Indexed: 11/29/2022]
Abstract
The common symptom of hoarseness is regarded to be caused by (1) turbulences and air loss due to incomplete glottic closure and (2) irregular vibrations of the vocal folds. With real time resolution, the latter can only be observed using high-speed recording techniques (> or =2,000 images/s). In this paper an actual recording method is described, called high-speed glottography (HGG), which quantifies vibration irregularities. It combines imaging and image processing techniques with a functional endoscopy of the disordered voice and delivers motion curves separately for each vocal fold. They are fitted with a computer simulation in order to identify the underlying driving parameters of the vibration. A vocal fold is assumed to vibrate as a system of two coupled oscillators ("two-mass model"). From the model fit to bilateral motion curves, the subglottal pressure, muscular tension and oscillating masses of the vocal folds can be computed with reasonable accuracy. Besides normal voices, HGG has been applied to selected clinical cases of voice disorders. Two types of irregularities have been measured: there is a frequency difference either between left and right vocal folds (horizontal asymmetry) or on one side between the ventral and dorsal third (vertical asymmetry). By modeling, both categories of irregular motion curves can be explained in detail. It is presumed that laryngeal asymmetry (either in mass or tension) causes irregular vibrations.
Collapse
Affiliation(s)
- U Eysholdt
- Abteilung für Phoniatrie und Pädaudiologie, Universitätsklinikum Erlangen, Bohlenplatz 21, 91054 Erlangen, Germany.
| | | | | |
Collapse
|