51
|
Kimura M, Nito T, Imagawa H, Sakakibara KI, Chan RW, Tayama N. Collagen injection for correcting vocal fold asymmetry: high-speed imaging. Ann Otol Rhinol Laryngol 2010; 119:359-68. [PMID: 20583733 DOI: 10.1177/000348941011900601] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
OBJECTIVES We hypothesized that high-speed digital imaging with videokymographic and laryngotopographic analysis would provide a quantitative method to evaluate the effect of collagen injection for the correction of asymmetric and irregular vocal fold vibration in unilateral vocal fold paralysis. METHODS Videokymographic and laryngotopographic analysis was performed for high-speed digital recordings of vocal fold vibration for visualizing the glottal vibratory patterns, and for quantifying the frequency of vibration of each vocal fold, respectively, including comparisons between the paralyzed and normal vocal folds before and after surgery. This included prospective observations of 11 subjects with unilateral vocal fold paralysis (4 male, 7 female; mean +/- SD age, 67.1 +/- 12.0 years) using high-speed digital image analysis before and after collagen injection. RESULTS Analysis of the laryngotopographs revealed 2 distinct frequencies of vibration for the paralyzed and contralateral vocal folds for 8 of the 11 subjects before surgery. After collagen injection, the vibration frequencies became identical, despite asymmetric vibration amplitudes. Asymmetric vibration amplitudes were also observed in the other 3 subjects before surgery, but the amplitudes became symmetric after collagen injection, despite a persistent phase shift. CONCLUSIONS Asymmetric vibration in vocal fold paralysis was exemplified by differences in vibration frequency and amplitude between the vocal folds. The present study showed that after collagen injection, these aspects of vibratory patterns improved toward symmetry. This surgical procedure could improve the functional symmetry of the larynx for phonation.
Collapse
Affiliation(s)
- Miwako Kimura
- Dept of Otolaryngology-Head and Neck Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390-9035, USA
| | | | | | | | | | | |
Collapse
|
52
|
Kimura M, Imagawa H, Nito T, Sakakibara KI, Chan RW, Tayama N. Arytenoid Adduction for Correcting Vocal Fold Asymmetry: High-Speed Imaging. Ann Otol Rhinol Laryngol 2010; 119:439-46. [DOI: 10.1177/000348941011900703] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Objectives We hypothesized that high-speed digital imaging provides a quantitative method to evaluate the effect of arytenoid adduction for the correction of asymmetric and irregular vocal fold vibration in unilateral vocal fold paralysis. Methods Six subjects with unilateral vocal fold paralysis participated in the study (4 male, 2 female; mean [±SD] age, 52.5 ± 21.3 years). Videokymographic and laryngotopographic methods for image analysis were performed for highspeed recordings of vocal fold vibration for visualizing the glottal vibratory patterns, and for quantifying the frequency of vibration of each vocal fold, respectively. Comparisons of the paralyzed and the normal vocal folds were made before and after arytenoid adduction. Results Analysis of the laryngotopographs revealed 2 distinct frequencies of vibration for the paralyzed and the contralateral vocal folds for all subjects before surgery. After arytenoid adduction, the vibration frequencies became identical or nearly identical in all subjects. Conclusions Asymmetric vibration in vocal fold paralysis was exemplified by differences in vibration frequency between the vocal folds. The present data showed that after arytenoid adduction the vibration frequencies and the vibratory patterns of the contralateral vocal folds approached symmetry. This surgical procedure could improve the functional symmetry of the larynx for phonation.
Collapse
Affiliation(s)
- Miwako Kimura
- Departments of Otolaryngology–Head and Neck Surgery, Tokyo, Japan
- University of Texas Southwestern Medical Center, Dallas, Texas, Department of Otolaryngology, International Medical Center of Japan, Tokyo, Japan
| | - Hiroshi Imagawa
- Department of Otorhinolaryngology–Head and Neck Surgery, University of Tokyo, Tokyo, Japan
| | - Takaharu Nito
- Department of Otorhinolaryngology–Head and Neck Surgery, University of Tokyo, Tokyo, Japan
| | - Ken-Ichi Sakakibara
- Department of Communication Disorders, Health Sciences University of Hokkaido, Hokkaido, Japan
| | - Roger W. Chan
- Departments of Otolaryngology–Head and Neck Surgery, Tokyo, Japan
- Biomedical Engineering, Tokyo, Japan
| | - Niro Tayama
- University of Texas Southwestern Medical Center, Dallas, Texas, Department of Otolaryngology, International Medical Center of Japan, Tokyo, Japan
- Department of Otorhinolaryngology–Head and Neck Surgery, University of Tokyo, Tokyo, Japan
| |
Collapse
|
53
|
Classification of functional voice disorders based on phonovibrograms. Artif Intell Med 2010; 49:51-9. [DOI: 10.1016/j.artmed.2010.01.001] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2008] [Revised: 08/20/2009] [Accepted: 01/10/2010] [Indexed: 11/17/2022]
|
54
|
Yang A, Lohscheller J, Berry DA, Becker S, Eysholdt U, Voigt D, Döllinger M. Biomechanical modeling of the three-dimensional aspects of human vocal fold dynamics. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:1014-31. [PMID: 20136223 PMCID: PMC3137461 DOI: 10.1121/1.3277165] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2009] [Revised: 10/15/2009] [Accepted: 11/24/2009] [Indexed: 05/23/2023]
Abstract
Human voice originates from the three-dimensional (3D) oscillations of the vocal folds. In previous studies, biomechanical properties of vocal fold tissues have been predicted by optimizing the parameters of simple two-mass-models to fit its dynamics to the high-speed imaging data from the clinic. However, only lateral and longitudinal displacements of the vocal folds were considered. To extend previous studies, a 3D mass-spring, cover-model is developed, which predicts the 3D vibrations of the entire medial surface of the vocal fold. The model consists of five mass planes arranged in vertical direction. Each plane contains five longitudinal, mass-spring, coupled oscillators. Feasibility of the model is assessed using a large body of dynamical data previously obtained from excised human larynx experiments, in vivo canine larynx experiments, physical models, and numerical models. Typical model output was found to be similar to existing findings. The resulting model enables visualization of the 3D dynamics of the human vocal folds during phonation for both symmetric and asymmetric vibrations.
Collapse
Affiliation(s)
- Anxiong Yang
- Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Medical School, Bohlenplatz 21, 91054 Erlangen, Germany.
| | | | | | | | | | | | | |
Collapse
|
55
|
Kunduk M, Doellinger M, McWhorter AJ, Lohscheller J. Assessment of the variability of vocal fold dynamics within and between recordings with high-speed imaging and by phonovibrogram. Laryngoscope 2010; 120:981-7. [DOI: 10.1002/lary.20832] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
56
|
Advances in laryngeal imaging. Eur Arch Otorhinolaryngol 2009; 266:1509-20. [PMID: 19618198 DOI: 10.1007/s00405-009-1050-4] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2008] [Accepted: 07/07/2009] [Indexed: 10/20/2022]
Abstract
Imaging and image analysis became an important issue in laryngeal diagnostics. Various techniques, such as videostroboscopy, videokymography, digital kymography, or ultrasonography are available and are used in research and clinical practice. This paper reviews recent advances in imaging for laryngeal diagnostics.
Collapse
|
57
|
Elemans CPH, Muller M, Larsen ON, van Leeuwen JL. Amplitude and frequency modulation control of sound production in a mechanical model of the avian syrinx. ACTA ACUST UNITED AC 2009; 212:1212-24. [PMID: 19329754 DOI: 10.1242/jeb.026872] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Birdsong has developed into one of the important models for motor control of learned behaviour and shows many parallels with speech acquisition in humans. However, there are several experimental limitations to studying the vocal organ - the syrinx - in vivo. The multidisciplinary approach of combining experimental data and mathematical modelling has greatly improved the understanding of neural control and peripheral motor dynamics of sound generation in birds. Here, we present a simple mechanical model of the syrinx that facilitates detailed study of vibrations and sound production. Our model resembles the 'starling resistor', a collapsible tube model, and consists of a tube with a single membrane in its casing, suspended in an external pressure chamber and driven by various pressure patterns. With this design, we can separately control 'bronchial' pressure and tension in the oscillating membrane and generate a wide variety of 'syllables' with simple sweeps of the control parameters. We show that the membrane exhibits high frequency, self-sustained oscillations in the audio range (>600 Hz fundamental frequency) using laser Doppler vibrometry, and systematically explore the conditions for sound production of the model in its control space. The fundamental frequency of the sound increases with tension in three membranes with different stiffness and mass. The lower-bound fundamental frequency increases with membrane mass. The membrane vibrations are strongly coupled to the resonance properties of the distal tube, most likely because of its reflective properties to sound waves. Our model is a gross simplification of the complex morphology found in birds, and more closely resembles mathematical models of the syrinx. Our results confirm several assumptions underlying existing mathematical models in a complex geometry.
Collapse
Affiliation(s)
- Coen P H Elemans
- Experimental Zoology Group, Wageningen University, Marijkeweg 40, NL-6709 PG Wageningen, The Netherlands.
| | | | | | | |
Collapse
|
58
|
Variability of Normal Vocal Fold Dynamics for Different Vocal Loading in One Healthy Subject Investigated by Phonovibrograms. J Voice 2009; 23:175-81. [DOI: 10.1016/j.jvoice.2007.09.008] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2007] [Accepted: 09/25/2007] [Indexed: 10/22/2022]
|
59
|
Bonilha HS, Deliyski DD, Gerlach TT. Phase asymmetries in normophonic speakers: visual judgments and objective findings. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2008; 17:367-76. [PMID: 18840697 PMCID: PMC2632767 DOI: 10.1044/1058-0360(2008/07-0059)] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
PURPOSE To ascertain the amount of phase asymmetry of the vocal fold vibration in normophonic speakers via visualization techniques and compare findings for habitual and pressed phonations. METHOD Fifty-two normophonic speakers underwent stroboscopy and high-speed videoendoscopy (HSV). The HSV images were further processed into 4 visual displays: HSV playbacks, digital kymography (DKG) playbacks, mucosal wave kymography playbacks, and static kymographic images of the medial line from the DKG playback. Two types of phase asymmetries, left-right and anterior-posterior, were rated on a scale from 1 to 5. Objective measures of left-right phase asymmetry were obtained. RESULTS The majority of normophonic speakers (81%) were noted to display anterior-posterior asymmetry; however, 66% of those were characterized as mild. Seventy-nine percent of participants were noted to display left-right asymmetry; however, 72% of those were mild. A moderate relationship between the objective measures and subjective ratings was found. CONCLUSIONS Most normophonic speakers exhibit mild left-right and anterior-posterior asymmetries for both habitual and pressed phonations. Asymmetries were noted more often during habitual than pressed phonations, and when visualized by HSV and kymography than stroboscopy. Differences between objective measures and visual judgments support the need to quantify vocal fold vibratory features.
Collapse
Affiliation(s)
- Heather Shaw Bonilha
- Communication Sciences and Disorders, University of South Carolina, 1621 Greene Street, 6th Floor, Columbia, SC 29208, USA.
| | | | | |
Collapse
|
60
|
Bonilha HS, Deliyski DD. Period and Glottal Width Irregularities in Vocally Normal Speakers. J Voice 2008; 22:699-708. [DOI: 10.1016/j.jvoice.2007.03.002] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2006] [Accepted: 03/01/2007] [Indexed: 10/22/2022]
|
61
|
Mortensen M, Woo P. High-speed imaging used to detect vocal fold paresis: a case report. Ann Otol Rhinol Laryngol 2008; 117:684-7. [PMID: 18834072 DOI: 10.1177/000348940811700910] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
High-speed imaging has been used to study vocal fold vibration and has been shown to provide additional information in aid of our understanding of pathologic vocal fold vibration. This is the first case report of vocal fold paresis diagnosed by high-speed imaging. An 18-year-old girl presented with intermittent voice loss that had been present for 4 years. The patient had been seen by other otolaryngologists and had been given proton pump inhibitors without any improvement in her voice. Her voice was diplophonic. The patient was examined by rigid stroboscopy and was found to have a predominantly open phase pattern but a normal vibratory pattern. High-speed photography showed a distinct vibratory frequency for each vocal fold, suggestive of a paresis pattern. Laryngeal electromyography confirmed the diagnosis of vocal fold paresis. A computed tomographic scan of the larynx and chest showed a thymoma. After thymectomy, the patient recovered full voice function. High-speed imaging is useful for the clinical evaluation of pathologic vocal fold vibration and can detect subtle features of paralysis that may not be detected on fiberoptic endoscopy and rigid stroboscopy. The additional information from high-speed imaging helped to make the diagnosis of vocal fold paresis in this patient.
Collapse
Affiliation(s)
- Melissa Mortensen
- Department of Otolaryngology-Head and Neck Surgery, Mount Sinai Medical Center, New York, New York 10029, USA
| | | |
Collapse
|
62
|
Schwarz R, Döllinger M, Wurzbacher T, Eysholdt U, Lohscheller J. Spatio-temporal quantification of vocal fold vibrations using high-speed videoendoscopy and a biomechanical model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:2717-32. [PMID: 18529190 DOI: 10.1121/1.2902167] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Pathologic changes within the organic constitution of vocal folds or a functional impairment of the larynx may result in disturbed or even irregular vocal fold vibrations. The consequences are perturbations of the acoustic speech signal which are perceived as a hoarse voice. By means of appropriate image processing techniques, the vocal fold dynamics are extracted from digital high-speed videos. This study addresses the approach to obtain a parametric description of the spatio-temporal characteristics of the vocal fold oscillations for the aim of classification. For this purpose a biomechanical vocal fold model is introduced. An automatic optimization procedure is developed for fitting the model dynamics to the observed vocal fold oscillations. Thus, the resulting parameter values represent a specific vibration pattern and serve as an objective quantification measure. Performance and reliability of the optimization procedure are validated with synthetically generated data sets. The high-speed videos of two normal voice subjects and six patients suffering from different voice disorders are processed. The resulting model parameters represent a rough approximation of physiological parameters along the entire vocal folds.
Collapse
Affiliation(s)
- Raphael Schwarz
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, South Carolina 29208, USA.
| | | | | | | | | |
Collapse
|
63
|
Wurzbacher T, Döllinger M, Schwarz R, Hoppe U, Eysholdt U, Lohscheller J. Spatiotemporal classification of vocal fold dynamics by a multimass model comprising time-dependent parameters. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:2324-34. [PMID: 18397036 DOI: 10.1121/1.2835435] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
A model-based approach is proposed to objectively measure and classify vocal fold vibrations by left-right asymmetries along the anterior-posterior direction, especially in the case of nonstationary phonation. For this purpose, vocal fold dynamics are recorded in real time with a digital high-speed camera during phonation of sustained vowels as well as pitch raises. The dynamics of a multimass model with time-dependent parameters are matched to vocal fold vibrations extracted at dorsal, medial, and ventral positions by an automatic optimization procedure. The block-based optimization accounts for nonstationary vibrations and compares the vocal fold and model dynamics by wavelet coefficients. The optimization is verified with synthetically generated data sets and is applied to 40 clinical high-speed recordings comprising normal and pathological voice subjects. The resulting model parameters allow an intuitive visual assessment of vocal fold instabilities within an asymmetry diagram and are applicable to an objective quantification of asymmetries.
Collapse
Affiliation(s)
- Tobias Wurzbacher
- Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Medical School, Erlangen, Germany.
| | | | | | | | | | | |
Collapse
|
64
|
|
65
|
Lohscheller J, Eysholdt U, Toy H, Dollinger M. Phonovibrography: mapping high-speed movies of vocal fold vibrations into 2-D diagrams for visualizing and analyzing the underlying laryngeal dynamics. IEEE TRANSACTIONS ON MEDICAL IMAGING 2008; 27:300-9. [PMID: 18334426 DOI: 10.1109/tmi.2007.903690] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Endoscopic high-speed laryngoscopy in combination with image analysis strategies is the most promising approach to investigate the interrelation between vocal fold vibrations and voice disorders. So far, due to the lack of an objective and standardized analysis procedure a unique characterization of vocal fold vibrations has not been achieved yet. We present a visualization and analysis strategy which transforms the segmented edges of vibrating vocal folds into a single 2-D image, denoted Phonovibrogram (PVG). Within a PVG the individual type of vocal fold vibration becomes uniquely characterized by specific geometric patterns. The PVG geometries give an intuitive access on the type and degree of the laryngeal asymmetry and can be quantified using an image segmentation approach. The PVG analysis was applied to 14 representative recordings derived from a high-speed database comprising normal and pathological voices. We demonstrate that PVGs are capable to differentiate and quantify different types of normal and pathological vocal fold vibrations. The objective and precise quantification of the PVG geometry may have the potential to realize a novel classification of vocal fold vibrations.
Collapse
Affiliation(s)
- Jörg Lohscheller
- Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen Medical School, 91054 Erlangen, Germany.
| | | | | | | |
Collapse
|
66
|
Calibration of laryngeal endoscopic high-speed image sequences by an automated detection of parallel laser line projections. Med Image Anal 2008; 12:300-17. [PMID: 18373942 DOI: 10.1016/j.media.2007.12.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2007] [Revised: 10/16/2007] [Accepted: 12/15/2007] [Indexed: 11/22/2022]
Abstract
High-speed laryngeal endoscopic systems record vocal fold vibrations during phonation in real-time. For a quantitative analysis of vocal fold dynamics a metrical scale is required to get absolute laryngeal dimensions of the recorded image sequence. For the clinical use there is no automated and stable calibration procedure up to now. A calibration method is presented that consists of a laser projection device and the corresponding image processing for the automated detection of the laser calibration marks. The laser projection device is clipped to the endoscope and projects two parallel laser lines with a known distance to each other as calibration information onto the vocal folds. Image processing methods automatically identify the pixels belonging to the projected laser lines in the image data. The line detection bases on a Radon transform approach and is a two-stage process, which successively uses temporal and spatial characteristics of the projected laser lines in the high-speed image sequence. The robustness and the applicability are demonstrated with clinical endoscopic image sequences. The combination of the laser projection device and the image processing enables the calibration of laryngeal endoscopic images within the vocal fold plane and thus provides quantitative metrical data of vocal fold dynamics.
Collapse
|
67
|
Braunschweig T, Flaschka J, Schelhorn-Neise P, Döllinger M. High-speed video analysis of the phonation onset, with an application to the diagnosis of functional dysphonias. Med Eng Phys 2008; 30:59-66. [PMID: 17317268 DOI: 10.1016/j.medengphy.2006.12.007] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2006] [Revised: 12/18/2006] [Accepted: 12/29/2006] [Indexed: 12/01/2022]
Abstract
An objective method for the diagnosis of functional dysphonias is presented. The mathematically motivated approach was evaluated on 71 female subjects with normal voice or functional dysphonia. Using digital high-speed recordings, the phonation onset process was recorded in real-time for 8-10 different sound pressure levels for each subject. From these recordings two parameters were mathematically estimated, reflecting the phonation onset dynamics. The growth of the vocal fold amplitudes during the phonation onset process was described by a parameter a for which its lower threshold value a(th) was extrapolated. This threshold reflects the myoelastic tonus within the vocal folds. The second parameter was the maximum sound pressure level L(max). It allows conclusions on voice efficiency with respect to the necessary subglottal pressure and the myoelastic forces. Due to the significant differences of these parameters between the pathological groups and normal voices, the presented method is a stable and objective tool for medical diagnosis.
Collapse
Affiliation(s)
- T Braunschweig
- University Hospital Jena, Department of Phoniatrics and Pediatric Audiology, Stoystr. 3, 07743 Jena, Germany.
| | | | | | | |
Collapse
|
68
|
Lohscheller J, Toy H, Rosanowski F, Eysholdt U, Döllinger M. Clinically evaluated procedure for the reconstruction of vocal fold vibrations from endoscopic digital high-speed videos. Med Image Anal 2007; 11:400-13. [PMID: 17544839 DOI: 10.1016/j.media.2007.04.005] [Citation(s) in RCA: 120] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2006] [Revised: 02/27/2007] [Accepted: 04/24/2007] [Indexed: 11/29/2022]
Abstract
Investigation of voice disorders requires the examination of vocal fold vibrations. State of the art is the recording of endoscopic high-speed movies which capture vocal fold vibrations in real-time. It enables investigating the interrelation between disturbances of vocal fold vibrations and voice disorders. However, the lack of clinical studies and of a standardized procedure to reconstruct vocal fold vibrations from high-speed videos constrain the clinical acceptance of the high-speed technique. An image processing approach is presented that extracts the vibrating vocal fold edges from digital high-speed movies. The initial segmentation is principally based on a seeded region-growing algorithm. Even in movies with low image quality the algorithm segments successfully the glottal area by an introduced two-dimensional threshold matrix. Following segmentation, the vocal fold edges are reconstructed from the computed time-varying glottal area. The performance of the procedure was objectively evaluated within a study comprising 372 high-speed recordings. The accuracy of vocal fold reconstruction exceeds manual segmentation results obtained by clinical experts. The algorithm reaches an information flow-rate of up to 98 images per second. The robustness and high accuracy of the procedure makes it suitable for the application in clinical routine. It enables an objective and highly accurate description of vocal fold vibrations which is essential to realize extensive clinical studies which focus on the classification of voice disorders.
Collapse
Affiliation(s)
- Jörg Lohscheller
- Department of Phoniatrics and Pediatric Audiology, Bohlenplatz 21, 91054 Erlangen, Germany.
| | | | | | | | | |
Collapse
|
69
|
Doellinger M, Berry DA. Visualization and Quantification of the Medial Surface Dynamics of an Excised Human Vocal Fold During Phonation. J Voice 2006; 20:401-13. [PMID: 16300925 DOI: 10.1016/j.jvoice.2005.08.003] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2005] [Accepted: 05/25/2005] [Indexed: 11/22/2022]
Abstract
SUMMARY The purpose of this investigation was to investigate physical mechanisms of vocal fold vibration during normal phonation through quantification of the medial surface dynamics of the fold. An excised hemilarynx setup was used. The dynamics of 30 microsutures mounted on the medial surface of a human vocal fold were analyzed across 18 phonatory conditions. The vibrations were recorded with a digital high-speed camera at a frequency of 4,000 Hz. The positions of the sutures were extracted and converted to three-dimensional coordinates using a linear approximation technique. The data were reduced to principal eigenfuctions, which captured over 90% of the variance of the data, and suggested mechanisms of sustained vocal fold oscillation. The vibrations were imaged as the following phonatory conditions were manipulated: glottal airflow, an adductory force applied to the muscular process, and an elongation force applied to the thyroid cartilage. Over the range of variables studied, only the variation in glottal airflow yielded significant changes in subglottal pressure and fundamental frequency. All recordings showed high correlation for the distribution of the dynamics across the medial surface of the vocal fold. The distribution of the different displacement directions and velocities showed the highest variations around the superior region of the medial surface. Although the computed vibration patterns of the two largest empirical eigenfunctions were consistent with previous experimental observations, the relative prominence of the two eigenfunctions changed as a function of glottal airflow, impacting theories of vocal efficiency and vocal economy.
Collapse
Affiliation(s)
- Michael Doellinger
- Laryngeal Dynamics Laboratory, Division of Head & Neck Surgery, UCLA School of Medicine, Los Angeles, CA 90095-1794, USA.
| | | |
Collapse
|
70
|
Wurzbacher T, Schwarz R, Döllinger M, Hoppe U, Eysholdt U, Lohscheller J. Model-based classification of nonstationary vocal fold vibrations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:1012-27. [PMID: 16938988 DOI: 10.1121/1.2211550] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Classification of vocal fold vibrations is an essential task of the objective assessment of voice disorders. For historical reasons, the conventional clinical examination of vocal fold vibrations is done during stationary, sustained phonation. However, the conclusions drawn from a stationary phonation are restricted to the observed steady-state vocal fold vibrations and cannot be generalized to voice mechanisms during running speech. This study addresses the approach of classifying real-time recordings of vocal fold oscillations during a nonstationary phonation paradigm in the form of a pitch raise. The classification is based on asymmetry measures derived from a time-dependent biomechanical two-mass model of the vocal folds which is adapted to observed vocal fold motion curves with an optimization procedure. After verification of the algorithm performance the method was applied to clinical problems. Recordings of ten subjects with normal voice and ten dysphonic subjects have been evaluated during stationary as well as nonstationary phonation. In the case of nonstationary phonation the model-based classification into "normal" and "dysphonic" succeeds in all cases, while it fails in the case of sustained phonation. The nonstationary vocal fold vibrations contain additional information about vocal fold irregularities, which are needed for an objective interpretation and classification of voice disorders.
Collapse
Affiliation(s)
- Tobias Wurzbacher
- Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Medical School, Erlangen, Germany
| | | | | | | | | | | |
Collapse
|
71
|
Dresel C, Mergell P, Hoppe U, Eysholdt U. An asymmetric smooth contour two-mass model for recurrent laryngeal nerve paralysis. LOGOP PHONIATR VOCO 2006; 31:61-75. [PMID: 16754278 DOI: 10.1080/14015430500363232] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Irregular vocal fold vibrations are assumed to be a major cause of hoarseness. A common clinical condition presenting with hoarseness is a unilateral recurrent laryngeal nerve paralysis (RLNP). In order to explain high-speed video recordings of clinical RLNP, RLNP-type vocal fold vibrations are simulated by extending the well known two-mass model (2MM) to an asymmetric smooth-contour two-mass model (SC2MM). Polynomial interpolations form a smooth surface over the lumped elements of the 2MM. Laryngeal asymmetry is accounted for by introduction of an asymmetry coefficient and an anterior commissure angle which models a variable glottal closure insufficiency. Compared to the 2MM, the SC2MM yields a smaller glottal volume flow and is more stable in critical parameter constellations of RLNP-like conditions. It is able to model the vocal fold dynamics during a glottal closure insufficiency.
Collapse
Affiliation(s)
- Christian Dresel
- Department of Neurology, Klinikum rechts der Isar, Technische Universitaet Muenchen, Germany.
| | | | | | | |
Collapse
|
72
|
Abstract
The main symptom of unilateral vocal fold palsy is hoarseness, which can cause considerable disturbance to the patient depending on its extent and the patient's individual situation. Therapy aims at the restitution of a tuneful and resilient voice, which can be achieved by surgical or conservative means, improving the glottal closure and synchronizing the vocal fold vibrations during phonation. Vocal therapy is a common conservative method that may be supported by psychotherapeutic or physical procedures. In surgical therapy, there is a distinction between techniques of endoscopic augmentation by injecting different materials into the vocal folds and transcutaneous laryngeal framework surgery, i.e., transferring the paralyzed vocal fold to the glottal midline. Particularly apt for injection are biocompatible materials amount and position whose can easily be controlled. However, the inevitable resorption of many materials causes deterioration in voice quality. Furthermore, the change of vocal fold morphology obstructs regular phonatory vibration. On the other hand, medialization thyroplasty leads to permanent voice amelioration without a substantial complication rate when performed by experienced surgeons.
Collapse
Affiliation(s)
- M Schuster
- Abteilung für Phoniatrie und Pädaudiologie, Universitätsklinikum Erlangen.
| | | |
Collapse
|
73
|
Deliyski DD. Endoscope Motion Compensation for Laryngeal High-Speed Videoendoscopy. J Voice 2005; 19:485-96. [PMID: 16102674 DOI: 10.1016/j.jvoice.2004.07.006] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/23/2004] [Indexed: 11/18/2022]
Abstract
The purpose of this study was to develop and evaluate a method for automatic endoscopic motion compensation (MC) specialized for laryngeal high-speed videoendoscopy (HSV) of sustained phonation. Specifically, the left-right and posterior-anterior endoscopic shifts were addressed. The method is based on the hypothesis that the difference between endoscopic and vocal fold dynamics is sufficient for accurate estimation of the endoscopic motion relative to the vocal folds. The research questions of particular interest were as follows: is MC of laryngeal HSV possible at the subpixel level, and what are the validity, reliability, and accuracy of the proposed MC method? First, the MC accuracy was assessed on simulated HSV data with exactly known motion trajectories. Then, the accuracy and reliability of MC were evaluated on real HSV data. Finally, the validity of MC was established by correlating objective to perceptual ratings. The method demonstrated subpixel accuracy and reliable performance when tested on both simulated and real data. Perceptual ratings of the endoscopic motion demonstrated high intrarater and interrater reliabilities. Perceptual and quantitative assessments of endoscopic motion were in agreement. The reported result is encouraging for resolving a persistent problem toward building better methods for visual and quantitative voice assessment.
Collapse
Affiliation(s)
- Dimitar D Deliyski
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, 29208, USA.
| |
Collapse
|
74
|
Schuster M, Lohscheller J, Kummer P, Eysholdt U, Hoppe U. Laser projection in high-speed glottography for high-precision measurements of laryngeal dimensions and dynamics. Eur Arch Otorhinolaryngol 2004; 262:477-81. [PMID: 15942801 DOI: 10.1007/s00405-004-0862-5] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2004] [Accepted: 08/17/2004] [Indexed: 10/26/2022]
Abstract
The detection of metric dimensions of laryngeal structures yields valuable information for both clinical and research purposes. The use of a laser projection system combined with a high-speed camera system enables the derivation of absolute spatial dimensions of the larynx. Vocal fold length, vibratory amplitudes and velocity can be derived. This was shown on 13 female and 9 male larynges during phonation of a vowel at different pitches. The vocal fold length, the amplitude of oscillation and the velocity of vibration were analyzed in between pitches of 119 to 236 Hz in the male group and 181 to 555 Hz in the female group. The vocal folds' length ranged from 8.4 to 14.3 mm in the male group and from 7.7 to 15.6 mm in the female group. Corresponding amplitudes varied from 0.33 to 1.24 mm (male) and from 0.38 to 0.82 mm (female). The maximal velocity of vibration was between 0.48 and 0.85 m/s in males and between 0.47 and 1.3 m/s in females without showing significant correlation between each parameter. The described technique enables the detection of absolute spatial laryngeal dimensions of female and male subjects at different pitches. Dynamic processes such as velocity of vibration can be quantified. The detection of metric data serves to optimize biomechanical model computations and provides valuable information in diagnostics and interpretation of organic and non-organic voice disorders.
Collapse
Affiliation(s)
- Maria Schuster
- Department of Phoniatrics and Pedaudiology, University Hospital, Erlangen, Germany.
| | | | | | | | | |
Collapse
|