1
|
Nobel SMN, Swapno SMMR, Islam MR, Safran M, Alfarhood S, Mridha MF. A machine learning approach for vocal fold segmentation and disorder classification based on ensemble method. Sci Rep 2024; 14:14435. [PMID: 38910146 DOI: 10.1038/s41598-024-64987-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Accepted: 06/14/2024] [Indexed: 06/25/2024] Open
Abstract
In the healthcare domain, the essential task is to understand and classify diseases affecting the vocal folds (VFs). The accurate identification of VF disease is the key issue in this domain. Integrating VF segmentation and disease classification into a single system is challenging but important for precise diagnostics. Our study addresses this challenge by combining VF illness categorization and VF segmentation into a single integrated system. We utilized two effective ensemble machine learning methods: ensemble EfficientNetV2L-LGBM and ensemble UNet-BiGRU. We utilized the EfficientNetV2L-LGBM model for classification, achieving a training accuracy of 98.88%, validation accuracy of 97.73%, and test accuracy of 97.88%. These exceptional outcomes highlight the system's ability to classify different VF illnesses precisely. In addition, we utilized the UNet-BiGRU model for segmentation, which attained a training accuracy of 92.55%, a validation accuracy of 89.87%, and a significant test accuracy of 91.47%. In the segmentation task, we examined some methods to improve our ability to divide data into segments, resulting in a testing accuracy score of 91.99% and an Intersection over Union (IOU) of 87.46%. These measures demonstrate skill of the model in accurately defining and separating VF. Our system's classification and segmentation results confirm its capacity to effectively identify and segment VF disorders, representing a significant advancement in enhancing diagnostic accuracy and healthcare in this specialized field. This study emphasizes the potential of machine learning to transform the medical field's capacity to categorize VF and segment VF, providing clinicians with a vital instrument to mitigate the profound impact of the condition. Implementing this innovative approach is expected to enhance medical procedures and provide a sense of optimism to those globally affected by VF disease.
Collapse
Affiliation(s)
- S M Nuruzzaman Nobel
- Department of Computer Science and Engineering, Bangladesh University of Business and Technology, Dhaka, 1216, Bangladesh
| | - S M Masfequier Rahman Swapno
- Department of Computer Science and Engineering, Bangladesh University of Business and Technology, Dhaka, 1216, Bangladesh
| | - Md Rajibul Islam
- Department of Electrical and Electronic Engineering, The Hong Kong Polytechnic University, Hong Kong, China
| | - Mejdl Safran
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, P. O. Box 51178, 11543, Riyadh, Saudi Arabia.
| | - Sultan Alfarhood
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, P. O. Box 51178, 11543, Riyadh, Saudi Arabia
| | - M F Mridha
- Department of Computer Science, American International University-Bangladesh, Dhaka, 1229, Bangladesh
| |
Collapse
|
2
|
Schlegel P, Döllinger M, Reddy NK, Zhang Z, Chhetri DK. Validation and enhancement of a vocal fold medial surface 3D reconstruction approach for in-vivo application. Sci Rep 2023; 13:10705. [PMID: 37400470 DOI: 10.1038/s41598-023-36022-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 05/27/2023] [Indexed: 07/05/2023] Open
Abstract
In laryngeal research, studying the vertical vocal fold oscillation component is often disregarded. However, vocal fold oscillation by its nature is a three-dimensional process. In the past, we have developed an in-vivo experimental protocol to reconstruct the full, three-dimensional vocal fold vibration. The goal of this study is to validate this 3D reconstruction method. We present an in-vivo canine hemilarynx setup using high-speed video recording and a right-angle prism for 3D reconstruction of vocal fold medial surface vibrations. The 3D surface is reconstructed from the split image provided by the prism. For validation, reconstruction error was calculated for objects located at a distance of up to 15 mm away from the prism. The influence of camera angle, changing calibrated volume, and calibration errors were determined. Overall average 3D reconstruction error is low and does not exceed 0.12 mm at 5 mm distance from the prism. Influence of a moderate (5°) and large (10°) deviation in camera angle led to a slight increase in error to 0.16 mm and 0.17 mm, respectively. This procedure is robust towards changes in calibration volume and small calibration errors. This makes this 3D reconstruction approach a useful tool for the reconstruction of accessible and moving tissue surfaces.
Collapse
Affiliation(s)
- Patrick Schlegel
- Department of Head and Neck Surgery, University of California, Los Angeles, UCLA Rehabilitation Services, 1000 Veteran Ave, Los Angeles, CA, 90095, USA.
| | - Michael Döllinger
- Department of Head and Neck Surgery, Division of Phoniatrics and Pediatric Audiology, Friedrich Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Neha K Reddy
- Department of Head and Neck Surgery, University of California, Los Angeles, UCLA Rehabilitation Services, 1000 Veteran Ave, Los Angeles, CA, 90095, USA
| | - Zhaoyan Zhang
- Department of Head and Neck Surgery, University of California, Los Angeles, UCLA Rehabilitation Services, 1000 Veteran Ave, Los Angeles, CA, 90095, USA
| | - Dinesh K Chhetri
- Department of Head and Neck Surgery, University of California, Los Angeles, UCLA Rehabilitation Services, 1000 Veteran Ave, Los Angeles, CA, 90095, USA
| |
Collapse
|
3
|
Bottasso-Arias N, Burra K, Sinner D, Riede T. Disruption of BMP4 signaling is associated with laryngeal birth defects in a mouse model. Dev Biol 2023:S0012-1606(23)00068-4. [PMID: 37230380 DOI: 10.1016/j.ydbio.2023.04.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 04/18/2023] [Accepted: 04/24/2023] [Indexed: 05/27/2023]
Abstract
Laryngeal birth defects are considered rare, but they can be life-threatening conditions. The BMP4 gene plays an important role in organ development and tissue remodeling throughout life. Here we examined its role in laryngeal development complementing similar efforts for the lung, pharynx, and cranial base. Our goal was to determine how different imaging techniques contribute to a better understanding of the embryonic anatomy of the normal and diseased larynx in small specimens. Contrast-enhanced micro CT images of embryonic larynx tissue from a mouse model with Bmp4 deletion informed by histology and whole-mount immunofluorescence were used to reconstruct the laryngeal cartilaginous framework in three dimensions. Laryngeal defects included laryngeal cleft, laryngeal asymmetry, ankylosis and atresia. Results implicate BMP4 in laryngeal development and show that the 3D reconstruction of laryngeal elements provides a powerful approach to visualize laryngeal defects and thereby overcoming shortcomings of 2D histological sectioning and whole mount immunofluorescence.
Collapse
Affiliation(s)
- N Bottasso-Arias
- Neonatology and Pulmonary Biology, Perinatal Institute Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - K Burra
- Neonatology and Pulmonary Biology, Perinatal Institute Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - D Sinner
- Neonatology and Pulmonary Biology, Perinatal Institute Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; College of Medicine, University of Cincinnati, Cincinnati, OH, USA.
| | - T Riede
- Department of Physiology, Midwestern University, Glendale, AZ, USA.
| |
Collapse
|
4
|
Pedersen M, Larsen CF, Madsen B, Eeg M. Localization and quantification of glottal gaps on deep learning segmentation of vocal folds. Sci Rep 2023; 13:878. [PMID: 36650265 PMCID: PMC9845318 DOI: 10.1038/s41598-023-27980-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 01/11/2023] [Indexed: 01/19/2023] Open
Abstract
The entire glottis has mostly been the focus in the tracking of the vocal folds, both manually and automatically. From a treatment point of view, the various regions of the glottis are of specific interest. The aim of the study was to test if it was possible to supplement an existing convolutional neural network (CNN) with post-network calculations for the localization and quantification of posterior glottal gaps during phonation, usable for vocal fold function analysis of e.g. laryngopharyngeal reflux findings. 30 subjects/videos with insufficient closure in the rear glottal area and 20 normal subjects/videos were selected from our database, recorded with a commercial high-speed video setup (HSV with 4000 frames per second), and segmented with an open-source CNN for validating voice function. We made post-network calculations to localize and quantify the 10% and 50% distance lines from the rear part of the glottis. The results showed a significant difference using the algorithm at the 10% line distance between the two groups of p < 0.0001 and no difference at 50%. These novel results show that it is possible to use post-network calculations on CNNs for the localization and quantification of posterior glottal gaps.
Collapse
|
5
|
Ikuma T, McWhorter AJ, Adkins L, Kunduk M. Investigation of Vocal Bifurcations and Voice Patterns Induced by Asymmetry of Pathological Vocal Folds. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:48-60. [PMID: 36472934 DOI: 10.1044/2022_jslhr-21-00499] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
PURPOSE Vocal fold asymmetry creates irregular entrainments and modulations in voice, which may lead to rough perceptual quality. The presence of asymmetry can also cause mid-phonation bifurcations where a small change in the phonatory system causes a drastic change in vibration pattern, resulting in transitions in and out of rough voice. This study surveys sustained phonation recordings of speakers with the diagnoses of vocal fold polyp or unilateral vocal fold paralysis to investigate the resulting voice patterns. METHOD This retrospective study observed 71 sustained phonation recordings from 48 patients. Segments with distinctive signal patterns were identified within each recording with narrowband spectrogram and computer-assisted analysis of spectral peaks. RESULTS Phonation segmentation yielded 240 segments across all the recordings. Five voice patterns were recognized: (regularly or irregularly) entrained, modulated, uncoupled, unstable, and pulsed. Thirty-six patients (75%) exhibited irregular patterns. No single irregular pattern lasted for the entire phonation and was always accompanied by at least one mid-phonation bifurcation. Durations of the irregular segments (M = 0.4 s) were significantly shorter than the segments with the regular pattern (M = 1.4 s). CONCLUSIONS The results suggest that vocal fold pathology frequently introduces dynamic vibratory patterns that affect both the acoustic signals and perceptions. Due to these abnormalities, it is important for clinical voice assessment protocols, both perceptual and acoustic, to account for these possible bifurcations, irregular signal patterns, and their tendencies.
Collapse
Affiliation(s)
- Takeshi Ikuma
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans
- Voice Center, The Our Lady of the Lake Regional Medical Center, Baton Rouge, LA
| | - Andrew J McWhorter
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans
- Voice Center, The Our Lady of the Lake Regional Medical Center, Baton Rouge, LA
| | - Lacey Adkins
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans
- Voice Center, The Our Lady of the Lake Regional Medical Center, Baton Rouge, LA
| | - Melda Kunduk
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans
- Voice Center, The Our Lady of the Lake Regional Medical Center, Baton Rouge, LA
- Department of Communication Sciences & Disorders, Louisiana State University, Baton Rouge
| |
Collapse
|
6
|
Stewart ME, Erath BD. Investigating blunt force trauma to the larynx: The role of inferior-superior vocal fold displacement on phonation. J Biomech 2021; 121:110377. [PMID: 33819698 DOI: 10.1016/j.jbiomech.2021.110377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 02/24/2021] [Accepted: 03/01/2021] [Indexed: 11/26/2022]
Abstract
Blunt force trauma to the larynx, which may result from motor vehicle collisions, sports activities, etc., can cause significant damage, often leading to displaced fractures of the laryngeal cartilages, thereby disrupting vocal function. Current surgical interventions primarily focus on airway restoration to stabilize the patient, with restoration of vocal function usually being a secondary consideration. Due to laryngeal fracture, asymmetric vertical misalignment of the left or right vocal fold (VF) in the inferior-superior direction often occurs. This affects VF closure and can lead to a weak, breathy voice requiring increased vocal effort. It is unclear, however, how much vertical VF misalignment can be tolerated before voice quality degrades significantly. To address this need, the influence of inferior-superior VF displacement on phonation is investigated in 1.0mm increments using synthetic, self-oscillating VF models in a physiologically-representative facility. Acoustic (SPL, frequency, H1-H2, jitter, and shimmer), kinematic (amplitude and phase differences), and aerodynamic parameters (flow rate and subglottal pressure) are investigated as a function of inferior-superior vertical displacement. Significant findings include that once the inferior-superior medial length of the VF is surpassed, sustained phonation degrades precipitously, becoming severely pathological. If laryngeal reconstruction approaches can ensure VF contact is maintained during phonation (i.e., vertical displacement doesn't surpass VF medial length), improved vocal outcomes are expected.
Collapse
Affiliation(s)
- Molly E Stewart
- Department of Mechanical and Aeronautical Engineering, Clarkson University, 8 Clarkson Ave, Potsdam, NY 13699, United States
| | - Byron D Erath
- Department of Mechanical and Aeronautical Engineering, Clarkson University, 8 Clarkson Ave, Potsdam, NY 13699, United States.
| |
Collapse
|
7
|
Falk S, Kniesburges S, Schoder S, Jakubaß B, Maurerlehner P, Echternach M, Kaltenbacher M, Döllinger M. 3D-FV-FE Aeroacoustic Larynx Model for Investigation of Functional Based Voice Disorders. Front Physiol 2021; 12:616985. [PMID: 33762964 PMCID: PMC7982522 DOI: 10.3389/fphys.2021.616985] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 02/09/2021] [Indexed: 12/02/2022] Open
Abstract
For the clinical analysis of underlying mechanisms of voice disorders, we developed a numerical aeroacoustic larynx model, called simVoice, that mimics commonly observed functional laryngeal disorders as glottal insufficiency and vibrational left-right asymmetries. The model is a combination of the Finite Volume (FV) CFD solver Star-CCM+ and the Finite Element (FE) aeroacoustic solver CFS++. simVoice models turbulence using Large Eddy Simulations (LES) and the acoustic wave propagation with the perturbed convective wave equation (PCWE). Its geometry corresponds to a simplified larynx and a vocal tract model representing the vowel /a/. The oscillations of the vocal folds are externally driven. In total, 10 configurations with different degrees of functional-based disorders were simulated and analyzed. The energy transfer between the glottal airflow and the vocal folds decreases with an increasing glottal insufficiency and potentially reflects the higher effort during speech for patients being concerned. This loss of energy transfer may also have an essential influence on the quality of the sound signal as expressed by decreasing sound pressure level (SPL), Cepstral Peak Prominence (CPP), and Vocal Efficiency (VE). Asymmetry in the vocal fold oscillations also reduces the quality of the sound signal. However, simVoice confirmed previous clinical and experimental observations that a high level of glottal insufficiency worsens the acoustic signal quality more than oscillatory left-right asymmetry. Both symptoms in combination will further reduce the quality of the sound signal. In summary, simVoice allows for detailed analysis of the origins of disordered voice production and hence fosters the further understanding of laryngeal physiology, including occurring dependencies. A current walltime of 10 h/cycle is, with a prospective increase in computing power, auspicious for a future clinical use of simVoice.
Collapse
Affiliation(s)
- Sebastian Falk
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Stefan Kniesburges
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Stefan Schoder
- Institute of Fundamentals and Theory in Electrical Engineering, Division Vibro- and Aeroacoustics, Graz University of Technology, Graz, Austria
| | - Bernhard Jakubaß
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Paul Maurerlehner
- Institute of Fundamentals and Theory in Electrical Engineering, Division Vibro- and Aeroacoustics, Graz University of Technology, Graz, Austria
| | - Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Munich, Germany
| | - Manfred Kaltenbacher
- Institute of Fundamentals and Theory in Electrical Engineering, Division Vibro- and Aeroacoustics, Graz University of Technology, Graz, Austria
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
8
|
Semmler M, Berry DA, Schützenberger A, Döllinger M. Fluid-structure-acoustic interactions in an ex vivo porcine phonation model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:1657. [PMID: 33765793 PMCID: PMC7952141 DOI: 10.1121/10.0003602] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 01/29/2021] [Accepted: 02/07/2021] [Indexed: 05/02/2023]
Abstract
In the clinic, many diagnostic and therapeutic procedures focus on the oscillation patterns of the vocal folds (VF). Dynamic characteristics of the VFs, such as symmetry, periodicity, and full glottal closure, are considered essential features for healthy phonation. However, the relevance of these individual factors in the complex interaction between the airflow, laryngeal structures, and the resulting acoustics has not yet been quantified. Sustained phonation was induced in nine excised porcine larynges without vocal tract (supraglottal structures had been removed above the ventricular folds). The multimodal setup was designed to simultaneously control and monitor key aspects of phonation in the three essential parts of the larynx. More specifically, measurements will comprise (1) the subglottal pressure signal, (2) high-speed recordings in the glottal plane, and (3) the acoustic signal in the supraglottal region. The automated setup regulates glottal airflow, asymmetric arytenoid adduction, and the pre-phonatory glottal gap. Statistical analysis revealed a beneficial influence of VF periodicity and glottal closure on the signal quality of the subglottal pressure and the supraglottal acoustics, whereas VF symmetry only had a negligible influence. Strong correlations were found between the subglottal and supraglottal signal quality, with significant improvement of the acoustic quality for high levels of periodicity and glottal closure.
Collapse
Affiliation(s)
- Marion Semmler
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| | - David A Berry
- Laryngeal Dynamics Laboratory, Department of Head and Neck Surgery, David Geffen School of Medicine, UCLA, Los Angeles, California 90024, USA
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| |
Collapse
|
9
|
Zäske R, Skuk VG, Schweinberger SR. Attractiveness and distinctiveness between speakers' voices in naturalistic speech and their faces are uncorrelated. ROYAL SOCIETY OPEN SCIENCE 2020; 7:201244. [PMID: 33489273 PMCID: PMC7813223 DOI: 10.1098/rsos.201244] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Accepted: 11/20/2020] [Indexed: 05/28/2023]
Abstract
Facial attractiveness has been linked to the averageness (or typicality) of a face and, more tentatively, to a speaker's vocal attractiveness, via the 'honest signal' hypothesis, holding that attractiveness signals good genes. In four experiments, we assessed ratings for attractiveness and two common measures of distinctiveness ('distinctiveness-in-the-crowd', DITC and 'deviation-based distinctiveness', DEV) for faces and voices (simple vowels, or more naturalistic sentences) from 64 young adult speakers (32 female). Consistent and substantial negative correlations between attractiveness and DEV generally supported the averageness account of attractiveness, for both voices and faces. By contrast, and indicating that both measures of distinctiveness reflect different constructs, correlations between attractiveness and DITC were numerically positive for faces (though small and non-significant), and significant for voices in sentence stimuli. Between faces and voices, distinctiveness ratings were uncorrelated. Remarkably, and at variance with the honest signal hypothesis, vocal and facial attractiveness were also uncorrelated in all analyses involving naturalistic, i.e. sentence-based, speech. This result pattern was confirmed using a new set of stimuli and raters (experiment 5). Overall, while our findings strongly support an averageness account of attractiveness for both domains, they provide no evidence for an honest signal account of facial and vocal attractiveness in complex naturalistic speech.
Collapse
Affiliation(s)
- Romi Zäske
- Department for General Psychology and Cognitive Neuroscience & DFG Research Unit Person Perception, Institute of Psychology, Friedrich Schiller University of Jena, Am Steiger 3/1, 07743 Jena, Germany
- Department of Otorhinolaryngology, Jena University Hospital, Am Klinikum 1, 07747 Jena, Germany
| | - Verena Gabriele Skuk
- Department for General Psychology and Cognitive Neuroscience & DFG Research Unit Person Perception, Institute of Psychology, Friedrich Schiller University of Jena, Am Steiger 3/1, 07743 Jena, Germany
- Department of Otorhinolaryngology, Jena University Hospital, Am Klinikum 1, 07747 Jena, Germany
| | - Stefan R. Schweinberger
- Department for General Psychology and Cognitive Neuroscience & DFG Research Unit Person Perception, Institute of Psychology, Friedrich Schiller University of Jena, Am Steiger 3/1, 07743 Jena, Germany
- International Max Planck Research School (IMPRS) for the Science of Human History, Max Planck Institute for the Science of Human History, Kahlaische Strasse 10, 07745 Jena, Germany
| |
Collapse
|
10
|
|
11
|
Schlegel P, Kniesburges S, Dürr S, Schützenberger A, Döllinger M. Machine learning based identification of relevant parameters for functional voice disorders derived from endoscopic high-speed recordings. Sci Rep 2020; 10:10517. [PMID: 32601277 PMCID: PMC7324600 DOI: 10.1038/s41598-020-66405-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 05/20/2020] [Indexed: 11/13/2022] Open
Abstract
In voice research and clinical assessment, many objective parameters are in use. However, there is no commonly used set of parameters that reflect certain voice disorders, such as functional dysphonia (FD); i.e. disorders with no visible anatomical changes. Hence, 358 high-speed videoendoscopy (HSV) recordings (159 normal females (NF), 101 FD females (FDF), 66 normal males (NM), 32 FD males (FDM)) were analyzed. We investigated 91 quantitative HSV parameters towards their significance. First, 25 highly correlated parameters were discarded. Second, further 54 parameters were discarded by using a LogitBoost decision stumps approach. This yielded a subset of 12 parameters sufficient to reflect functional dysphonia. These parameters separated groups NF vs. FDF and NM vs. FDM with fair accuracy of 0.745 or 0.768, respectively. Parameters solely computed from the changing glottal area waveform (1D-function called GAW) between the vocal folds were less important than parameters describing the oscillation characteristics along the vocal folds (2D-function called Phonovibrogram). Regularity of GAW phases and peak shape, harmonic structure and Phonovibrogram-based vocal fold open and closing angles were mainly important. This study showed the high degree of redundancy of HSV-voice-parameters but also affirms the need of multidimensional based assessment of clinical data.
Collapse
Affiliation(s)
- Patrick Schlegel
- Department of Otorhinolaryngology, Division of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany.
| | - Stefan Kniesburges
- Department of Otorhinolaryngology, Division of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Stephan Dürr
- Department of Otorhinolaryngology, Division of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Anne Schützenberger
- Department of Otorhinolaryngology, Division of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Michael Döllinger
- Department of Otorhinolaryngology, Division of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
12
|
Mohd Khairuddin KA, Ahmad K, Ibrahim HM, Yan Y. Effects of Using Laryngeal High-Speed Videoendoscopy Images Visualizing Partial Views of The Glottis on Measurement Outcomes. J Voice 2020; 36:106-112. [PMID: 32456835 DOI: 10.1016/j.jvoice.2020.04.027] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 04/21/2020] [Accepted: 04/22/2020] [Indexed: 11/29/2022]
Abstract
Ideally, an analysis method for laryngeal high-speed videoendoscopy (LHSV) based on the glottal area waveforms (GAW) requires images of a complete view of the glottis to ensure findings that are representatives of the vibratory behaviors of the whole vocal folds. However, in practice, the preferred images may not be obtained at all times. Often, the only available images that a clinician has to work with consist of a partial view of the glottis. This study aims to examine the effects of using images of a partial view of the glottis (ie, posterior-middle, anterior-middle, or middle) on the LHSV-based measures (ie, fundamental frequency (F0GAW), frequency perturbation (jitterGAW), amplitude perturbation (shimmerGAW), open quotient (OQGAW), and Nyquist plot). The participants consisted of 9 young normophonic females. The procedures involved LHSV recording of the vibration of the vocal folds. The images of the complete view of the glottis were analyzed to obtain the LHSV-based measures. The same images were used to simulate the images of partial views of the glottis by changing the outline of the region of interest to include only either the posterior-middle, anterior-middle, or middle parts of the glottis. The LHSV-based measures from the images of the partial views were then compared to those with the complete view . The results showed that all LHSV-based measures from the images of the posterior-middle view were similar to those of the complete view. However, only the F0GAW, jitterGAW, and shimmerGAW from the images of the anterior-middle and middle views were similar to those of the complete view. Lower OQGAW and different Nyquist plots than those of the complete view were generated by the images of the anterior-middle and middle views. In conclusion, all LHSV-based measures from the images of the posterior-middle view of the glottis, and only the F0GAW, jitterGAW, and shimmerGAW from the images of the anterior-middle and middle views of the glottis reflect the vibratory behaviors of the whole vocal folds. The same conclusion could not be applied to the OQGAW and Nyquist plots of the images of the anterior-middle and middle views of the glottis. A possible effect of the presence or absence of a posterior glottal gap on the findings warrants further confirmation.
Collapse
Affiliation(s)
- Khairy Anuar Mohd Khairuddin
- Speech Sciences Program, Centre for Rehabilitation and Special Needs, Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia; Speech Pathology Program, School of Health Sciences, Universiti Sains Malaysia, Kelantan, Malaysia.
| | - Kartini Ahmad
- Speech Sciences Program, Centre for Rehabilitation and Special Needs, Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Hasherah Mohd Ibrahim
- Speech Sciences Program, Centre for Rehabilitation and Special Needs, Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Yuling Yan
- Department of Bioengineering, School of Engineering, Santa Clara University, California
| |
Collapse
|
13
|
Schlegel P, Kist AM, Semmler M, Döllinger M, Kunduk M, Dürr S, Schützenberger A. Determination of Clinical Parameters Sensitive to Functional Voice Disorders Applying Boosted Decision Stumps. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE 2020; 8:2100511. [PMID: 32518739 PMCID: PMC7274815 DOI: 10.1109/jtehm.2020.2985026] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 02/21/2020] [Accepted: 03/28/2020] [Indexed: 12/30/2022]
Abstract
BACKGROUND Various voice assessment tools, such as questionnaires and aerodynamic voice characteristics, can be used to assess vocal function of individuals. However, not much is known about the best combinations of these parameters in identification of functional dysphonia in clinical settings. METHODS This study investigated six scores from clinically commonly used questionnaires and seven acoustic parameters. 514 females and 277 males were analyzed. The subjects were divided into three groups: one healthy group (N01) (49 females, 50 males) and two disordered groups with perceptually hoarse (FD23) (220 females, 96 males) and perceptually not hoarse (FD01) (245 females, 131 males) sounding voices. A tree stumps Adaboost approach was applied to find the subset of parameters that best separates the groups. Subsequently, it was determined if this parameter subset reflects treatment outcome for 120 female and 51 male patients by pairwise pre- and post-treatment comparisons of parameters. RESULTS The questionnaire "Voice-related-quality-of-Life" and three objective parameters ("maximum fundamental frequency", "maximum Intensity" and "Jitter Percent") were sufficient to separate the groups (accuracy ranging from 0.690 (FD01 vs. FD23, females) to 0.961 (N01 vs. FD23, females)). Our study suggests that a reduced parameter subset (4 out of 13) is sufficient to separate these three groups. All parameters reflected treatment outcome for patients with hoarse voices, Voice-related-quality-of-Life showed improvement for the not hoarse group (FD01). CONCLUSION Results show that single parameters are insufficient to separate voice disorders but a set of several well-chosen parameters is. These findings will help to optimize and reduce clinical assessment time.
Collapse
Affiliation(s)
- Patrick Schlegel
- Department of Otorhinolaryngology Head and Neck SurgeryDivision of Phoniatrics and Pediatric AudiologyUniversity Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg91054ErlangenGermany
| | - Andreas M. Kist
- Department of Otorhinolaryngology Head and Neck SurgeryDivision of Phoniatrics and Pediatric AudiologyUniversity Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg91054ErlangenGermany
| | - Marion Semmler
- Department of Otorhinolaryngology Head and Neck SurgeryDivision of Phoniatrics and Pediatric AudiologyUniversity Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg91054ErlangenGermany
| | - Michael Döllinger
- Department of Otorhinolaryngology Head and Neck SurgeryDivision of Phoniatrics and Pediatric AudiologyUniversity Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg91054ErlangenGermany
| | - Melda Kunduk
- Department of Communication Sciences and DisordersLouisiana State UniversityBaton RougeLA70803USA
| | - Stephan Dürr
- Department of Otorhinolaryngology Head and Neck SurgeryDivision of Phoniatrics and Pediatric AudiologyUniversity Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg91054ErlangenGermany
| | - Anne Schützenberger
- Department of Otorhinolaryngology Head and Neck SurgeryDivision of Phoniatrics and Pediatric AudiologyUniversity Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg91054ErlangenGermany
| |
Collapse
|
14
|
Fehling MK, Grosch F, Schuster ME, Schick B, Lohscheller J. Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network. PLoS One 2020; 15:e0227791. [PMID: 32040514 PMCID: PMC7010264 DOI: 10.1371/journal.pone.0227791] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 12/25/2019] [Indexed: 01/22/2023] Open
Abstract
The objective investigation of the dynamic properties of vocal fold vibrations demands the recording and further quantitative analysis of laryngeal high-speed video (HSV). Quantification of the vocal fold vibration patterns requires as a first step the segmentation of the glottal area within each video frame from which the vibrating edges of the vocal folds are usually derived. Consequently, the outcome of any further vibration analysis depends on the quality of this initial segmentation process. In this work we propose for the first time a procedure to fully automatically segment not only the time-varying glottal area but also the vocal fold tissue directly from laryngeal high-speed video (HSV) using a deep Convolutional Neural Network (CNN) approach. Eighteen different Convolutional Neural Network (CNN) network configurations were trained and evaluated on totally 13,000 high-speed video (HSV) frames obtained from 56 healthy and 74 pathologic subjects. The segmentation quality of the best performing Convolutional Neural Network (CNN) model, which uses Long Short-Term Memory (LSTM) cells to take also the temporal context into account, was intensely investigated on 15 test video sequences comprising 100 consecutive images each. As performance measures the Dice Coefficient (DC) as well as the precisions of four anatomical landmark positions were used. Over all test data a mean Dice Coefficient (DC) of 0.85 was obtained for the glottis and 0.91 and 0.90 for the right and left vocal fold (VF) respectively. The grand average precision of the identified landmarks amounts 2.2 pixels and is in the same range as comparable manual expert segmentations which can be regarded as Gold Standard. The method proposed here requires no user interaction and overcomes the limitations of current semiautomatic or computational expensive approaches. Thus, it allows also for the analysis of long high-speed video (HSV)-sequences and holds the promise to facilitate the objective analysis of vocal fold vibrations in clinical routine. The here used dataset including the ground truth will be provided freely for all scientific groups to allow a quantitative benchmarking of segmentation approaches in future.
Collapse
Affiliation(s)
- Mona Kirstin Fehling
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, Trier, Germany
| | - Fabian Grosch
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, Trier, Germany
| | - Maria Elke Schuster
- Department of Otorhinolaryngology and Head and Neck Surgery, University of Munich, Campus Grosshadern, München, Germany
| | - Bernhard Schick
- Department of Otorhinolaryngology, Saarland University Hospital, Homburg/Saar, Germany
| | - Jörg Lohscheller
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, Trier, Germany
| |
Collapse
|
15
|
Influence of spatial camera resolution in high-speed videoendoscopy on laryngeal parameters. PLoS One 2019; 14:e0215168. [PMID: 31009488 PMCID: PMC6476512 DOI: 10.1371/journal.pone.0215168] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 03/27/2019] [Indexed: 11/19/2022] Open
Abstract
In laryngeal high-speed videoendoscopy (HSV) the area between the vibrating vocal folds during phonation is of interest, being referred to as glottal area waveform (GAW). Varying camera resolution may influence parameters computed on the GAW and hence hinder the comparability between examinations. This study investigates the influence of spatial camera resolution on quantitative vocal fold vibratory function parameters obtained from the GAW. In total 40 HSV recordings during sustained phonation (20 healthy males and 20 healthy females) were investigated. A clinically used Photron Fastcam MC2 camera with a frame rate of 4000 fps and a spatial resolution of 512×256 pixels was applied. This initial resolution was reduced by pixel averaging to (1) a resolution of 256×128 and (2) to a resolution of 128×64 pixels, yielding three sets of recordings. The GAW was extracted and in total 50 vocal fold vibratory parameters representing different features of the GAW were computed. Statistical analyses using SPSS Statistics, version 21, was performed. 15 Parameters showing strong mathematical dependencies with other parameters were excluded from the main analysis but are given in the Supporting Information. Data analysis revealed clear influence of spatial resolution on GAW parameters. Fundamental period measures and period perturbation measures were the least affected. Amplitude perturbation measures and mechanical measures were most strongly influenced. Most glottal dynamic characteristics and symmetry measures deviated significantly. Most energy perturbation measures changed significantly in males but were mostly unaffected in females. In females 18 of 35 remaining parameters (51%) and in males 22 parameters (63%) changed significantly between spatial resolutions. This work represents the first step in studying the impact of video resolution on quantitative HSV parameters. Clear influences of spatial camera resolution on computed parameters were found. The study results suggest avoiding the use of the most strongly affected parameters. Further, the use of cameras with high resolution is recommended to analyze GAW measures in HSV data.
Collapse
|
16
|
Powell ME, Deliyski DD, Zeitels SM, Burns JA, Hillman RE, Gerlach TT, Mehta DD. Efficacy of Videostroboscopy and High-Speed Videoendoscopy to Obtain Functional Outcomes From Perioperative Ratings in Patients With Vocal Fold Mass Lesions. J Voice 2019; 34:769-782. [PMID: 31005449 DOI: 10.1016/j.jvoice.2019.03.012] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Revised: 03/20/2019] [Accepted: 03/21/2019] [Indexed: 11/30/2022]
Abstract
OBJECTIVES A major limitation of comparing the efficacy of videostroboscopy (VS) and high-speed videoendoscopy (HSV) is the lack of an objective reference by which to compare the functional assessment ratings of the two techniques. For patients with vocal fold mass lesions, intraoperative measures of lesion size and depth may serve as this objective reference. This study compared the relationships between the pre- to postoperative change in VS and HSV visual-perceptual ratings to intraoperative measures of lesion size and depth. DESIGN Prospective visual-perceptual study with intraoperative measures of lesion size and depth. METHODS VS and HSV samples were obtained preoperatively and postoperatively from 28 patients with vocal fold lesions and from 17 vocally healthy controls. Two experienced clinicians rated amplitude, mucosal wave, vertical phase difference, left-right phase asymmetry, and vocal fold edge on a visual-analog scale using both imaging techniques. The change in perioperative ratings from VS and HSV was compared between groups and correlated to intraoperative measures of lesion size and depth. RESULTS HSV was as reliable as VS for ratings of amplitude and edge, and substantially more reliable for ratings of mucosal wave and left-right phase asymmetry. Both VS and HSV had mild-moderate correlations between change in perioperative ratings and intraoperative measures of lesion area. Change in function could be obtained in more patients and for more parameters using HSV than VS. Group differences were noted for postoperative ratings of amplitude and edge; however, these differences were within one level of the visual-perceptual rating scale. The presence of asynchronicity in VS recordings renders vibratory features either uninterpretable or potentially distorted and thus should not be rated. CONCLUSIONS Amplitude and edge are robust vibratory measures for perioperative functional assessment, regardless of imaging modality. HSV is indicated for evaluation of subepithelial lesions or if asynchronicity is present in the VS image sequence.
Collapse
Affiliation(s)
- Maria E Powell
- Department of Otolaryngology, Vanderbilt University Medical Center, Nashville, Tennessee; Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio; Department of Communication Sciences and Disorders, University of Cincinnati, Cincinnati, Ohio.
| | - Dimitar D Deliyski
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio; Department of Communication Sciences and Disorders, University of Cincinnati, Cincinnati, Ohio; Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan
| | - Steven M Zeitels
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts; Department of Surgery, Harvard Medical School, Boston, Massachusetts
| | - James A Burns
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts; Department of Surgery, Harvard Medical School, Boston, Massachusetts
| | - Robert E Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts; Department of Surgery, Harvard Medical School, Boston, Massachusetts; Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, Massachusetts
| | - Terri Treman Gerlach
- Voice and Swallowing Center, Charlotte Eye Ear Nose and Throat Associates, Charlotte, North Carolina
| | - Daryush D Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts; Department of Surgery, Harvard Medical School, Boston, Massachusetts; Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, Massachusetts
| |
Collapse
|
17
|
Caffier PP, Nawka T, Ibrahim-Nasr A, Thomas B, Müller H, Ko SR, Song W, Gross M, Weikert S. Development of three-dimensional laryngostroboscopy for office-based laryngeal diagnostics and phonosurgical therapy. Laryngoscope 2018; 128:2823-2831. [PMID: 30328614 DOI: 10.1002/lary.27260] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2018] [Revised: 03/19/2018] [Accepted: 04/06/2018] [Indexed: 11/10/2022]
Abstract
OBJECTIVE To develop a three-dimensional (3D) laryngostroboscopic examination unit, compare the optic playback quality in relation to established 2D procedures, and report the first case series using 3D rigid laryngostroboscopy for diagnosis and management of laryngotracheal diseases. STUDY DESIGN Laboratory study, prospective case series. METHODS The optical efficacy of newly developed rigid 3D endoscopes was examined in a laboratory setting. Diagnostic suitability was investigated in 100 subjects (50 male, 50 female) receiving 2D high-definition (HD) and 3D laryngostroboscopy. Two of the subjects subsequently underwent 3D-assisted office-based transoral phonosurgery under local anesthesia. Main outcome measures were comparative visualization of laryngotracheal pathologies, influence on preoperative planning, and evaluation of prognostic factors for the outcome of phonosurgical interventions. RESULTS Three-dimensional endostroboscopic procedures were effectively optimized to establish an examination protocol for all-day clinical use. Office-based 3D laryngostroboscopy was successfully applied in subjects with normal anatomy (n = 10) and various laryngotracheal findings (n = 90). In comparison to 2D HD videolaryngostroboscopy, the 3D view offered enhanced visualization of laryngotracheal anatomy, with qualitatively improved depth perception and spatial representation. In organic pathologies, this resulted in a more precise indication of phonosurgical procedures, increased accuracy in surgical planning, facilitated office-based endoscopic surgery, and better evaluation of prognostic factors for the outcome of phonosurgical interventions. CONCLUSION Three-dimensional laryngostroboscopy proved to increase the understanding of functional and surgical anatomy. Its application has enormous potential for improving the diagnostic value of laryngoscopy, surgical precision in laryngotracheal interventions, tissue preservation, and methods of teaching. LEVEL OF EVIDENCE NA Laryngoscope, 128:2823-2831, 2018.
Collapse
Affiliation(s)
- Philipp P Caffier
- Department of Audiology and Phoniatrics, Charité-University Medicine Berlin, Berlin, Germany
| | - Tadeus Nawka
- Department of Audiology and Phoniatrics, Charité-University Medicine Berlin, Berlin, Germany
| | - Ahmed Ibrahim-Nasr
- Department of Audiology and Phoniatrics, Charité-University Medicine Berlin, Berlin, Germany
| | | | | | - Seo-Rin Ko
- Department of Audiology and Phoniatrics, Charité-University Medicine Berlin, Berlin, Germany
| | - Wen Song
- Department of Audiology and Phoniatrics, Charité-University Medicine Berlin, Berlin, Germany
| | - Manfred Gross
- Department of Audiology and Phoniatrics, Charité-University Medicine Berlin, Berlin, Germany
| | - Sebastian Weikert
- Department of Audiology and Phoniatrics, Charité-University Medicine Berlin, Berlin, Germany
| |
Collapse
|
18
|
Semmler M, Döllinger M, Patel RR, Ziethe A, Schützenberger A. Clinical relevance of endoscopic three-dimensional imaging for quantitative assessment of phonation. Laryngoscope 2018. [DOI: 10.1002/lary.27165] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Marion Semmler
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head and Neck Surgery; University Hospital Erlangen Medical School; Erlangen Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head and Neck Surgery; University Hospital Erlangen Medical School; Erlangen Germany
| | - Rita R. Patel
- Department of Speech and Hearing Sciences; Indiana University; Bloomington Indiana U.S.A
| | - Anke Ziethe
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head and Neck Surgery; University Hospital Erlangen Medical School; Erlangen Germany
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head and Neck Surgery; University Hospital Erlangen Medical School; Erlangen Germany
| |
Collapse
|
19
|
Birk V, Kniesburges S, Semmler M, Berry DA, Bohr C, Döllinger M, Schützenberger A. Influence of glottal closure on the phonatory process in ex vivo porcine larynges. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:2197. [PMID: 29092569 PMCID: PMC6909995 DOI: 10.1121/1.5007952] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Many cases of disturbed voice signals can be attributed to incomplete glottal closure, vocal fold oscillation asymmetries, and aperiodicity. Often these phenomena occur simultaneously and interact with each other, making a systematic, isolated investigation challenging. Therefore, ex vivo porcine experiments were performed which enable direct control of glottal configurations. Different pre-phonatory glottal gap sizes, adduction levels, and flow rates were adjusted. The resulting glottal closure types were identified in a post-processing step. Finally, the acoustic quality, aerodynamic parameters, and the characteristics of vocal fold oscillation were analyzed in reference to the glottal closure types. Results show that complete glottal closure stabilizes the phonation process indicated through a reduced left-right phase asymmetry, increased amplitude and time periodicity, and an increase in the acoustic quality. Although asymmetry and periodicity parameter variation covers only a small range of absolute values, these small variations have a remarkable influence on the acoustic quality. Due to the fact that these parameters cannot be influenced directly, the authors suggest that the (surgical) reduction of the glottal gap seems to be a promising method to stabilize the phonatory process, which has to be confirmed in future studies.
Collapse
Affiliation(s)
- Veronika Birk
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 1, 91054 Erlangen, Germany
| | - Stefan Kniesburges
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 1, 91054 Erlangen, Germany
| | - Marion Semmler
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 1, 91054 Erlangen, Germany
| | - David A Berry
- Laryngeal Dynamics Laboratory, Division of Head and Neck Surgery, David Geffen School of Medicine at UCLA, 10833 Le Conte Avenue, Los Angeles, California 90095-1624, USA
| | - Christopher Bohr
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 1, 91054 Erlangen, Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 1, 91054 Erlangen, Germany
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 1, 91054 Erlangen, Germany
| |
Collapse
|
20
|
Tokuda IT, Shimamura R. Effect of level difference between left and right vocal folds on phonation: Physical experiment and theoretical study. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:482. [PMID: 28863607 DOI: 10.1121/1.4996105] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
As an alternative factor to produce asymmetry between left and right vocal folds, the present study focuses on level difference, which is defined as the distance between the upper surfaces of the bilateral vocal folds in the inferior-superior direction. Physical models of the vocal folds were utilized to study the effect of the level difference on the phonation threshold pressure. A vocal tract model was also attached to the vocal fold model. For two types of different models, experiments revealed that the phonation threshold pressure tended to increase as the level difference was extended. Based upon a small amplitude approximation of the vocal fold oscillations, a theoretical formula was derived for the phonation threshold pressure. This theory agrees with the experiments, especially when the phase difference between the left and right vocal folds is not extensive. Furthermore, an asymmetric two-mass model was simulated with a level difference to validate the experiments as well as the theory. The primary conclusion is that the level difference has a potential effect on voice production especially for patients with an extended level of vertical difference in the vocal folds, which might be taken into account for the diagnosis of voice disorders.
Collapse
Affiliation(s)
- Isao T Tokuda
- Graduate School of Science and Engineering, Ritsumeikan University, Noji-higashi, Kusatsu, Shiga 525-8577, Japan
| | - Ryo Shimamura
- Graduate School of Science and Engineering, Ritsumeikan University, Noji-higashi, Kusatsu, Shiga 525-8577, Japan
| |
Collapse
|
21
|
Samlan RA, Story BH. Influence of Left-Right Asymmetries on Voice Quality in Simulated Paramedian Vocal Fold Paralysis. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:306-321. [PMID: 28199505 DOI: 10.1044/2016_jslhr-s-16-0076] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Accepted: 05/31/2016] [Indexed: 05/25/2023]
Abstract
PURPOSE The purpose of this study was to determine the vocal fold structural and vibratory symmetries that are important to vocal function and voice quality in a simulated paramedian vocal fold paralysis. METHOD A computational kinematic speech production model was used to simulate an exemplar "voice" on the basis of asymmetric settings of parameters controlling glottal configuration. These parameters were then altered individually to determine their effect on maximum flow declination rate, spectral slope, cepstral peak prominence, harmonics-to-noise ratio, and perceived voice quality. RESULTS Asymmetry of each of the 5 vocal fold parameters influenced vocal function and voice quality; measured change was greatest for adduction and bulging. Increasing the symmetry of all parameters improved voice, and the best voice occurred with overcorrection of adduction, followed by bulging, nodal point ratio, starting phase, and amplitude of vibration. CONCLUSIONS Although vocal process adduction and edge bulging asymmetries are most influential in voice quality for simulated vocal fold motion impairment, amplitude of vibration and starting phase asymmetries are also perceptually important. These findings are consistent with the current surgical approach to vocal fold motion impairment, where goals include medializing the vocal process and straightening concave edges. The results also explain many of the residual postoperative voice limitations.
Collapse
Affiliation(s)
- Robin A Samlan
- Department of Speech, Language, and Hearing Sciences, University of Arizona, Tucson
| | - Brad H Story
- Department of Speech, Language, and Hearing Sciences, University of Arizona, Tucson
| |
Collapse
|
22
|
Investigation of the Immediate Effects of Humming on Vocal Fold Vibration Irregularity Using Electroglottography and High-speed Laryngoscopy in Patients With Organic Voice Disorders. J Voice 2017; 31:48-56. [DOI: 10.1016/j.jvoice.2016.03.010] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Accepted: 03/17/2016] [Indexed: 11/22/2022]
|
23
|
Evaluation of an asymmetric anterior glottic web in an excised canine larynx model. Eur Arch Otorhinolaryngol 2016; 274:1609-1615. [PMID: 27826648 DOI: 10.1007/s00405-016-4364-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2016] [Accepted: 10/26/2016] [Indexed: 10/20/2022]
Abstract
The main objective of the study is to model asymmetry within anterior glottic webs in excised larynges using sutures and apply aerodynamic and acoustic analyses. Anterior glottic webs (AGW) were modeled in eight excised larynges using sutures secured at the level of the glottis to mimic the scar tissue of the web. Each of the eight larynges were tested under three different pressure increments for each of the three models of AGW: symmetric, vertically asymmetric, and laterally asymmetric. Phonation threshold pressure (PTP) and flow (PTF) differed significantly across AGW conditions (p = 0.006 and p = 0.005, respectively). Additionally, vocal efficiency was significantly different among conditions (p = 0.005) as well as significantly lower in the asymmetric groups (p = 0.015 and p = 0.007). Perturbation measures were not significantly different across conditions. Correlation dimension (D2) was significantly different at PTP, 1.25 × PTP, and 1.5 × PTP (p = 0.003, p = 0.010, and p < 0.001, respectively) as well as significantly higher in the asymmetric groups at each pressure increment. The increased PTP, PTF, and D2 values as well as decreased vocal efficiency among the asymmetric conditions indicates a significant decrease in vocal function, and thus represents that asymmetries could be a contributing factor to the pathological symptoms associated with glottic webs.
Collapse
|
24
|
Hill AK, Cárdenas RA, Wheatley JR, Welling LLM, Burriss RP, Claes P, Apicella CL, McDaniel MA, Little AC, Shriver MD, Puts DA. Are there vocal cues to human developmental stability? Relationships between facial fluctuating asymmetry and voice attractiveness. EVOL HUM BEHAV 2016; 38:249-258. [PMID: 34629843 DOI: 10.1016/j.evolhumbehav.2016.10.008] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Fluctuating asymmetry (FA), deviation from perfect bilateral symmetry, is thought to reflect an organism's relative inability to maintain stable morphological development in the face of environmental and genetic stressors. Previous research has documented negative relationships between FA and attractiveness judgments in humans, but scant research has explored relationships between the human voice and this putative marker of genetic quality in either sex. Only one study (and in women only) has explored relationships between vocal attractiveness and asymmetry of the face, a feature-rich trait space central in prior work on human genetic quality and mate choice. We therefore examined this relationship in three studies comprising 231 men and 240 women from two Western samples as well as Hadza hunter-gatherers of Tanzania. Voice recordings were collected and rated for attractiveness, and FA was computed from two-dimensional facial images as well as, for a subset of men, three-dimensional facial scans. Through meta-analysis of our results and those of prior studies, we found a negative association between FA and vocal attractiveness that was highly robust and statistically significant whether we included effect sizes from previously published work, or only those from the present research, and regardless of the inclusion of any individual sample or method of assessing FA (e.g., facial or limb FA). Weighted mean correlations between FA and vocal attractiveness across studies were -.23 for men and -.29 for women. This research thus offers strong support for the hypothesis that voices provide cues to genetic quality in humans.
Collapse
Affiliation(s)
- Alexander K Hill
- Department of Anthropology, The Pennsylvania State University, University Park, PA 16802
| | - Rodrigo A Cárdenas
- Department of Psychology, The Pennsylvania State University, University Park, PA 16802
| | - John R Wheatley
- Department of Anthropology, The Pennsylvania State University, University Park, PA 16802
| | - Lisa L M Welling
- Department of Anthropology, The Pennsylvania State University, University Park, PA 16802
| | - Robert P Burriss
- Department of Anthropology, The Pennsylvania State University, University Park, PA 16802
| | - Peter Claes
- KU Leuven, ESAT/PSI - UZ Leuven, MIRC - iMinds, Medical IT Department, Belgium
| | - Coren L Apicella
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19104
| | - Michael A McDaniel
- Department of Management, Virginia Commonwealth University, Richmond, VA 23284
| | | | - Mark D Shriver
- Department of Anthropology, The Pennsylvania State University, University Park, PA 16802
| | - David A Puts
- Department of Anthropology, The Pennsylvania State University, University Park, PA 16802.,Center for Brain, Behavior, and Cognition, The Pennsylvania State University, University Park, PA 16802
| |
Collapse
|
25
|
Quantitative Analysis of Vocal Fold Vibration in Vocal Fold Paralysis With the Use of High-speed Digital Imaging. J Voice 2016; 30:766.e13-766.e22. [DOI: 10.1016/j.jvoice.2015.10.015] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Accepted: 10/22/2015] [Indexed: 11/21/2022]
|
26
|
Yamauchi A, Yokonishi H, Imagawa H, Sakakibara KI, Nito T, Tayama N, Yamasoba T. Visualization and Estimation of Vibratory Disturbance in Vocal Fold Scar Using High-Speed Digital Imaging. J Voice 2016; 30:493-500. [DOI: 10.1016/j.jvoice.2015.07.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 07/08/2015] [Indexed: 11/17/2022]
|
27
|
Semmler M, Kniesburges S, Birk V, Ziethe A, Patel R, Dollinger M. 3D Reconstruction of Human Laryngeal Dynamics Based on Endoscopic High-Speed Recordings. IEEE TRANSACTIONS ON MEDICAL IMAGING 2016; 35:1615-1624. [PMID: 26829782 DOI: 10.1109/tmi.2016.2521419] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Standard laryngoscopic imaging techniques provide only limited two-dimensional insights into the vocal fold vibrations not taking the vertical component into account. However, previous experiments have shown a significant vertical component in the vibration of the vocal folds. We present a 3D reconstruction of the entire superior vocal fold surface from 2D high-speed videoendoscopy via stereo triangulation. In a typical camera-laser set-up the structured laser light pattern is projected on the vocal folds and captured at 4000 fps. The measuring device is suitable for in vivo application since the external dimensions of the miniaturized set-up barely exceed the size of a standard rigid laryngoscope. We provide a conservative estimate on the resulting resolution based on the hardware components and point out the possibilities and limitations of the miniaturized camera-laser set-up. In addition to the 3D vocal fold surface, we extended previous approaches with a G2-continuous model of the vocal fold edge. The clinical applicability was successfully established by the reconstruction of visual data acquired from 2D in vivo high-speed recordings of a female and a male subject. We present extracted dynamic parameters like maximum amplitude and velocity in the vertical direction. The additional vertical component reveals deeper insights into the vibratory dynamics of the vocal folds by means of a non-invasive method. The successful miniaturization allows for in vivo application giving access to the most realistic model available and hence enables a comprehensive understanding of the human phonation process.
Collapse
|
28
|
Abstract
Objectives: Kymographic imaging through videokymography has been recognized as a convenient, novel way to display laryngeal behavior, yet little systematic research has been done to map the relevant features displayed in such images. Here we have aimed at specification of these features to enable systematic visual characterization and categorization of vocal fold vibratory patterns in voice disorders. Methods: A cross-sectional, descriptive design was used. We selected 45 subjects and extracted 100 videokymographic images from the archive of more than 7,000 videokymographic examinations of subjects with a wide range of voice disorders. The images showed a large variety of vocal fold vibratory behaviors during sustained phonations. We visually identified the prominent features that distinguished the vibration patterns across the images. Results: We divided the findings into 10 feature categories. They included refined traditional features (eg, mucosal waves), as well as additional features that are obscured in strobolaryngoscopy (eg, different types of irregularities, left-right frequency differences, shapes of lateral and medial peaks, cycle aberrations). Conclusions: The variations in the identified features reveal different behavioral origins of voice disorders. The findings open new possibilities for objective documentation and for monitoring vocal fold behavior in clinical practice through kymographic imaging.
Collapse
Affiliation(s)
- Jan G Svec
- Groningen Voice Research Laboratory, Dept of Biomedical Engineering, University Medical Center Groningen, University of Groningen, Antonius Deusinglaan 1, NL 9713 AV Groningen, the Netherlands
| | | | | |
Collapse
|
29
|
Efremova KO, Frey R, Volodin IA, Fritsch G, Soldatova NV, Volodina EV. The postnatal ontogeny of the sexually dimorphic vocal apparatus in goitred gazelles (Gazella subgutturosa). J Morphol 2016; 277:826-44. [PMID: 26997608 DOI: 10.1002/jmor.20538] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Revised: 02/24/2016] [Accepted: 02/28/2016] [Indexed: 11/11/2022]
Abstract
This study quantitatively documents the progressive development of sexual dimorphism of the vocal organs along the ontogeny of the goitred gazelle (Gazella subgutturosa). The major, male-specific secondary sexual features, of vocal anatomy in goitred gazelle are an enlarged larynx and a marked laryngeal descent. These features appear to have evolved by sexual selection and may serve as a model for similar events in male humans. Sexual dimorphism of larynx size and larynx position in adult goitred gazelles is more pronounced than in humans, whereas the vocal anatomy of neonate goitred gazelles does not differ between sexes. This study examines the vocal anatomy of 19 (11 male, 8 female) goitred gazelle specimens across three age-classes, that is, neonates, subadults and mature adults. The postnatal ontogenetic development of the vocal organs up to their respective end states takes considerably longer in males than in females. Both sexes share the same features of vocal morphology but differences emerge in the course of ontogeny, ultimately resulting in the pronounced sexual dimorphism of the vocal apparatus in adults. The main differences comprise larynx size, vocal fold length, vocal tract length, and mobility of the larynx. The resilience of the thyrohyoid ligament and the pharynx, including the soft palate, and the length changes during contraction and relaxation of the extrinsic laryngeal muscles play a decisive role in the mobility of the larynx in both sexes but to substantially different degrees in adult females and males. Goitred gazelles are born with an undescended larynx and, therefore, larynx descent has to develop in the course of ontogeny. This might result from a trade-off between natural selection and sexual selection requiring a temporal separation of different laryngeal functions at birth and shortly after from those later in life. J. Morphol. 277:826-844, 2016. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Kseniya O Efremova
- Department of General Biology, Medicobiological Faculty, Pirogov Russian National Research Medical University (RNRMU), Moscow, Russia
| | - Roland Frey
- Department of Reproduction Management, Leibniz Institute for Zoo and Wildlife Research (IZW), Berlin, Germany
| | - Ilya A Volodin
- Department of Vertebrate Zoology, Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia.,Scientific Research Department, Moscow Zoo, Moscow, Russia
| | - Guido Fritsch
- Department of Reproduction Management, Leibniz Institute for Zoo and Wildlife Research (IZW), Berlin, Germany
| | | | | |
Collapse
|
30
|
Unger J, Schuster M, Hecker DJ, Schick B, Lohscheller J. A generalized procedure for analyzing sustained and dynamic vocal fold vibrations from laryngeal high-speed videos using phonovibrograms. Artif Intell Med 2015; 66:15-28. [PMID: 26597002 DOI: 10.1016/j.artmed.2015.10.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Revised: 09/28/2015] [Accepted: 10/20/2015] [Indexed: 12/01/2022]
Abstract
OBJECTIVE This work presents a computer-based approach to analyze the two-dimensional vocal fold dynamics of endoscopic high-speed videos, and constitutes an extension and generalization of a previously proposed wavelet-based procedure. While most approaches aim for analyzing sustained phonation conditions, the proposed method allows for a clinically adequate analysis of both dynamic as well as sustained phonation paradigms. MATERIALS AND METHODS The analysis procedure is based on a spatio-temporal visualization technique, the phonovibrogram, that facilitates the documentation of the visible laryngeal dynamics. From the phonovibrogram, a low-dimensional set of features is computed using a principle component analysis strategy that quantifies the type of vibration patterns, irregularity, lateral symmetry and synchronicity, as a function of time. Two different test bench data sets are used to validate the approach: (I) 150 healthy and pathologic subjects examined during sustained phonation. (II) 20 healthy and pathologic subjects that were examined twice: during sustained phonation and a glissando from a low to a higher fundamental frequency. In order to assess the discriminative power of the extracted features, a Support Vector Machine is trained to distinguish between physiologic and pathologic vibrations. The results for sustained phonation sequences are compared to the previous approach. Finally, the classification performance of the stationary analyzing procedure is compared to the transient analysis of the glissando maneuver. RESULTS For the first test bench the proposed procedure outperformed the previous approach (proposed feature set: accuracy: 91.3%, sensitivity: 80%, specificity: 97%, previous approach: accuracy: 89.3%, sensitivity: 76%, specificity: 96%). Comparing the classification performance of the second test bench further corroborates that analyzing transient paradigms provides clear additional diagnostic value (glissando maneuver: accuracy: 90%, sensitivity: 100%, specificity: 80%, sustained phonation: accuracy: 75%, sensitivity: 80%, specificity: 70%). CONCLUSIONS The incorporation of parameters describing the temporal evolvement of vocal fold vibration clearly improves the automatic identification of pathologic vibration patterns. Furthermore, incorporating a dynamic phonation paradigm provides additional valuable information about the underlying laryngeal dynamics that cannot be derived from sustained conditions. The proposed generalized approach provides a better overall classification performance than the previous approach, and hence constitutes a new advantageous tool for an improved clinical diagnosis of voice disorders.
Collapse
Affiliation(s)
- Jakob Unger
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, 54293 Trier, Germany.
| | - Maria Schuster
- Department of Otorhinolaryngology and Head and Neck Surgery, University of Munich, Campus Grosshadern, Marchioninistr. 13, 81366 München, Germany
| | - Dietmar J Hecker
- Department of Otorhinolaryngology, Saarland University Hospital, Kirrbergerstr., 66424 Homburg/Saar, Germany
| | - Bernhard Schick
- Department of Otorhinolaryngology, Saarland University Hospital, Kirrbergerstr., 66424 Homburg/Saar, Germany
| | - Jörg Lohscheller
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, 54293 Trier, Germany
| |
Collapse
|
31
|
Lucero JC, Schoentgen J, Haas J, Luizard P, Pelorson X. Self-entrainment of the right and left vocal fold oscillators. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:2036-46. [PMID: 25920854 DOI: 10.1121/1.4916601] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
This article presents an analysis of entrained oscillations of the right and left vocal folds in the presence of asymmetries. A simple one-mass model is proposed for each vocal fold. A stiffness asymmetry and open glottis oscillations are considered first, and regions of oscillation are determined by a stability analysis and an averaging technique. The results show that the subglottal threshold pressure for 1:1 entrainment increases with the asymmetry. Within that region, both folds oscillate with the same amplitude and with the lax fold delayed in time with regard to the tense fold. At large asymmetries, a region involving several different phase entrainments or toroidal regimes at constant threshold pressure appears. The effect of vocal fold collisions and asymmetry in the damping coefficients of the oscillators are explored next by means of numerical analyses. It is shown that the damping asymmetry expands the 1:1 entrainment region at low subglottal pressures across the whole asymmetry range. In the expanded region, the oscillator with the lowest natural frequency is dominant and the other oscillator has a large phase advance and small amplitude. The theoretical results are finally compared with data collected from a mechanical replica of the vocal folds.
Collapse
Affiliation(s)
- Jorge C Lucero
- Department of Computer Science, University of Brasilia, Brasilia, Federal District, 70910-900, Brazil
| | - Jean Schoentgen
- Laboratories of Image, Signal Processing and Acoustics, Université Libre de Bruxelles, Faculty of Applied Sciences 50, Avenue Franklin D. Roosevelt, B-1050, Brussels, Belgium
| | - Jessy Haas
- Grenoble Images Parole Signal Automatique, Unité Mixte de Recherche 5216, Centre National de la Recherche Scientifique, Grenoble Universities, 961 rue de la Houille Blanche, BP 46, 38402 Saint-Martin d'Heres, France
| | - Paul Luizard
- Grenoble Images Parole Signal Automatique, Unité Mixte de Recherche 5216, Centre National de la Recherche Scientifique, Grenoble Universities, 961 rue de la Houille Blanche, BP 46, 38402 Saint-Martin d'Heres, France
| | - Xavier Pelorson
- Grenoble Images Parole Signal Automatique, Unité Mixte de Recherche 5216, Centre National de la Recherche Scientifique, Grenoble Universities, 961 rue de la Houille Blanche, BP 46, 38402 Saint-Martin d'Heres, France
| |
Collapse
|
32
|
Unger J, Lohscheller J, Reiter M, Eder K, Betz CS, Schuster M. A Noninvasive Procedure for Early-Stage Discrimination of Malignant and Precancerous Vocal Fold Lesions Based on Laryngeal Dynamics Analysis. Cancer Res 2014; 75:31-9. [DOI: 10.1158/0008-5472.can-14-1458] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
33
|
Abstract
The laryngeal video stroboscope is an important instrument to test glottal diseases and read vocal fold images and voice quality for physician clinical diagnosis. This study is aimed to develop a medical system with functionality of automatic intelligent recognition of dynamic images. The static images of glottis opening to the largest extent and closing to the smallest extent were screened automatically using color space transformation and image preprocessing. The glottal area was also quantized. As the tongue base movements affected the position of laryngoscope and saliva would result in unclear images, this study used the gray scale adaptive entropy value to set the threshold in order to establish an elimination system. The proposed system can improve the effect of automatically captured images of glottis and achieve an accuracy rate of 96%. In addition, the glottal area and area segmentation threshold were calculated effectively. The glottis area segmentation was corrected, and the glottal area waveform pattern was drawn automatically to assist in vocal fold diagnosis. When developing the intelligent recognition system for vocal fold disorders, this study analyzed the characteristic values of four vocal fold patterns, namely, normal vocal fold, vocal fold paralysis, vocal fold polyp, and vocal fold cyst. It also used the support vector machine classifier to identify vocal fold disorders and achieved an identification accuracy rate of 98.75%. The results can serve as a very valuable reference for diagnosis.
Collapse
|
34
|
Chhetri DK, Neubauer J, Sofer E. Influence of asymmetric recurrent laryngeal nerve stimulation on vibration, acoustics, and aerodynamics. Laryngoscope 2014; 124:2544-50. [PMID: 24913182 DOI: 10.1002/lary.24774] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2013] [Revised: 04/22/2014] [Accepted: 05/20/2014] [Indexed: 11/06/2022]
Abstract
OBJECTIVES/HYPOTHESIS Evaluate the influence of asymmetric recurrent laryngeal nerve (RLN) stimulation on the vibratory phase, acoustics and aerodynamics of phonation. STUDY DESIGN Basic science study using an in vivo canine model. METHODS The RLNs were symmetrically and asymmetrically stimulated over eight graded levels to test a range of vocal fold activation conditions from subtle paresis to paralysis. Vibratory phase, fundamental frequency (F0 ), subglottal pressure, and airflow were noted at phonation onset. The evaluations were repeated for three levels of symmetric superior laryngeal nerve (SLN) stimulation. RESULTS Asymmetric laryngeal adductor activation from asymmetric left-right RLN stimulation led to a consistent pattern of vibratory phase asymmetry, with the more activated vocal fold leading in the opening phase of the glottal cycle and in mucosal wave amplitude. Vibratory amplitude asymmetry was also observed, with more lateral excursion of the glottis of the less activated side. Onset fundamental frequency was higher with asymmetric activation because the two RLNs were synergistic in decreasing F0 , glottal width, and strain. Phonation onset pressure increased and airflow decreased with symmetric RLN activation. CONCLUSION Asymmetric laryngeal activation from RLN paresis and paralysis has consistent effects on vocal fold vibration, acoustics, and aerodynamics. This information may be useful in diagnosis and management of vocal fold paresis. LEVEL OF EVIDENCE N/A.
Collapse
Affiliation(s)
- Dinesh K Chhetri
- Laryngeal Physiology Laboratory, Department of Head and Neck Surgery, UCLA School of Medicine, Los Angeles, California, U.S.A
| | | | | |
Collapse
|
35
|
Unger J, Hecker DJ, Kunduk M, Schuster M, Schick B, Lohscheller J. Quantifying spatiotemporal properties of vocal fold dynamics based on a multiscale analysis of phonovibrograms. IEEE Trans Biomed Eng 2014; 61:2422-33. [PMID: 24771562 DOI: 10.1109/tbme.2014.2318774] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
In order to objectively assess the laryngeal vibratory behavior, endoscopic high-speed cameras capture several thousand frames per second of the vocal folds during phonation. However, judging all inherent clinically relevant features is a challenging task and requires well-founded expert knowledge. In this study, an automated wavelet-based analysis of laryngeal high-speed videos based on phonovibrograms is presented. The phonovibrogram is an image representation of the spatiotemporal pattern of vocal fold vibration and constitutes the basis for a computer-based analysis of laryngeal dynamics. The features extracted from the wavelet transform are shown to be closely related to a basic set of video-based measurements categorized by the European Laryngological Society for a subjective assessment of pathologic voices. The wavelet-based analysis further offers information about irregularity and lateral asymmetry and asynchrony. It is demonstrated in healthy and pathologic subjects as well as for a surgical group that was examined before and after the removal of a vocal fold polyp. The features were found to not only classify glottal closure characteristics but also quantify the impact of pathologies on the vibratory behavior. The interpretability and the discriminative power of the proposed feature set show promising relevance for a computer-assisted diagnosis and classification of voice disorders.
Collapse
|
36
|
Moisik SR, Esling JH. Modeling the biomechanical influence of epilaryngeal stricture on the vocal folds: a low-dimensional model of vocal-ventricular fold coupling. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2014; 57:S687-S704. [PMID: 24687007 DOI: 10.1044/2014_jslhr-s-12-0279] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
PURPOSE Physiological and phonetic studies suggest that, at moderate levels of epilaryngeal stricture, the ventricular folds impinge upon the vocal folds and influence their dynamical behavior, which is thought to be responsible for constricted laryngeal sounds. In this work, the authors examine this hypothesis through biomechanical modeling. METHOD The dynamical response of a low-dimensional, lumped-element model of the vocal folds under the influence of vocal-ventricular fold coupling was evaluated. The model was assessed for F0 and cover-mass phase difference. Case studies of simulations of different constricted phonation types and of glottal stop illustrate various additional aspects of model performance. RESULTS Simulated vocal-ventricular fold coupling lowers F0 and perturbs the mucosal wave. It also appears to reinforce irregular patterns of oscillation, and it can enhance laryngeal closure in glottal stop production. CONCLUSION The effects of simulated vocal-ventricular fold coupling are consistent with sounds, such as creaky voice, harsh voice, and glottal stop, that have been observed to involve epilaryngeal stricture and apparent contact between the vocal folds and ventricular folds. This supports the view that vocal-ventricular fold coupling is important in the vibratory dynamics of such sounds and, furthermore, suggests that these sounds may intrinsically require epilaryngeal stricture.
Collapse
|
37
|
Kuo CFJ, Wang HW, Hsiao SW, Peng KC, Chou YL, Lai CY, Hsu CTM. Development of laryngeal video stroboscope with laser marking module for dynamic glottis measurement. Comput Med Imaging Graph 2014; 38:34-41. [DOI: 10.1016/j.compmedimag.2013.10.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Revised: 09/05/2013] [Accepted: 10/16/2013] [Indexed: 10/26/2022]
|
38
|
Kuo CFJ, Chu YH, Wang PC, Lai CY, Chu WL, Leu YS, Wang HW. Using image processing technology and mathematical algorithm in the automatic selection of vocal cord opening and closing images from the larynx endoscopy video. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2013; 112:455-465. [PMID: 24070546 DOI: 10.1016/j.cmpb.2013.08.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2012] [Revised: 08/06/2013] [Accepted: 08/08/2013] [Indexed: 06/02/2023]
Abstract
The human larynx is an important organ for voice production and respiratory mechanisms. The vocal cord is approximated for voice production and open for breathing. The videolaryngoscope is widely used for vocal cord examination. At present, physicians usually diagnose vocal cord diseases by manually selecting the image of the vocal cord opening to the largest extent (abduction), thus maximally exposing the vocal cord lesion. On the other hand, the severity of diseases such as vocal palsy, atrophic vocal cord is largely dependent on the vocal cord closing to the smallest extent (adduction). Therefore, diseases can be assessed by the image of the vocal cord opening to the largest extent, and the seriousness of breathy voice is closely correlated to the gap between vocal cords when closing to the smallest extent. The aim of the study was to design an automatic vocal cord image selection system to improve the conventional selection process by physicians and enhance diagnosis efficiency. Also, due to the unwanted fuzzy images resulting from examination process caused by human factors as well as the non-vocal cord images, texture analysis is added in this study to measure image entropy to establish a screening and elimination system to effectively enhance the accuracy of selecting the image of the vocal cord closing to the smallest extent.
Collapse
Affiliation(s)
- Chung-Feng Jeffrey Kuo
- Graduate Institute of Automation and Control, National Taiwan University of Science and Technology, No. 43, Keelung Road, Sec. 4, Taipei 106, Taiwan, ROC.
| | | | | | | | | | | | | |
Collapse
|
39
|
Chhetri DK, Neubauer J, Bergeron JL, Sofer E, Peng KA, Jamal N. Effects of asymmetric superior laryngeal nerve stimulation on glottic posture, acoustics, vibration. Laryngoscope 2013; 123:3110-6. [PMID: 23712542 DOI: 10.1002/lary.24209] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Revised: 04/26/2013] [Accepted: 04/26/2013] [Indexed: 11/12/2022]
Abstract
OBJECTIVES/HYPOTHESIS Evaluate the effects of asymmetric superior laryngeal nerve stimulation on the vibratory phase, laryngeal posture, and acoustics. STUDY DESIGN Basic science study using an in vivo canine model. METHODS The superior laryngeal nerves were symmetrically and asymmetrically stimulated over eight activation levels to mimic laryngeal asymmetries representing various levels of superior laryngeal nerve paresis and paralysis conditions. Glottal posture change, vocal fold speed, and vibration of these 64 distinct laryngeal-activation conditions were evaluated by high speed video and concurrent acoustic and aerodynamic recordings. Assessments were made at phonation onset. RESULTS Vibratory phase was symmetric in all symmetric activation conditions, but consistent phase asymmetry toward the vocal fold with higher superior laryngeal-nerve activation was observed. Superior laryngeal nerve paresis and paralysis conditions had reduced vocal fold strain and fundamental frequency. Superior laryngeal nerve activation increased vocal fold closure speed, but this effect was more pronounced for the ipsilateral vocal fold. Increasing asymmetry led to aperiodic and chaotic vibration. CONCLUSIONS This study directly links vocal-fold tension asymmetry with vibratory phase asymmetry, in particular the side with greater tension leads in the opening phase. The clinical observations of vocal fold lag, reduced vocal range, and aperiodic voice in superior laryngeal paresis and paralysis is also supported.
Collapse
Affiliation(s)
- Dinesh K Chhetri
- Department of Head and Neck Surgery, Laryngeal Physiology Laboratory, CHS 62-132, Los Angeles, California, U.S.A
| | | | | | | | | | | |
Collapse
|
40
|
Zorrilla AM, Zapirain BG, Izquierdo AP. Computer aided tool for diagnosis of ENT pathologies using digital signal processing of speech and stroboscopic images. SPRINGERPLUS 2013; 1:64. [PMID: 23483585 PMCID: PMC3586405 DOI: 10.1186/2193-1801-1-64] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/01/2012] [Accepted: 12/08/2012] [Indexed: 11/19/2022]
Abstract
The development of computer software and other technologies greatly facilitates the evaluation of pathological voice patients. This fact allows to reduce exploration time, improves the reproducibility of results and creates the possibility of test protocol standardization needed for the intercommunication between the different voice specialists. The proposed application encompasses the most important aspects which should be taken into account regarding dysphonic patients. It is a multidimensional scope which involves subjective questionnaires and perceptual, aerodynamic, acoustic and stroboscopic evaluations. In this system, the authors have designed and created simple tools for recording and automatic acoustic analysis for the acquisition and edition of stroboscopic images. The purpose is to work with all necessary tools running on a single application, without having to export and import data from other computer programs. Therefore, the objective is to synthetize the basic voice and the exploration of the vocal folds, simplifying it through the design of a program which helps us to analyze step-by-step each aspect of the vocal pathology. The evaluation of the tool has been performed by the otolaryngologists through periodical (medical) appointments on 25 patients for one year a year, and the results are promising either for the professionals as well as for the patients which receive a detailed report with the objective information concerning the features of their voice and vocal cords.
Collapse
Affiliation(s)
- Amaia Méndez Zorrilla
- DeustoTech-LIFE Unit, DeustoTech Institute of Technology, University of Deusto 24, Bilbao, 48007 Spain
| | | | | |
Collapse
|
41
|
Lohscheller J, Svec JG, Döllinger M. Vocal fold vibration amplitude, open quotient, speed quotient and their variability along glottal length: kymographic data from normal subjects. LOGOP PHONIATR VOCO 2012; 38:182-92. [PMID: 23173880 DOI: 10.3109/14015439.2012.731083] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Abstract Quantitative knowledge about healthy vocal fold vibration characteristics provides the basis for an objective assessment of vocal fold vibrations. In this study, using high-speed videolaryngoscopy the alterations of the relative vibration amplitudes, open quotients, and speed quotients were analyzed along the glottal length in 30 male and 30 female healthy subjects. The maximum vibration amplitude was identified at 41.1% ± 10.8% and 46.5% ± 18.0% of the visible glottal length in females and males, respectively. The average open quotients decreased in females and males from posterior to anterior, while the speed quotients did not change systematically. The reported normative values can be used to distinguish normal and abnormal vibrations in clinical practice when aiming at quantitative diagnosis of functional voice disorders.
Collapse
Affiliation(s)
- Jörg Lohscheller
- Department of Computer Science, University of Applied Sciences Trier , Germany
| | | | | |
Collapse
|
42
|
Elidan G, Elidan J. Vocal Folds Analysis Using Global Energy Tracking. J Voice 2012; 26:760-8. [DOI: 10.1016/j.jvoice.2011.07.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2011] [Accepted: 07/18/2011] [Indexed: 10/14/2022]
|
43
|
Analysis of longitudinal phase differences in vocal-fold vibration using synchronous high-speed videoendoscopy and electroglottography. J Voice 2012; 26:816.e13-20. [PMID: 23059188 DOI: 10.1016/j.jvoice.2012.04.009] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Accepted: 04/26/2012] [Indexed: 11/23/2022]
Abstract
OBJECTIVE This investigation used synchronous high-speed videoendoscopy and electroglottography (EGG) to systematically study contact and separation behavior along the length of the vocal folds. DESIGN Repeated measures. METHODS Facilitated by EGG and digital kymograms derived at 20%, 35%, 50%, 65%, and 80% of the posteroanterior length of the vocal folds, the pattern of vocal-fold contact and separation was determined for seven female and seven male vocally healthy subjects while producing "breathy," "comfortable," and "pressed" phonations. RESULTS The female subjects consistently used an anterior-to-posterior contact pattern and posterior-to-anterior separation pattern when producing a breathy or comfortable voice, with several using a simultaneous pattern of contact and/or separation for pressed phonation. The male subjects showed more variable "zipperlike" separation patterns, but consistently used a simultaneous contact pattern for pressed voice that was also commonly used when producing comfortable phonation. CONCLUSIONS Findings indicate longitudinal phase differences in vocal-fold vibration are both common and expected in vocally healthy speakers. The implications for vocal assessment, as well as for the use and interpretation of the EGG signal, are discussed.
Collapse
|
44
|
Abstract
PURPOSE OF REVIEW This review presents recent advances in high-speed digital imaging (HSDI) of the larynx including data acquisition, data analysis, and clinical applicability. RECENT FINDINGS Software designed to summarize the large amounts of data captured with HSDI makes it possible to quantitatively analyze recordings from patients, improving the accuracy of the methodology. The new software has been used in studies of normal individuals, increasing our knowledge of normal vocal fold vibratory behavior. HSDI has also been used in patient populations and shows promise in distinguishing various laryngeal conditions that are difficult to distinguish with other imaging modalities. Studies of postoperative patients with HSDI demonstrate the return of some vibratory characteristics but not others, potentially leading the way to improvements in surgical technique. SUMMARY Recent advances in HSDI technology have increased the clinical usefulness of the imaging technology and recent studies demonstrate the clinical applicability of HSDI. However, challenges to widespread clinical use of HSDI remain.
Collapse
|
45
|
Bonilha HS, Deliyski DD, Whiteside JP, Gerlach TT. Vocal fold phase asymmetries in patients with voice disorders: a study across visualization techniques. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2012; 21:3-15. [PMID: 22049403 PMCID: PMC7587608 DOI: 10.1044/1058-0360(2011/09-0086)] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
PURPOSE To examine differences in vocal fold vibratory phase asymmetry judged from stroboscopy, high-speed videoendoscopy (HSV), and the HSV-derived playbacks of mucosal wave kymography, digital kymography, and a static medial digital kymography image of persons with hypofunctional and hyperfunctional voice disorders. Differences between the methods of visual judgments and objective measures of left-right phase asymmetry were assessed. The findings were compared with those from a previous study with vocally normal speakers. METHOD Forty-nine persons with voice disorders underwent stroboscopy and HSV. The HSV images were processed, resulting in 4 different spatial or kymographic displays. Two types of phase asymmetries, left-right and anterior-posterior, were visually rated. Objective measures of left-right phase asymmetry were obtained. RESULTS From stroboscopy, the HSV playback, and the HSV-derived playbacks, left-right phase symmetry was judged to be symmetrical in 41%, 32%, and 19% of cases, respectively. This difference in playbacks was not seen for anterior-posterior asymmetry. Correlation between visual judgments and objective measures was mild for stroboscopy and moderate to high for all HSV-based playbacks. CONCLUSIONS The use of kymography appears important for judgments of phase asymmetry. Stroboscopy appears to be sensitive, but possibly not specific, to phase asymmetries. Further development of objective measures is warranted for this feature.
Collapse
|
46
|
Physical simulation of laryngeal disorders using a multiple-mass vocal fold model. Biomed Signal Process Control 2012. [DOI: 10.1016/j.bspc.2011.04.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
47
|
Chodara AM, Krausert CR, Jiang JJ. Kymographic characterization of vibration in human vocal folds with nodules and polyps. Laryngoscope 2011; 122:58-65. [PMID: 21898450 DOI: 10.1002/lary.22324] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2011] [Accepted: 07/22/2011] [Indexed: 11/10/2022]
Abstract
OBJECTIVES/HYPOTHESIS Digital kymography (DKG) can provide objective quantitative data about vocal fold vibration, which may help distinguish normal from pathological vocal folds as well as nodules from polyps. STUDY DESIGN Case-control study. METHODS There were 87 subjects who were separated into three groups: control, nodules, and unilateral polyps, and examined using a high-speed camera attached to an endoscope. Videos were analyzed using a custom MATLAB program, and three DKG line-scan positions (25%, 50%, and 75% of vocal fold length) were used in statistical analyses to compare vocal fold vibrational frequency, amplitude symmetry index (ASI), amplitude order, and vertical and lateral phase difference (VPD and LPD, respectively). RESULTS Significant differences among groups were found in all vibrational parameters except frequency. Polyps and nodules groups exhibited greater ASI values (less amplitude symmetry) than the control group. Although the control group consistently showed its largest amplitudes at the midline, the polyps group showed larger amplitudes toward the posterior end of the vocal folds. A significant anterior-posterior pattern in amplitude was not found in the nodules group. LPD values were usually largest (most symmetrical) in the control group, followed by nodules and polyps. LPD at the 25% position allowed for differentiation between polyp and nodule groups. The largest VPD (more pronounced mucosal wave) values were usually found in the control group. CONCLUSIONS Vibratory characteristics of normal and pathological vocal folds were quantitatively examined and compared using multiline DKG. These findings may allow for better characterization of pathologies and eventually assist in improving the clinical utility of DKG.
Collapse
Affiliation(s)
- Ann M Chodara
- Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin 53706, USA
| | | | | |
Collapse
|
48
|
Kniesburges S, Thomson SL, Barney A, Triep M, Sidlof P, Horáčcek J, Brücker C, Becker S. In vitro experimental investigation of voice production. Curr Bioinform 2011. [PMID: 23181007 DOI: 10.2174/157489311796904637] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The process of human phonation involves a complex interaction between the physical domains of structural dynamics, fluid flow, and acoustic sound production and radiation. Given the high degree of nonlinearity of these processes, even small anatomical or physiological disturbances can significantly affect the voice signal. In the worst cases, patients can lose their voice and hence the normal mode of speech communication. To improve medical therapies and surgical techniques it is very important to understand better the physics of the human phonation process. Due to the limited experimental access to the human larynx, alternative strategies, including artificial vocal folds, have been developed. The following review gives an overview of experimental investigations of artificial vocal folds within the last 30 years. The models are sorted into three groups: static models, externally driven models, and self-oscillating models. The focus is on the different models of the human vocal folds and on the ways in which they have been applied.
Collapse
Affiliation(s)
- Stefan Kniesburges
- Institute of Process Maschinery and Systems Engineering, University Erlangen-Nuremberg, Cauerstr. 4, 91058 Erlangen, Germany,
| | | | | | | | | | | | | | | |
Collapse
|
49
|
Krausert CR, Ying D, Zhang Y, Jiang JJ. Quantitative study of vibrational symmetry of injured vocal folds via digital kymography in excised canine larynges. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2011; 54:1022-1038. [PMID: 21173386 PMCID: PMC3187921 DOI: 10.1044/1092-4388(2010/10-0105)] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
PURPOSE Digital kymography and vocal fold curve fitting are blended with detailed symmetry analysis of kymograms to provide a comprehensive characterization of the vibratory properties of injured vocal folds. METHOD Vocal fold vibration of 12 excised canine larynges was recorded under uninjured, unilaterally injured, and bilaterally injured conditions. Kymograms were created at 25%, 50%, and 75% of the vocal fold length, and vibratory parameters were compared quantitatively among conditions and were studied with respect to right-left and anterior-posterior symmetries. RESULTS Anterior-posterior amplitude asymmetry was found in the bilateral condition. The unilateral condition showed significant right-left amplitude asymmetry, and it showed the lowest right-left phase symmetry among the conditions. In condition comparisons, vertical phase difference did not show significant differences among conditions, whereas amplitudes were significantly different among conditions at all line scan positions and most vocal fold lips. Significant differences in frequency were found among the conditions at all 4 vocal fold lips, with the bilateral condition exhibiting the greatest frequency. CONCLUSION Digital kymography and curve fitting provide detailed information about the vibratory behavior of injured vocal folds. Awareness of vibratory properties associated with vocal fold injury may aid in diagnosis, and the quantitative abilities of digital kymography may allow for objective treatment selection.
Collapse
Affiliation(s)
- Christopher R. Krausert
- Department of Surgery, Division of Otolaryngology - Head and Neck Surgery, University of Wisconsin School of Medicine and Public Health, Madison, WI 53792-7375
| | - Di Ying
- Department of Surgery, Division of Otolaryngology - Head and Neck Surgery, University of Wisconsin School of Medicine and Public Health, Madison, WI 53792-7375
| | - Yu Zhang
- Key Laboratory of Underwater Acoustic Communication and Marine Information Technology of the Ministry of Education, Xiamen University, Xiamen Fujian 361005, China
| | - Jack J. Jiang
- Department of Surgery, Division of Otolaryngology - Head and Neck Surgery, University of Wisconsin School of Medicine and Public Health, Madison, WI 53792-7375
| |
Collapse
|
50
|
Dollinger M, Berry DA, Huttner B, Bohr C. Assessment of local vocal fold deformation characteristics in an in vitro static tensile test. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:977-985. [PMID: 21877810 PMCID: PMC3190661 DOI: 10.1121/1.3605671] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2010] [Revised: 06/07/2011] [Accepted: 06/08/2011] [Indexed: 05/31/2023]
Abstract
Voice quality is strongly dependent on vocal fold dynamics, which in turn are dependent on lung pressure and vocal fold biomechanics. Numerical and physical models are often used to investigate the interactions of these different subsystems. However, the utility of numerical and physical models is limited unless appropriately validated with data from physiological models. Hence a method that enables analysis of local vocal fold deformations along the entire surface is presented. In static tensile tests, forces are applied to distinctive working points being located in cover and muscle, respectively, so that specific layer properties can be investigated. The forces are directed vertically upward and are applied along or above the vocal fold edge. The resulting deformations are analyzed using multiple perspectives and three-dimensional reconstruction. Deformation characteristics of four human vocal folds were investigated. Preliminary results showed two phases of deformation: a range with a small slope for small deformations fading into a significant nonlinear deformation trend with a high slope. An increase of tissue stiffness from posterior to anterior was detected. This trend is more significant for muscle and in the mid-anterior half of the vocal fold.
Collapse
Affiliation(s)
- M Dollinger
- Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Germany.
| | | | | | | |
Collapse
|