1
|
Patel RR, Döllinger M, Jakubaß B, Pinhack H, Katz U, Semmler M. Analyzing Vocal Fold Frequency Dynamics Using High-Speed 3D Laser Video Endoscopy. Laryngoscope 2024; 134:3267-3276. [PMID: 38481073 PMCID: PMC11182720 DOI: 10.1002/lary.31394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 02/24/2024] [Accepted: 02/29/2024] [Indexed: 06/18/2024]
Abstract
OBJECTIVE To examine changes in lateral and vertical vibratory motion along the anterior, middle, and posterior sections of the vocal folds, as a function of vocal frequency variations. METHODS Absolute measurements of vocal fold surface dynamics from high-speed videoendoscopy with custom laser endoscope were made on 23 vocally healthy adults during sustained /i:/ production at 10%, 20%, and 80% of pitch range. The 3D parameters of amplitude (mm), maximum velocity opening/closing (mm/s), and mean velocity opening/closing (mm/s) were computed for the lateral and vertical vibratory motion along the anterior, middle, and posterior sections of the vocal folds. Linear mixed model analysis was conducted to evaluate the differences in (a) vocal frequency levels (high vs. normal vs. low pitch), (b) axis level (vertical vs. lateral), (c) position level (anterior vs. middle vs. posterior), and (d) gender differences (male vs. female). RESULTS Overall, the superior surface vertical motion of the vocal fold is greater compared with the lateral motion, especially in males. Along the superior surface, the mean and maximum closing velocities are greater posteriorly for low pitch. The location (anterior, middle, and posterior) along the superior surface is relevant only for vocal fold closing rather than opening, as the dynamics are different along the various locations. CONCLUSIONS The study highlights the significance of assessing the vertical motion of the superior surface of the vocal fold to understand the complex dynamics of voice production. LEVEL OF EVIDENCE NA Laryngoscope, 134:3267-3276, 2024.
Collapse
Affiliation(s)
- Rita R. Patel
- Department of Otolaryngology Head and Neck Surgery, Indiana University, Indianapolis, Indiana, United States
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Bernhard Jakubaß
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Hanna Pinhack
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Ute Katz
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Marion Semmler
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
2
|
Donhauser J, Tur B, Döllinger M. Neural network-based estimation of biomechanical vocal fold parameters. Front Physiol 2024; 15:1282574. [PMID: 38449783 PMCID: PMC10916882 DOI: 10.3389/fphys.2024.1282574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 01/09/2024] [Indexed: 03/08/2024] Open
Abstract
Vocal fold (VF) vibrations are the primary source of human phonation. High-speed video (HSV) endoscopy enables the computation of descriptive VF parameters for assessment of physiological properties of laryngeal dynamics, i.e., the vibration of the VFs. However, underlying biomechanical factors responsible for physiological and disordered VF vibrations cannot be accessed. In contrast, physically based numerical VF models reveal insights into the organ's oscillations, which remain inaccessible through endoscopy. To estimate biomechanical properties, previous research has fitted subglottal pressure-driven mass-spring-damper systems, as inverse problem to the HSV-recorded VF trajectories, by global optimization of the numerical model. A neural network trained on the numerical model may be used as a substitute for computationally expensive optimization, yielding a fast evaluating surrogate of the biomechanical inverse problem. This paper proposes a convolutional recurrent neural network (CRNN)-based architecture trained on regression of a physiological-based biomechanical six-mass model (6 MM). To compare with previous research, the underlying biomechanical factor "subglottal pressure" prediction was tested against 288 HSV ex vivo porcine recordings. The contributions of this work are two-fold: first, the presented CRNN with the 6 MM handles multiple trajectories along the VFs, which allows for investigations on local changes in VF characteristics. Second, the network was trained to reproduce further important biomechanical model parameters like VF mass and stiffness on synthetic data. Unlike in a previous work, the network in this study is therefore an entire surrogate of the inverse problem, which allowed for explicit computation of the fitted model using our approach. The presented approach achieves a best-case mean absolute error (MAE) of 133 Pa (13.9%) in subglottal pressure prediction with 76.6% correlation on experimental data and a re-estimated fundamental frequency MAE of 15.9 Hz (9.9%). In-detail training analysis revealed subglottal pressure as the most learnable parameter. With the physiological-based model design and advances in fast parameter prediction, this work is a next step in biomechanical VF model fitting and the estimation of laryngeal kinematics.
Collapse
Affiliation(s)
- Jonas Donhauser
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | | | | |
Collapse
|
3
|
Tur B, Gühring L, Wendler O, Schlicht S, Drummer D, Kniesburges S. Effect of Ligament Fibers on Dynamics of Synthetic, Self-Oscillating Vocal Folds in a Biomimetic Larynx Model. Bioengineering (Basel) 2023; 10:1130. [PMID: 37892860 PMCID: PMC10604794 DOI: 10.3390/bioengineering10101130] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 09/13/2023] [Accepted: 09/25/2023] [Indexed: 10/29/2023] Open
Abstract
Synthetic silicone larynx models are essential for understanding the biomechanics of physiological and pathological vocal fold vibrations. The aim of this study is to investigate the effects of artificial ligament fibers on vocal fold vibrations in a synthetic larynx model, which is capable of replicating physiological laryngeal functions such as elongation, abduction, and adduction. A multi-layer silicone model with different mechanical properties for the musculus vocalis and the lamina propria consisting of ligament and mucosa was used. Ligament fibers of various diameters and break resistances were cast into the vocal folds and tested at different tension levels. An electromechanical setup was developed to mimic laryngeal physiology. The measurements included high-speed video recordings of vocal fold vibrations, subglottal pressure and acoustic. For the evaluation of the vibration characteristics, all measured values were evaluated and compared with parameters from ex and in vivo studies. The fundamental frequency of the synthetic larynx model was found to be approximately 200-520 Hz depending on integrated fiber types and tension levels. This range of the fundamental frequency corresponds to the reproduction of a female normal and singing voice range. The investigated voice parameters from vocal fold vibration, acoustics, and subglottal pressure were within normal value ranges from ex and in vivo studies. The integration of ligament fibers leads to an increase in the fundamental frequency with increasing airflow, while the tensioning of the ligament fibers remains constant. In addition, a tension increase in the fibers also generates a rise in the fundamental frequency delivering the physiological expectation of the dynamic behavior of vocal folds.
Collapse
Affiliation(s)
- Bogac Tur
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School, Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| | - Lucia Gühring
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School, Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| | - Olaf Wendler
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School, Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| | - Samuel Schlicht
- Institute of Polymer Technology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Am Weichselgarten 10, 91058 Erlangen, Germany
| | - Dietmar Drummer
- Institute of Polymer Technology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Am Weichselgarten 10, 91058 Erlangen, Germany
| | - Stefan Kniesburges
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School, Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| |
Collapse
|
4
|
Arias-Vergara T, Döllinger M, Schraut T, Mohd Khairuddin KA, Schützenberger A. Nyquist Plot Parametrization for Quantitative Analysis of Vibration of the Vocal Folds. J Voice 2023:S0892-1997(23)00014-0. [PMID: 36774264 DOI: 10.1016/j.jvoice.2023.01.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 01/12/2023] [Accepted: 01/12/2023] [Indexed: 02/11/2023]
Abstract
OBJECTIVES The Nyquist plot provides a graphical representation of the glottal cycles as elliptical trajectories in a 2D plane. This study proposes a methodology to parameterize the Nyquist plot with application to support the quantitative analysis of voice disorders. METHODS We considered high-speed videoendoscopy recordings of 33 functional dysphonia (FD) patients and 33 normophonic controls (NC). Quantitative analysis was performed by computing four shape-based parameters from the Nyquist plot: Variability, Size (Perimeter and Area), and Consistency. Additionally, we performed automatic classification using a linear support vector machine and feature importance analysis by combining the proposed features with state-of-the-art glottal area waveform (GAW) parameters. RESULTS We found that the inter-cycle variability was significantly higher in FD patients compared to NC. We achieved a classification accuracy of 83% when the top 30 most important features were used. Furthermore, the proposed Nyquist plot features were ranked in the top 12 most important features. CONCLUSIONS The Nyquist plot provides complementary information for subjective and objective assessment of voice disorders. On the one hand, with visual inspection it is possible to observe intra- and inter-glottal cycle irregularities during sustained phonation. On the other hand, shaped-based parameters allow quantifying such irregularities and provide complementary information to state-of-the-art GAW parameters.
Collapse
Affiliation(s)
- Tomás Arias-Vergara
- University Hospital Erlangen, Medical School Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany.
| | - Michael Döllinger
- University Hospital Erlangen, Medical School Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| | - Tobias Schraut
- University Hospital Erlangen, Medical School Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| | | | - Anne Schützenberger
- University Hospital Erlangen, Medical School Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| |
Collapse
|
5
|
Long-term performance assessment of fully automatic biomedical glottis segmentation at the point of care. PLoS One 2022; 17:e0266989. [PMID: 36129922 PMCID: PMC9491538 DOI: 10.1371/journal.pone.0266989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 07/25/2022] [Indexed: 12/04/2022] Open
Abstract
Deep Learning has a large impact on medical image analysis and lately has been adopted for clinical use at the point of care. However, there is only a small number of reports of long-term studies that show the performance of deep neural networks (DNNs) in such an environment. In this study, we measured the long-term performance of a clinically optimized DNN for laryngeal glottis segmentation. We have collected the video footage for two years from an AI-powered laryngeal high-speed videoendoscopy imaging system and found that the footage image quality is stable across time. Next, we determined the DNN segmentation performance on lossy and lossless compressed data revealing that only 9% of recordings contain segmentation artifacts. We found that lossy and lossless compression is on par for glottis segmentation, however, lossless compression provides significantly superior image quality. Lastly, we employed continual learning strategies to continuously incorporate new data into the DNN to remove the aforementioned segmentation artifacts. With modest manual intervention, we were able to largely alleviate these segmentation artifacts by up to 81%. We believe that our suggested deep learning-enhanced laryngeal imaging platform consistently provides clinically sound results, and together with our proposed continual learning scheme will have a long-lasting impact on the future of laryngeal imaging.
Collapse
|
6
|
A single latent channel is sufficient for biomedical glottis segmentation. Sci Rep 2022; 12:14292. [PMID: 35995933 PMCID: PMC9395348 DOI: 10.1038/s41598-022-17764-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 07/30/2022] [Indexed: 11/23/2022] Open
Abstract
Glottis segmentation is a crucial step to quantify endoscopic footage in laryngeal high-speed videoendoscopy. Recent advances in deep neural networks for glottis segmentation allow for a fully automatic workflow. However, exact knowledge of integral parts of these deep segmentation networks remains unknown, and understanding the inner workings is crucial for acceptance in clinical practice. Here, we show that a single latent channel as a bottleneck layer is sufficient for glottal area segmentation using systematic ablations. We further demonstrate that the latent space is an abstraction of the glottal area segmentation relying on three spatially defined pixel subtypes allowing for a transparent interpretation. We further provide evidence that the latent space is highly correlated with the glottal area waveform, can be encoded with four bits, and decoded using lean decoders while maintaining a high reconstruction accuracy. Our findings suggest that glottis segmentation is a task that can be highly optimized to gain very efficient and explainable deep neural networks, important for application in the clinic. In the future, we believe that online deep learning-assisted monitoring is a game-changer in laryngeal examinations.
Collapse
|
7
|
Kist AM, Dürr S, Schützenberger A, Döllinger M. OpenHSV: an open platform for laryngeal high-speed videoendoscopy. Sci Rep 2021; 11:13760. [PMID: 34215788 PMCID: PMC8253769 DOI: 10.1038/s41598-021-93149-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 06/03/2021] [Indexed: 11/22/2022] Open
Abstract
High-speed videoendoscopy is an important tool to study laryngeal dynamics, to quantify vocal fold oscillations, to diagnose voice impairments at laryngeal level and to monitor treatment progress. However, there is a significant lack of an open source, expandable research tool that features latest hardware and data analysis. In this work, we propose an open research platform termed OpenHSV that is based on state-of-the-art, commercially available equipment and features a fully automatic data analysis pipeline. A publicly available, user-friendly graphical user interface implemented in Python is used to interface the hardware. Video and audio data are recorded in synchrony and are subsequently fully automatically analyzed. Video segmentation of the glottal area is performed using efficient deep neural networks to derive glottal area waveform and glottal midline. Established quantitative, clinically relevant video and audio parameters were implemented and computed. In a preliminary clinical study, we recorded video and audio data from 28 healthy subjects. Analyzing these data in terms of image quality and derived quantitative parameters, we show the applicability, performance and usefulness of OpenHSV. Therefore, OpenHSV provides a valid, standardized access to high-speed videoendoscopy data acquisition and analysis for voice scientists, highlighting its use as a valuable research tool in understanding voice physiology. We envision that OpenHSV serves as basis for the next generation of clinical HSV systems.
Collapse
Affiliation(s)
- Andreas M Kist
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstr. 1, 91054, Erlangen, Germany. .,Department of Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-University Erlangen-Nürnberg, Henkestr. 91, 91054, Erlangen, Germany.
| | - Stephan Dürr
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstr. 1, 91054, Erlangen, Germany
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstr. 1, 91054, Erlangen, Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstr. 1, 91054, Erlangen, Germany
| |
Collapse
|
8
|
Kist AM, Gómez P, Dubrovskiy D, Schlegel P, Kunduk M, Echternach M, Patel R, Semmler M, Bohr C, Dürr S, Schützenberger A, Döllinger M. A Deep Learning Enhanced Novel Software Tool for Laryngeal Dynamics Analysis. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1889-1903. [PMID: 34000199 DOI: 10.1044/2021_jslhr-20-00498] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose High-speed videoendoscopy (HSV) is an emerging, but barely used, endoscopy technique in the clinic to assess and diagnose voice disorders because of the lack of dedicated software to analyze the data. HSV allows to quantify the vocal fold oscillations by segmenting the glottal area. This challenging task has been tackled by various studies; however, the proposed approaches are mostly limited and not suitable for daily clinical routine. Method We developed a user-friendly software in C# that allows the editing, motion correction, segmentation, and quantitative analysis of HSV data. We further provide pretrained deep neural networks for fully automatic glottis segmentation. Results We freely provide our software Glottis Analysis Tools (GAT). Using GAT, we provide a general threshold-based region growing platform that enables the user to analyze data from various sources, such as in vivo recordings, ex vivo recordings, and high-speed footage of artificial vocal folds. Additionally, especially for in vivo recordings, we provide three robust neural networks at various speed and quality settings to allow a fully automatic glottis segmentation needed for application by untrained personnel. GAT further evaluates video and audio data in parallel and is able to extract various features from the video data, among others the glottal area waveform, that is, the changing glottal area over time. In total, GAT provides 79 unique quantitative analysis parameters for video- and audio-based signals. Many of these parameters have already been shown to reflect voice disorders, highlighting the clinical importance and usefulness of the GAT software. Conclusion GAT is a unique tool to process HSV and audio data to determine quantitative, clinically relevant parameters for research, diagnosis, and treatment of laryngeal disorders. Supplemental Material https://doi.org/10.23641/asha.14575533.
Collapse
Affiliation(s)
- Andreas M Kist
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Pablo Gómez
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Denis Dubrovskiy
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Patrick Schlegel
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Melda Kunduk
- Department of Communication Sciences and Disorders, Louisiana State University, Baton Rouge
| | - Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Germany
| | - Rita Patel
- Department of Speech, Language and Hearing Sciences, College of Arts and Sciences, Indiana University, Bloomington
| | - Marion Semmler
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Christopher Bohr
- Klinik und Poliklinik für Hals-Nasen-Ohren-Heilkunde Universitätsklinikum Regensburg, Germany
| | - Stephan Dürr
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| |
Collapse
|
9
|
Patel RR, Sandage MJ, Kluess H, Plexico LW. High-Speed Characterization of Vocal Fold Vibrations in Normally Cycling and Postmenopausal Women: Randomized Double-Blind Analyses. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1869-1888. [PMID: 33971105 PMCID: PMC8740695 DOI: 10.1044/2021_jslhr-20-00706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Purpose The aim of this study was to examine the influence of menstrual cycle phases (follicular, ovulatory, luteal, and ischemic) and hormone levels (estradiol, testosterone, progesterone, and neuropeptide Y) on vocal fold vibrations in reproductive and postmenopausal women. Method Glottal area waveforms were extracted from high-speed videoendoscopy during sustained phonation, inhalation phonation, and voice onset/offset in the reproductive (n = 15) and postmenopausal (n = 13) groups. Linear mixed-model analysis was conducted to evaluate hormone levels and high-speed videoendoscopy outcome variables between the reproductive and postmenopausal groups. In the reproductive group, simple linear regression and multiple regression were conducted to determine the effects of hormones on the dependent variables. Results Group differences between reproductive and postmenopausal women were identified for stiffness index, oscillatory onset time, and oscillatory offset time. Neuropeptide Y hormone in the ischemic phase significantly predicted changes in the reproductive group for some dependent variables; however, the relationship varied for sustained phonation and inhalation phonation. Conclusion These findings provide preliminary evidence that vocal fold vibrations in the reproductive group are different predominantly in the ischemic phase due to neuropeptide Y changes.
Collapse
Affiliation(s)
- Rita R. Patel
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington
| | - Mary J. Sandage
- Department of Speech, Language, and Hearing Sciences, Auburn University, AL
| | | | - Laura W. Plexico
- Department of Speech, Language, and Hearing Sciences, Auburn University, AL
| |
Collapse
|
10
|
Oliveira RCCD, Gama ACC, Genilhú PDFL, Santos MAR. High speed digital videolaringoscopy: evaluation of vocal nodules and cysts in women. Codas 2021; 33:e20200095. [PMID: 34008770 DOI: 10.1590/2317-1782/20202020095] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 07/04/2020] [Indexed: 11/22/2022] Open
Abstract
PURPOSE To evaluate and compare the parameters of Digital kymography obtained through the High-speed Videolaryngoscopy of women without laryngeal disorders, of women with vocal fold nodules and of women with vocal cysts. METHODS A cross-sectional observational study in which 60 women aged 18 years and 45 years were selected. Three study groups were formed: 20 women without laryngeal disorder forming the control group (Group 1), 20 women with diagnosis of vocal fold nodules forming Group 2 and 20 women with diagnosis of vocal cysts forming Group 3. Subsequently the participants were evaluated by High-speed Videolaryngoscopy for analysis and comparison of laryngeal images using Digital kymography. The laryngeal parameters processed by the program KIPS® were: minimum, maximum and mean opening; dominant amplitude of the left and right vocal folds; dominant frequency of the right and left vocal folds; and close. RESULTS The analysis of Digital kymography suggests that the presence of the vocal fold nodules and the vocal cysts tend to restrict more to the maximum and minimum opening of the vocal fold and the dominant amplitude of the opening variation in the middle region of the glottis. CONCLUSION Digital kymography parameters were similar in the presence of vocal fold nodules and vocal cysts lesions.
Collapse
Affiliation(s)
- Renata Cristina Cordeiro Diniz Oliveira
- Programa de Pós-graduação em Ciências Fonoaudiológicas, Departamento de Fonoaudiologia, Faculdade de Medicina, Universidade Federal de Minas Gerais - UFMG - Belo Horizonte (MG), Brasil
| | - Ana Cristina Côrtes Gama
- Departamento de Fonoaudiologia, Faculdade de Medicina, Universidade Federal de Minas Gerais - UFMG - Belo Horizonte (MG), Brasil
| | - Patrícia de Freitas Lopes Genilhú
- Programa de Pós-graduação em Ciências Fonoaudiológicas, Departamento de Fonoaudiologia, Faculdade de Medicina, Universidade Federal de Minas Gerais - UFMG - Belo Horizonte (MG), Brasil
| | - Marco Aurélio Rocha Santos
- Programa de Pós-graduação em Ciências Fonoaudiológicas, Departamento de Fonoaudiologia, Faculdade de Medicina, Universidade Federal de Minas Gerais - UFMG - Belo Horizonte (MG), Brasil
| |
Collapse
|
11
|
Schlegel P, Kist AM, Kunduk M, Dürr S, Döllinger M, Schützenberger A. Interdependencies between acoustic and high-speed videoendoscopy parameters. PLoS One 2021; 16:e0246136. [PMID: 33529244 PMCID: PMC7853476 DOI: 10.1371/journal.pone.0246136] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Accepted: 01/13/2021] [Indexed: 02/06/2023] Open
Abstract
In voice research, uncovering relations between the oscillating vocal folds, being the sound source of phonation, and the resulting perceived acoustic signal are of great interest. This is especially the case in the context of voice disorders, such as functional dysphonia (FD). We investigated 250 high-speed videoendoscopy (HSV) recordings with simultaneously recorded acoustic signals (124 healthy females, 60 FD females, 44 healthy males, 22 FD males). 35 glottal area waveform (GAW) parameters and 14 acoustic parameters were calculated for each recording. Linear and non-linear relations between GAW and acoustic parameters were investigated using Pearson correlation coefficients (PCC) and distance correlation coefficients (DCC). Further, norm values for parameters obtained from 250 ms long sustained phonation data (vowel /i/) were provided. 26 PCCs in females (5.3%) and 8 in males (1.6%) were found to be statistically significant (|corr.| ≥ 0.3). Only minor differences were found between PCCs and DCCs, indicating presence of weak non-linear dependencies between parameters. Fundamental frequency was involved in the majority of all relevant PCCs between GAW and acoustic parameters (19 in females and 7 in males). The most distinct difference between correlations in females and males was found for the parameter Period Variability Index. The study shows only weak relations between investigated acoustic and GAW-parameters. This indicates that the reduction of the complex 3D glottal dynamics to the 1D-GAW may erase laryngeal dynamic characteristics that are reflected within the acoustic signal. Hence, other GAW parameters, 2D-, 3D-laryngeal dynamics and vocal tract parameters should be further investigated towards potential correlations to the acoustic signal.
Collapse
Affiliation(s)
- Patrick Schlegel
- Department of Head & Neck Surgery, David Geffen School of Medicine, University of California Los Angeles (UCLA), Los Angeles, California, United States of America
- Dep. of Otorhinolaryngology, Div. of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
- * E-mail:
| | - Andreas M. Kist
- Dep. of Otorhinolaryngology, Div. of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Melda Kunduk
- Dep. of Communication Sciences and Disorders, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Stephan Dürr
- Dep. of Otorhinolaryngology, Div. of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Michael Döllinger
- Dep. of Otorhinolaryngology, Div. of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Anne Schützenberger
- Dep. of Otorhinolaryngology, Div. of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
12
|
Schlegel P, Kniesburges S, Dürr S, Schützenberger A, Döllinger M. Machine learning based identification of relevant parameters for functional voice disorders derived from endoscopic high-speed recordings. Sci Rep 2020; 10:10517. [PMID: 32601277 PMCID: PMC7324600 DOI: 10.1038/s41598-020-66405-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 05/20/2020] [Indexed: 11/13/2022] Open
Abstract
In voice research and clinical assessment, many objective parameters are in use. However, there is no commonly used set of parameters that reflect certain voice disorders, such as functional dysphonia (FD); i.e. disorders with no visible anatomical changes. Hence, 358 high-speed videoendoscopy (HSV) recordings (159 normal females (NF), 101 FD females (FDF), 66 normal males (NM), 32 FD males (FDM)) were analyzed. We investigated 91 quantitative HSV parameters towards their significance. First, 25 highly correlated parameters were discarded. Second, further 54 parameters were discarded by using a LogitBoost decision stumps approach. This yielded a subset of 12 parameters sufficient to reflect functional dysphonia. These parameters separated groups NF vs. FDF and NM vs. FDM with fair accuracy of 0.745 or 0.768, respectively. Parameters solely computed from the changing glottal area waveform (1D-function called GAW) between the vocal folds were less important than parameters describing the oscillation characteristics along the vocal folds (2D-function called Phonovibrogram). Regularity of GAW phases and peak shape, harmonic structure and Phonovibrogram-based vocal fold open and closing angles were mainly important. This study showed the high degree of redundancy of HSV-voice-parameters but also affirms the need of multidimensional based assessment of clinical data.
Collapse
Affiliation(s)
- Patrick Schlegel
- Department of Otorhinolaryngology, Division of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany.
| | - Stefan Kniesburges
- Department of Otorhinolaryngology, Division of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Stephan Dürr
- Department of Otorhinolaryngology, Division of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Anne Schützenberger
- Department of Otorhinolaryngology, Division of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Michael Döllinger
- Department of Otorhinolaryngology, Division of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
13
|
Kniesburges S, Lodermeyer A, Semmler M, Schulz YK, Schützenberger A, Becker S. Analysis of the tonal sound generation during phonation with and without glottis closure. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:3285. [PMID: 32486803 DOI: 10.1121/10.0001184] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
The human phonation is characterized by periodical oscillations of the vocal folds with a complete glottis closure. In contrast, a glottal insufficiency (GI) represents an oscillation without glottis closure resulting in a breathy and weak voice. In this study, flow-induced oscillations of silicone vocal folds were modeled with and without glottis closure. The measurements comprised the flow pressure in the model, the generated sound, and the high-speed footage of the vocal fold motion. The analysis revealed that the sound signal for vocal fold oscillations without closure exhibits a lower number of harmonic tones with smaller amplitudes compared to the case with complete closure. The time series of the pressure signals showed small and periodical oscillations occurring less frequently and with smaller amplitude for the GI case. Accordingly, the pressure spectra include fewer harmonics similar to the sound. The analysis of the high-speed videos indicates that the strength of the pressure oscillations correlates with the divergence angle of the glottal duct during the closing motion. Physiologically, large divergence angles typically occur for a pronounced mucosal wave motion with glottis closure. Thus, the results indicate a correlation between the intensity of the mucosal wave and the development of harmonic tones.
Collapse
Affiliation(s)
- Stefan Kniesburges
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| | - Alexander Lodermeyer
- Department of Process Machinery and Systems Engineering, Friedrich-Alexander University Erlangen-Nürnberg, Cauerstrasse 7, 91058 Erlangen, Germany
| | - Marion Semmler
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| | - Yvonne Katrin Schulz
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| | - Stefan Becker
- Department of Process Machinery and Systems Engineering, Friedrich-Alexander University Erlangen-Nürnberg, Cauerstrasse 7, 91058 Erlangen, Germany
| |
Collapse
|
14
|
Maryn Y, Verguts M, Demarsin H, van Dinther J, Gomez P, Schlegel P, Döllinger M. Intersegmenter Variability in High-Speed Laryngoscopy-Based Glottal Area Waveform Measures. Laryngoscope 2019; 130:E654-E661. [PMID: 31840827 DOI: 10.1002/lary.28475] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 11/26/2019] [Indexed: 12/31/2022]
Abstract
OBJECTIVES/HYPOTHESIS High-speed videoendoscopy (HSV) has potential to objectively quantify vibratory vocal fold characteristics during phonation. Glottal Analysis Tools (GAT) version 2018, developed in Erlangen, Germany, is software for determining various glottal area waveform (GAW) quantities. Before having GAT analyze HSV videos, segmenters have to define glottis manually across videos in a semiautomatic segmentation protocol. Such interventions are hypothesized to induce variability of subsequent GAW measure computation across segmenters and may attenuate GAT measures' reliability to a certain point. This study explored intersegmenter variability in GAT's GAW measures based on semiautomatic image processing. STUDY DESIGN Cohort study of rater reliability. METHODS In total, 20 HSV videos from normophonic and dysphonic subjects with various laryngeal disorders were selected for this study and segmented by three trained segmenters. They separately segmented glottis areas in the same frame sets of the videos. Upon analysis of GAW, GAT offers 46 measures related to topologic GAW dynamic characteristics, GAW periodicity and perturbation characteristics, and GAW harmonic components. To address GAT's reliability, intersegmenter-based variability in these measures was examined with intraclass correlation coefficient (ICC). RESULTS In general, ICC behavior of the 46 GAW measures across three raters was highly acceptable. ICC of one parameter was moderate (0.5 < ICC < 0.75), good for seven parameters (0.75 < ICC < 0.9), and excellent for 38 parameters (0.9 < ICC). CONCLUSIONS Overall, high ICC values confirm clinical applicability of GAT for objective and quantitative assessment of HSV. Small intersegmenter differences with actual small parameter differences suggest that manual or semiautomatic segmentation in GAT does not noticeably influence clinical assessment outcome. To guarantee the software's performance, we suggest segmentation training before clinical application. LEVEL OF EVIDENCE 2b Laryngoscope, 130:E654-E661, 2020.
Collapse
Affiliation(s)
- Youri Maryn
- Department of Otorhinolaryngology-Head and Neck Surgery, European Institute for Otorhinolaryngology-Head and Neck Surgery, GasthuisZusters Antwerpen Sint-Augustinus, Wilrijk/Antwerp, Belgium.,Department of Speech, Language, and Hearing Sciences, University of Ghent, Ghent, Belgium.,Faculty of Education, Health, and Social Work, University College of Ghent, Ghent, Belgium.,Faculty of Psychology and Educational Sciences, School of Logopedics, Université Catholique de Louvain, Louvain-la-Neuve, Belgium.,Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium.,Phonanium, Lokeren, Belgium
| | - Monique Verguts
- Department of Otorhinolaryngology-Head and Neck Surgery, European Institute for Otorhinolaryngology-Head and Neck Surgery, GasthuisZusters Antwerpen Sint-Augustinus, Wilrijk/Antwerp, Belgium.,Department of Otorhinolaryngology and Voice Disorders, Diest General Hospital, Diest, Belgium
| | - Hannelore Demarsin
- Department of Otorhinolaryngology-Head and Neck Surgery, European Institute for Otorhinolaryngology-Head and Neck Surgery, GasthuisZusters Antwerpen Sint-Augustinus, Wilrijk/Antwerp, Belgium
| | - Joost van Dinther
- Department of Otorhinolaryngology-Head and Neck Surgery, European Institute for Otorhinolaryngology-Head and Neck Surgery, GasthuisZusters Antwerpen Sint-Augustinus, Wilrijk/Antwerp, Belgium
| | - Pablo Gomez
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Patrick Schlegel
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
15
|
Diaz-Cadiz M, McKenna VS, Vojtech JM, Stepp CE. Adductory Vocal Fold Kinematic Trajectories During Conventional Versus High-Speed Videoendoscopy. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:1685-1706. [PMID: 31181175 PMCID: PMC6808372 DOI: 10.1044/2019_jslhr-s-18-0405] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Objective Prephonatory vocal fold angle trajectories may supply useful information about the laryngeal system but were examined in previous studies using sigmoidal curves fit to data collected at 30 frames per second (fps). Here, high-speed videoendoscopy (HSV) was used to investigate the impacts of video frame rate and sigmoidal fitting strategy on vocal fold adductory patterns for voicing onsets. Method Twenty-five participants with healthy voices performed /ifi/ sequences under flexible nasendoscopy at 1,000 fps. Glottic angles were extracted during adduction for voicing onset; resulting vocal fold trajectories (i.e., changes in glottic angle over time) were down-sampled to simulate different frame rate conditions (30-1,000 fps). Vocal fold adduction data were fit with asymmetric sigmoids using 5 fitting strategies with varying parameter restrictions. Adduction trajectories and maximum adduction velocities were compared between the fits and the actual HSV data. Adduction trajectory errors between HSV data and fits were evaluated using root-mean-square error and maximum angular velocity error. Results Simulated data were generally well fit by sigmoid models; however, when compared to the actual 1,000-fps data, sigmoid fits were found to overestimate maximum angle velocities. Errors decreased as frame rate increased, reaching a plateau by 120 fps. Conclusion In healthy adults, vocal fold kinematic behavior during adduction is generally sigmoidal, although such fits can produce substantial errors when data are acquired at frame rates lower than 120 fps.
Collapse
Affiliation(s)
- Manuel Diaz-Cadiz
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
| | | | - Jennifer M. Vojtech
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
| | - Cara E. Stepp
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
- Department of Otolaryngology–Head and Neck Surgery, Boston University School of Medicine, MA
| |
Collapse
|
16
|
Influence of spatial camera resolution in high-speed videoendoscopy on laryngeal parameters. PLoS One 2019; 14:e0215168. [PMID: 31009488 PMCID: PMC6476512 DOI: 10.1371/journal.pone.0215168] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 03/27/2019] [Indexed: 11/19/2022] Open
Abstract
In laryngeal high-speed videoendoscopy (HSV) the area between the vibrating vocal folds during phonation is of interest, being referred to as glottal area waveform (GAW). Varying camera resolution may influence parameters computed on the GAW and hence hinder the comparability between examinations. This study investigates the influence of spatial camera resolution on quantitative vocal fold vibratory function parameters obtained from the GAW. In total 40 HSV recordings during sustained phonation (20 healthy males and 20 healthy females) were investigated. A clinically used Photron Fastcam MC2 camera with a frame rate of 4000 fps and a spatial resolution of 512×256 pixels was applied. This initial resolution was reduced by pixel averaging to (1) a resolution of 256×128 and (2) to a resolution of 128×64 pixels, yielding three sets of recordings. The GAW was extracted and in total 50 vocal fold vibratory parameters representing different features of the GAW were computed. Statistical analyses using SPSS Statistics, version 21, was performed. 15 Parameters showing strong mathematical dependencies with other parameters were excluded from the main analysis but are given in the Supporting Information. Data analysis revealed clear influence of spatial resolution on GAW parameters. Fundamental period measures and period perturbation measures were the least affected. Amplitude perturbation measures and mechanical measures were most strongly influenced. Most glottal dynamic characteristics and symmetry measures deviated significantly. Most energy perturbation measures changed significantly in males but were mostly unaffected in females. In females 18 of 35 remaining parameters (51%) and in males 22 parameters (63%) changed significantly between spatial resolutions. This work represents the first step in studying the impact of video resolution on quantitative HSV parameters. Clear influences of spatial camera resolution on computed parameters were found. The study results suggest avoiding the use of the most strongly affected parameters. Further, the use of cameras with high resolution is recommended to analyze GAW measures in HSV data.
Collapse
|
17
|
Birk V, Kniesburges S, Semmler M, Berry DA, Bohr C, Döllinger M, Schützenberger A. Influence of glottal closure on the phonatory process in ex vivo porcine larynges. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:2197. [PMID: 29092569 PMCID: PMC6909995 DOI: 10.1121/1.5007952] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Many cases of disturbed voice signals can be attributed to incomplete glottal closure, vocal fold oscillation asymmetries, and aperiodicity. Often these phenomena occur simultaneously and interact with each other, making a systematic, isolated investigation challenging. Therefore, ex vivo porcine experiments were performed which enable direct control of glottal configurations. Different pre-phonatory glottal gap sizes, adduction levels, and flow rates were adjusted. The resulting glottal closure types were identified in a post-processing step. Finally, the acoustic quality, aerodynamic parameters, and the characteristics of vocal fold oscillation were analyzed in reference to the glottal closure types. Results show that complete glottal closure stabilizes the phonation process indicated through a reduced left-right phase asymmetry, increased amplitude and time periodicity, and an increase in the acoustic quality. Although asymmetry and periodicity parameter variation covers only a small range of absolute values, these small variations have a remarkable influence on the acoustic quality. Due to the fact that these parameters cannot be influenced directly, the authors suggest that the (surgical) reduction of the glottal gap seems to be a promising method to stabilize the phonatory process, which has to be confirmed in future studies.
Collapse
Affiliation(s)
- Veronika Birk
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 1, 91054 Erlangen, Germany
| | - Stefan Kniesburges
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 1, 91054 Erlangen, Germany
| | - Marion Semmler
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 1, 91054 Erlangen, Germany
| | - David A Berry
- Laryngeal Dynamics Laboratory, Division of Head and Neck Surgery, David Geffen School of Medicine at UCLA, 10833 Le Conte Avenue, Los Angeles, California 90095-1624, USA
| | - Christopher Bohr
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 1, 91054 Erlangen, Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 1, 91054 Erlangen, Germany
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 1, 91054 Erlangen, Germany
| |
Collapse
|
18
|
High-speed Videolaryngoscopy: Quantitative Parameters of Glottal Area Waveforms and High-speed Kymography in Healthy Individuals. J Voice 2017; 31:282-290. [DOI: 10.1016/j.jvoice.2016.09.026] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Revised: 09/22/2016] [Accepted: 09/23/2016] [Indexed: 11/21/2022]
|
19
|
Oscillatory Onset and Offset in Young Vocally Healthy Adults Across Various Measurement Methods. J Voice 2017; 31:512.e17-512.e24. [PMID: 28169095 DOI: 10.1016/j.jvoice.2016.12.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2016] [Revised: 12/01/2016] [Accepted: 12/02/2016] [Indexed: 11/20/2022]
Abstract
OBJECTIVE This study aimed to investigate the relationship between (1) oscillatory onset-offset time across various approaches that use different measurement criteria and (2) oscillatory onset and offset times in vocally healthy young adults. METHOD Oscillatory onset-offset times were obtained from 71 vocally normal adults, using high-speed videoendoscopy. Comparisons between the different onset methods involved measurement of the oscillatory onset time (OOT), voice initiation period (VIP), and the phonation onset time (POT), and for offset methods involved computation of the oscillatory offset time (OOToff) and the phonation offset time. RESULTS Correlation of the OOT with the VIP was 0.240 (P = 0.04) and with the POT form glottal area waveform was 0.248 (P = 0.04); however, correlation between the VIP and the POT glottal area waveform was 0.661 (P < 0.001). For offset, there was a moderate correlation (rS = 0.503, P < 0.001) across OOToff and vocal offset period. The onset time was longest for the OOT followed by the VIP and the POT. There was no correlation between onset and offset for all methods. CONCLUSIONS A framework for quantification of oscillatory onset-offset time was developed for /hi/ tasks, which can be used for future measurements of disordered voice. A positive relationship was observed between VIP and POT and between OOToff and vocal offset period. There was a nonlinear relationship between the OOT, VIP, and POT measures. Onset-offset times are strongly influenced by the calculation method used, the pros and cons of which are discussed in this paper. Vibratory onset and offset represent physiologically different phenomena.
Collapse
|
20
|
Laryngeal High-Speed Videoendoscopy: Sensitivity of Objective Parameters towards Recording Frame Rate. BIOMED RESEARCH INTERNATIONAL 2016; 2016:4575437. [PMID: 27990428 PMCID: PMC5136634 DOI: 10.1155/2016/4575437] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2016] [Accepted: 10/10/2016] [Indexed: 11/29/2022]
Abstract
The current use of laryngeal high-speed videoendoscopy in clinic settings involves subjective visual assessment of vocal fold vibratory characteristics. However, objective quantification of vocal fold vibrations for evidence-based diagnosis and therapy is desired, and objective parameters assessing laryngeal dynamics have therefore been suggested. This study investigated the sensitivity of the objective parameters and their dependence on recording frame rate. A total of 300 endoscopic high-speed videos with recording frame rates between 1000 and 15 000 fps were analyzed for a vocally healthy female subject during sustained phonation. Twenty parameters, representing laryngeal dynamics, were computed. Four different parameter characteristics were found: parameters showing no change with increasing frame rate; parameters changing up to a certain frame rate, but then remaining constant; parameters remaining constant within a particular range of recording frame rates; and parameters changing with nearly every frame rate. The results suggest that (1) parameter values are influenced by recording frame rates and different parameters have varying sensitivities to recording frame rate; (2) normative values should be determined based on recording frame rates; and (3) the typically used recording frame rate of 4000 fps seems to be too low to distinguish accurately certain characteristics of the human phonation process in detail.
Collapse
|
21
|
Patel RR. Vibratory onset and offset times in children: A laryngeal imaging study. Int J Pediatr Otorhinolaryngol 2016; 87:11-7. [PMID: 27368436 PMCID: PMC4930831 DOI: 10.1016/j.ijporl.2016.05.019] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/05/2016] [Revised: 05/10/2016] [Accepted: 05/12/2016] [Indexed: 10/21/2022]
Abstract
OBJECTIVES The aim of the study was to evaluate the differences in vibratory onset and offset times across age (adult males, adult females, and children) and waveform types (total glottal area waveform, left glottal area waveform, and right glottal area waveform) using high-speed videoendoscopy. METHODS In this prospective study, vibratory onset and offset times were evaluated in a total of 86 participants. Forty-three children (23 girls, 18 boys) between 5 and 11 years and 43 gender matched vocally normal young adults (23 females and 18 males) in the age range (21-45 years) were recruited. Vibratory onset and offset times were calculated in milliseconds from the total, left, and right Glottal Area Waveform (GAW). A two-factor analysis of variance was used to compare the means among the subject groups (children, adult male, and adult female) and waveform type (total GAW, left GAW, right GAW) for onset and offset variables. Post hoc analyses were performed using the Fishers Least Significant Different test with Bonferroni correction for multiple comparisons. RESULTS Children exhibited significantly shorter vibratory onset and offset times compared to adult males and females. Differences in vibratory onset and offset times were not statistically significant between adult males and females. Across all waveform types (i.e. total GAW, left GAW, and right GAW), no statistical significance was observed among the subject groups. CONCLUSION This is the first study reporting vibratory onset and offset times in the pediatric population. The study findings lay the foundation for the development of a large age- and gender-based database of the pediatric population to aid the study of the effects of maturation of vocal fold vibration in adulthood. The findings from this study may also provide the basis for evaluating the impact of numerous lesions on tissue pliability, and thereby has potential utility for the clinical differentiation of various lesions.
Collapse
Affiliation(s)
- Rita R. Patel
- Department of Speech and Hearing Sciences, Indiana University
| |
Collapse
|
22
|
Patel RR, Walker R, Sivasankar PM. Spatiotemporal Quantification of Vocal Fold Vibration After Exposure to Superficial Laryngeal Dehydration: A Preliminary Study. J Voice 2015; 30:427-33. [PMID: 26277075 DOI: 10.1016/j.jvoice.2015.07.009] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 07/17/2015] [Indexed: 11/17/2022]
Abstract
OBJECTIVES The aim of the study was to evaluate the effects of a superficial laryngeal dehydration challenge on vocal fold vibration in young healthy adults using high-speed video imaging. SUBJECTS AND METHODS In this prospective study, the effects of a 60-minute superficial laryngeal dehydration challenge on spatial (speed quotient, amplitude quotient) and temporal measures (jitter percentage, vibratory onset time) of vocal fold vibration and phonation threshold pressure (PTP) were evaluated in 10 (male = 4, female = 6) vocally normal adults (21-29 years). All measures except the vibratory onset time were measured at the 10 (low) and 80 (high) percent level of their pitch range. The vibratory onset time was obtained at habitual pitch and loudness level. Superficial laryngeal dehydration was induced by oral breathing in low ambient humidity. Prechallenge and postchallenge differences were statistically investigated using t tests with Bonferroni correction. RESULTS The speed quotient at low-pitch phonation significantly decreased after oral breathing of low ambient humidity. Other spatiotemporal measures and PTP at low and high pitch were not significant after challenge. CONCLUSIONS Results from this initial study have implications for the use of high-speed video imaging to detect and quantify the subtle changes in vocal fold vibrations after superficial dehydration in healthy individuals. Preliminary findings indicate that superficial dehydration in healthy individuals results in spatial deviations at low pitch. However, further studies are warranted to identify additional spatiotemporal changes in vocal fold vibration after superficial dehydration in normal and disordered populations.
Collapse
Affiliation(s)
- Rita R Patel
- Department of Speech and Hearing Sciences, Indiana University, Bloomington, Indiana.
| | - Reuben Walker
- Department of Speech and Hearing Sciences, Indiana University, Bloomington, Indiana
| | - Preeti M Sivasankar
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana
| |
Collapse
|
23
|
Comparative Analysis of Vocal Fold Vibration Using High-Speed Videoendoscopy and Digital Kymography. J Voice 2014; 28:603-7. [DOI: 10.1016/j.jvoice.2013.12.019] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2013] [Accepted: 12/30/2013] [Indexed: 11/30/2022]
|
24
|
Bohr C, Kräck A, Dubrovskiy D, Eysholdt U, Svec J, Psychogios G, Ziethe A, Döllinger M. Spatiotemporal analysis of high-speed videolaryngoscopic imaging of organic pathologies in males. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2014; 57:1148-1161. [PMID: 24686496 DOI: 10.1044/2014_jslhr-s-12-0076] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
PURPOSE The aim of this study was to identify parameters that would differentiate healthy from pathological organic-based vocal fold vibrations to emphasize clinical usefulness of high-speed imaging. METHOD Fifty-five men (M age = 36 years, SD = 20 years) were examined and separated into 4 groups: 1 healthy (26 individuals) and 3 pathological (10 individuals with contact granuloma, 12 with polyps, and 7 with cysts). Vocal fold vibrations were recorded using a high-speed camera during sustained phonation. Twenty objective glottal area waveform and 24 phonovibrogram parameters representing spatiotemporal characteristics were analyzed. Statistical group comparisons were performed to document spatiotemporal changes for organic lesions that cannot be determined visually. To look for specific pattern profiles within organic lesions, the authors performed linear discriminant analysis. RESULTS Thirteen parameters showed significant differences between the healthy group and at least 1 pathological group. The differences occurred more in temporal than in spatial parameters. Contact granuloma showed the fewest statistical differences (3 parameters), followed by cysts (9 parameters), and polyps (10 parameters). Linear discriminant analysis achieved accuracy performance of 76% (all groups separated) and 82% (healthy vs. pathological). CONCLUSION The results suggest that for males, the differences between healthy voices and organic voice disorders may be more pronounced within temporal characteristics that cannot be visually detected without high-speed imaging.
Collapse
|
25
|
Quantitative measurement of vocal fold vibration in male radio performers and healthy controls using high-speed videoendoscopy. PLoS One 2014; 9:e101128. [PMID: 24971625 PMCID: PMC4074127 DOI: 10.1371/journal.pone.0101128] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2013] [Accepted: 06/03/2014] [Indexed: 11/19/2022] Open
Abstract
PURPOSE Acoustic and perceptual studies show a number of differences between the voices of radio performers and controls. Despite this, the vocal fold kinematics underlying these differences are largely unknown. Using high-speed videoendoscopy, this study sought to determine whether the vocal vibration features of radio performers differed from those of non-performing controls. METHOD Using high-speed videoendoscopy, recordings of a mid-phonatory/i/ in 16 male radio performers (aged 25-52 years) and 16 age-matched controls (aged 25-52 years) were collected. Videos were extracted and analysed semi-automatically using High-Speed Video Program, obtaining measures of fundamental frequency (f0), open quotient and speed quotient. Post-hoc analyses of sound pressure level (SPL) were also performed (n = 19). Pearson's correlations were calculated between SPL and both speed and open quotients. RESULTS Male radio performers had a significantly higher speed quotient than their matched controls (t = 3.308, p = 0.005). No significant differences were found for f0 or open quotient. No significant correlation was found between either open or speed quotient with SPL. DISCUSSION A higher speed quotient in male radio performers suggests that their vocal fold vibration was characterised by a higher ratio of glottal opening to closing times than controls. This result may explain findings of better voice quality, higher equivalent sound level and greater spectral tilt seen in previous research. Open quotient was not significantly different between groups, indicating that the durations of complete vocal fold closure were not different between the radio performers and controls. Further validation of these results is required to determine the aetiology of the higher speed quotient result and its implications for voice training and clinical management in performers.
Collapse
|
26
|
Yamauchi A, Imagawa H, Sakakibara KI, Yokonishi H, Nito T, Yamasoba T, Tayama N. Characteristics of vocal fold vibrations in vocally healthy subjects: analysis with multi-line kymography. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2014; 57:S648-S657. [PMID: 24686860 DOI: 10.1044/2014_jslhr-s-12-0269] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
PURPOSE In this study, the authors aimed to analyze longitudinal data from high-speed digital images in normative subjects using multi-line kymography. METHOD Vocally healthy subjects were divided into young (9 men and 17 women; Mage = 27 years) and older groups (8 men and 12 women; Mage = 73 years). From high-speed digital images of phonation at a conversational frequency kymograms were created at 5 different levels of the vocal fold and were analyzed to determine the opening/closing longitudinal phase difference, open quotient, and speed index. Then age- and gender-related differences of these parameters were analyzed statistically. RESULTS Young women frequently showed a pattern of posterior-to-anterior glottal opening and anterior-to-posterior glottal closure, and older women demonstrated various opening and closing patterns. Both young men and older men were similar to older women. The open quotient was maximal at the most posterior glottal level in young women, but it tended to be maximal at the anterior glottis in the other subgroups. The mean value of the 5 open quotients was largest in young women. The mean speed index had a large negative value in older subjects. CONCLUSION This study provides the first information about age-related differences of longitudinal oscillatory characteristics of the vocal folds obtained with high-speed digital imaging.
Collapse
|
27
|
Patel R, Dubrovskiy D, Döllinger M. Characterizing vibratory kinematics in children and adults with high-speed digital imaging. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2014; 57:S674-86. [PMID: 24686982 PMCID: PMC7315516 DOI: 10.1044/2014_jslhr-s-12-0278] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
PURPOSE The aim of this study is to quantify and identify characteristic vibratory motion in typically developing prepubertal children and young adults using high-speed digital imaging. METHOD The vibrations of the vocal folds were recorded from 27 children (ages 5-9 years) and 35 adults (ages 21-45 years), with high speed at 4,000 frames per second for sustained phonation. Kinematic features of amplitude periodicity, time periodicity, phase asymmetry, spatial symmetry, and glottal gap index were analyzed from the glottal area waveform across mean and standard deviation (i.e., intercycle variability) for each measure. RESULTS Children exhibited lower mean amplitude periodicity compared to men and women and lower time periodicity compared to men. Children and women exhibited greater variability in amplitude periodicity, time periodicity, phase asymmetry, and glottal gap index compared to men. Women had lower mean values of amplitude periodicity and time periodicity compared to men. CONCLUSION Children differed both spatially but more temporally in vocal fold motion, suggesting the need for the development of children-specific kinematic norms. Results suggest more uncontrolled vibratory motion in children, reflecting changes in the vocal fold layered structure and aero-acoustic source mechanisms.
Collapse
|
28
|
Patel RR, Dubrovskiy D, Döllinger M. Measurement of glottal cycle characteristics between children and adults: physiological variations. J Voice 2014; 28:476-86. [PMID: 24629646 DOI: 10.1016/j.jvoice.2013.12.010] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2013] [Accepted: 12/16/2013] [Indexed: 10/25/2022]
Abstract
OBJECTIVES The aim of this study was to quantify phases of the vibratory cycle using measurements of glottal cycle quotients and glottal cycle derivatives, in typically developing prepubertal children and young adults with the use of high-speed digital imaging (HSDI). METHODS Vocal fold vibrations were recorded from 27 children (age range 5-9 years) and 35 adults (age range 21-45 years), with HSDI at 4000 frames per second for sustained phonation. Glottal area waveform measures of Open Quotient, Closing Quotient, Speed Index (SI), Rate Quotient, and Asymmetry Quotient (AsyQ) were computed. Glottal cycle derivatives of Amplitude Quotient (AQ) and Maximum Area Declination Rate (MADR) were also computed. Group differences (adult females, adult males, and children) were statistically investigated for mean and standard deviation values of the glottal cycle quotients and glottal cycle derivatives. RESULTS Children exhibited higher values of SI and AsyQ and lower values of MADR compared with adult males. Children exhibited the highest mean value and lowest variability in AQ compared with adult males and females. Adult males showed lower values of SI, AsyQ, AQ, and higher values of MADR compared with adult females. CONCLUSIONS Glottal cycle vibratory motion in children is functionally different compared with adult males and females, suggesting the need for development of children specific norms for both normal and disordered voice qualities.
Collapse
Affiliation(s)
- Rita R Patel
- Department Speech and Hearing Sciences, College of Arts and Sciences, Indiana University, Bloomington, Indiana.
| | - Denis Dubrovskiy
- Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Medical School, Erlangen, Germany
| | - Michael Döllinger
- Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Medical School, Erlangen, Germany
| |
Collapse
|
29
|
Abstract
PURPOSE OF REVIEW Kymographic imaging is a modern method for displaying and evaluating vibratory behaviour of the vocal folds which is crucial for voice production. This review summarizes the state of the art of this method, and focuses on the progress in this area within the last 5 years. RECENT FINDINGS Videokymography, using a special videocamera, offers high-speed (video)kymographic images in real time, which is advantageous in daily clinical practice. Two other methods use software to create kymograms retrospectively: digital kymography processes high-speed videolaryngoscopic recordings and offers numerous research possibilities, whereas strobovideokymography processes videostroboscopic recordings, and its use is limited to regular vibration patterns. Current studies reveal that high-speed kymographic images allow more reliable visual evaluation of vibrations than by watching video recordings. Image analysis procedures have been advanced to quantify the vibration properties of the vocal folds. New information has been obtained on asymmetry, mucosal waves, irregularities, phonation onset, and nonlinear dynamic phenomena in voice disorders, as well as in singing. SUMMARY High-speed kymography visualizes vibratory features which are not simply observable via traditional methods. It shows large potential in better understanding the functional origin of hoarseness and unsteady phonatory states. Further research in this area is envisioned.
Collapse
|
30
|
Bohr C, Kraeck A, Eysholdt U, Ziethe A, Döllinger M. Quantitative analysis of organic vocal fold pathologies in females by high-speed endoscopy. Laryngoscope 2013; 123:1686-93. [PMID: 23649746 DOI: 10.1002/lary.23783] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2012] [Revised: 08/23/2012] [Accepted: 09/18/2012] [Indexed: 11/11/2022]
Abstract
OBJECTIVES/HYPOTHESIS Quantitative analysis of endoscopic high-speed video recordings of vocal fold vibrations has been growing in importance in recent years. The videos have mainly been analyzed using subjective evaluation, but this is examiner dependent, and the results show inadequate interobserver agreement. The aims of this study were therefore to identify appropriate objective parameters for analyzing high-speed recordings to differentiate healthy voice production from organic disorders. STUDY DESIGN METHODS A total of 152 females were examined, divided into 77 healthy and 75 with four different pathological conditions: laryngeal epithelial thickening, Reinke edema, vocal fold polyps, and vocal fold cysts. Vocal fold vibrations were recorded with a high-speed camera (4,000 Hz, 256 × 256 pixels) during sustained phonation. Parameters computed from the glottal area waveform (GAW) and from phonovibrogram (PVG) were analyzed. Multiparametric linear discriminant analysis was performed to classify pathological conditions versus the healthy group. RESULTS Twenty of 44 parameters were identified that are capable of distinguishing between the individual types of pathology. PVG parameters showed better performance than GAW parameters. Parameters representing vibrational periodicity via standard deviation showed better performance than absolute parameters. In addition, linear discriminant analysis achieved reliable differentiation between healthy and pathological vocal fold vibrations: 72% for the five-class problem (all groups separately) and 88% for the two-class problem (healthy vs. all pathologies taken as one class). CONCLUSIONS The study succeeded in defining objective parameters for analyzing endoscopic high-speed videos and suggesting first parameters for differentiation between healthy dynamics and dynamics of organic pathologies.
Collapse
Affiliation(s)
- Christopher Bohr
- Department of Otorhinolaryngology, Head and Neck Surgery, Erlangen University Hospital, Erlangen, Germany.
| | | | | | | | | |
Collapse
|
31
|
Abstract
PURPOSE OF REVIEW This review presents recent advances in high-speed digital imaging (HSDI) of the larynx including data acquisition, data analysis, and clinical applicability. RECENT FINDINGS Software designed to summarize the large amounts of data captured with HSDI makes it possible to quantitatively analyze recordings from patients, improving the accuracy of the methodology. The new software has been used in studies of normal individuals, increasing our knowledge of normal vocal fold vibratory behavior. HSDI has also been used in patient populations and shows promise in distinguishing various laryngeal conditions that are difficult to distinguish with other imaging modalities. Studies of postoperative patients with HSDI demonstrate the return of some vibratory characteristics but not others, potentially leading the way to improvements in surgical technique. SUMMARY Recent advances in HSDI technology have increased the clinical usefulness of the imaging technology and recent studies demonstrate the clinical applicability of HSDI. However, challenges to widespread clinical use of HSDI remain.
Collapse
|
32
|
Döllinger M, Dubrovskiy D, Patel R. Spatiotemporal analysis of vocal fold vibrations between children and adults. Laryngoscope 2012; 122:2511-8. [PMID: 22965771 DOI: 10.1002/lary.23568] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/11/2012] [Indexed: 11/10/2022]
Abstract
OBJECTIVES/HYPOTHESIS Aim of the study is to quantify differences in spatiotemporal features of vibratory motion in typically developing prepubertal children and adults with use of high speed digital imaging. STUDY DESIGN Prospective case-control study. METHODS Vocal fold oscillations of 31 children and 35 adults were analyzed. Endoscopic high-speed imaging was performed during sustained phonation at typical pitch and loudness. Quantitative technique of Phonovibrogram was used to compute spatiotemporal features. Spatial features are represented by opening and closing angles along the anterior and posterior parts of the vocal folds, as well as by left-right symmetry ratio. Temporal features are represented by the cycle-to-cycle variability of the spatial features. Group differences (adult females, adult males, and children) were statistically investigated. RESULTS Statistical differences were more pronounced in the temporal behavior compared to the spatial behavior. Children demonstrated greater cycle-to-cycle variability in oscillations compared to adults. Most differences between children and adults were found for temporal characteristics along the anterior parts during closing phase. The spatiotemporal features differed more between children and males than between children and females. Both adults and children showed equally high left-right symmetry. CONCLUSIONS Results suggest a more unstable phonation in children than in adults, yielding increased perturbation in periodicity. Children demonstrated longer phase delay in the anterior/posterior and medio-lateral parts during the opening phase compared to adults. The data presented may provide the bases for differentiating normal vibratory characteristics from the disordered in the pediatric population, and eventually assist in aiding the clinical utility of high speed imaging.
Collapse
Affiliation(s)
- Michael Döllinger
- Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Medical School, Erlangen, Germany.
| | | | | |
Collapse
|
33
|
Kunduk M, Döllinger M, McWhorter AJ, Švec JG, Lohscheller J. Vocal Fold Vibratory Behavior Changes following Surgical Treatment of Polyps Investigated with High-Speed Videoendoscopy and Phonovibrography. Ann Otol Rhinol Laryngol 2012; 121:355-63. [DOI: 10.1177/000348941212100601] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Objectives: The goal of this study was to objectively quantify the changes in vocal fold vibratory characteristics before and after surgery with high-speed videoendoscopy and the image analysis tool phonovibrography. Methods: High-speed videoendoscopic data, audio recordings, and Voice Handicap Index scores were collected from 8 subjects with a diagnosis of unilateral vocal fold polyps, before operation and at 1 week and 1 to 3 months after operation. We then analyzed the objective phonovibrographic patterns and parameters describing the vocal fold vibratory behavior. Results: On phonovibrography, the visual representations of the vocal fold vibratory characteristics, from both the individual and the group data, demonstrated very different patterns before surgery and both 1 week and 1 to 3 months after surgery. The individual phonovibrograms obtained from the left and right true vocal folds clearly demonstrated the lesion site and its effects on the vocal fold vibratory characteristics for each subject. The improvements in amplitude and symmetry (relative vibratory amplitude and vibration amplitude symmetry) of vocal fold vibration were quantified; the difference was greatest between data from before surgery and data from 1 week after surgery. Conclusions: The visual phonovibrographic patterns and quantitative data revealed marked changes in vocal fold vibratory patterns after operation and continued improvement at 1 to 3 months.
Collapse
|