1
|
Echternach M, Burk F, Köberlein M, Döllinger M, Burdumy M, Richter B, Titze IR, Elemans CPH, Herbst CT. Biomechanics of sound production in high-pitched classical singing. Sci Rep 2024; 14:13132. [PMID: 38849382 PMCID: PMC11161605 DOI: 10.1038/s41598-024-62598-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 05/20/2024] [Indexed: 06/09/2024] Open
Abstract
Voice production of humans and most mammals is governed by the MyoElastic-AeroDynamic (MEAD) principle, where an air stream is modulated by self-sustained vocal fold oscillation to generate audible air pressure fluctuations. An alternative mechanism is found in ultrasonic vocalizations of rodents, which are established by an aeroacoustic (AA) phenomenon without vibration of laryngeal tissue. Previously, some authors argued that high-pitched human vocalization is also produced by the AA principle. Here, we investigate the so-called "whistle register" voice production in nine professional female operatic sopranos singing a scale from C6 (≈ 1047 Hz) to G6 (≈ 1568 Hz). Super-high-speed videolaryngoscopy revealed vocal fold collision in all participants, with closed quotients from 30 to 73%. Computational modeling showed that the biomechanical requirements to produce such high-pitched voice would be an increased contraction of the cricothyroid muscle, vocal fold strain of about 50%, and high subglottal pressure. Our data suggest that high-pitched operatic soprano singing uses the MEAD mechanism. Consequently, the commonly used term "whistle register" does not reflect the physical principle of a whistle with regard to voice generation in high pitched classical singing.
Collapse
Affiliation(s)
- Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, LMU University Hospital, Marchioninistr. 15, 81377, Munich, Germany.
| | - Fabian Burk
- Department of Otorhinolaryngology and Plastic Surgery, SRH Wald-Klinikum Gera, Strasse des Friedens 122, Gera, Germany
| | - Marie Köberlein
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, LMU University Hospital, Marchioninistr. 15, 81377, Munich, Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology Head and Neck Surgery, University Hospital Erlangen, Waldstr. 1, 91054, Erlangen, Germany
| | - Michael Burdumy
- Department of Medical Physics, Department of Radiology, Faculty of Medicine, Medical Center-University of Freiburg, Breisacher Str. 60, 79106, Freiburg, Germany
| | - Bernhard Richter
- Institute of Musicians' Medicine, Freiburg University Medical Center and Faculty of Medicine Freiburg University, Elsässer Str. 2m, 79110, Freiburg, Germany
| | - Ingo R Titze
- Utah Center for Vocology, 240 S 1500 E, Room 206, Salt Lake City, UT, 84112, USA
| | - Coen P H Elemans
- Vocal Neuromechanics Lab, Sound Communication and Behavior Group, Department of Biology, University of Southern Denmark, Campusvej 55, DK-5230, Odense M, Denmark
| | - Christian T Herbst
- Department of Behavioural and Cognitive Biology, University of Vienna, Djerassiplatz 1, 1030, Vienna, Austria.
- Janette Ogg Voice Research Center, Shenandoah Conservatory, Winchester, VA, USA.
| |
Collapse
|
2
|
Donhauser J, Tur B, Döllinger M. Neural network-based estimation of biomechanical vocal fold parameters. Front Physiol 2024; 15:1282574. [PMID: 38449783 PMCID: PMC10916882 DOI: 10.3389/fphys.2024.1282574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 01/09/2024] [Indexed: 03/08/2024] Open
Abstract
Vocal fold (VF) vibrations are the primary source of human phonation. High-speed video (HSV) endoscopy enables the computation of descriptive VF parameters for assessment of physiological properties of laryngeal dynamics, i.e., the vibration of the VFs. However, underlying biomechanical factors responsible for physiological and disordered VF vibrations cannot be accessed. In contrast, physically based numerical VF models reveal insights into the organ's oscillations, which remain inaccessible through endoscopy. To estimate biomechanical properties, previous research has fitted subglottal pressure-driven mass-spring-damper systems, as inverse problem to the HSV-recorded VF trajectories, by global optimization of the numerical model. A neural network trained on the numerical model may be used as a substitute for computationally expensive optimization, yielding a fast evaluating surrogate of the biomechanical inverse problem. This paper proposes a convolutional recurrent neural network (CRNN)-based architecture trained on regression of a physiological-based biomechanical six-mass model (6 MM). To compare with previous research, the underlying biomechanical factor "subglottal pressure" prediction was tested against 288 HSV ex vivo porcine recordings. The contributions of this work are two-fold: first, the presented CRNN with the 6 MM handles multiple trajectories along the VFs, which allows for investigations on local changes in VF characteristics. Second, the network was trained to reproduce further important biomechanical model parameters like VF mass and stiffness on synthetic data. Unlike in a previous work, the network in this study is therefore an entire surrogate of the inverse problem, which allowed for explicit computation of the fitted model using our approach. The presented approach achieves a best-case mean absolute error (MAE) of 133 Pa (13.9%) in subglottal pressure prediction with 76.6% correlation on experimental data and a re-estimated fundamental frequency MAE of 15.9 Hz (9.9%). In-detail training analysis revealed subglottal pressure as the most learnable parameter. With the physiological-based model design and advances in fast parameter prediction, this work is a next step in biomechanical VF model fitting and the estimation of laryngeal kinematics.
Collapse
Affiliation(s)
- Jonas Donhauser
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | | | | |
Collapse
|
3
|
Nogueira do Nascimento U, Santos MAR, Gama ACC. Digital Videokymography: Analysis of Glottal Closure in Adults. J Voice 2024; 38:18-24. [PMID: 34417083 DOI: 10.1016/j.jvoice.2021.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 07/01/2021] [Accepted: 07/06/2021] [Indexed: 11/25/2022]
Abstract
INTRODUCTION High-speed videolaryngoscopy and quantitative analysis of laryngeal images are relevant in accurately diagnosing vocal fold closure patterns. OBJECTIVE To analyze the parameters of digital videokymography obtained through high-speed videolaryngoscopy in women and men with complete and incomplete glottal closure, and posterior glottal chink. METHODS We conducted an observational, analytical, cross-sectional study with data from 65 adults, which we divided into groups according to sex and glottal closure. Digital videokymography parameters were analyzed using an image-processing program. The Anderson-Darling and Mann-Whitney U tests were used to verify sample normality and compare videokymography parameters between groups, respectively. The significance level was set at 5%. RESULTS Among 65 laryngeal images, 20 each were from women with complete and incomplete glottal closure, and 20 and 5 were from men with complete and incomplete glottal closure, respectively. Considering the clinical relevance of the evaluated data, groups of 11 women and 4 men with posterior glottal chink were compared with sex-similar groups with complete glottal closure. Digital videokymography showed a lower maximum and mean vocal fold opening in women with incomplete glottal closure, and a lower dominant left vocal fold-opening amplitude and higher dominant frequency of bilateral vocal fold opening in men with incomplete glottal closure. It also showed a lower closed phase percentage in the posterior region for women and men, with higher closed phase percentage in the anterior and middle regions in women. Both groups with posterior glottal chink showed similar results. CONCLUSION Incomplete glottal closure may interfere with the results of the digital videokymography parameters, with higher impact on the posterior vocal fold region in males and the middle and anterior vocal fold regions in females.
Collapse
Affiliation(s)
- Ualisson Nogueira do Nascimento
- Federal University of Minas Gerais (UFMG), School of Medicine, Graduate Program in Speech Therapy Sciences, Belo Horizonte, Minas Gerais, Brazil.
| | | | - Ana Cristina Côrtes Gama
- Federal University of Minas Gerais (UFMG), School of Medicine, Graduate Program in Speech Therapy Sciences, Belo Horizonte, Minas Gerais, Brazil
| |
Collapse
|
4
|
Yamauchi A, Imagawa H, Yokonishi H, Sakakibara KI, Tayama N. Multivariate Analysis of Vocal Fold Vibrations in Normal Speakers Using High-Speed Digital Imaging. J Voice 2024; 38:10-17. [PMID: 34470706 DOI: 10.1016/j.jvoice.2021.08.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2021] [Revised: 07/30/2021] [Accepted: 08/02/2021] [Indexed: 11/18/2022]
Abstract
INTRODUCTION Little is known about the normal variations in vocal fold vibrations. We conducted a prospective study on normal subjects using high-speed digital imaging (HSDI) to elucidate key parameters regarding age/gender-related normal variations. METHODS Forty-six healthy adult volunteers were divided into young (aged ≤35 years) male, young female, elderly (aged ≥65 years) male, and elderly female subgroups. HSDI data of sustained phonation of /i/ at a comfortable pitch and loudness were obtained, and vibratory parameters were calculated using the visual-perceptual rating, laryngotopography, digital kymography, and glottal area waveform. Multivariate analysis was then performed on these parameters to clarify the subgroup-specific key parameters. RESULTS Four key parameters were identified from a total of 83: one from visual perceptual rating and three from laryngotopography. Subgroup analyses showed that posterior-to-anterior longitudinal phase difference (PD) and high fundamental frequency (F0) were specific to young female participants. A low F0 was specific to young male participants. Large anterior-to-posterior longitudinal PD and its left-right difference were specific to elderly male participants. There were no key parameters for elderly female participants. CONCLUSIONS Methods that can assess F0 and longitudinal PD, such as visual-perceptual rating and laryngotopography, were effective in the evaluation of normal vocal fold vibrations and their variations.
Collapse
Affiliation(s)
- Akihito Yamauchi
- Department of Otolaryngology, The University of Tokyo Hospital, Bunkyo-Ku, Tokyo, Japan.
| | - Hiroshi Imagawa
- Department of Otolaryngology, The University of Tokyo Hospital, Bunkyo-Ku, Tokyo, Japan
| | - Hisayuki Yokonishi
- Department of Otolaryngology, Tokyo Metropolitan Bokutoh Hospital, Sumida-Ku, Tokyo, Japan
| | - Ken-Ichi Sakakibara
- Department of Communication Disorders, Health Sciences University of Hokkaido, Ishikari-Gun, Hokkaido, Japan
| | - Niro Tayama
- Department of Otolaryngology and Tracheo-esophagology, National Center for Global Health and Medicine, Shinjuku-Ku, Tokyo, Japan
| |
Collapse
|
5
|
Malinowski J, Pietruszewska W, Stawiski K, Kowalczyk M, Barańska M, Rycerz A, Niebudek-Bogusz E. High-Speed Videoendoscopy Enhances the Objective Assessment of Glottic Organic Lesions: A Case-Control Study with Multivariable Data-Mining Model Development. Cancers (Basel) 2023; 15:3716. [PMID: 37509377 PMCID: PMC10378075 DOI: 10.3390/cancers15143716] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 07/13/2023] [Accepted: 07/19/2023] [Indexed: 07/30/2023] Open
Abstract
The aim of the study was to utilize a quantitative assessment of the vibratory characteristics of vocal folds in diagnosing benign and malignant lesions of the glottis using high-speed videolaryngoscopy (HSV). METHODS Case-control study including 100 patients with unilateral vocal fold lesions in comparison to 38 normophonic subjects. Quantitative assessment with the determination of vocal fold oscillation parameters was performed based on HSV kymography. Machine-learning predictive models were developed and validated. RESULTS All calculated parameters differed significantly between healthy subjects and patients with organic lesions. The first predictive model distinguishing any organic lesion patients from healthy subjects reached an area under the curve (AUC) equal to 0.983 and presented with 89.3% accuracy, 97.0% sensitivity, and 71.4% specificity on the testing set. The second model identifying malignancy among organic lesions reached an AUC equal to 0.85 and presented with 80.6% accuracy, 100% sensitivity, and 71.1% specificity on the training set. Important predictive factors for the models were frequency perturbation measures. CONCLUSIONS The standard protocol for distinguishing between benign and malignant lesions continues to be clinical evaluation by an experienced ENT specialist and confirmed by histopathological examination. Our findings did suggest that advanced machine learning models, which consider the complex interactions present in HSV data, could potentially indicate a heightened risk of malignancy. Therefore, this technology could prove pivotal in aiding in early cancer detection, thereby emphasizing the need for further investigation and validation.
Collapse
Affiliation(s)
- Jakub Malinowski
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, 90-419 Lodz, Poland
| | - Wioletta Pietruszewska
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, 90-419 Lodz, Poland
| | - Konrad Stawiski
- Department of Radiation Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02115, USA
- Department of Biostatistics and Translational Medicine, Medical University of Lodz, 90-419 Lodz, Poland
| | - Magdalena Kowalczyk
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, 90-419 Lodz, Poland
| | - Magda Barańska
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, 90-419 Lodz, Poland
| | - Aleksander Rycerz
- Department of Biostatistics and Translational Medicine, Medical University of Lodz, 90-419 Lodz, Poland
| | - Ewa Niebudek-Bogusz
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, 90-419 Lodz, Poland
| |
Collapse
|
6
|
Fujiki RB, Croegaert-Koch CK, Thibeault SL. Videostroboscopy Versus High-Speed Videoendoscopy: Factors Influencing Ratings of Laryngeal Oscillation. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:1496-1510. [PMID: 37040690 PMCID: PMC10457078 DOI: 10.1044/2023_jslhr-22-00649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 01/16/2023] [Accepted: 01/23/2023] [Indexed: 05/11/2023]
Abstract
PURPOSE The purpose of this study was to determine whether patient voice-related diagnosis, severity of dysphonia, and rater's experience influence the relationship between laryngeal oscillation ratings made from videostroboscopic and high-speed videoendoscopic (HSV) exams. METHOD Stroboscopy and HSV exams from 15 patients with adductor spasmodic dysphonia (ADSD) and 15 with benign vocal fold lesions were rated for laryngeal oscillation and closure by 10 licensed speech-language pathologists (SLPs). Raters were divided into low- (< 5 years) and high-experience (> 5 years) groups. Ratings of vocal fold amplitude, mucosal wave, periodicity, phase symmetry, nonvibrating portion of the vocal fold, and glottal closure were examined using an online form adapted from the Voice Vibratory Assessment of Laryngeal Imaging (VALI). RESULTS Stroboscopy and HSV ratings were more strongly positively correlated for patients with benign vocal fold lesions (r between .43 and .75) than for those with ADSD (r between .40 and .68). Differences between stroboscopy and HSV exams were significantly greater for ratings of amplitude, mucosal wave, and periodicity in patients with ADSD than for patients with benign vocal fold lesions. Raters with < 5 years of experience showed significantly greater differences between stroboscopy and HSV ratings of amplitude and nonvibrating portion of the vocal fold for patients with ADSD only. Significantly greater differences between ratings of periodicity and phase symmetry were observed in patients with more severe dysphonia. CONCLUSIONS Differences in laryngeal ratings made between HSV and stroboscopy exams may be influenced by patient diagnosis, severity of dysphonia, and rater experience. Future study is warranted to determine how the differences observed influence clinical diagnosis and outcomes.
Collapse
|
7
|
Motie-Shirazi M, Zañartu M, Peterson SD, Mehta DD, Hillman RE, Erath BD. Effect of nodule size and stiffness on phonation threshold and collision pressures in a synthetic hemilaryngeal vocal fold model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:654. [PMID: 36732229 PMCID: PMC9884154 DOI: 10.1121/10.0016997] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 12/19/2022] [Accepted: 01/06/2023] [Indexed: 06/18/2023]
Abstract
Synthetic vocal fold (VF) replicas were used to explore the role of nodule size and stiffness on kinematic, aerodynamic, and acoustic measures of voiced speech production. Emphasis was placed on determining how changes in collision pressure may contribute to the development of phonotrauma. This was performed by adding spherical beads with different sizes and moduli of elasticity at the middle of the medial surface of synthetic silicone VF models, representing nodules of varying size and stiffness. The VF models were incorporated into a hemilaryngeal flow facility. For each case, self-sustained oscillations were investigated at the phonation threshold pressure. It was found that increasing the nodule diameter increased the open quotient, phonation threshold pressure, and phonation threshold flow rate. However, these values did not change considerably as a function of the modulus of elasticity of the nodule. Nevertheless, the ratio of collision pressure to subglottal pressure increased significantly for both increasing nodule size and stiffness. This suggests that over time, both growth in size and fibrosis of nodules will lead to an increasing cycle of compensatory vocal hyperfunction that accelerates phonotrauma.
Collapse
Affiliation(s)
- Mohsen Motie-Shirazi
- Department of Mechanical and Aerospace Engineering, Clarkson University, Potsdam, New York 13699, USA
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Sean D Peterson
- Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, Ontario, Canada
| | - Daryush D Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| | - Robert E Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| | - Byron D Erath
- Department of Mechanical and Aerospace Engineering, Clarkson University, Potsdam, New York 13699, USA
| |
Collapse
|
8
|
Hao Z, Peng J, Dang X, Yan H, Wang R. mmSafe: A Voice Security Verification System Based on Millimeter-Wave Radar. SENSORS (BASEL, SWITZERLAND) 2022; 22:9309. [PMID: 36502011 PMCID: PMC9739021 DOI: 10.3390/s22239309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 11/15/2022] [Accepted: 11/25/2022] [Indexed: 06/17/2023]
Abstract
With the increasing popularity of smart devices, users can control their mobile phones, TVs, cars, and smart furniture by using voice assistants, but voice assistants are susceptible to intrusion by outsider speakers or playback attacks. In order to address this security issue, a millimeter-wave radar-based voice security authentication system is proposed in this paper. First, the speaker's fine-grained vocal cord vibration signal is extracted by eliminating static object clutter and motion effects; second, the weighted Mel Frequency Cepstrum Coefficients (MFCCs) are obtained as biometric features; and finally, text-independent security authentication is performed by the WMHS (Weighted MFCCs and Hog-based SVM) method. This system is highly adaptable and can authenticate designated speakers, resist intrusion by other unspecified speakers as well as playback attacks, and is secure for smart devices. Extensive experiments have verified that the system achieves a 93.4% speaker verification accuracy and a 5.8% miss detection rate for playback attacks.
Collapse
Affiliation(s)
- Zhanjun Hao
- School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China
- Gansu Province Internet of Things Engineering Research Centre, Northwest Normal University, Lanzhou 730070, China
| | - Jianxiang Peng
- School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China
| | - Xiaochao Dang
- School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China
- Gansu Province Internet of Things Engineering Research Centre, Northwest Normal University, Lanzhou 730070, China
| | - Hao Yan
- School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China
| | - Ruidong Wang
- School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China
| |
Collapse
|
9
|
Motie-Shirazi M, Zañartu M, Peterson SD, Mehta DD, Hillman RE, Erath BD. Collision Pressure and Dissipated Power Dose in a Self-Oscillating Silicone Vocal Fold Model With a Posterior Glottal Opening. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:2829-2845. [PMID: 35914018 PMCID: PMC9911124 DOI: 10.1044/2022_jslhr-21-00471] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 01/24/2022] [Accepted: 05/04/2022] [Indexed: 06/15/2023]
Abstract
PURPOSE The goal of this study was to experimentally evaluate how compensating for the adverse acoustic effects of a posterior glottal opening (PGO) by increasing subglottal pressure and changing supraglottal compression, as have been associated with vocal hyperfunction, influences the risk of vocal fold (VF) trauma. METHOD A self-oscillating synthetic silicone model of the VFs with an airflow bypass that modeled a PGO was investigated in a hemilaryngeal flow facility. The influence of compensatory mechanisms on collision pressure and dissipated collision power was investigated for different PGO areas and supraglottal compression. Compensatory behaviors were mimicked by increasing the subglottal pressure to achieve a target sound pressure level (SPL). RESULTS Increasing the subglottal pressure to compensate for decreased SPL due to a PGO produced higher values for both collision pressure and dissipated collision power. Whereas a 10-mm2 PGO area produced a 12% increase in the peak collision pressure, the dissipated collision power increased by 122%, mainly due to an increase in the magnitude of the collision velocity. This suggests that the value of peak collision pressure may not fully capture the mechanisms by which phonotrauma occurs. It was also found that an optimal value of supraglottal compression exists that maximizes the radiated SPL, indicating the potential utility of supraglottal compression as a compensatory mechanism. CONCLUSIONS Larger PGO areas are expected to increase the risk of phonotrauma due to the concomitant increase in dissipated collision power associated with maintaining SPL. Furthermore, the risk of VF damage may not be fully characterized by only the peak collision pressure.
Collapse
Affiliation(s)
- Mohsen Motie-Shirazi
- Department of Mechanical and Aeronautical Engineering, Clarkson University, Potsdam, NY
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Sean D. Peterson
- Department of Mechanical and Mechatronics Engineering, University of Waterloo, Ontario, Canada
| | - Daryush D. Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston
| | - Robert E. Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston
| | - Byron D. Erath
- Department of Mechanical and Aeronautical Engineering, Clarkson University, Potsdam, NY
| |
Collapse
|
10
|
Taylor CJ, Thomson SL. Optimization of Synthetic Vocal Fold Models for Glottal Closure. JOURNAL OF ENGINEERING AND SCIENCE IN MEDICAL DIAGNOSTICS AND THERAPY 2022; 5:031106. [PMID: 35832120 PMCID: PMC9132011 DOI: 10.1115/1.4054194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 03/23/2022] [Indexed: 06/15/2023]
Abstract
Synthetic, self-oscillating models of the human vocal folds are used to study the complex and inter-related flow, structure, and acoustical aspects of voice production. The vocal folds typically collide during each cycle, thereby creating a brief period of glottal closure that has important implications for flow, acoustic, and motion-related outcomes. Many previous synthetic models, however, have been limited by incomplete glottal closure during vibration. In this study, a low-fidelity, two-dimensional, multilayer finite element model of vocal fold flow-induced vibration was coupled with a custom genetic algorithm optimization code to determine geometric and material characteristics that would be expected to yield physiologically-realistic frequency and closed quotient values. The optimization process yielded computational models that vibrated with favorable frequency and closed quotient characteristics. A tradeoff was observed between frequency and closed quotient. A synthetic, self-oscillating vocal fold model with geometric and material properties informed by the simulation outcomes was fabricated and tested for onset pressure, oscillation frequency, and closed quotient. The synthetic model successfully vibrated at a realistic frequency and exhibited a nonzero closed quotient. The methodology described in this study provides potential direction for fabricating synthetic models using isotropic silicone materials that can be designed to vibrate with physiologically-realistic frequencies and closed quotient values. The results also show the potential for a low-fidelity model optimization approach to be used to tune synthetic vocal fold model characteristics for specific vibratory outcomes.
Collapse
Affiliation(s)
- Cassandra J. Taylor
- Department of Mechanical Engineering, Brigham Young University, 350 EB, Provo, UT 84602
| | - Scott L. Thomson
- Department of Mechanical Engineering, Brigham Young University, 350 EB, Provo, UT 84602
| |
Collapse
|
11
|
Quantitative Analysis of Vocal Fold Vibration using High-Speed Videoendoscopy in Children with and without Bilateral Lesions. J Voice 2022; 36:176-182. [PMID: 32712076 PMCID: PMC7854946 DOI: 10.1016/j.jvoice.2020.05.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 05/04/2020] [Accepted: 05/07/2020] [Indexed: 11/22/2022]
Abstract
OBJECTIVE To provide data on the measurable vocal fold vibratory differences in children with and without vocal fold lesions using high-speed videoendoscopy. DESIGN Prospective study, 24 participants (8 healthy; 16 with lesions) between the ages of 5 and 10. METHODS Rigid high-speed videoendoscopy at the rate of 8,000 frames per second was used to examine participants. Four objective vocal fold phase linearity measures were obtained to establish anterior-posterior contact and separation vibratory patterns. RESULTS All objective measures showed a difference between nonlesion and bilateral vocal fold lesion groups. Contact-separation patterns in all nonlesion girls and young pre-pubertal boys exhibited an anterior-to-posterior contact and posterior-to-anterior separation; while older boys differed. The objective measures of open quotient, left-right relative phase asymmetry and speed index, showed linear anterior-posterior patterns within the nonlesion group; while the bilateral vocal fold lesion group displayed nonlinear patterns. Patterns in the posterior region of the vocal fold were similar in both groups; while patterns in the anterior region differed. CONCLUSIONS This study suggests lesions have an effect on the anterior aspect of vocal fold vibratory patterns specifically anterior to the lesions. Age-related differences for males are also evidenced, prompting further investigation of laryngeal development in males and females from childhood to adulthood. This study could serve as a basis for the development of objective clinical measurements of vocal fold vibration in presence of lesions. Further findings could help redefine the theoretical framework of pediatric voice.
Collapse
|
12
|
Kopczynski B, Niebudek-Bogusz E, Pietruszewska W, Strumillo P. Segmentation of Glottal Images from High-Speed Videoendoscopy Optimized by Synchronous Acoustic Recordings. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22051751. [PMID: 35270897 PMCID: PMC8915112 DOI: 10.3390/s22051751] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 02/12/2022] [Accepted: 02/15/2022] [Indexed: 05/17/2023]
Abstract
Laryngeal high-speed videoendoscopy (LHSV) is an imaging technique offering novel visualization quality of the vibratory activity of the vocal folds. However, in most image analysis methods, the interaction of the medical personnel and access to ground truth annotations are required to achieve accurate detection of vocal folds edges. In our fully automatic method, we combine video and acoustic data that are synchronously recorded during the laryngeal endoscopy. We show that the image segmentation algorithm of the glottal area can be optimized by matching the Fourier spectra of the pre-processed video and the spectra of the acoustic recording during the phonation of sustained vowel /i:/. We verify our method on a set of LHSV recordings taken from subjects with normophonic voice and patients with voice disorders due to glottal insufficiency. We show that the computed geometric indices of the glottal area make it possible to discriminate between normal and pathologic voices. The median of the Open Quotient and Minimal Relative Glottal Area values for healthy subjects were 0.69 and 0.06, respectively, while for dysphonic subjects were 1 and 0.35, respectively. We also validate these results using independent phoniatrician experts.
Collapse
Affiliation(s)
- Bartosz Kopczynski
- Institute of Electronics, Lodz University of Technology, 90-924 Lodz, Poland;
| | - Ewa Niebudek-Bogusz
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, 90-001 Lodz, Poland; (E.N.-B.); (W.P.)
| | - Wioletta Pietruszewska
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, 90-001 Lodz, Poland; (E.N.-B.); (W.P.)
| | - Pawel Strumillo
- Institute of Electronics, Lodz University of Technology, 90-924 Lodz, Poland;
- Correspondence:
| |
Collapse
|
13
|
Movahhedi M, Geng B, Xue Q, Zheng X. A computational framework for patient-specific surgical planning of type 1 thyroplasty. JASA EXPRESS LETTERS 2021; 1:125203. [PMID: 36154377 DOI: 10.1121/10.0009084] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
A computational framework is proposed for virtual optimization of implant configurations of type 1 thyroplasty based on patient-specific laryngeal structures reconstructed from MRI images. Through integration of a muscle mechanics-based laryngeal posturing model, a flow-structure-acoustics interaction voice production model, a real-coded genetic algorithm, and virtual implant insertion, the framework acquires the implant configuration that achieves the optimal acoustic objectives. The framework is showcased by successfully optimizing an implant that restores acoustic features of a diseased voice resulted from unilateral vocal fold paralysis (UVFP) in producing a sustained vowel utterance. The sound intensity is improved from 62 dB (UVFP) to 81 dB (post-correction).
Collapse
Affiliation(s)
- Mohammadreza Movahhedi
- Department of Mechanical Engineering, University of Maine, Orono, Maine 04473, USA , , ,
| | - Biao Geng
- Department of Mechanical Engineering, University of Maine, Orono, Maine 04473, USA , , ,
| | - Qian Xue
- Department of Mechanical Engineering, University of Maine, Orono, Maine 04473, USA , , ,
| | - Xudong Zheng
- Department of Mechanical Engineering, University of Maine, Orono, Maine 04473, USA , , ,
| |
Collapse
|
14
|
Comparative analysis of high-speed videolaryngoscopy images and sound data simultaneously acquired from rigid and flexible laryngoscope: a pilot study. Sci Rep 2021; 11:20480. [PMID: 34650174 PMCID: PMC8516923 DOI: 10.1038/s41598-021-99948-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 10/04/2021] [Indexed: 12/03/2022] Open
Abstract
High-Speed Videoendoscopy (HSV) is becoming a robust tool for the assessment of vocal fold vibration in laboratory investigation and clinical practice. We describe the first successful application of flexible High Speed Videoendoscopy with innovative laser light source conducted in clinical settings. The acquired image and simultaneously recorded audio data are compared to the results obtained by means of a rigid endoscope. We demonstrated that the HSV recordings with fiber-optic laryngoscope have enabled obtaining consistently bright, color images suitable for parametrization of vocal fold oscillation similarly as in the case of the HSV data obtained from a rigid laryngoscope. The comparison of period and amplitude perturbation parameters calculated on the basis of image and audio data acquired from flexible and rigid HSV recording objectively confirm that flexible High-Speed Videoendoscopy is a more suitable method for examination of natural phonation. The HSV-based measures generated from this kymographic analysis are arguably a superior representation of the vocal fold vibrations than the acoustic analysis because their quantification is independent of the vocal tract influences. This experimental study has several implications for further research in the field of HSV application in clinical assessment of glottal pathologies nature and its effect on vocal folds vibrations.
Collapse
|
15
|
Malinowski J, Niebudek-Bogusz E, Just M, Morawska J, Racino A, Hoffman J, Barańska M, Kowalczyk MM, Pietruszewska W. Laryngeal High-Speed Videoendoscopy with Laser Illumination: A Preliminary Report. Otolaryngol Pol 2021; 75:1-10. [PMID: 35175220 DOI: 10.5604/01.3001.0015.2575] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
<br><b>Introduction:</b> Advances in computer image analysis have enabled the use of new functional imaging methods in the diagnosis of laryngeal diseases. Particularly interesting techniques of dynamic laryngeal imaging involve High Speed Videoendoscopy (HSV). This still-developed technique allows to overcome the limitations of laryngovideostroboscopy (LVS) and a more detailed analysis of the glottal function based on the image of the actual vibrations of the vocal folds. It also enables the determination of objective coefficients parameterizing phonatory vibrations of the vocal folds.</br> <br><b>Aim:</b> The aim of this pilot study was to evaluate the use of a high-speed videoendoscopy set with laser illumination for the diagnosis of glottic pathology in ENT practice.</br> <br><b>Material and methods:</b> The study included 40 patients who underwent LVS followed by HSV. The modern HSV examination kit - Advanced Larynx Imager System (ALIS), used for the first time in a clinical setting in Poland, is characterized by significantly improved, compared to the previously used high-speed cameras, operational parameters - a light head, the possibility of continuous lighting operation without excessive heating of the head tip, registration of the image in full color scale. Thanks to such modernization, the safety and course of the examination do not differ from laryngoscopy conducted with commonly used recorders. The device owes some of these improvements to a laser illuminator which was used for the first time as the main light source in a high-speed camera. In the study, two cases were selected to present the results of HSV and the analysis of the generated kymograms - a woman with no glottic pathology and a man with a polyp of the right vocal fold. In the first case, the HSV examination compared with the LVS revealed a discrete glottis functional disorder in the form of a tendency to hyperphonation. The patient with an organic lesion had a clearly visible irregularity of vocal fold vibrations, which also allowed to trace mucosal wave disturbances related to its reflection from the pathological structure of the glottis and the formation of a return wave, both on the fold affected by the lesion and, to a lesser extent, contralaterally. The glottic dysfunctions observed in the studied patients were confirmed in the generated kymograms and the graphs of the glottal width waveform (GWW), as well as in the parameters calculated on their basis, assessing the frequency and amplitude of phonatory vibrations.</br> <br><b>Conclusions:</b> The use of high-speed videoendoscopy allows for a much more accurate assessment of the phonatory function of the glottis than in laryngovideostroboscopy. The presented HSV system allows for obtaining high quality kinematic images of the larynx, color fidelity, and contrast. The use of this technology in laryngological practice enables precise structural and functional assessment of the glottis and detection of discrete phonation disorders that elude the techniques used so far.</br>.
Collapse
Affiliation(s)
- Jakub Malinowski
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, Poland
| | - Ewa Niebudek-Bogusz
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, Poland
| | - Marcin Just
- Diagnova Technologies, Wroclaw Technology Park, Wroclaw, Poland
| | - Joanna Morawska
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, Poland
| | - Anna Racino
- Diagnova Technologies, Wroclaw Technology Park, Wroclaw, Poland
| | - Joanna Hoffman
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, Poland
| | - Magda Barańska
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, Poland
| | | | - Wioletta Pietruszewska
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, Poland
| |
Collapse
|
16
|
Motie-Shirazi M, Zañartu M, Peterson SD, Erath BD. Vocal fold dynamics in a synthetic self-oscillating model: Intraglottal aerodynamic pressure and energy. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:1332. [PMID: 34470335 PMCID: PMC8387087 DOI: 10.1121/10.0005882] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 07/21/2021] [Accepted: 07/26/2021] [Indexed: 06/13/2023]
Abstract
Self-sustained oscillations of the vocal folds (VFs) during phonation are the result of the energy exchange between the airflow and VF tissue. Understanding this mechanism requires accurate investigation of the aerodynamic pressures acting on the VF surface during oscillation. A self-oscillating silicone VF model was used in a hemilaryngeal flow facility to measure the time-varying pressure distribution along the inferior-superior thickness of the VF and at four discrete locations in the anterior-posterior direction. It was found that the intraglottal pressures during the opening and closing phases of the glottis are highly dependent on three-dimensional and unsteady flow behaviors. The measured aerodynamic pressures and estimates of the medial surface velocity were used to compute the intraglottal energy transfer from the airflow to the VFs. The energy was greatest at the anterior-posterior midline and decreased significantly toward the anterior/posterior endpoints. The findings provide insight into the dynamics of the VF oscillation and potential causes of some VF disorders.
Collapse
Affiliation(s)
- Mohsen Motie-Shirazi
- Department of Mechanical and Aeronautical Engineering, Clarkson University, Potsdam, New York, USA
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Sean D Peterson
- Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, Ontario, Canada
| | - Byron D Erath
- Department of Mechanical and Aeronautical Engineering, Clarkson University, Potsdam, New York, USA
| |
Collapse
|
17
|
Motie-Shirazi M, Zañartu M, Peterson SD, Erath BD. Vocal fold dynamics in a synthetic self-oscillating model: Contact pressure and dissipated-energy dose. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:478. [PMID: 34340498 PMCID: PMC8298101 DOI: 10.1121/10.0005596] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 06/18/2021] [Accepted: 06/21/2021] [Indexed: 06/13/2023]
Abstract
The energy dissipated during vocal fold (VF) contact is a predictor of phonotrauma. Difficulty measuring contact pressure has forced prior energy dissipation estimates to rely upon generalized approximations of the contact dynamics. To address this shortcoming, contact pressure was measured in a self-oscillating synthetic VF model with high spatiotemporal resolution using a hemilaryngeal configuration. The approach yields a temporal resolution of less than 0.26 ms and a spatial resolution of 0.254 mm in the inferior-superior direction. The average contact pressure was found to be 32% of the peak contact pressure, 60% higher than the ratio estimated in prior studies. It was found that 52% of the total power was dissipated due to collision. The power dissipated during contact was an order of magnitude higher than the power dissipated due to internal friction during the non-contact phase of oscillation. Both the contact pressure magnitude and dissipated power were found to be maximums at the mid anterior-posterior position, supporting the idea that collision is responsible for the formation of benign lesions, which normally appear at the middle third of the VF.
Collapse
Affiliation(s)
- Mohsen Motie-Shirazi
- Department of Mechanical and Aeronautical Engineering, Clarkson University, Potsdam, New York 13699, USA
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Sean D Peterson
- Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, Ontario, Canada
| | - Byron D Erath
- Department of Mechanical and Aeronautical Engineering, Clarkson University, Potsdam, New York 13699, USA
| |
Collapse
|
18
|
Stewart ME, Erath BD. Investigating blunt force trauma to the larynx: The role of inferior-superior vocal fold displacement on phonation. J Biomech 2021; 121:110377. [PMID: 33819698 DOI: 10.1016/j.jbiomech.2021.110377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 02/24/2021] [Accepted: 03/01/2021] [Indexed: 11/26/2022]
Abstract
Blunt force trauma to the larynx, which may result from motor vehicle collisions, sports activities, etc., can cause significant damage, often leading to displaced fractures of the laryngeal cartilages, thereby disrupting vocal function. Current surgical interventions primarily focus on airway restoration to stabilize the patient, with restoration of vocal function usually being a secondary consideration. Due to laryngeal fracture, asymmetric vertical misalignment of the left or right vocal fold (VF) in the inferior-superior direction often occurs. This affects VF closure and can lead to a weak, breathy voice requiring increased vocal effort. It is unclear, however, how much vertical VF misalignment can be tolerated before voice quality degrades significantly. To address this need, the influence of inferior-superior VF displacement on phonation is investigated in 1.0mm increments using synthetic, self-oscillating VF models in a physiologically-representative facility. Acoustic (SPL, frequency, H1-H2, jitter, and shimmer), kinematic (amplitude and phase differences), and aerodynamic parameters (flow rate and subglottal pressure) are investigated as a function of inferior-superior vertical displacement. Significant findings include that once the inferior-superior medial length of the VF is surpassed, sustained phonation degrades precipitously, becoming severely pathological. If laryngeal reconstruction approaches can ensure VF contact is maintained during phonation (i.e., vertical displacement doesn't surpass VF medial length), improved vocal outcomes are expected.
Collapse
Affiliation(s)
- Molly E Stewart
- Department of Mechanical and Aeronautical Engineering, Clarkson University, 8 Clarkson Ave, Potsdam, NY 13699, United States
| | - Byron D Erath
- Department of Mechanical and Aeronautical Engineering, Clarkson University, 8 Clarkson Ave, Potsdam, NY 13699, United States.
| |
Collapse
|
19
|
Fitting synthetic to clinical kymographic images for deriving kinematic vocal fold parameters: Application to left-right vibratory phase differences. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2020.102253] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
20
|
Gómez P, Kist AM, Schlegel P, Berry DA, Chhetri DK, Dürr S, Echternach M, Johnson AM, Kniesburges S, Kunduk M, Maryn Y, Schützenberger A, Verguts M, Döllinger M. BAGLS, a multihospital Benchmark for Automatic Glottis Segmentation. Sci Data 2020; 7:186. [PMID: 32561845 PMCID: PMC7305104 DOI: 10.1038/s41597-020-0526-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 05/15/2020] [Indexed: 02/06/2023] Open
Abstract
Laryngeal videoendoscopy is one of the main tools in clinical examinations for voice disorders and voice research. Using high-speed videoendoscopy, it is possible to fully capture the vocal fold oscillations, however, processing the recordings typically involves a time-consuming segmentation of the glottal area by trained experts. Even though automatic methods have been proposed and the task is particularly suited for deep learning methods, there are no public datasets and benchmarks available to compare methods and to allow training of generalizing deep learning models. In an international collaboration of researchers from seven institutions from the EU and USA, we have created BAGLS, a large, multihospital dataset of 59,250 high-speed videoendoscopy frames with individually annotated segmentation masks. The frames are based on 640 recordings of healthy and disordered subjects that were recorded with varying technical equipment by numerous clinicians. The BAGLS dataset will allow an objective comparison of glottis segmentation methods and will enable interested researchers to train their own models and compare their methods.
Collapse
Affiliation(s)
- Pablo Gómez
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Waldstraße 1, 91054, Erlangen, Germany.
| | - Andreas M Kist
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Waldstraße 1, 91054, Erlangen, Germany.
| | - Patrick Schlegel
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Waldstraße 1, 91054, Erlangen, Germany
| | - David A Berry
- Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California, Los Angeles, Los Angeles, California, USA
| | - Dinesh K Chhetri
- Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California, Los Angeles, Los Angeles, California, USA
| | - Stephan Dürr
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Waldstraße 1, 91054, Erlangen, Germany
| | - Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Munich, Germany
| | - Aaron M Johnson
- NYU Voice Center, Department of Otolaryngology - Head and Neck Surgery, New York University School of Medicine, New York, New York, USA
| | - Stefan Kniesburges
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Waldstraße 1, 91054, Erlangen, Germany
| | - Melda Kunduk
- Department of Communication Sciences and Disorders, Louisiana State University, Baton Rouge, Louisiana, USA
| | - Youri Maryn
- European Institute for ORL-HNS, Department of Otorhinolaryngology and Head & Neck Surgery, Sint-Augustinus GZA, Wilrijk, Belgium
- Department of Speech, Language and Hearing sciences, University of Ghent, Ghent, Belgium
- Faculty of Education, Health and Social Work, University College Ghent, Ghent, Belgium
- Faculty of Psychology and Educational Sciences, School of Logopedics, Université Catholique de Louvain, Louvain-la-Neuve, Belgium
- Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Waldstraße 1, 91054, Erlangen, Germany
| | - Monique Verguts
- European Institute for ORL-HNS, Department of Otorhinolaryngology and Head & Neck Surgery, Sint-Augustinus GZA, Wilrijk, Belgium
- Department of Otorhinolaryngology and Voice Disorders, Diest General Hospital, Diest, Belgium
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Waldstraße 1, 91054, Erlangen, Germany
| |
Collapse
|
21
|
Mohd Khairuddin KA, Ahmad K, Ibrahim HM, Yan Y. Effects of Using Laryngeal High-Speed Videoendoscopy Images Visualizing Partial Views of The Glottis on Measurement Outcomes. J Voice 2020; 36:106-112. [PMID: 32456835 DOI: 10.1016/j.jvoice.2020.04.027] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 04/21/2020] [Accepted: 04/22/2020] [Indexed: 11/29/2022]
Abstract
Ideally, an analysis method for laryngeal high-speed videoendoscopy (LHSV) based on the glottal area waveforms (GAW) requires images of a complete view of the glottis to ensure findings that are representatives of the vibratory behaviors of the whole vocal folds. However, in practice, the preferred images may not be obtained at all times. Often, the only available images that a clinician has to work with consist of a partial view of the glottis. This study aims to examine the effects of using images of a partial view of the glottis (ie, posterior-middle, anterior-middle, or middle) on the LHSV-based measures (ie, fundamental frequency (F0GAW), frequency perturbation (jitterGAW), amplitude perturbation (shimmerGAW), open quotient (OQGAW), and Nyquist plot). The participants consisted of 9 young normophonic females. The procedures involved LHSV recording of the vibration of the vocal folds. The images of the complete view of the glottis were analyzed to obtain the LHSV-based measures. The same images were used to simulate the images of partial views of the glottis by changing the outline of the region of interest to include only either the posterior-middle, anterior-middle, or middle parts of the glottis. The LHSV-based measures from the images of the partial views were then compared to those with the complete view . The results showed that all LHSV-based measures from the images of the posterior-middle view were similar to those of the complete view. However, only the F0GAW, jitterGAW, and shimmerGAW from the images of the anterior-middle and middle views were similar to those of the complete view. Lower OQGAW and different Nyquist plots than those of the complete view were generated by the images of the anterior-middle and middle views. In conclusion, all LHSV-based measures from the images of the posterior-middle view of the glottis, and only the F0GAW, jitterGAW, and shimmerGAW from the images of the anterior-middle and middle views of the glottis reflect the vibratory behaviors of the whole vocal folds. The same conclusion could not be applied to the OQGAW and Nyquist plots of the images of the anterior-middle and middle views of the glottis. A possible effect of the presence or absence of a posterior glottal gap on the findings warrants further confirmation.
Collapse
Affiliation(s)
- Khairy Anuar Mohd Khairuddin
- Speech Sciences Program, Centre for Rehabilitation and Special Needs, Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia; Speech Pathology Program, School of Health Sciences, Universiti Sains Malaysia, Kelantan, Malaysia.
| | - Kartini Ahmad
- Speech Sciences Program, Centre for Rehabilitation and Special Needs, Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Hasherah Mohd Ibrahim
- Speech Sciences Program, Centre for Rehabilitation and Special Needs, Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Yuling Yan
- Department of Bioengineering, School of Engineering, Santa Clara University, California
| |
Collapse
|
22
|
Abstract
This review provides a comprehensive compilation, from a digital image processing point of view of the most important techniques currently developed to characterize and quantify the vibration behaviour of the vocal folds, along with a detailed description of the laryngeal image modalities currently used in the clinic. The review presents an overview of the most significant glottal-gap segmentation and facilitative playbacks techniques used in the literature for the mentioned purpose, and shows the drawbacks and challenges that still remain unsolved to develop robust vocal folds vibration function analysis tools based on digital image processing.
Collapse
|
23
|
Fehling MK, Grosch F, Schuster ME, Schick B, Lohscheller J. Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network. PLoS One 2020; 15:e0227791. [PMID: 32040514 PMCID: PMC7010264 DOI: 10.1371/journal.pone.0227791] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 12/25/2019] [Indexed: 01/22/2023] Open
Abstract
The objective investigation of the dynamic properties of vocal fold vibrations demands the recording and further quantitative analysis of laryngeal high-speed video (HSV). Quantification of the vocal fold vibration patterns requires as a first step the segmentation of the glottal area within each video frame from which the vibrating edges of the vocal folds are usually derived. Consequently, the outcome of any further vibration analysis depends on the quality of this initial segmentation process. In this work we propose for the first time a procedure to fully automatically segment not only the time-varying glottal area but also the vocal fold tissue directly from laryngeal high-speed video (HSV) using a deep Convolutional Neural Network (CNN) approach. Eighteen different Convolutional Neural Network (CNN) network configurations were trained and evaluated on totally 13,000 high-speed video (HSV) frames obtained from 56 healthy and 74 pathologic subjects. The segmentation quality of the best performing Convolutional Neural Network (CNN) model, which uses Long Short-Term Memory (LSTM) cells to take also the temporal context into account, was intensely investigated on 15 test video sequences comprising 100 consecutive images each. As performance measures the Dice Coefficient (DC) as well as the precisions of four anatomical landmark positions were used. Over all test data a mean Dice Coefficient (DC) of 0.85 was obtained for the glottis and 0.91 and 0.90 for the right and left vocal fold (VF) respectively. The grand average precision of the identified landmarks amounts 2.2 pixels and is in the same range as comparable manual expert segmentations which can be regarded as Gold Standard. The method proposed here requires no user interaction and overcomes the limitations of current semiautomatic or computational expensive approaches. Thus, it allows also for the analysis of long high-speed video (HSV)-sequences and holds the promise to facilitate the objective analysis of vocal fold vibrations in clinical routine. The here used dataset including the ground truth will be provided freely for all scientific groups to allow a quantitative benchmarking of segmentation approaches in future.
Collapse
Affiliation(s)
- Mona Kirstin Fehling
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, Trier, Germany
| | - Fabian Grosch
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, Trier, Germany
| | - Maria Elke Schuster
- Department of Otorhinolaryngology and Head and Neck Surgery, University of Munich, Campus Grosshadern, München, Germany
| | - Bernhard Schick
- Department of Otorhinolaryngology, Saarland University Hospital, Homburg/Saar, Germany
| | - Jörg Lohscheller
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, Trier, Germany
| |
Collapse
|
24
|
Kim GH, Lee YW, Bae IH, Park HJ, Wang SG, Kwon SB. Usefulness of Two-Dimensional Digital Kymography in Patients With Vocal Fold Scarring. J Voice 2019; 33:906-914. [DOI: 10.1016/j.jvoice.2018.06.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Revised: 06/04/2018] [Accepted: 06/06/2018] [Indexed: 11/29/2022]
|
25
|
Motie-Shirazi M, Zañartu M, Peterson SD, Mehta DD, Kobler JB, Hillman RE, Erath BD. Toward Development of a Vocal Fold Contact Pressure Probe: Sensor Characterization and Validation Using Synthetic Vocal Fold Models. APPLIED SCIENCES-BASEL 2019; 9. [PMID: 32377408 PMCID: PMC7202565 DOI: 10.3390/app9153002] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Excessive vocal fold collision pressures during phonation are considered to play a primary role in the formation of benign vocal fold lesions, such as nodules. The ability to accurately and reliably acquire intraglottal pressure has the potential to provide unique insights into the pathophysiology of phonotrauma. Difficulties arise, however, in directly measuring vocal fold contact pressures due to physical intrusion from the sensor that may disrupt the contact mechanics, as well as difficulty in determining probe/sensor position relative to the contact location. These issues are quantified and addressed through the implementation of a novel approach for identifying the timing and location of vocal fold contact, and measuring intraglottal and vocal fold contact pressures via a pressure probe embedded in the wall of a hemi-laryngeal flow facility. The accuracy and sensitivity of the pressure measurements are validated against ground truth values. Application to in vivo approaches are assessed by acquiring intraglottal and VF contact pressures using a synthetic, self-oscillating vocal fold model in a hemi-laryngeal configuration, where the sensitivity of the measured intraglottal and vocal fold contact pressure relative to the sensor position is explored.
Collapse
Affiliation(s)
- Mohsen Motie-Shirazi
- Department of Mechanical & Aeronautical Engineering, Clarkson University, Potsdam, NY 13699, USA
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso 2390123, Chile
| | - Sean D. Peterson
- Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| | - Daryush D. Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
| | - James B. Kobler
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Robert E. Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Byron D. Erath
- Department of Mechanical & Aeronautical Engineering, Clarkson University, Potsdam, NY 13699, USA
- Correspondence:
| |
Collapse
|
26
|
Zhang Z. Vocal fold contact pressure in a three-dimensional body-cover phonation model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:256. [PMID: 31370600 PMCID: PMC6642050 DOI: 10.1121/1.5116138] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Revised: 06/18/2019] [Accepted: 06/20/2019] [Indexed: 05/18/2023]
Abstract
The goal of this study is to identify vocal fold geometric and mechanical conditions that are likely to produce large contact pressure and thus high risk of vocal fold injury. Using a three-dimensional computational model of phonation, parametric simulations are performed with co-variations in vocal fold geometry and stiffness, with and without a vocal tract. For each simulation, the peak contact pressure is calculated. The results show that the subglottal pressure and the transverse stiffness of the vocal folds in the coronal plane have the largest and most consistent effect on the peak contact pressure, indicating the importance of maintaining a balance between the subglottal pressure and transverse stiffness to avoiding vocal fold injury. The presence of a vocal tract generally increases the peak contact pressure, particularly for an open-mouth vocal tract configuration. While a low degree of vocal fold approximation significantly reduces vocal fold contact pressure, for conditions of moderate and tight vocal fold approximation changes in vocal fold approximation may increase or decrease the peak contact pressure. The effects of the medial surface thickness and vocal fold stiffness along the anterior-posterior direction are similarly inconsistent and vary depending on other control parameters and the vocal tract configuration.
Collapse
Affiliation(s)
- Zhaoyan Zhang
- Department of Head and Neck Surgery, University of California, Los Angeles, 31-24 Rehabilitation Center, 1000 Veteran Avenue, Los Angeles, California 90095-1794, USA
| |
Collapse
|
27
|
Lee JC, Wang SG, Sung ES, Bae IH, Kim ST, Lee YW. Clinical Practicability of a Newly Developed Real-time Digital Kymographic System. J Voice 2019; 33:346-351. [DOI: 10.1016/j.jvoice.2017.10.024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Revised: 10/28/2017] [Accepted: 10/31/2017] [Indexed: 10/18/2022]
|
28
|
Sadeghi H, Döllinger M, Kaltenbacher M, Kniesburges S. Aerodynamic impact of the ventricular folds in computational larynx models. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:2376. [PMID: 31046372 DOI: 10.1121/1.5098775] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Accepted: 04/01/2019] [Indexed: 06/09/2023]
Abstract
Ventricular folds (VeFs) act as passive, non-moving structures during normal phonation. According to the literature, VeFs potentially aid the flow-driven oscillations of the vocal folds (VFs) that produce the primary sound of human phonation. In this study, large eddy simulations were performed to analyze this influence in a numerical model with imposed VF motion as measured experimentally from a synthetic silicone vocal fold model. Model configurations with and without VeFs were considered. Furthermore, configurations with rectangular and elliptical glottis shapes were simulated to investigate the effects of three-dimensional glottal jet evolutions. Results showed that VeFs increased flow rate and transglottal pressure difference by a decrease in the pressure level in the ventricles immediately downstream of the VFs. This led to an increase in the glottal flow resistance, increased energy transfer rate between the flow and VFs, and a simultaneous decrease in the laryngeal flow resistance, which shows a higher amount of kinetic energy in the glottal flow. This enhancement was more pronounced in the rectangular glottis and varied with the subglottal pressure and VeF gap size.
Collapse
Affiliation(s)
- Hossein Sadeghi
- Divison of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| | - Michael Döllinger
- Divison of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| | - Manfred Kaltenbacher
- Institute of Mechanics and Mechatronics, Technical University Vienna, Getreidemarkt 9, 1060 Vienna, Austria
| | - Stefan Kniesburges
- Divison of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
| |
Collapse
|
29
|
Sielska-Badurek EM, Jędra K, Sobol M, Niemczyk K, Osuch-Wójcikiewicz E. Laryngeal stroboscopy-Normative values for amplitude, open quotient, asymmetry and phase difference in young adults. Clin Otolaryngol 2018; 44:158-165. [PMID: 30353981 DOI: 10.1111/coa.13247] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2017] [Revised: 05/10/2018] [Accepted: 10/18/2018] [Indexed: 11/26/2022]
Abstract
OBJECTIVE To provide the normative values for laryngeal stroboscopy (LS) concerning amplitude, open quotient, asymmetry and phase difference in healthy, young subjects. STUDY DESIGN Prospective case-control study. SETTING Patients treated at a single institute. METHODS A total of 68 healthy subjects were included in the study (35 women, 33 men), aged 18-35 years. After obtaining LS recordings, image processing was performed to attain parameters of vocal fold vibration. RESULTS In women, the location of the maximum vibration amplitude is approximately in the 1/3 posterior part of the glottis, while in men, the location is moved to the glottis centre. In males, the relative amplitude vibration of the vocal folds in the 1/3 anterior part of the glottis was significantly higher than in females (P = 0.029). Women showed significantly higher open quotients (OQ) at the posterior part of the glottis than the male subjects (P < 0.001) and men presented significantly higher OQ at the anterior part of the glottis than the females (P < 0.001). The average OQ values for both sexes were almost the same. Females showed significantly higher relative glottal gap area (P = 0.044). Women presented a significantly lower amplitude asymmetry than men (P = 0.002). The weighted absolute left-right phase difference reached up to 24° and remained insignificantly higher in the men than the women (P = 0.142). CONCLUSIONS The study provides normative values for LS in young adults for the measurement of therapy outcomes in patients with voice disorders and realisation of evidence-based medicine. The LS parametrisation is easy to perform in clinical practice.
Collapse
Affiliation(s)
| | - Katarzyna Jędra
- Department of Otolaryngology, Medical University of Warsaw, Warsaw, Poland
| | - Maria Sobol
- Department of Biophysics and Human Physiology, Medical University of Warsaw, Warsaw, Poland
| | - Kazimierz Niemczyk
- Department of Otolaryngology, Medical University of Warsaw, Warsaw, Poland
| | | |
Collapse
|
30
|
Kumar SP, Phadke KV, Vydrová J, Novozámský A, Zita A, Zitová B, Švec JG. Visual and Automatic Evaluation of Vocal Fold Mucosal Waves Through Sharpness of Lateral Peaks in High-Speed Videokymographic Images. J Voice 2018; 34:170-178. [PMID: 30314931 DOI: 10.1016/j.jvoice.2018.08.022] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2018] [Revised: 07/12/2018] [Accepted: 08/30/2018] [Indexed: 01/14/2023]
Abstract
INTRODUCTION The sharpness of lateral peaks is a visually helpful clinical feature in high-speed videokymographic (VKG) images indicating vertical phase differences and mucosal waves on the vibrating vocal folds and giving insights into the health and pliability of vocal fold mucosa. This study aims at investigating parameters that can be helpful in objectively quantifying the lateral peak sharpness from the VKG images. METHOD Forty-five clinical VKG images with different degrees of sharpness of lateral peaks were independently evaluated visually by three raters. The ratings were compared to parameters obtained by automatic image analysis of the vocal fold contours: Open Time Percentage Quotients (OTQ) and Plateau Quotients (PQ). The OTQ parameters were derived as fractions of the period during which the vocal fold displacement exceeds a predetermined percentage of the vibratory amplitude. The PQ parameters were derived similarly but as a fraction of the open phase instead of a period. RESULTS The best correspondence between the visual ratings and the automatically derived quotients were found for the OTQ and PQ parameters derived at 95% and 80% of the amplitude, named OTQ95, PQ95, OTQ80 and PQ80. Their Spearman's rank correlation coefficients were in the range of 0.73 to 0.77 (P < 0.001) indicating strong relationships with the visual ratings. The strengths of these correlations were similar to those found from inter-rater comparisons of visual evaluations of peak sharpness. CONCLUSION The Open time percentage and Plateau quotients at 95% and 80% of the amplitude stood out as the possible candidates for capturing the sharpness of the lateral peaks with their reliability comparable to that of visual ratings.
Collapse
Affiliation(s)
- S Pravin Kumar
- Voice Research Lab, Department of Biophysics, Faculty of Science, Palacký University, Olomouc, Czech Republic.
| | - Ketaki Vasant Phadke
- Voice Research Lab, Department of Biophysics, Faculty of Science, Palacký University, Olomouc, Czech Republic
| | - Jitka Vydrová
- Voice and Hearing Centre, Medical Healthcom Ltd., Prague, Czech Republic
| | - Adam Novozámský
- Department of Image Processing, Institute of Information Theory and Automation of the Czech Academy of Sciences, Prague, Czech Republic
| | - Aleš Zita
- Department of Image Processing, Institute of Information Theory and Automation of the Czech Academy of Sciences, Prague, Czech Republic
| | - Barbara Zitová
- Department of Image Processing, Institute of Information Theory and Automation of the Czech Academy of Sciences, Prague, Czech Republic
| | - Jan G Švec
- Voice Research Lab, Department of Biophysics, Faculty of Science, Palacký University, Olomouc, Czech Republic.
| |
Collapse
|
31
|
Pathological Voice Source Analysis System Using a Flow Waveform-Matched Biomechanical Model. Appl Bionics Biomech 2018; 2018:3158439. [PMID: 30057647 PMCID: PMC6051280 DOI: 10.1155/2018/3158439] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Accepted: 05/24/2018] [Indexed: 11/24/2022] Open
Abstract
Voice production occurs through vocal cord and vibration coupled to glottal airflow. Vocal cord lesions affect the vocal system and lead to voice disorders. In this paper, a pathological voice source analysis system is designed. This study integrates nonlinear dynamics with an optimized asymmetric two-mass model to explore nonlinear characteristics of vocal cord vibration, and changes in acoustic parameters, such as fundamental frequency, caused by distinct subglottal pressure and varying degrees of vocal cord paralysis are analyzed. Various samples of sustained vowel /a/ of normal and pathological voices were extracted from MEEI (Massachusetts Eye and Ear Infirmary) database. A fitting procedure combining genetic particle swarm optimization and a quasi-Newton method was developed to optimize the biomechanical model parameters and match the targeted voice source. Experimental results validate the applicability of the proposed model to reproduce vocal cord vibration with high accuracy, and show that paralyzed vocal cord increases the model coupling stiffness.
Collapse
|
32
|
Semmler M, Döllinger M, Patel RR, Ziethe A, Schützenberger A. Clinical relevance of endoscopic three-dimensional imaging for quantitative assessment of phonation. Laryngoscope 2018. [DOI: 10.1002/lary.27165] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Marion Semmler
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head and Neck Surgery; University Hospital Erlangen Medical School; Erlangen Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head and Neck Surgery; University Hospital Erlangen Medical School; Erlangen Germany
| | - Rita R. Patel
- Department of Speech and Hearing Sciences; Indiana University; Bloomington Indiana U.S.A
| | - Anke Ziethe
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head and Neck Surgery; University Hospital Erlangen Medical School; Erlangen Germany
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head and Neck Surgery; University Hospital Erlangen Medical School; Erlangen Germany
| |
Collapse
|
33
|
Krasnodębska P, Szkiełkowska A, Miaśkiewicz B, Włodarczyk E, Domeracka-Kołodziej A, Skarżyński H. Objective measurement of mucosal wave parameters in diagnosing benign lesions of the vocal folds. LOGOP PHONIATR VOCO 2018; 44:73-78. [PMID: 29318925 DOI: 10.1080/14015439.2017.1402950] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
INTRODUCTION The diagnostic procedure of phonation is dominated by subjective assessment tools. It seems reasonable to seek methods of quantitative glottal cycle assessment. OBJECTIVE The aim of our study was the analysis of open quotients (OQ) of the glottis. METHODS One hundred and twenty-four people were included in the study. Methodology was based on tools available in everyday phoniatrics practice - laryngovideostroboscopy (LVS) and electroglottography (EGG). There were statistically significant differences between control and studied group. Vocal fold polyps, nodules and edema influence glottal function in a different manner, what can be illustrated by objective glottal function parameters. Establishing Videostroboscopic Open Quotient values from three parts of glottis and Electroglottographic Quasi Open Quotient (QOQ) value, can help in dividing patients with benign lesions of vocal folds according to the type of disease. RESULTS AND CONCLUSIONS Measurement of the open quotient from three parts of the glottis helps to differentially diagnose and localize glottal vocal fold lesions. Videostroboscopic Open Quotient and Electroglottographic QOQ values can be used to quantify the glottal cycle. Videostroboscopic Open Quotient, Electroglottographic QOQ and their ratio varies depending on the type of organic dysphonia.
Collapse
Affiliation(s)
- Paulina Krasnodębska
- a Audiology and Phoniatrics Clinic , Institute of Physiology and Pathology of Hearing , Warsaw , Poland
| | - Agata Szkiełkowska
- a Audiology and Phoniatrics Clinic , Institute of Physiology and Pathology of Hearing , Warsaw , Poland.,b Audiology and Phoniatrics Faculty , Fryderyk Chopin University of Music , Warsaw , Poland
| | - Beata Miaśkiewicz
- a Audiology and Phoniatrics Clinic , Institute of Physiology and Pathology of Hearing , Warsaw , Poland
| | - Elżbieta Włodarczyk
- a Audiology and Phoniatrics Clinic , Institute of Physiology and Pathology of Hearing , Warsaw , Poland
| | - Anna Domeracka-Kołodziej
- a Audiology and Phoniatrics Clinic , Institute of Physiology and Pathology of Hearing , Warsaw , Poland
| | - Henryk Skarżyński
- a Audiology and Phoniatrics Clinic , Institute of Physiology and Pathology of Hearing , Warsaw , Poland.,b Audiology and Phoniatrics Faculty , Fryderyk Chopin University of Music , Warsaw , Poland
| |
Collapse
|
34
|
Döllinger M, Gómez P, Patel RR, Alexiou C, Bohr C, Schützenberger A. Biomechanical simulation of vocal fold dynamics in adults based on laryngeal high-speed videoendoscopy. PLoS One 2017; 12:e0187486. [PMID: 29121085 PMCID: PMC5679561 DOI: 10.1371/journal.pone.0187486] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Accepted: 10/18/2017] [Indexed: 12/18/2022] Open
Abstract
MOTIVATION Human voice is generated in the larynx by the two oscillating vocal folds. Owing to the limited space and accessibility of the larynx, endoscopic investigation of the actual phonatory process in detail is challenging. Hence the biomechanics of the human phonatory process are still not yet fully understood. Therefore, we adapt a mathematical model of the vocal folds towards vocal fold oscillations to quantify gender and age related differences expressed by computed biomechanical model parameters. METHODS The vocal fold dynamics are visualized by laryngeal high-speed videoendoscopy (4000 fps). A total of 33 healthy young subjects (16 females, 17 males) and 11 elderly subjects (5 females, 6 males) were recorded. A numerical two-mass model is adapted to the recorded vocal fold oscillations by varying model masses, stiffness and subglottal pressure. For adapting the model towards the recorded vocal fold dynamics, three different optimization algorithms (Nelder-Mead, Particle Swarm Optimization and Simulated Bee Colony) in combination with three cost functions were considered for applicability. Gender differences and age-related kinematic differences reflected by the model parameters were analyzed. RESULTS AND CONCLUSION The biomechanical model in combination with numerical optimization techniques allowed phonatory behavior to be simulated and laryngeal parameters involved to be quantified. All three optimization algorithms showed promising results. However, only one cost function seems to be suitable for this optimization task. The gained model parameters reflect the phonatory biomechanics for men and women well and show quantitative age- and gender-specific differences. The model parameters for younger females and males showed lower subglottal pressures, lower stiffness and higher masses than the corresponding elderly groups. Females exhibited higher subglottal pressures, smaller oscillation masses and larger stiffness than the corresponding similar aged male groups. Optimizing numerical models towards vocal fold oscillations is useful to identify underlying laryngeal components controlling the phonatory process.
Collapse
Affiliation(s)
- Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, Medical School, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Pablo Gómez
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, Medical School, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Rita R. Patel
- Department of Speech and Hearing Sciences, Indiana University, Bloomington, Indiana, Indiana, United States of America
| | - Christoph Alexiou
- Section of Experimental Oncology and Nanomedicine (SEON), Department of Otorhinolaryngology, Head and Neck Surgery, Medical School, Else Kröner-Fresenius-Stiftung-Professorship, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Christopher Bohr
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, Medical School, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, Medical School, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
35
|
Evaluation of clinical value of videokymography for diagnosis and treatment of voice disorders. Eur Arch Otorhinolaryngol 2017; 274:3941-3949. [PMID: 28856469 DOI: 10.1007/s00405-017-4726-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 08/21/2017] [Indexed: 10/19/2022]
Abstract
This study aimed at determining the clinical value of videokymography (VKG) as an additional tool for the assessment of voice disorders. 105 subjects with voice disorders were examined by an experienced laryngologist. A questionnaire was used to specify diagnosis, diagnostic confidence, and treatment recommendations before and after VKG. The first part of questionnaire was filled by the laryngologist for each patient after routine ear-nose-throat evaluation, including stroboscopy, the second part after the subsequent VKG examination. In 31% of subjects VKG confirmed the stroboscopic diagnosis, in 44% it made the diagnosis more accurate, in 20% there was adjustment of the treatment, and in 5% it was not found diagnostically useful. After VKG the diagnostic confidence increased in 68% of the subjects. VKG may help clinicians to take some important treatment decisions and may be recommended to be performed in patients, where clinicians are uncertain about diagnosis and treatment.
Collapse
|
36
|
Herbst CT, Schutte HK, Bowling DL, Svec JG. Comparing Chalk With Cheese—The EGG Contact Quotient Is Only a Limited Surrogate of the Closed Quotient. J Voice 2017; 31:401-409. [DOI: 10.1016/j.jvoice.2016.11.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2016] [Revised: 11/06/2016] [Accepted: 11/08/2016] [Indexed: 10/20/2022]
|
37
|
High-speed Videolaryngoscopy: Quantitative Parameters of Glottal Area Waveforms and High-speed Kymography in Healthy Individuals. J Voice 2017; 31:282-290. [DOI: 10.1016/j.jvoice.2016.09.026] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Revised: 09/22/2016] [Accepted: 09/23/2016] [Indexed: 11/21/2022]
|
38
|
Andrade-Miranda G, Henrich Bernardoni N, Godino-Llorente JI. Synthesizing the motion of the vocal folds using optical flow based techniques. Biomed Signal Process Control 2017. [DOI: 10.1016/j.bspc.2017.01.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
39
|
Volgger V, Felicio A, Lohscheller J, Englhard AS, Al-Muzaini H, Betz CS, Schuster ME. Evaluation of the combined use of narrow band imaging and high-speed imaging to discriminate laryngeal lesions. Lasers Surg Med 2017; 49:609-618. [PMID: 28231400 DOI: 10.1002/lsm.22652] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/04/2017] [Indexed: 02/05/2023]
Abstract
BACKGROUND AND OBJECTIVE Laryngeal lesions are usually investigated by microlaryngoscopy, biopsy, and histopathology. This study aimed to evaluate the combined use of Narrow Band Imaging (NBI) and High-Speed Imaging (HSI) in the differentiation of glottic lesions in awake patients. STUDY DESIGN Prospective diagnostic study. MATERIALS AND METHODS Thirty-six awake patients with 41 glottic lesions were investigated with both NBI and HSI, and the suspected diagnoses were compared to the histopathological results of tissue biopsies taken during subsequent microlaryngoscopies. Of the 41 lesions, 28 were primary lesions and 13 recurrent lesions after previous laryngeal pathologies. RESULTS Sensitivity, specificity, positive predictive value, and negative predictive value in the differentiation between benign/premalignant and malignant lesions with both NBI and HSI accounted to 100.0%, 79.4%, 50.0%, and 100.0%. Sensitivities and specificities were 100.0% and 85.7% for HSI alone, and 100.0% and 79.4% for NBI alone. Regarding only primary lesions the results were generally better with sensitivities and specificities of 100% and 81% for NBI, 100% and 84.2% for HSI and 100% and 85.7% for the combination of both methods, respectively. CONCLUSION NBI and HSI both seem to be promising adjunct tools in the differentiation of various laryngeal lesions in awake patients with high sensitivities. Specificities, however, were moderate but could be increased when using NBI and HSI in combination in a subgroup of patients with only primary lesions. Although both methods still have limitations they might ameliorate the evaluation of suspicious laryngeal lesions in the future and could possibly spare patients from repeated invasive tissue biopsies. Lasers Surg. Med. 49:609-618, 2017. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Veronika Volgger
- Department of Otorhinolaryngology, Head and Neck Surgery, Klinikum der Universität München, 81377, Munich, Germany
| | - Axelle Felicio
- Department of Otorhinolaryngology, Head and Neck Surgery, Klinikum der Universität München, 81377, Munich, Germany
| | - Jörg Lohscheller
- Department of Informatics, Trier University of Applied Sciences, Schneidershof, 54208, Trier, Germany
| | - Anna S Englhard
- Department of Otorhinolaryngology, Head and Neck Surgery, Klinikum der Universität München, 81377, Munich, Germany
| | - Hanan Al-Muzaini
- Department of Otorhinolaryngology, Head and Neck Surgery, Klinikum der Universität München, 81377, Munich, Germany
| | - Christian S Betz
- Department of Otorhinolaryngology, Head and Neck Surgery, Klinikum der Universität München, 81377, Munich, Germany
| | - Maria E Schuster
- Department of Otorhinolaryngology, Head and Neck Surgery, Klinikum der Universität München, 81377, Munich, Germany
| |
Collapse
|
40
|
Granados A, Misztal MK, Brunskog J, Visseq V, Erleben K. A numerical strategy for finite element modeling of frictionless asymmetric vocal fold collision. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2017; 33. [PMID: 27058999 DOI: 10.1002/cnm.2793] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Revised: 02/23/2016] [Accepted: 03/28/2016] [Indexed: 05/08/2023]
Abstract
Analysis of voice pathologies may require vocal fold models that include relevant features such as vocal fold asymmetric collision. The present study numerically addresses the problem of frictionless asymmetric collision in a self-sustained three-dimensional continuum model of the vocal folds. Theoretical background and numerical analysis of the finite-element position-based contact model are presented, along with validation. A novel contact detection mechanism capable to detect collision in asymmetric oscillations is developed. The effect of inexact contact constraint enforcement on vocal fold dynamics is examined by different variational methods for inequality constrained minimization problems, namely, the Lagrange multiplier method and the penalty method. In contrast to the penalty solution, which is related to classical spring-like contact forces, numerical examples show that the parameter-independent Lagrange multiplier solution is more robust and accurate in the estimation of dynamical and mechanical features at vocal fold contact. Furthermore, special attention is paid to the temporal integration schemes in relation to the contact problem, the results suggesting an advantage of highly diffusive schemes. Finally, vocal fold contact enforcement is shown to affect asymmetric oscillations. The present model may be adapted to existing vocal fold models, which may contribute to a better understanding of the effect of the nonlinear contact phenomenon on phonation. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Alba Granados
- Department of Electrical Engineering, Technical University of Denmark, Kgs. Lyngby, DK-2800, Denmark
| | | | - Jonas Brunskog
- Department of Electrical Engineering, Technical University of Denmark, Kgs. Lyngby, DK-2800, Denmark
| | - Vincent Visseq
- Institut Supérieur de Mécanique de Paris, Saint-Ouen, F-93400, France
| | - Kenny Erleben
- Department of Computer Science, University of Copenhagen, Copenhagen, DK-2100, Denmark
| |
Collapse
|
41
|
Yamauchi A, Yokonishi H, Imagawa H, Sakakibara KI, Nito T, Tayama N, Yamasoba T. Characterization of Vocal Fold Vibration in Sulcus Vocalis Using High-Speed Digital Imaging. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:24-37. [PMID: 28114611 DOI: 10.1044/2016_jslhr-s-14-0285] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2014] [Accepted: 07/07/2016] [Indexed: 06/06/2023]
Abstract
PURPOSE The aim of the present study was to qualitatively and quantitatively characterize vocal fold vibrations in sulcus vocalis by high-speed digital imaging (HSDI) and to clarify the correlations between HSDI-derived parameters and traditional vocal parameters. METHOD HSDI was performed in 20 vocally healthy subjects (8 men and 12 women) and 41 patients with sulcus vocalis (33 men and 8 women). Then HSDI data were evaluated by assessing the visual-perceptual rating, digital kymography, and glottal area waveform. RESULTS Patients with sulcus vocalis frequently had spindle-shaped glottal gaps and a decreased mucosal wave. Compared with the control group, the sulcus vocalis group showed higher open quotient as well as a shorter duration of the visible mucosal wave, a smaller speed index, and a smaller glottal area difference index ([maximal glottal area - minimal glottal area]/maximal glottal area). These parameters deteriorated in order of the control group and Type I, II, and III sulcus vocalis. There were no gender-related differences. Strong correlations were noted between the open quotient and the type of sulcus vocalis. CONCLUSIONS HSDI was an effective method for documenting the characteristics of vocal fold vibrations in patients with sulcus vocalis and estimating the severity of dysphonia.
Collapse
Affiliation(s)
- Akihito Yamauchi
- Department of Otolaryngology, The University of Tokyo Hospital, Japan
| | | | - Hiroshi Imagawa
- Department of Otolaryngology, The University of Tokyo Hospital, Japan
| | - Ken-Ichi Sakakibara
- Department of Communication Disorders, The Health Sciences University of Hokkaido, Japan
| | - Takaharu Nito
- Department of Otolaryngology, The University of Tokyo Hospital, Japan
| | - Niro Tayama
- Department of Otolaryngology and Tracheo-esophagology, The National Center for Global Health and Medicine, Tokyo, Japan
| | - Tatsuya Yamasoba
- Department of Otolaryngology, The University of Tokyo Hospital, Japan
| |
Collapse
|
42
|
Ikuma T, Kunduk M, Fink D, McWhorter AJ. Synthetic multi-line kymographic analysis: A spatiotemporal data reduction technique for high-speed videoendoscopy. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:2703. [PMID: 27794340 DOI: 10.1121/1.4964400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
High-speed videoendoscopy (HSV) enables observation of the true vibratory behavior of the vocal folds. To quantify the vocal fold vibration captured by the HSV, lateral movement features (e.g., glottal width and vocal fold edge displacements) have been extracted as functions of time. The most common analysis method is to extract the features on a lateral strip used to form digital kymogram. The weakness of this method is that it can only capture the vibrational behavior local to the strip location. While the multi-line kymographic approach has been utilized to capture the spatial diversity, the observation points are either fixed or manually positioned. Behaviors of pathological vocal folds, especially those with lesions, are expected to be spatially diverse and also diverse among speakers, making fixed observation points ineffective. This paper proposes a technique to synthesize kymographic waveforms from full spatiotemporal HSV feature data to extract distinctive behaviors automatically. Each synthesized waveform represents a non-overlapping section of the glottis, where vocal folds are locally behaving homogeneously. The efficacy of the algorithm is demonstrated with four HSV recordings (three pathological) and discussed, including mitigation of the known drawbacks.
Collapse
Affiliation(s)
- Takeshi Ikuma
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans, Louisiana 70112, USA
| | - Melda Kunduk
- Department of Communication Disorders, Louisiana State University, Baton Rouge, Louisiana 70803, USA
| | - Daniel Fink
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans, Louisiana 70112, USA
| | - Andrew J McWhorter
- Department of Otolaryngology-Head and Neck Surgery, Louisiana State University Health Sciences Center, New Orleans, Louisiana 70112, USA
| |
Collapse
|
43
|
Semmler M, Kniesburges S, Birk V, Ziethe A, Patel R, Dollinger M. 3D Reconstruction of Human Laryngeal Dynamics Based on Endoscopic High-Speed Recordings. IEEE TRANSACTIONS ON MEDICAL IMAGING 2016; 35:1615-1624. [PMID: 26829782 DOI: 10.1109/tmi.2016.2521419] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Standard laryngoscopic imaging techniques provide only limited two-dimensional insights into the vocal fold vibrations not taking the vertical component into account. However, previous experiments have shown a significant vertical component in the vibration of the vocal folds. We present a 3D reconstruction of the entire superior vocal fold surface from 2D high-speed videoendoscopy via stereo triangulation. In a typical camera-laser set-up the structured laser light pattern is projected on the vocal folds and captured at 4000 fps. The measuring device is suitable for in vivo application since the external dimensions of the miniaturized set-up barely exceed the size of a standard rigid laryngoscope. We provide a conservative estimate on the resulting resolution based on the hardware components and point out the possibilities and limitations of the miniaturized camera-laser set-up. In addition to the 3D vocal fold surface, we extended previous approaches with a G2-continuous model of the vocal fold edge. The clinical applicability was successfully established by the reconstruction of visual data acquired from 2D in vivo high-speed recordings of a female and a male subject. We present extracted dynamic parameters like maximum amplitude and velocity in the vertical direction. The additional vertical component reveals deeper insights into the vibratory dynamics of the vocal folds by means of a non-invasive method. The successful miniaturization allows for in vivo application giving access to the most realistic model available and hence enables a comprehensive understanding of the human phonation process.
Collapse
|
44
|
Niebudek-Bogusz E, Kopczynski B, Strumillo P, Morawska J, Wiktorowicz J, Sliwinska-Kowalska M. Quantitative assessment of videolaryngostroboscopic images in patients with glottic pathologies. LOGOP PHONIATR VOCO 2016; 42:73-83. [DOI: 10.3109/14015439.2016.1174293] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
45
|
Döllinger M, Berry DA, Kniesburges S. Dynamic vocal fold parameters with changing adduction in ex-vivo hemilarynx experiments. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:2372. [PMID: 27250133 PMCID: PMC4859834 DOI: 10.1121/1.4947044] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2014] [Revised: 03/22/2016] [Accepted: 04/05/2016] [Indexed: 05/25/2023]
Abstract
Ex-vivo hemilarynx experiments allow the visualization and quantification of three-dimensional dynamics of the medial vocal fold surface. For three excised human male larynges, the vibrational output, the glottal flow resistance, and the sound pressure during sustained phonation were analyzed as a function of vocal fold adduction for varying subglottal pressure. Empirical eigenfunctions, displacements, and velocities were investigated along the vocal fold surface. For two larynges, an increase of adduction level resulted in an increase of the glottal flow resistance at equal subglottal pressures. This caused an increase of lateral and vertical oscillation amplitudes and velocity indicating an improved energy transfer from the airflow to the vocal folds. In contrast, the third larynx exhibited an amplitude decrease for rising adduction accompanying reduction of the flow resistance. By evaluating the empirical eigenfunctions, this reduced flow resistance was assigned to an unbalanced oscillation pattern with predominantly lateral amplitudes. The results suggest that adduction facilitates the phonatory process by increasing the glottal flow resistance and enhancing the vibrational amplitudes. However, this interrelation only holds for a maintained balanced ratio between vertical and lateral displacements. Indeed, a balanced vertical-lateral oscillation pattern may be more beneficial to phonation than strong periodicity with predominantly lateral vibrations.
Collapse
Affiliation(s)
- Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology-Computational Medicine, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Bohlenplatz 21, 91054 Erlangen, Germany
| | - David A Berry
- The Laryngeal Dynamics Laboratory, Division of Head and Neck Surgery, David Geffen School of Medicine at UCLA, 31-24 Rehab Center, 1000 Veteran Avenue, Los Angeles, California 90095-1794, USA
| | - Stefan Kniesburges
- Division of Phoniatrics and Pediatric Audiology-Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School at Friedrich-Alexander-Universität Erlangen-Nürnberg, Bohlenplatz 21, 91054 Erlangen, Germany
| |
Collapse
|
46
|
Patel RR, Unnikrishnan H, Donohue KD. Effects of Vocal Fold Nodules on Glottal Cycle Measurements Derived from High-Speed Videoendoscopy in Children. PLoS One 2016; 11:e0154586. [PMID: 27124157 PMCID: PMC4849744 DOI: 10.1371/journal.pone.0154586] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Accepted: 04/17/2016] [Indexed: 11/18/2022] Open
Abstract
The goal of this study is to quantify the effects of vocal fold nodules on vibratory motion in children using high-speed videoendoscopy. Differences in vibratory motion were evaluated in 20 children with vocal fold nodules (5–11 years) and 20 age and gender matched typically developing children (5–11 years) during sustained phonation at typical pitch and loudness. Normalized kinematic features of vocal fold displacements from the mid-membranous vocal fold point were extracted from the steady-state high-speed video. A total of 12 kinematic features representing spatial and temporal characteristics of vibratory motion were calculated. Average values and standard deviations (cycle-to-cycle variability) of the following kinematic features were computed: normalized peak displacement, normalized average opening velocity, normalized average closing velocity, normalized peak closing velocity, speed quotient, and open quotient. Group differences between children with and without vocal fold nodules were statistically investigated. While a moderate effect size was observed for the spatial feature of speed quotient, and the temporal feature of normalized average closing velocity in children with nodules compared to vocally normal children, none of the features were statistically significant between the groups after Bonferroni correction. The kinematic analysis of the mid-membranous vocal fold displacement revealed that children with nodules primarily differ from typically developing children in closing phase kinematics of the glottal cycle, whereas the opening phase kinematics are similar. Higher speed quotients and similar opening phase velocities suggest greater relative forces are acting on vocal fold in the closing phase. These findings suggest that future large-scale studies should focus on spatial and temporal features related to the closing phase of the glottal cycle for differentiating the kinematics of children with and without vocal fold nodules.
Collapse
Affiliation(s)
- Rita R. Patel
- Department of Speech & Hearing Sciences, Indiana University, Bloomington, Indiana, United States of America
- * E-mail:
| | - Harikrishnan Unnikrishnan
- Department of Electrical and Computer Engineering, University of Kentucky, Lexington, Kentucky, United States of America
| | - Kevin D. Donohue
- Department of Electrical and Computer Engineering, University of Kentucky, Lexington, Kentucky, United States of America
| |
Collapse
|
47
|
Relationship of Various Open Quotients With Acoustic Property, Phonation Types, Fundamental Frequency, and Intensity. J Voice 2016; 30:145-57. [DOI: 10.1016/j.jvoice.2015.01.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2014] [Accepted: 01/30/2015] [Indexed: 10/23/2022]
|
48
|
Unger J, Schuster M, Hecker DJ, Schick B, Lohscheller J. A generalized procedure for analyzing sustained and dynamic vocal fold vibrations from laryngeal high-speed videos using phonovibrograms. Artif Intell Med 2015; 66:15-28. [PMID: 26597002 DOI: 10.1016/j.artmed.2015.10.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Revised: 09/28/2015] [Accepted: 10/20/2015] [Indexed: 12/01/2022]
Abstract
OBJECTIVE This work presents a computer-based approach to analyze the two-dimensional vocal fold dynamics of endoscopic high-speed videos, and constitutes an extension and generalization of a previously proposed wavelet-based procedure. While most approaches aim for analyzing sustained phonation conditions, the proposed method allows for a clinically adequate analysis of both dynamic as well as sustained phonation paradigms. MATERIALS AND METHODS The analysis procedure is based on a spatio-temporal visualization technique, the phonovibrogram, that facilitates the documentation of the visible laryngeal dynamics. From the phonovibrogram, a low-dimensional set of features is computed using a principle component analysis strategy that quantifies the type of vibration patterns, irregularity, lateral symmetry and synchronicity, as a function of time. Two different test bench data sets are used to validate the approach: (I) 150 healthy and pathologic subjects examined during sustained phonation. (II) 20 healthy and pathologic subjects that were examined twice: during sustained phonation and a glissando from a low to a higher fundamental frequency. In order to assess the discriminative power of the extracted features, a Support Vector Machine is trained to distinguish between physiologic and pathologic vibrations. The results for sustained phonation sequences are compared to the previous approach. Finally, the classification performance of the stationary analyzing procedure is compared to the transient analysis of the glissando maneuver. RESULTS For the first test bench the proposed procedure outperformed the previous approach (proposed feature set: accuracy: 91.3%, sensitivity: 80%, specificity: 97%, previous approach: accuracy: 89.3%, sensitivity: 76%, specificity: 96%). Comparing the classification performance of the second test bench further corroborates that analyzing transient paradigms provides clear additional diagnostic value (glissando maneuver: accuracy: 90%, sensitivity: 100%, specificity: 80%, sustained phonation: accuracy: 75%, sensitivity: 80%, specificity: 70%). CONCLUSIONS The incorporation of parameters describing the temporal evolvement of vocal fold vibration clearly improves the automatic identification of pathologic vibration patterns. Furthermore, incorporating a dynamic phonation paradigm provides additional valuable information about the underlying laryngeal dynamics that cannot be derived from sustained conditions. The proposed generalized approach provides a better overall classification performance than the previous approach, and hence constitutes a new advantageous tool for an improved clinical diagnosis of voice disorders.
Collapse
Affiliation(s)
- Jakob Unger
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, 54293 Trier, Germany.
| | - Maria Schuster
- Department of Otorhinolaryngology and Head and Neck Surgery, University of Munich, Campus Grosshadern, Marchioninistr. 13, 81366 München, Germany
| | - Dietmar J Hecker
- Department of Otorhinolaryngology, Saarland University Hospital, Kirrbergerstr., 66424 Homburg/Saar, Germany
| | - Bernhard Schick
- Department of Otorhinolaryngology, Saarland University Hospital, Kirrbergerstr., 66424 Homburg/Saar, Germany
| | - Jörg Lohscheller
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, 54293 Trier, Germany
| |
Collapse
|
49
|
Andrade-Miranda G, Godino-Llorente JI, Moro-Velázquez L, Gómez-García JA. An automatic method to detect and track the glottal gap from high speed videoendoscopic images. Biomed Eng Online 2015; 14:100. [PMID: 26510707 PMCID: PMC4625946 DOI: 10.1186/s12938-015-0096-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2015] [Accepted: 10/20/2015] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND The image-based analysis of the vocal folds vibration plays an important role in the diagnosis of voice disorders. The analysis is based not only on the direct observation of the video sequences, but also in an objective characterization of the phonation process by means of features extracted from the recorded images. However, such analysis is based on a previous accurate identification of the glottal gap, which is the most challenging step for a further automatic assessment of the vocal folds vibration. METHODS In this work, a complete framework to automatically segment and track the glottal area (or glottal gap) is proposed. The algorithm identifies a region of interest that is adapted along time, and combine active contours and watershed transform for the final delineation of the glottis and also an automatic procedure for synthesize different videokymograms is proposed. RESULTS Thanks to the ROI implementation, our technique is robust to the camera shifting and also the objective test proved the effectiveness and performance of the approach in the most challenging scenarios that it is when exist an inappropriate closure of the vocal folds. CONCLUSIONS The novelties of the proposed algorithm relies on the used of temporal information for identify an adaptive ROI and the use of watershed merging combined with active contours for the glottis delimitation. Additionally, an automatic procedure for synthesize multiline VKG by the identification of the glottal main axis is developed.
Collapse
Affiliation(s)
- Gustavo Andrade-Miranda
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Campus de Montegancedo, Crta. M40 km, 38, Madrid, Spain.
| | - Juan I Godino-Llorente
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Campus de Montegancedo, Crta. M40 km, 38, Madrid, Spain.
| | - Laureano Moro-Velázquez
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Campus de Montegancedo, Crta. M40 km, 38, Madrid, Spain.
| | - Jorge Andrés Gómez-García
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Campus de Montegancedo, Crta. M40 km, 38, Madrid, Spain.
| |
Collapse
|
50
|
A Preliminary Quantitative Comparison of Vibratory Amplitude Using Rigid and Flexible Stroboscopic Assessment. J Voice 2015; 30:485-92. [PMID: 26149662 DOI: 10.1016/j.jvoice.2015.05.018] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2015] [Accepted: 05/29/2015] [Indexed: 11/22/2022]
Abstract
STUDY OBJECTIVE The purpose of this study was to establish preliminary, quantitative data on amplitude of vibration during stroboscopic assessment in healthy speakers with normal voice characteristics. Amplitude of vocal fold vibration is a core physiological parameter used in diagnosing voice disorders, yet quantitative data are lacking to guide the determination of what constitutes normal vibratory amplitude. METHODS/STUDY DESIGN Eleven participants were assessed during sustained vowel production using rigid and flexible endoscopy with stroboscopy. Still images were extracted from digital recordings of a sustained /i/ produced at a comfortable pitch and loudness, with F0 controlled so that levels were within ±15% of each participant's comfortable mean level as determined from connected speech. Glottal width (GW), true vocal fold (TVF) length, and TVF width were measured from still frames representing the maximum open phase of the vibratory cycle. To control for anatomic and magnification differences across participants, GW was normalized to TVF length. GW as a ratio of TVF width was also computed for comparison with prior studies. RESULTS Mean values and standard deviations were computed for the normalized measures. Paired t tests showed no significant differences between rigid and flexible endoscopy methods. Interrater and intrarater reliability values for raw measurements were found to be high (0.89-0.99). CONCLUSIONS These preliminary quantitative data may be helpful in determining normality or abnormality of vocal fold vibration. Results indicate that quantified amplitude of vibration is similar between endoscopic methods, a clinically relevant finding for individuals performing and interpreting stroboscopic assessments.
Collapse
|