1
|
Patel RR, Döllinger M, Jakubaß B, Pinhack H, Katz U, Semmler M. Analyzing Vocal Fold Frequency Dynamics Using High-Speed 3D Laser Video Endoscopy. Laryngoscope 2024; 134:3267-3276. [PMID: 38481073 PMCID: PMC11182720 DOI: 10.1002/lary.31394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 02/24/2024] [Accepted: 02/29/2024] [Indexed: 06/18/2024]
Abstract
OBJECTIVE To examine changes in lateral and vertical vibratory motion along the anterior, middle, and posterior sections of the vocal folds, as a function of vocal frequency variations. METHODS Absolute measurements of vocal fold surface dynamics from high-speed videoendoscopy with custom laser endoscope were made on 23 vocally healthy adults during sustained /i:/ production at 10%, 20%, and 80% of pitch range. The 3D parameters of amplitude (mm), maximum velocity opening/closing (mm/s), and mean velocity opening/closing (mm/s) were computed for the lateral and vertical vibratory motion along the anterior, middle, and posterior sections of the vocal folds. Linear mixed model analysis was conducted to evaluate the differences in (a) vocal frequency levels (high vs. normal vs. low pitch), (b) axis level (vertical vs. lateral), (c) position level (anterior vs. middle vs. posterior), and (d) gender differences (male vs. female). RESULTS Overall, the superior surface vertical motion of the vocal fold is greater compared with the lateral motion, especially in males. Along the superior surface, the mean and maximum closing velocities are greater posteriorly for low pitch. The location (anterior, middle, and posterior) along the superior surface is relevant only for vocal fold closing rather than opening, as the dynamics are different along the various locations. CONCLUSIONS The study highlights the significance of assessing the vertical motion of the superior surface of the vocal fold to understand the complex dynamics of voice production. LEVEL OF EVIDENCE NA Laryngoscope, 134:3267-3276, 2024.
Collapse
Affiliation(s)
- Rita R. Patel
- Department of Otolaryngology Head and Neck Surgery, Indiana University, Indianapolis, Indiana, United States
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Bernhard Jakubaß
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Hanna Pinhack
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Ute Katz
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Marion Semmler
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
2
|
Garcia GJM, Catalano D, Shum A, Larkee CE, Rhee JS. Estimation of Nasal Airway Cross-sectional Area From Endoscopy Using Depth Maps: A Proof-of-Concept Study. Otolaryngol Head Neck Surg 2024; 170:1581-1589. [PMID: 38329226 DOI: 10.1002/ohn.669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 12/07/2023] [Accepted: 01/13/2024] [Indexed: 02/09/2024]
Abstract
OBJECTIVE Endoscopy is routinely used to diagnose obstructive airway diseases. Currently, endoscopy is only a visualization technique and does not allow quantification of airspace cross-sectional areas (CSAs). This pilot study tested the hypothesis that CSAs can be accurately estimated from depth maps created from virtual endoscopy videos. STUDY DESIGN Cross-sectional. SETTING Academic tertiary medical center. METHODS Virtual endoscopy and depth map videos of the nasal cavity were digitally created based on anatomically accurate three-dimensional (3D) models built from computed tomography scans of 30 subjects. A software tool was developed to outline the airway perimeter and estimate the airspace CSA from the depth maps. Two otolaryngologists used the software tool to estimate the nasopharynx CSA and the nasal valve minimal CSA (mCSA) in the left and right nasal cavities. Model validation statistics were performed. RESULTS Nasopharynx CSA had a median percent error of 3.7% to 4.6% when compared to the true values measured in the 3D models. Nasal valve mCSA had a median percent error of 22.7% to 33.6% relative to the true values. Raters successfully used the software tool to identify subjects with nasal valve stenosis (ie, mCSA < 0.20 cm2) with a sensitivity of 83.3%, specificity ≥ 90.7%, and classification accuracy ≥ 90.0%. Interrater and intrarater agreements were high. CONCLUSION This study demonstrates that airway CSAs in 3D models can be accurately estimated from depth maps. The development of artificial intelligence algorithms to compute depth maps may soon allow the quantification of airspace CSAs from clinical endoscopies.
Collapse
Affiliation(s)
- Guilherme J M Garcia
- Department of Biomedical Engineering, Marquette University and The Medical College of Wisconsin, Milwaukee, Wisconsin, USA
- Department of Otolaryngology and Communication Sciences, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Dominic Catalano
- Department of Otolaryngology and Communication Sciences, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Axel Shum
- Department of Otolaryngology and Communication Sciences, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Christopher E Larkee
- Department of Biomedical Engineering, Marquette University and The Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - John S Rhee
- Department of Otolaryngology and Communication Sciences, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| |
Collapse
|
3
|
Movahhedi M, Liu XY, Geng B, Elemans C, Xue Q, Wang JX, Zheng X. Predicting 3D soft tissue dynamics from 2D imaging using physics informed neural networks. Commun Biol 2023; 6:541. [PMID: 37208428 DOI: 10.1038/s42003-023-04914-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 05/04/2023] [Indexed: 05/21/2023] Open
Abstract
Tissue dynamics play critical roles in many physiological functions and provide important metrics for clinical diagnosis. Capturing real-time high-resolution 3D images of tissue dynamics, however, remains a challenge. This study presents a hybrid physics-informed neural network algorithm that infers 3D flow-induced tissue dynamics and other physical quantities from sparse 2D images. The algorithm combines a recurrent neural network model of soft tissue with a differentiable fluid solver, leveraging prior knowledge in solid mechanics to project the governing equation on a discrete eigen space. The algorithm uses a Long-short-term memory-based recurrent encoder-decoder connected with a fully connected neural network to capture the temporal dependence of flow-structure-interaction. The effectiveness and merit of the proposed algorithm is demonstrated on synthetic data from a canine vocal fold model and experimental data from excised pigeon syringes. The results showed that the algorithm accurately reconstructs 3D vocal dynamics, aerodynamics, and acoustics from sparse 2D vibration profiles.
Collapse
Affiliation(s)
| | - Xin-Yang Liu
- Aerospace and Mechanical Engineering Department, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Biao Geng
- Mechanical Engineering Department, University of Maine, Orono, ME, 04469, USA
- Mechanical Engineering Department, Rochester Institute of Technology, Rochester, NY, 14623, USA
| | - Coen Elemans
- Department of Biology, University of Southern Denmark, Odense M, 5230, Denmark
| | - Qian Xue
- Mechanical Engineering Department, University of Maine, Orono, ME, 04469, USA
- Mechanical Engineering Department, Rochester Institute of Technology, Rochester, NY, 14623, USA
| | - Jian-Xun Wang
- Aerospace and Mechanical Engineering Department, University of Notre Dame, Notre Dame, IN, 46556, USA.
| | - Xudong Zheng
- Mechanical Engineering Department, University of Maine, Orono, ME, 04469, USA.
- Mechanical Engineering Department, Rochester Institute of Technology, Rochester, NY, 14623, USA.
| |
Collapse
|
4
|
Arias-Vergara T, Döllinger M, Schraut T, Mohd Khairuddin KA, Schützenberger A. Nyquist Plot Parametrization for Quantitative Analysis of Vibration of the Vocal Folds. J Voice 2023:S0892-1997(23)00014-0. [PMID: 36774264 DOI: 10.1016/j.jvoice.2023.01.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 01/12/2023] [Accepted: 01/12/2023] [Indexed: 02/11/2023]
Abstract
OBJECTIVES The Nyquist plot provides a graphical representation of the glottal cycles as elliptical trajectories in a 2D plane. This study proposes a methodology to parameterize the Nyquist plot with application to support the quantitative analysis of voice disorders. METHODS We considered high-speed videoendoscopy recordings of 33 functional dysphonia (FD) patients and 33 normophonic controls (NC). Quantitative analysis was performed by computing four shape-based parameters from the Nyquist plot: Variability, Size (Perimeter and Area), and Consistency. Additionally, we performed automatic classification using a linear support vector machine and feature importance analysis by combining the proposed features with state-of-the-art glottal area waveform (GAW) parameters. RESULTS We found that the inter-cycle variability was significantly higher in FD patients compared to NC. We achieved a classification accuracy of 83% when the top 30 most important features were used. Furthermore, the proposed Nyquist plot features were ranked in the top 12 most important features. CONCLUSIONS The Nyquist plot provides complementary information for subjective and objective assessment of voice disorders. On the one hand, with visual inspection it is possible to observe intra- and inter-glottal cycle irregularities during sustained phonation. On the other hand, shaped-based parameters allow quantifying such irregularities and provide complementary information to state-of-the-art GAW parameters.
Collapse
Affiliation(s)
- Tomás Arias-Vergara
- University Hospital Erlangen, Medical School Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany.
| | - Michael Döllinger
- University Hospital Erlangen, Medical School Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| | - Tobias Schraut
- University Hospital Erlangen, Medical School Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| | | | - Anne Schützenberger
- University Hospital Erlangen, Medical School Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| |
Collapse
|
5
|
Zhao YM, Currie EH, Kavoussi L, Rabbany SY. Laser scanner for 3D reconstruction of a wound's edge and topology. Int J Comput Assist Radiol Surg 2021; 16:1761-1773. [PMID: 34424457 DOI: 10.1007/s11548-021-02459-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Accepted: 07/12/2021] [Indexed: 10/20/2022]
Abstract
PURPOSE Robotic systems have the potential to overcome inherent limitations of humans and offer substantial advantages to patients including reduction in surgery time. Our group has undertaken the challenge of developing autonomous wound closure system. One of the initial steps is to allow accurate assessment of wound skin topology and wound edge location. We present a vision-laser scanner to generate 3D point cloud for 3D reconstruction of wound's edge and topology. METHODS When the laser range sensor measures Z coordinate, two encoders installed on the actuators of the gantry robot provide the precision values of X, Y coordinates simultaneously. The 3D point cloud of the wound skin is generated by recordings of X, Y and Z during scanning is performed over wound skin surface. To reduce the scanning time, we exploit a supplementary laser LED to project a regular laser spot on the wound skin surface, which can provide an additional measurement point by incorporating artificial neural network estimation approach. In the meantime, the point cloud of the wound edge can be extracted by detecting if the laser spot is located on the wound edge in the image from 2D camera. RESULTS The mean absolute error (MAE) and standard deviation (σ) of wound edge are measured in MeshLab environment. The MAE (σ) in X (tangent), Y (tangent), and Z (normal) are 0.32 (0.22) mm, 0.37 (0.34) mm, and 0.61 (0.29) mm, respectively. The experimental results demonstrate that the vision-laser scanner attains high accuracy in determining wound edge location along the tangent of the wound skin. CONCLUSION A vision-laser scanner is developed for 3D reconstruction of wound's edge and topology. The experimental tests on the different wound models revealed the effectiveness of the vision-laser scanner. The proposed scanner can generate 3D point cloud of the wound skin and its edge simultaneously, and thus significantly improve the accuracy of wound closure in clinical applications.
Collapse
Affiliation(s)
- Y M Zhao
- DeMatteis School of Engineering and Applied Science, Hofstra University, Hempstead, NY, USA
| | - Edward H Currie
- DeMatteis School of Engineering and Applied Science, Hofstra University, Hempstead, NY, USA.
| | - Louis Kavoussi
- Department of Urology, Long Island Jewish Medical Centre, Queens, NY, USA
| | - Sina Y Rabbany
- DeMatteis School of Engineering and Applied Science, Hofstra University, Hempstead, NY, USA
| |
Collapse
|
6
|
Kist AM, Gómez P, Dubrovskiy D, Schlegel P, Kunduk M, Echternach M, Patel R, Semmler M, Bohr C, Dürr S, Schützenberger A, Döllinger M. A Deep Learning Enhanced Novel Software Tool for Laryngeal Dynamics Analysis. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1889-1903. [PMID: 34000199 DOI: 10.1044/2021_jslhr-20-00498] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose High-speed videoendoscopy (HSV) is an emerging, but barely used, endoscopy technique in the clinic to assess and diagnose voice disorders because of the lack of dedicated software to analyze the data. HSV allows to quantify the vocal fold oscillations by segmenting the glottal area. This challenging task has been tackled by various studies; however, the proposed approaches are mostly limited and not suitable for daily clinical routine. Method We developed a user-friendly software in C# that allows the editing, motion correction, segmentation, and quantitative analysis of HSV data. We further provide pretrained deep neural networks for fully automatic glottis segmentation. Results We freely provide our software Glottis Analysis Tools (GAT). Using GAT, we provide a general threshold-based region growing platform that enables the user to analyze data from various sources, such as in vivo recordings, ex vivo recordings, and high-speed footage of artificial vocal folds. Additionally, especially for in vivo recordings, we provide three robust neural networks at various speed and quality settings to allow a fully automatic glottis segmentation needed for application by untrained personnel. GAT further evaluates video and audio data in parallel and is able to extract various features from the video data, among others the glottal area waveform, that is, the changing glottal area over time. In total, GAT provides 79 unique quantitative analysis parameters for video- and audio-based signals. Many of these parameters have already been shown to reflect voice disorders, highlighting the clinical importance and usefulness of the GAT software. Conclusion GAT is a unique tool to process HSV and audio data to determine quantitative, clinically relevant parameters for research, diagnosis, and treatment of laryngeal disorders. Supplemental Material https://doi.org/10.23641/asha.14575533.
Collapse
Affiliation(s)
- Andreas M Kist
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Pablo Gómez
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Denis Dubrovskiy
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Patrick Schlegel
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Melda Kunduk
- Department of Communication Sciences and Disorders, Louisiana State University, Baton Rouge
| | - Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Germany
| | - Rita Patel
- Department of Speech, Language and Hearing Sciences, College of Arts and Sciences, Indiana University, Bloomington
| | - Marion Semmler
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Christopher Bohr
- Klinik und Poliklinik für Hals-Nasen-Ohren-Heilkunde Universitätsklinikum Regensburg, Germany
| | - Stephan Dürr
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| |
Collapse
|
7
|
Falk S, Kniesburges S, Schoder S, Jakubaß B, Maurerlehner P, Echternach M, Kaltenbacher M, Döllinger M. 3D-FV-FE Aeroacoustic Larynx Model for Investigation of Functional Based Voice Disorders. Front Physiol 2021; 12:616985. [PMID: 33762964 PMCID: PMC7982522 DOI: 10.3389/fphys.2021.616985] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 02/09/2021] [Indexed: 12/02/2022] Open
Abstract
For the clinical analysis of underlying mechanisms of voice disorders, we developed a numerical aeroacoustic larynx model, called simVoice, that mimics commonly observed functional laryngeal disorders as glottal insufficiency and vibrational left-right asymmetries. The model is a combination of the Finite Volume (FV) CFD solver Star-CCM+ and the Finite Element (FE) aeroacoustic solver CFS++. simVoice models turbulence using Large Eddy Simulations (LES) and the acoustic wave propagation with the perturbed convective wave equation (PCWE). Its geometry corresponds to a simplified larynx and a vocal tract model representing the vowel /a/. The oscillations of the vocal folds are externally driven. In total, 10 configurations with different degrees of functional-based disorders were simulated and analyzed. The energy transfer between the glottal airflow and the vocal folds decreases with an increasing glottal insufficiency and potentially reflects the higher effort during speech for patients being concerned. This loss of energy transfer may also have an essential influence on the quality of the sound signal as expressed by decreasing sound pressure level (SPL), Cepstral Peak Prominence (CPP), and Vocal Efficiency (VE). Asymmetry in the vocal fold oscillations also reduces the quality of the sound signal. However, simVoice confirmed previous clinical and experimental observations that a high level of glottal insufficiency worsens the acoustic signal quality more than oscillatory left-right asymmetry. Both symptoms in combination will further reduce the quality of the sound signal. In summary, simVoice allows for detailed analysis of the origins of disordered voice production and hence fosters the further understanding of laryngeal physiology, including occurring dependencies. A current walltime of 10 h/cycle is, with a prospective increase in computing power, auspicious for a future clinical use of simVoice.
Collapse
Affiliation(s)
- Sebastian Falk
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Stefan Kniesburges
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Stefan Schoder
- Institute of Fundamentals and Theory in Electrical Engineering, Division Vibro- and Aeroacoustics, Graz University of Technology, Graz, Austria
| | - Bernhard Jakubaß
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Paul Maurerlehner
- Institute of Fundamentals and Theory in Electrical Engineering, Division Vibro- and Aeroacoustics, Graz University of Technology, Graz, Austria
| | - Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Munich, Germany
| | - Manfred Kaltenbacher
- Institute of Fundamentals and Theory in Electrical Engineering, Division Vibro- and Aeroacoustics, Graz University of Technology, Graz, Austria
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
8
|
Barone NA, Ludlow CL, Tellis CM. Acoustic and Aerodynamic Comparisons of Voice Qualities Produced After Voice Training. J Voice 2021; 35:157.e11-157.e21. [DOI: 10.1016/j.jvoice.2019.07.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Revised: 07/11/2019] [Accepted: 07/15/2019] [Indexed: 10/26/2022]
|
9
|
Schlegel P, Kist AM, Semmler M, Döllinger M, Kunduk M, Dürr S, Schützenberger A. Determination of Clinical Parameters Sensitive to Functional Voice Disorders Applying Boosted Decision Stumps. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE 2020; 8:2100511. [PMID: 32518739 PMCID: PMC7274815 DOI: 10.1109/jtehm.2020.2985026] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 02/21/2020] [Accepted: 03/28/2020] [Indexed: 12/30/2022]
Abstract
BACKGROUND Various voice assessment tools, such as questionnaires and aerodynamic voice characteristics, can be used to assess vocal function of individuals. However, not much is known about the best combinations of these parameters in identification of functional dysphonia in clinical settings. METHODS This study investigated six scores from clinically commonly used questionnaires and seven acoustic parameters. 514 females and 277 males were analyzed. The subjects were divided into three groups: one healthy group (N01) (49 females, 50 males) and two disordered groups with perceptually hoarse (FD23) (220 females, 96 males) and perceptually not hoarse (FD01) (245 females, 131 males) sounding voices. A tree stumps Adaboost approach was applied to find the subset of parameters that best separates the groups. Subsequently, it was determined if this parameter subset reflects treatment outcome for 120 female and 51 male patients by pairwise pre- and post-treatment comparisons of parameters. RESULTS The questionnaire "Voice-related-quality-of-Life" and three objective parameters ("maximum fundamental frequency", "maximum Intensity" and "Jitter Percent") were sufficient to separate the groups (accuracy ranging from 0.690 (FD01 vs. FD23, females) to 0.961 (N01 vs. FD23, females)). Our study suggests that a reduced parameter subset (4 out of 13) is sufficient to separate these three groups. All parameters reflected treatment outcome for patients with hoarse voices, Voice-related-quality-of-Life showed improvement for the not hoarse group (FD01). CONCLUSION Results show that single parameters are insufficient to separate voice disorders but a set of several well-chosen parameters is. These findings will help to optimize and reduce clinical assessment time.
Collapse
Affiliation(s)
- Patrick Schlegel
- Department of Otorhinolaryngology Head and Neck SurgeryDivision of Phoniatrics and Pediatric AudiologyUniversity Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg91054ErlangenGermany
| | - Andreas M. Kist
- Department of Otorhinolaryngology Head and Neck SurgeryDivision of Phoniatrics and Pediatric AudiologyUniversity Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg91054ErlangenGermany
| | - Marion Semmler
- Department of Otorhinolaryngology Head and Neck SurgeryDivision of Phoniatrics and Pediatric AudiologyUniversity Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg91054ErlangenGermany
| | - Michael Döllinger
- Department of Otorhinolaryngology Head and Neck SurgeryDivision of Phoniatrics and Pediatric AudiologyUniversity Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg91054ErlangenGermany
| | - Melda Kunduk
- Department of Communication Sciences and DisordersLouisiana State UniversityBaton RougeLA70803USA
| | - Stephan Dürr
- Department of Otorhinolaryngology Head and Neck SurgeryDivision of Phoniatrics and Pediatric AudiologyUniversity Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg91054ErlangenGermany
| | - Anne Schützenberger
- Department of Otorhinolaryngology Head and Neck SurgeryDivision of Phoniatrics and Pediatric AudiologyUniversity Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg91054ErlangenGermany
| |
Collapse
|
10
|
Abstract
This review provides a comprehensive compilation, from a digital image processing point of view of the most important techniques currently developed to characterize and quantify the vibration behaviour of the vocal folds, along with a detailed description of the laryngeal image modalities currently used in the clinic. The review presents an overview of the most significant glottal-gap segmentation and facilitative playbacks techniques used in the literature for the mentioned purpose, and shows the drawbacks and challenges that still remain unsolved to develop robust vocal folds vibration function analysis tools based on digital image processing.
Collapse
|
11
|
Maguluri G, Mehta D, Kobler J, Park J, Iftimia N. Synchronized, concurrent optical coherence tomography and videostroboscopy for monitoring vocal fold morphology and kinematics. BIOMEDICAL OPTICS EXPRESS 2019; 10:4450-4461. [PMID: 31565501 PMCID: PMC6757476 DOI: 10.1364/boe.10.004450] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Revised: 07/17/2019] [Accepted: 07/24/2019] [Indexed: 06/10/2023]
Abstract
Voice disorders affect a large number of adults in the United States, and their clinical evaluation heavily relies on laryngeal videostroboscopy, which captures the medial-lateral and anterior-posterior motion of the vocal folds using stroboscopic sampling. However, videostroboscopy does not provide direct visualization of the superior-inferior movement of the vocal folds, which yields important clinical insight. In this paper, we present a novel technology that complements videostroboscopic findings by adding the ability to image the coronal plane and visualize the superior-inferior movement of the vocal folds. The technology is based on optical coherence tomography, which is combined with videostroboscopy within the same endoscopic probe to provide spatially and temporally co-registered images of the mucosal wave motion, as well as vocal folds subsurface morphology. We demonstrate the capability of the rigid endoscopic probe, in a benchtop setting, to characterize the complex movement and subsurface structure of the aerodynamically driven excised larynx models within the 50 to 200 Hz phonation range. Our preliminary results encourage future development of this technology with the goal of its use for in vivo laryngeal imaging.
Collapse
Affiliation(s)
| | - Daryush Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
| | - James Kobler
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Jesung Park
- Physical Sciences Inc., Andover, MA 01810, USA
| | | |
Collapse
|
12
|
Deliyski DD, Shishkov M, Mehta DD, Ghasemzadeh H, Bouma B, Zañartu M, de Alarcon A, Hillman RE. Laser-Calibrated System for Transnasal Fiberoptic Laryngeal High-Speed Videoendoscopy. J Voice 2019; 35:122-128. [PMID: 31383516 DOI: 10.1016/j.jvoice.2019.07.013] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 07/16/2019] [Indexed: 10/26/2022]
Abstract
The design specifications and experimental characteristics of a newly developed laser-projection transnasal flexible endoscope coupled with a high-speed videoendoscopy system are provided. The hardware and software design of the proposed system benefits from the combination of structured green light projection and laser triangulation techniques, which provide the capability of calibrated absolute measurements of the laryngeal structures along the horizontal and vertical planes during phonation. Visual inspection of in vivo acquired images demonstrated sharp contrast between laser points and background, confirming successful design of the system. Objective analyses were carried out for assessing the irradiance of the system and the penetration of the green laser light into the red and blue channels in the recorded images. The analysis showed that the system has irradiance of 372 W/m2 at a working distance of 20 mm, which is well within the safety limits, indicating minimal risk of usage of the device on human subjects. Additionally, the color penetration analysis showed that, with probability of 90%, the ratio of contamination of the red channel from the green laser light is less than 0.002. This indicates minimal effect of the laser projection on the measurements performed on the red data channel, making the system applicable for calibrated 3D spatial-temporal segmentation and data-driven subject-specific modeling, which is important for further advancing voice science and clinical voice assessment.
Collapse
Affiliation(s)
- Dimitar D Deliyski
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan.
| | - Milen Shishkov
- Wellman Center for Photomedicine, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Daryush D Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts; Department of Surgery, Harvard Medical School, Boston, Massachusetts; Division of Medical Sciences, Speech and Hearing Bioscience and Technology, Harvard Medical School, Boston, Massachusetts
| | - Hamzeh Ghasemzadeh
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan; Department of Computational Mathematics Science and Engineering, Michigan State University, East Lansing, Michigan
| | - Brett Bouma
- Wellman Center for Photomedicine, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Matias Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Alessandro de Alarcon
- Division of Pediatric Otolaryngology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - Robert E Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts; Department of Surgery, Harvard Medical School, Boston, Massachusetts; Division of Medical Sciences, Speech and Hearing Bioscience and Technology, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
13
|
Imaging the Vocal Folds: A Feasibility Study on Strain Imaging and Elastography of Porcine Vocal Folds. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9132729] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Vocal folds are an essential part of human voice production. The biomechanical properties are a good indicator for pathological changes. In particular, as an oscillation system, changes in the biomechanical properties have an impact on the vibration behavior. Subsequently, those changes could lead to voice-related disturbances. However, no existing examination combines biomechanical properties and spatial imaging. Therefore, we propose an image registration-based approach, using ultrasound in order to gain this information synchronously. We used a quasi-static load to compress the tissue and measured the displacement by image registration. The strain distribution was directly calculated from the displacement field, whereas the elastic properties were estimated by a finite element model. In order to show the feasibility and reliability of the algorithm, we tested it on gelatin phantoms. Further, by examining ex vivo porcine vocal folds, we were able to show the practicability of the approach. We displayed the strain distribution in the tissue and the elastic properties of the vocal folds. The results were superimposed on the corresponding ultrasound images. The findings are promising and show the feasibility of the suggested approach. Possible applications are in improved diagnosis of voice disorders, by measuring the biomechanical properties of the vocal folds with ultrasound. The transducer will be placed on the vocal folds of the anesthetized patient, and the elastic properties will be measured. Further, the understanding of the vocal folds’ biomechanics and the voice forming process could benefit from it.
Collapse
|
14
|
Influence of spatial camera resolution in high-speed videoendoscopy on laryngeal parameters. PLoS One 2019; 14:e0215168. [PMID: 31009488 PMCID: PMC6476512 DOI: 10.1371/journal.pone.0215168] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 03/27/2019] [Indexed: 11/19/2022] Open
Abstract
In laryngeal high-speed videoendoscopy (HSV) the area between the vibrating vocal folds during phonation is of interest, being referred to as glottal area waveform (GAW). Varying camera resolution may influence parameters computed on the GAW and hence hinder the comparability between examinations. This study investigates the influence of spatial camera resolution on quantitative vocal fold vibratory function parameters obtained from the GAW. In total 40 HSV recordings during sustained phonation (20 healthy males and 20 healthy females) were investigated. A clinically used Photron Fastcam MC2 camera with a frame rate of 4000 fps and a spatial resolution of 512×256 pixels was applied. This initial resolution was reduced by pixel averaging to (1) a resolution of 256×128 and (2) to a resolution of 128×64 pixels, yielding three sets of recordings. The GAW was extracted and in total 50 vocal fold vibratory parameters representing different features of the GAW were computed. Statistical analyses using SPSS Statistics, version 21, was performed. 15 Parameters showing strong mathematical dependencies with other parameters were excluded from the main analysis but are given in the Supporting Information. Data analysis revealed clear influence of spatial resolution on GAW parameters. Fundamental period measures and period perturbation measures were the least affected. Amplitude perturbation measures and mechanical measures were most strongly influenced. Most glottal dynamic characteristics and symmetry measures deviated significantly. Most energy perturbation measures changed significantly in males but were mostly unaffected in females. In females 18 of 35 remaining parameters (51%) and in males 22 parameters (63%) changed significantly between spatial resolutions. This work represents the first step in studying the impact of video resolution on quantitative HSV parameters. Clear influences of spatial camera resolution on computed parameters were found. The study results suggest avoiding the use of the most strongly affected parameters. Further, the use of cameras with high resolution is recommended to analyze GAW measures in HSV data.
Collapse
|