1
|
Patil N, Jain S, Wadhwa S. Unveiling the Potential: A Comprehensive Review of Dynamic Slow-Motion Video Endoscopy for Eustachian Tube Dysfunction Evaluation. Cureus 2024; 16:e63811. [PMID: 39099922 PMCID: PMC11297562 DOI: 10.7759/cureus.63811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 07/04/2024] [Indexed: 08/06/2024] Open
Abstract
Eustachian tube dysfunction (ETD) poses diagnostic challenges due to its complex pathophysiology and varied clinical presentation. Traditional diagnostic methods often lack direct visualization of the Eustachian tube (ET) function, leading to suboptimal evaluation and management. Dynamic slow-motion video endoscopy (DSVE) has emerged as a novel approach to address these limitations, offering real-time visualization of ET dynamics with enhanced clarity and precision. This comprehensive review provides an overview of DSVE as a promising tool for evaluating ETD. We discuss its methodology, clinical applications, comparative analysis with traditional methods, and future directions. Key findings from the literature highlight DSVE's ability to enhance diagnostic accuracy, facilitate targeted treatment strategies, and improve patient outcomes. Integrating DSVE into routine clinical practice holds significant implications for the diagnosis and management of ETD, offering clinicians valuable insights into underlying pathophysiology and guiding personalized treatment interventions. Future research should focus on standardizing DSVE protocols, validating its diagnostic accuracy, and exploring its role in guiding novel treatment modalities. By advancing our understanding of ETD and optimizing diagnostic and therapeutic approaches, DSVE has the potential to revolutionize the management of this common yet challenging otologic condition.
Collapse
Affiliation(s)
- Nimisha Patil
- Otolaryngology - Head and Neck Surgery, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education and Research, Wardha, IND
| | - Shraddha Jain
- Otolaryngology - Head and Neck Surgery, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education and Research, Wardha, IND
| | - Smriti Wadhwa
- Otolaryngology - Head and Neck Surgery, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education and Research, Wardha, IND
| |
Collapse
|
2
|
Kopczynski B, Niebudek-Bogusz E, Pietruszewska W, Strumillo P. Segmentation of Glottal Images from High-Speed Videoendoscopy Optimized by Synchronous Acoustic Recordings. SENSORS (BASEL, SWITZERLAND) 2022; 22:1751. [PMID: 35270897 PMCID: PMC8915112 DOI: 10.3390/s22051751] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 02/12/2022] [Accepted: 02/15/2022] [Indexed: 05/17/2023]
Abstract
Laryngeal high-speed videoendoscopy (LHSV) is an imaging technique offering novel visualization quality of the vibratory activity of the vocal folds. However, in most image analysis methods, the interaction of the medical personnel and access to ground truth annotations are required to achieve accurate detection of vocal folds edges. In our fully automatic method, we combine video and acoustic data that are synchronously recorded during the laryngeal endoscopy. We show that the image segmentation algorithm of the glottal area can be optimized by matching the Fourier spectra of the pre-processed video and the spectra of the acoustic recording during the phonation of sustained vowel /i:/. We verify our method on a set of LHSV recordings taken from subjects with normophonic voice and patients with voice disorders due to glottal insufficiency. We show that the computed geometric indices of the glottal area make it possible to discriminate between normal and pathologic voices. The median of the Open Quotient and Minimal Relative Glottal Area values for healthy subjects were 0.69 and 0.06, respectively, while for dysphonic subjects were 1 and 0.35, respectively. We also validate these results using independent phoniatrician experts.
Collapse
Affiliation(s)
- Bartosz Kopczynski
- Institute of Electronics, Lodz University of Technology, 90-924 Lodz, Poland;
| | - Ewa Niebudek-Bogusz
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, 90-001 Lodz, Poland; (E.N.-B.); (W.P.)
| | - Wioletta Pietruszewska
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, 90-001 Lodz, Poland; (E.N.-B.); (W.P.)
| | - Pawel Strumillo
- Institute of Electronics, Lodz University of Technology, 90-924 Lodz, Poland;
| |
Collapse
|
3
|
Abstract
A healthy voice is crucial for verbal communication and hence in daily as well as professional life. The basis for a healthy voice are the sound producing vocal folds in the larynx. A hallmark of healthy vocal fold oscillation is the symmetric motion of the left and right vocal fold. Clinically, videoendoscopy is applied to assess the symmetry of the oscillation and evaluated subjectively. High-speed videoendoscopy, an emerging method that allows quantification of the vocal fold oscillation, is more commonly employed in research due to the amount of data and the complex, semi-automatic analysis. In this study, we provide a comprehensive evaluation of methods that detect fully automatically the glottal midline. We used a biophysical model to simulate different vocal fold oscillations, extended the openly available BAGLS dataset using manual annotations, utilized both, simulations and annotated endoscopic images, to train deep neural networks at different stages of the analysis workflow, and compared these to established computer vision algorithms. We found that classical computer vision perform well on detecting the glottal midline in glottis segmentation data, but are outperformed by deep neural networks on this task. We further suggest GlottisNet, a multi-task neural architecture featuring the simultaneous prediction of both, the opening between the vocal folds and the symmetry axis, leading to a huge step forward towards clinical applicability of quantitative, deep learning-assisted laryngeal endoscopy, by fully automating segmentation and midline detection.
Collapse
|
4
|
Twenty Eight Days in the Life of a Vocal Pathology: A Case Study of Videolaryngostroboscopy, Acoustic, and Perceptual Variability. J Voice 2020; 36:732.e21-732.e31. [PMID: 32891478 DOI: 10.1016/j.jvoice.2020.08.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 08/10/2020] [Accepted: 08/11/2020] [Indexed: 11/21/2022]
Abstract
The purpose of this investigation was to observe laryngeal tissue and vocal function changes over the course of 28 days in a single participant diagnosed by a laryngologist with bilateral nodules. Laryngeal imaging, acoustic variables and perceptual assessments of voice quality, and perceived vocal effort were obtained every morning for 28 consecutive days. A daily journal of occupational and recreational voice use as well as menstruation and alcohol consumption was maintained each day. It was hypothesized that the laryngeal pathology would appear more extensive and the vocal function measures obtained would be worse following extensive voice use. Laryngeal imaging, acoustic variables, and perceptual measures quantified provided evidence to support the study hypotheses. The size, extent, and asymmetry of the bilateral vocal pathologies observed were more extensive on days following occupational and recreational vocal loading. Acoustic and perceptual measures obtained correlated with the laryngeal tissue changes observed. Study findings support a more holistic approach to laryngeal pathology diagnosis that is based on a more thorough understanding of vocal loading considerations up to 48 hours prior to laryngeal endoscopy to better understand the pathophysiology of the observed lesion(s) for most accurate clinical diagnosis.
Collapse
|
5
|
Abstract
This review provides a comprehensive compilation, from a digital image processing point of view of the most important techniques currently developed to characterize and quantify the vibration behaviour of the vocal folds, along with a detailed description of the laryngeal image modalities currently used in the clinic. The review presents an overview of the most significant glottal-gap segmentation and facilitative playbacks techniques used in the literature for the mentioned purpose, and shows the drawbacks and challenges that still remain unsolved to develop robust vocal folds vibration function analysis tools based on digital image processing.
Collapse
|
6
|
Maguluri G, Mehta D, Kobler J, Park J, Iftimia N. Synchronized, concurrent optical coherence tomography and videostroboscopy for monitoring vocal fold morphology and kinematics. BIOMEDICAL OPTICS EXPRESS 2019; 10:4450-4461. [PMID: 31565501 PMCID: PMC6757476 DOI: 10.1364/boe.10.004450] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Revised: 07/17/2019] [Accepted: 07/24/2019] [Indexed: 06/10/2023]
Abstract
Voice disorders affect a large number of adults in the United States, and their clinical evaluation heavily relies on laryngeal videostroboscopy, which captures the medial-lateral and anterior-posterior motion of the vocal folds using stroboscopic sampling. However, videostroboscopy does not provide direct visualization of the superior-inferior movement of the vocal folds, which yields important clinical insight. In this paper, we present a novel technology that complements videostroboscopic findings by adding the ability to image the coronal plane and visualize the superior-inferior movement of the vocal folds. The technology is based on optical coherence tomography, which is combined with videostroboscopy within the same endoscopic probe to provide spatially and temporally co-registered images of the mucosal wave motion, as well as vocal folds subsurface morphology. We demonstrate the capability of the rigid endoscopic probe, in a benchtop setting, to characterize the complex movement and subsurface structure of the aerodynamically driven excised larynx models within the 50 to 200 Hz phonation range. Our preliminary results encourage future development of this technology with the goal of its use for in vivo laryngeal imaging.
Collapse
Affiliation(s)
| | - Daryush Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
| | - James Kobler
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Jesung Park
- Physical Sciences Inc., Andover, MA 01810, USA
| | | |
Collapse
|
7
|
Patel RR, Awan SN, Barkmeier-Kraemer J, Courey M, Deliyski D, Eadie T, Paul D, Švec JG, Hillman R. Recommended Protocols for Instrumental Assessment of Voice: American Speech-Language-Hearing Association Expert Panel to Develop a Protocol for Instrumental Assessment of Vocal Function. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2018; 27:887-905. [PMID: 29955816 DOI: 10.1044/2018_ajslp-17-0009] [Citation(s) in RCA: 355] [Impact Index Per Article: 59.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 02/17/2018] [Indexed: 05/09/2023]
Abstract
PURPOSE The aim of this study was to recommend protocols for instrumental assessment of voice production in the areas of laryngeal endoscopic imaging, acoustic analyses, and aerodynamic procedures, which will (a) improve the evidence for voice assessment measures, (b) enable valid comparisons of assessment results within and across clients and facilities, and (c) facilitate the evaluation of treatment efficacy. METHOD Existing evidence was combined with expert consensus in areas with a lack of evidence. In addition, a survey of clinicians and a peer review of an initial version of the protocol via VoiceServe and the American Speech-Language-Hearing Association's Special Interest Group 3 (Voice and Voice Disorders) Community were used to create the recommendations for the final protocols. RESULTS The protocols include recommendations regarding technical specifications for data acquisition, voice and speech tasks, analysis methods, and reporting of results for instrumental evaluation of voice production in the areas of laryngeal endoscopic imaging, acoustics, and aerodynamics. CONCLUSION The recommended protocols for instrumental assessment of voice using laryngeal endoscopic imaging, acoustic, and aerodynamic methods will enable clinicians and researchers to collect a uniform set of valid and reliable measures that can be compared across assessments, clients, and facilities.
Collapse
Affiliation(s)
- Rita R Patel
- Department of Speech and Hearing Sciences, Indiana University, Bloomington
| | - Shaheen N Awan
- Department of Audiology and Speech-Language Pathology, Bloomsburg University of Pennsylvania
| | | | - Mark Courey
- Otolaryngology, The Mount Sinai Hospital, New York Eye and Ear Infirmary of Mount Sinai
| | - Dimitar Deliyski
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing
| | - Tanya Eadie
- Department of Speech and Hearing Sciences, University of Washington, Seattle
| | - Diane Paul
- Director, Clinical Issues in Speech-Language Pathology, American Speech-Language-Hearing Association, Rockville, MD
| | - Jan G Švec
- Department of Biophysics, Faculty of Science, Palacký University, Olomouc, Czech Republic
| | - Robert Hillman
- Massachusetts General Hospital, Harvard Medical School, MGH Institute of Health Professions, Boston
| |
Collapse
|
8
|
Popolo PS. Investigation of Flexible High-Speed Video Nasolaryngoscopy. J Voice 2017; 32:529-537. [PMID: 28958874 DOI: 10.1016/j.jvoice.2017.08.017] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2017] [Revised: 08/17/2017] [Accepted: 08/17/2017] [Indexed: 11/30/2022]
Abstract
OBJECTIVE High-speed videolaryngoscopy is widely used in voice practices as a complement to videostroboscopy, especially when it is desired to visualize asymmetric and nonperiodic vocal fold vibration or voice onset and offset. Because of the requirement for greater illumination at higher frame rates, the high-speed exam is usually performed with a rigid transoral laryngoscope. Although it is possible to obtain color high-speed video images with a flexible fiberoptic nasoendoscope, the results are often disappointing because of the inability to provide adequate lighting inside the larynx. This paper will present the results of a systematic exploration of tools and techniques to optimize the image brightness of flexible color high-speed videolaryngoscopy exams using the KayPENTAX Model 9710 Color High-Speed Video (CHSV) System. METHODS The KayPENTAX CHSV System was used with three PENTAX flexible fiberoptic nasolaryngoscopes and a new supplemental light fiber bundle to perform high-speed examinations of healthy vocal folds. Variables of the investigation included camera frame rate, camera sensitivity (color head versus black-and-white head), optics (camera lens focal length), light coupling, nasoendoscope outer diameter, and endoscopy technique (visually perceived distance of the distal tip of scope from the glottal plane). RESULTS AND CONCLUSIONS The manipulation of camera gain, the proper selection of lens coupler focal length, and the adjustment of scope distal tip distance from the glottal plane were found to be most effective for optimizing image brightness, whereas the supplemental light fiber bundle provided minimal benefits. Other factors considered include patient comfort, practicality, and ease of use by the clinician.
Collapse
Affiliation(s)
- Peter S Popolo
- Department of Communication Sciences and Disorders, Montclair State University, Montclair, New Jersey.
| |
Collapse
|
9
|
Oscillatory Onset and Offset in Young Vocally Healthy Adults Across Various Measurement Methods. J Voice 2017; 31:512.e17-512.e24. [PMID: 28169095 DOI: 10.1016/j.jvoice.2016.12.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2016] [Revised: 12/01/2016] [Accepted: 12/02/2016] [Indexed: 11/20/2022]
Abstract
OBJECTIVE This study aimed to investigate the relationship between (1) oscillatory onset-offset time across various approaches that use different measurement criteria and (2) oscillatory onset and offset times in vocally healthy young adults. METHOD Oscillatory onset-offset times were obtained from 71 vocally normal adults, using high-speed videoendoscopy. Comparisons between the different onset methods involved measurement of the oscillatory onset time (OOT), voice initiation period (VIP), and the phonation onset time (POT), and for offset methods involved computation of the oscillatory offset time (OOToff) and the phonation offset time. RESULTS Correlation of the OOT with the VIP was 0.240 (P = 0.04) and with the POT form glottal area waveform was 0.248 (P = 0.04); however, correlation between the VIP and the POT glottal area waveform was 0.661 (P < 0.001). For offset, there was a moderate correlation (rS = 0.503, P < 0.001) across OOToff and vocal offset period. The onset time was longest for the OOT followed by the VIP and the POT. There was no correlation between onset and offset for all methods. CONCLUSIONS A framework for quantification of oscillatory onset-offset time was developed for /hi/ tasks, which can be used for future measurements of disordered voice. A positive relationship was observed between VIP and POT and between OOToff and vocal offset period. There was a nonlinear relationship between the OOT, VIP, and POT measures. Onset-offset times are strongly influenced by the calculation method used, the pros and cons of which are discussed in this paper. Vibratory onset and offset represent physiologically different phenomena.
Collapse
|
10
|
Niebudek-Bogusz E, Kopczynski B, Strumillo P, Morawska J, Wiktorowicz J, Sliwinska-Kowalska M. Quantitative assessment of videolaryngostroboscopic images in patients with glottic pathologies. LOGOP PHONIATR VOCO 2016; 42:73-83. [DOI: 10.3109/14015439.2016.1174293] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|