Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Yousef AM, Deliyski DD, Zacharias SRC, de Alarcon A, Orlikoff RF, Naghibolhosseini M. Spatial Segmentation for Laryngeal High-Speed Videoendoscopy in Connected Speech. J Voice 2023;37:26-36. [PMID: 33257208 PMCID: PMC8411982 DOI: 10.1016/j.jvoice.2020.10.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Revised: 10/11/2020] [Accepted: 10/23/2020] [Indexed: 01/17/2023]

For:	Yousef AM, Deliyski DD, Zacharias SRC, de Alarcon A, Orlikoff RF, Naghibolhosseini M. Spatial Segmentation for Laryngeal High-Speed Videoendoscopy in Connected Speech. J Voice 2023;37:26-36. [PMID: 33257208 PMCID: PMC8411982 DOI: 10.1016/j.jvoice.2020.10.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Revised: 10/11/2020] [Accepted: 10/23/2020] [Indexed: 01/17/2023]

Number

Cited by Other Article(s)

Naghibolhosseini M, Henry TM, Zayernouri M, Zacharias SRC, Deliyski DD. Supraglottic Laryngeal Maneuvers in Adductor Laryngeal Dystonia During Connected Speech. J Voice 2024:S0892-1997(24)00257-1. [PMID: 39217084 DOI: 10.1016/j.jvoice.2024.08.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 08/05/2024] [Accepted: 08/06/2024] [Indexed: 09/04/2024]

Abstract

OBJECTIVE

Adductor laryngeal dystonia (AdLD) disrupts fine motor movements of vocal folds during speech, resulting in a strained, broken, and strangled voice. Laryngeal high-speed videoendoscopy (HSV) in connected speech enables the direct visualization of detailed laryngeal dynamics, hence, it can be effectively used to study AdLD. The current study utilizes HSV to investigate supraglottic laryngeal tissue maneuvers obstructing the view of the vocal folds, in AdLD and normophonic speakers during connected speech. Characterizing the laryngeal maneuvers in these groups can facilitate a deeper understanding of the normophonic voice physiology and AdLD voice pathophysiology.

METHODS

HSV data were obtained from six normophonic speakers and six patients with AdLD during production of connected speech. Three experienced raters visually analyzed the data to determine laryngeal tissues leading to obstructions of vocal folds in HSV images. The raters recorded the duration of each obstruction and indicated the specific tissue(s) leading to the obstruction. After the completion of their individual visual analysis, the raters came to consensus about their observations and measurements.

RESULTS

Statistical analysis indicated that AdLD patients exhibited higher occurrences of vocal fold obstructions and longer durations of obstructions compared with the normophonic group. Similar obstruction types were found in both groups, with the epiglottis being the primary site of obstruction for both. Participants with AdLD displayed significantly elevated occurrences of sphincteric compression resulting in vocal fold obstruction.

CONCLUSION

HSV can be used to study the movements of laryngeal tissues in detail during connected speech. The analysis of supraglottic laryngeal tissue dynamics in speech can help us characterize the AdLD pathophysiology. The study's findings regarding the tissues implicated in obstructions may potentially inform the development of patient-specific therapeutic strategies targeting individual control over specific laryngeal muscles during phonation and speech production.

Collapse

Yousef AM, Deliyski DD, Zacharias SRC, Naghibolhosseini M. Detection of Vocal Fold Image Obstructions in High-Speed Videoendoscopy During Connected Speech in Adductor Spasmodic Dysphonia: A Convolutional Neural Networks Approach. J Voice 2024;38:951-962. [PMID: 35304042 PMCID: PMC9474736 DOI: 10.1016/j.jvoice.2022.01.028] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 01/30/2022] [Accepted: 01/30/2022] [Indexed: 01/10/2023]

Abstract

OBJECTIVE

Adductor spasmodic dysphonia (AdSD) is a neurogenic voice disorder, affecting the intrinsic laryngeal muscle control. AdSD leads to involuntary laryngeal spasms and only reveals during connected speech. Laryngeal high-speed videoendoscopy (HSV) coupled with a flexible fiberoptic endoscope provides a unique opportunity to study voice production and visualize the vocal fold vibrations in AdSD during speech. The goal of this study is to automatically detect instances during which the image of the vocal folds is optically obstructed in HSV recordings obtained during connected speech.

METHODS

HSV data were recorded from vocally normal adults and patients with AdSD during reading of the "Rainbow Passage", six CAPE-V sentences, and production of the vowel /i/. A convolutional neural network was developed and trained as a classifier to detect obstructed/unobstructed vocal folds in HSV frames. Manually labelled data were used for training, validating, and testing of the network. Moreover, a comprehensive robustness evaluation was conducted to compare the performance of the developed classifier and visual analysis of HSV data.

RESULTS

The developed convolutional neural network was able to automatically detect the vocal fold obstructions in HSV data in vocally normal participants and AdSD patients. The trained network was tested successfully and showed an overall classification accuracy of 94.18% on the testing dataset. The robustness evaluation showed an average overall accuracy of 94.81% on a massive number of HSV frames demonstrating the high robustness of the introduced technique while keeping a high level of accuracy.

CONCLUSIONS

The proposed approach can be used for efficient analysis of HSV data to study laryngeal maneuvers in patients with AdSD during connected speech. Additionally, this method will facilitate development of vocal fold vibratory measures for HSV frames with an unobstructed view of the vocal folds. Indicating parts of connected speech that provide an unobstructed view of the vocal folds can be used for developing optimal passages for precise HSV examination during connected speech and subject-specific clinical voice assessment protocols.

Collapse

Stager S, Maryn Y. A Retrospective Study of Acoustic Measures of Glottal Stop Production to Assess Vocal Function in Unilateral Vocal Fold Paresis/Paralysis Patients. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024;67:1643-1659. [PMID: 38683058 DOI: 10.1044/2024_jslhr-23-00576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/01/2024]

Abstract

PURPOSE

The aim of this study was to determine (a) diagnostic accuracy of acoustic measures of glottal stop production (GSP; intensity differences, slopes, complete voicing cessation) to distinguish between unilateral vocal fold paresis/paralysis (UVFP) patients and controls; (b) if acoustic measures of GSP significantly correlated with an acoustic measure of voice disorder severity, acoustic voice quality index (AVQI); and (c) if acoustic measures from another type of voicing cessation, voiceless consonant production, also significantly differed between groups.

METHOD

Ninety-seven patients with unilateral paresis/paralysis and 35 controls with normal laryngostroboscopic signs produced two sets of five repeated [i] and four repeated [isi]. Tokens were randomized by type between groups and analyzed blinded using a customized Praat program that computed intensity differences and slopes between vowel maxima and glottal stop minima for inter-[i] tokens and vowel maxima and voiceless consonant minima for intra-[isi] tokens. The number of voicing cessations for inter-[i] tokens was obtained.

RESULTS

Onset and offset intensity differences and number of voicing cessations from inter-[i] tokens had the greatest areas under the curve (.854, .856, and .835, respectively). Correlation coefficients were significant (p < .01) between AVQI and all GSP acoustic measures with weak/medium effect sizes. No significant differences were found between controls and participants with UVFP for acoustic measures from intra-[isi].

CONCLUSIONS

Acoustic GSP measures demonstrated good diagnostic accuracy and some relationship to severity of voice disorder. No significant differences in acoustic measures for medial voiceless fricative consonants between controls and participants with UVFP suggested that voicing cessation for voiceless fricatives differs from voicing cessation for GSP.

Collapse

Malinowski J, Pietruszewska W, Kowalczyk M, Niebudek-Bogusz E. Value of high-speed videoendoscopy as an auxiliary tool in differentiation of benign and malignant unilateral vocal lesions. J Cancer Res Clin Oncol 2024;150:10. [PMID: 38216796 PMCID: PMC10786956 DOI: 10.1007/s00432-023-05543-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 12/13/2023] [Indexed: 01/14/2024]

Abstract

PURPOSE

The study aimed to assess the relevance of objective vibratory parameters derived from high-speed videolaryngoscopy (HSV) as a supporting tool, to assist clinicians in establishing the initial diagnosis of benign and malignant glottal organic lesions.

METHODS

The HSV examinations were conducted in 175 subjects: 50 normophonic, 85 subjects with benign vocal fold lesions, and 40 with early glottic cancer; organic lesions were confirmed by histopathologic examination. The parameters, derived from HSV kymography: amplitude, symmetry, and glottal dynamic characteristics, were compared statistically between the groups with the following ROC analysis.

RESULTS

Among 14 calculated parameters, 10 differed significantly between the groups. Four of them, the average resultant amplitude of the involved vocal fold (AmpInvolvedAvg), average amplitude asymmetry for the whole glottis and its middle third part (AmplAsymAvg; AmplAsymAvg_2/3), and absolute average phase difference (AbsPhaseDiffAvg), showed significant differences between benign and malignant lesions. Amplitude values were decreasing, while asymmetry and phase difference values were increasing with the risk of malignancy. In ROC analysis, the highest AUC was observed for AmpAsymAvg (0.719; p < 0.0001), and next in order was AmpInvolvedAvg (0.70; p = 0.0002).

CONCLUSION

The golden standard in the diagnosis of organic lesions of glottis remains clinical examination with videolaryngoscopy, confirmed by histopathological examination. Our results showed that measurements of amplitude, asymmetry, and phase of vibrations in malignant vocal fold masses deteriorate significantly in comparison to benign vocal lesions. High-speed videolaryngoscopy could aid their preliminary differentiation noninvasively before histopathological examination; however, further research on larger groups is needed.

Collapse

Yousef AM, Deliyski DD, Zayernouri M, Zacharias SRC, Naghibolhosseini M. Deep Learning-Based Analysis of Glottal Attack and Offset Times in Adductor Laryngeal Dystonia. J Voice 2023:S0892-1997(23)00319-3. [PMID: 37977969 PMCID: PMC11093885 DOI: 10.1016/j.jvoice.2023.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 10/05/2023] [Accepted: 10/06/2023] [Indexed: 11/19/2023]

Abstract

OBJECTIVE

Diagnosis of adductor laryngeal dystonia (AdLD) is challenging as it mimics voice features of other voice disorders. This could lead to misdiagnosis (or delayed diagnosis) and ineffective treatments of AdLD. This paper develops automated measurements of glottal attack time (GAT) and glottal offset time (GOT) from high-speed videoendoscopy (HSV) in connected speech as objective measures that can potentially facilitate the diagnosis of this disorder in the future.

METHODS

HSV data were recorded from vocally normal adults and patients with AdLD during the reading of the "Rainbow Passage" and six CAPE-V (Consensus Auditory-Perceptual Evaluation of Voice) sentences. A deep learning framework was designed and trained to segment the glottal area and detect the vocal fold edges in the HSV dataset. This automated framework allowed us to automatically measure and quantify the GATs and GOTs for the participants. Accordingly, a comparison was held between the obtained measurements among vocally normal speakers and those with AdLD.

RESULTS

The automated framework was successfully developed and able to accurately segment the glottal area/edges. The precise automated measurements of GAT and GOT revealed minor, nonsignificant differences compared to the results of manual analysis-showing a strong correlation between the measures by the automated and manual methods. The results showed significant differences in the GAT values between the vocally normal subjects and AdLD patients, with larger variability in both the GAT and GOT measures in the AdLD group.

CONCLUSIONS

The developed automated approach for GAT and GOT measurement can be valuable in clinical practice. These quantitative measurements can be used as meaningful biomarkers of the impaired vocal function in AdLD and help its differential diagnosis in the future.

Collapse

Tsilivigkos C, Athanasopoulos M, Micco RD, Giotakis A, Mastronikolis NS, Mulita F, Verras GI, Maroulis I, Giotakis E. Deep Learning Techniques and Imaging in Otorhinolaryngology-A State-of-the-Art Review. J Clin Med 2023;12:6973. [PMID: 38002588 PMCID: PMC10672270 DOI: 10.3390/jcm12226973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2023] [Revised: 11/02/2023] [Accepted: 11/06/2023] [Indexed: 11/26/2023] Open

Naghibolhosseini M, Zacharias SRC, Zenas S, Levesque F, Deliyski DD. Laryngeal Imaging Study of Glottal Attack/Offset Time in Adductor Spasmodic Dysphonia during Connected Speech. APPLIED SCIENCES (BASEL, SWITZERLAND) 2023;13:2979. [PMID: 37034315 PMCID: PMC10077958 DOI: 10.3390/app13052979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/12/2023]

Sakthivel S, Prabhu V. Optimal Deep Learning-Based Vocal Fold Disorder Detection and Classification Model on High-Speed Video Endoscopy. JOURNAL OF HEALTHCARE ENGINEERING 2022;2022:4248938. [PMID: 36353680 PMCID: PMC9640237 DOI: 10.1155/2022/4248938] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 09/04/2022] [Accepted: 09/21/2022] [Indexed: 08/08/2023]

Yousef AM, Deliyski DD, Zacharias SRC, Naghibolhosseini M. Deep-Learning-Based Representation of Vocal Fold Dynamics in Adductor Spasmodic Dysphonia during Connected Speech in High-Speed Videoendoscopy. J Voice 2022:S0892-1997(22)00263-6. [PMID: 36154973 PMCID: PMC10030376 DOI: 10.1016/j.jvoice.2022.08.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 08/14/2022] [Accepted: 08/17/2022] [Indexed: 11/28/2022]

Abstract

OBJECTIVE

Adductor spasmodic dysphonia (AdSD) is a neurogenic dystonia, which causes spasms of the laryngeal muscles. This disorder mainly affects production of connected speech. To understand how AdSD affects vocal fold (VF) movements and hence, the speech signal, it is necessary to study VF kinematics during the running speech. This paper introduces an automated method for analysis of VF vibrations in AdSD using laryngeal high-speed videoendoscopy (HSV) in running speech.

METHODS

A monochrome HSV system was used to obtain video recordings from vocally normal individuals and AdSD patients during production of the six CAPE-V sentences and the "Rainbow Passage." A deep neural network was designed based on the UNet architecture. The network was developed for glottal area segmentation in HSV data providing a tool for quantitative analysis of VF vibrations in both norm and AdSD. The network was trained and validated using the manually labeled HSV frames. After training the network, the segmentation quality was quantitatively evaluated against visual analysis results of a test dataset including segregated HSV frames and a short sequence of VF vibrations in consecutive frames.

RESULTS

The developed convolutional network was successfully trained and demonstrated an accurate segmentation on the testing dataset with a mean Intersection over Union (IoU) of 0.81 and a mean Boundary-F1 score of 0.93. Moreover, the visual assessment of the automated technique showed an accurate detection of the glottal edges/area in the HSV data even with challenging image quality and excessive laryngeal maneuvers of AdSD patients during the running speech.

CONCLUSION

The introduced automated approach provides an accurate representation of the glottal edges/area during connected speech in HSV data for norm and AdSD patients. This method facilitates the development of HSV-based measures to quantify VF dynamics in AdSD. Using HSV to automatically analyze VF vibrations in AdSD can allow for understanding AdSD vocal mechanisms and characteristics.

Collapse

Yousef AM, Deliyski DD, Zacharias SRC, de Alarcon A, Orlikoff RF, Naghibolhosseini M. A Deep Learning Approach for Quantifying Vocal Fold Dynamics During Connected Speech Using Laryngeal High-Speed Videoendoscopy. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022;65:2098-2113. [PMID: 35605603 PMCID: PMC9567340 DOI: 10.1044/2022_jslhr-21-00540] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 01/30/2022] [Accepted: 02/28/2022] [Indexed: 06/15/2023]

Abstract

PURPOSE

Voice disorders are best assessed by examining vocal fold dynamics in connected speech. This can be achieved using flexible laryngeal high-speed videoendoscopy (HSV), which enables us to study vocal fold mechanics with high temporal details. Analysis of vocal fold vibration using HSV requires accurate segmentation of the vocal fold edges. This article presents an automated deep-learning scheme to segment the glottal area in HSV from which the glottal edges are derived during connected speech.

METHOD

Using a custom-built HSV system, data were obtained from a vocally healthy participant reciting the "Rainbow Passage." A deep neural network was designed for glottal area segmentation in the HSV data. A recently introduced hybrid approach by the authors was utilized as an automated labeling tool to train the network on a set of HSV frames, where the glottis region was automatically annotated during vocal fold vibrations. The network was then tested against manually segmented frames using different metrics, intersection over union (IoU), and Boundary F1 (BF) score, and its performance was assessed on various phonatory events on the HSV sequence.

RESULTS

The designed network was successfully trained using the hybrid approach, without the need for manual labeling, and tested on the manually labeled data. The performance metrics showed a mean IoU of 0.82 and a mean BF score of 0.96. In addition, the evaluation assessment of the network's performance demonstrated an accurate segmentation of the glottal edges/area even during complex nonstationary phonatory events and when vocal folds were not vibrating, thus overcoming the limitations of the previous hybrid approach that could only be applied to the vibrating vocal folds.

CONCLUSIONS

The introduced automated scheme guarantees accurate glottis representation in challenging color HSV data with lower image quality and excessive laryngeal maneuvers during all instances of connected speech. This facilitates the future development of HSV-based measures to assess the running vibratory characteristics of the vocal folds in speakers with and without voice disorder.

SUPPLEMENTAL MATERIAL

https://doi.org/10.23641/asha.19798864.

Collapse

Kopczynski B, Niebudek-Bogusz E, Pietruszewska W, Strumillo P. Segmentation of Glottal Images from High-Speed Videoendoscopy Optimized by Synchronous Acoustic Recordings. SENSORS (BASEL, SWITZERLAND) 2022;22:1751. [PMID: 35270897 PMCID: PMC8915112 DOI: 10.3390/s22051751] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 02/12/2022] [Accepted: 02/15/2022] [Indexed: 05/17/2023]

Kist AM, Gómez P, Dubrovskiy D, Schlegel P, Kunduk M, Echternach M, Patel R, Semmler M, Bohr C, Dürr S, Schützenberger A, Döllinger M. A Deep Learning Enhanced Novel Software Tool for Laryngeal Dynamics Analysis. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021;64:1889-1903. [PMID: 34000199 DOI: 10.1044/2021_jslhr-20-00498] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Abstract

Purpose High-speed videoendoscopy (HSV) is an emerging, but barely used, endoscopy technique in the clinic to assess and diagnose voice disorders because of the lack of dedicated software to analyze the data. HSV allows to quantify the vocal fold oscillations by segmenting the glottal area. This challenging task has been tackled by various studies; however, the proposed approaches are mostly limited and not suitable for daily clinical routine. Method We developed a user-friendly software in C# that allows the editing, motion correction, segmentation, and quantitative analysis of HSV data. We further provide pretrained deep neural networks for fully automatic glottis segmentation. Results We freely provide our software Glottis Analysis Tools (GAT). Using GAT, we provide a general threshold-based region growing platform that enables the user to analyze data from various sources, such as in vivo recordings, ex vivo recordings, and high-speed footage of artificial vocal folds. Additionally, especially for in vivo recordings, we provide three robust neural networks at various speed and quality settings to allow a fully automatic glottis segmentation needed for application by untrained personnel. GAT further evaluates video and audio data in parallel and is able to extract various features from the video data, among others the glottal area waveform, that is, the changing glottal area over time. In total, GAT provides 79 unique quantitative analysis parameters for video- and audio-based signals. Many of these parameters have already been shown to reflect voice disorders, highlighting the clinical importance and usefulness of the GAT software. Conclusion GAT is a unique tool to process HSV and audio data to determine quantitative, clinically relevant parameters for research, diagnosis, and treatment of laryngeal disorders. Supplemental Material https://doi.org/10.23641/asha.14575533.

Collapse

Yousef AM, Deliyski DD, Zacharias SRC, de Alarcon A, Orlikoff RF, Naghibolhosseini M. A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech. APPLIED SCIENCES-BASEL 2021;11. [PMID: 33717604 PMCID: PMC7954580 DOI: 10.3390/app11031179] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]