Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kist AM, Zilker J, Gómez P, Schützenberger A, Döllinger M. Rethinking glottal midline detection. Sci Rep 2020;10:20723. [PMID: 33244031 DOI: 10.1038/s41598-020-77216-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 11/06/2020] [Indexed: 11/22/2022] Open

For:	Kist AM, Zilker J, Gómez P, Schützenberger A, Döllinger M. Rethinking glottal midline detection. Sci Rep 2020;10:20723. [PMID: 33244031 DOI: 10.1038/s41598-020-77216-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 11/06/2020] [Indexed: 11/22/2022] Open

Number

Cited by Other Article(s)

Darvish M, Kist AM. A Generative Method for a Laryngeal Biosignal. J Voice 2024:S0892-1997(24)00019-5. [PMID: 38395653 DOI: 10.1016/j.jvoice.2024.01.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 01/26/2024] [Accepted: 01/26/2024] [Indexed: 02/25/2024]

Pennington-FitzGerald W, Joshi A, Honzel E, Hernandez-Morato I, Pitman MJ, Moayedi Y. Development and Application of Automated Vocal Fold Tracking Software in a Rat Surgical Model. Laryngoscope 2024;134:340-346. [PMID: 37543969 DOI: 10.1002/lary.30930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 06/21/2023] [Accepted: 07/15/2023] [Indexed: 08/08/2023]

Kruse E, Döllinger M, Schützenberger A, Kist AM. GlottisNetV2: Temporal Glottal Midline Detection Using Deep Convolutional Neural Networks. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE 2023;11:137-144. [PMID: 36816097 PMCID: PMC9933989 DOI: 10.1109/jtehm.2023.3237859] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 11/27/2022] [Accepted: 01/04/2023] [Indexed: 11/26/2023]

Peterson QA, Fei T, Sy LE, Froeschke LL, Mendelsohn AH, Berke GS, Peterson DA. Correlating Perceptual Voice Quality in Adductor Spasmodic Dysphonia With Computer Vision Assessment of Glottal Geometry Dynamics. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022;65:3695-3708. [PMID: 36130065 PMCID: PMC9927624 DOI: 10.1044/2022_jslhr-22-00053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]

Sakthivel S, Prabhu V. Optimal Deep Learning-Based Vocal Fold Disorder Detection and Classification Model on High-Speed Video Endoscopy. JOURNAL OF HEALTHCARE ENGINEERING 2022;2022:4248938. [PMID: 36353680 PMCID: PMC9640237 DOI: 10.1155/2022/4248938] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 09/04/2022] [Accepted: 09/21/2022] [Indexed: 08/08/2023]

Döllinger M, Schraut T, Henrich LA, Chhetri D, Echternach M, Johnson AM, Kunduk M, Maryn Y, Patel RR, Samlan R, Semmler M, Schützenberger A. Re-Training of Convolutional Neural Networks for Glottis Segmentation in Endoscopic High-Speed Videos. APPLIED SCIENCES (BASEL, SWITZERLAND) 2022;12:9791. [PMID: 37583544 PMCID: PMC10427138 DOI: 10.3390/app12199791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]

Affiliation(s)

Michael Döllinger Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany
Tobias Schraut Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany
Lea A. Henrich Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany
Dinesh Chhetri Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California, Los Angeles, Los Angeles, CA 90095, USA
Matthias Echternach Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), 80331 Munich, Germany
Aaron M. Johnson NYU Voice Center, Department of Otolaryngology–Head and Neck Surgery, New York University, Grossman School of Medicine, New York, NY 10001, USA
Melda Kunduk Department of Communication Sciences and Disorders, Louisiana State University, Baton Rouge, LA 70801, USA
Youri Maryn Department of Speech, Language and Hearing Sciences, University of Ghent, 9000 Ghent, Belgium
Rita R. Patel Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IA 47401, USA
Robin Samlan Department of Speech, Language, & Hearing Sciences, University of Arizona, Tucson, AZ 85641, USA
Marion Semmler Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany
Anne Schützenberger Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany

Collapse

Yousef AM, Deliyski DD, Zacharias SRC, Naghibolhosseini M. Deep-Learning-Based Representation of Vocal Fold Dynamics in Adductor Spasmodic Dysphonia during Connected Speech in High-Speed Videoendoscopy. J Voice 2022:S0892-1997(22)00263-6. [PMID: 36154973 PMCID: PMC10030376 DOI: 10.1016/j.jvoice.2022.08.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 08/14/2022] [Accepted: 08/17/2022] [Indexed: 11/28/2022]

Abstract

OBJECTIVE

Adductor spasmodic dysphonia (AdSD) is a neurogenic dystonia, which causes spasms of the laryngeal muscles. This disorder mainly affects production of connected speech. To understand how AdSD affects vocal fold (VF) movements and hence, the speech signal, it is necessary to study VF kinematics during the running speech. This paper introduces an automated method for analysis of VF vibrations in AdSD using laryngeal high-speed videoendoscopy (HSV) in running speech.

METHODS

A monochrome HSV system was used to obtain video recordings from vocally normal individuals and AdSD patients during production of the six CAPE-V sentences and the "Rainbow Passage." A deep neural network was designed based on the UNet architecture. The network was developed for glottal area segmentation in HSV data providing a tool for quantitative analysis of VF vibrations in both norm and AdSD. The network was trained and validated using the manually labeled HSV frames. After training the network, the segmentation quality was quantitatively evaluated against visual analysis results of a test dataset including segregated HSV frames and a short sequence of VF vibrations in consecutive frames.

RESULTS

The developed convolutional network was successfully trained and demonstrated an accurate segmentation on the testing dataset with a mean Intersection over Union (IoU) of 0.81 and a mean Boundary-F1 score of 0.93. Moreover, the visual assessment of the automated technique showed an accurate detection of the glottal edges/area in the HSV data even with challenging image quality and excessive laryngeal maneuvers of AdSD patients during the running speech.

CONCLUSION

The introduced automated approach provides an accurate representation of the glottal edges/area during connected speech in HSV data for norm and AdSD patients. This method facilitates the development of HSV-based measures to quantify VF dynamics in AdSD. Using HSV to automatically analyze VF vibrations in AdSD can allow for understanding AdSD vocal mechanisms and characteristics.

Collapse

Paderno A, Gennarini F, Sordi A, Montenegro C, Lancini D, Villani FP, Moccia S, Piazza C. Artificial intelligence in clinical endoscopy: Insights in the field of videomics. Front Surg 2022;9:933297. [PMID: 36171813 PMCID: PMC9510389 DOI: 10.3389/fsurg.2022.933297] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 08/22/2022] [Indexed: 11/13/2022] Open

A single latent channel is sufficient for biomedical glottis segmentation. Sci Rep 2022;12:14292. [PMID: 35995933 PMCID: PMC9395348 DOI: 10.1038/s41598-022-17764-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 07/30/2022] [Indexed: 11/23/2022] Open

Yousef AM, Deliyski DD, Zacharias SRC, de Alarcon A, Orlikoff RF, Naghibolhosseini M. A Deep Learning Approach for Quantifying Vocal Fold Dynamics During Connected Speech Using Laryngeal High-Speed Videoendoscopy. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022;65:2098-2113. [PMID: 35605603 PMCID: PMC9567340 DOI: 10.1044/2022_jslhr-21-00540] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 01/30/2022] [Accepted: 02/28/2022] [Indexed: 06/15/2023]

Abstract

PURPOSE

Voice disorders are best assessed by examining vocal fold dynamics in connected speech. This can be achieved using flexible laryngeal high-speed videoendoscopy (HSV), which enables us to study vocal fold mechanics with high temporal details. Analysis of vocal fold vibration using HSV requires accurate segmentation of the vocal fold edges. This article presents an automated deep-learning scheme to segment the glottal area in HSV from which the glottal edges are derived during connected speech.

METHOD

Using a custom-built HSV system, data were obtained from a vocally healthy participant reciting the "Rainbow Passage." A deep neural network was designed for glottal area segmentation in the HSV data. A recently introduced hybrid approach by the authors was utilized as an automated labeling tool to train the network on a set of HSV frames, where the glottis region was automatically annotated during vocal fold vibrations. The network was then tested against manually segmented frames using different metrics, intersection over union (IoU), and Boundary F1 (BF) score, and its performance was assessed on various phonatory events on the HSV sequence.

RESULTS

The designed network was successfully trained using the hybrid approach, without the need for manual labeling, and tested on the manually labeled data. The performance metrics showed a mean IoU of 0.82 and a mean BF score of 0.96. In addition, the evaluation assessment of the network's performance demonstrated an accurate segmentation of the glottal edges/area even during complex nonstationary phonatory events and when vocal folds were not vibrating, thus overcoming the limitations of the previous hybrid approach that could only be applied to the vibrating vocal folds.

CONCLUSIONS

The introduced automated scheme guarantees accurate glottis representation in challenging color HSV data with lower image quality and excessive laryngeal maneuvers during all instances of connected speech. This facilitates the future development of HSV-based measures to assess the running vibratory characteristics of the vocal folds in speakers with and without voice disorder.

SUPPLEMENTAL MATERIAL

https://doi.org/10.23641/asha.19798864.

Collapse

Kist AM, Dürr S, Schützenberger A, Döllinger M. OpenHSV: an open platform for laryngeal high-speed videoendoscopy. Sci Rep 2021;11:13760. [PMID: 34215788 PMCID: PMC8253769 DOI: 10.1038/s41598-021-93149-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 06/03/2021] [Indexed: 11/22/2022] Open

Yousef AM, Deliyski DD, Zacharias SRC, de Alarcon A, Orlikoff RF, Naghibolhosseini M. A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech. APPLIED SCIENCES-BASEL 2021;11. [PMID: 33717604 PMCID: PMC7954580 DOI: 10.3390/app11031179] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]