1
|
Nobel SMN, Swapno SMMR, Islam MR, Safran M, Alfarhood S, Mridha MF. A machine learning approach for vocal fold segmentation and disorder classification based on ensemble method. Sci Rep 2024; 14:14435. [PMID: 38910146 DOI: 10.1038/s41598-024-64987-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Accepted: 06/14/2024] [Indexed: 06/25/2024] Open
Abstract
In the healthcare domain, the essential task is to understand and classify diseases affecting the vocal folds (VFs). The accurate identification of VF disease is the key issue in this domain. Integrating VF segmentation and disease classification into a single system is challenging but important for precise diagnostics. Our study addresses this challenge by combining VF illness categorization and VF segmentation into a single integrated system. We utilized two effective ensemble machine learning methods: ensemble EfficientNetV2L-LGBM and ensemble UNet-BiGRU. We utilized the EfficientNetV2L-LGBM model for classification, achieving a training accuracy of 98.88%, validation accuracy of 97.73%, and test accuracy of 97.88%. These exceptional outcomes highlight the system's ability to classify different VF illnesses precisely. In addition, we utilized the UNet-BiGRU model for segmentation, which attained a training accuracy of 92.55%, a validation accuracy of 89.87%, and a significant test accuracy of 91.47%. In the segmentation task, we examined some methods to improve our ability to divide data into segments, resulting in a testing accuracy score of 91.99% and an Intersection over Union (IOU) of 87.46%. These measures demonstrate skill of the model in accurately defining and separating VF. Our system's classification and segmentation results confirm its capacity to effectively identify and segment VF disorders, representing a significant advancement in enhancing diagnostic accuracy and healthcare in this specialized field. This study emphasizes the potential of machine learning to transform the medical field's capacity to categorize VF and segment VF, providing clinicians with a vital instrument to mitigate the profound impact of the condition. Implementing this innovative approach is expected to enhance medical procedures and provide a sense of optimism to those globally affected by VF disease.
Collapse
Affiliation(s)
- S M Nuruzzaman Nobel
- Department of Computer Science and Engineering, Bangladesh University of Business and Technology, Dhaka, 1216, Bangladesh
| | - S M Masfequier Rahman Swapno
- Department of Computer Science and Engineering, Bangladesh University of Business and Technology, Dhaka, 1216, Bangladesh
| | - Md Rajibul Islam
- Department of Electrical and Electronic Engineering, The Hong Kong Polytechnic University, Hong Kong, China
| | - Mejdl Safran
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, P. O. Box 51178, 11543, Riyadh, Saudi Arabia.
| | - Sultan Alfarhood
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, P. O. Box 51178, 11543, Riyadh, Saudi Arabia
| | - M F Mridha
- Department of Computer Science, American International University-Bangladesh, Dhaka, 1229, Bangladesh
| |
Collapse
|
2
|
Hackman A, Chen CH, Chen AWG, Chen MK. Automatic Segmentation of Membranous Glottal Gap Area with U-Net-Based Architecture. Laryngoscope 2024; 134:2835-2843. [PMID: 38217455 DOI: 10.1002/lary.31266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 12/10/2023] [Accepted: 12/21/2023] [Indexed: 01/15/2024]
Abstract
BACKGROUND While videostroboscopy is recognized as the most popular approach for investigating vocal fold function, evaluating the numerical values, such as the membranous glottal gap area, remains too time consuming for clinical applications. METHODS We used a total of 2507 videostroboscopy images from 137 patients and developed five U-Net-based deep-learning image segmentation models for automatic masking of the membranous glottal gap area. To further validate the models, we used another 410 images from 41 different patients. RESULTS During development, all five models exhibited acceptable and similar metrics. While the VGG19 U-Net had a long inference time of 1654 ms, the other four models had more practical inference times, ranging from 16 to 138 ms. During further validation, Efficient U-Net demonstrated the highest intersection over union of 0.8455, the highest Dice coefficient of 0.9163, and the lowest Hausdorff distance of 1.5626. The normalized membranous glottal gap area index was also calculated and validated. Efficient U-Net and VGG19 U-Net exhibited the lowest mean squared errors (3.5476 and 3.3842) and the lowest mean absolute errors (1.8835 and 1.8396). CONCLUSIONS Automatic segmentation of the membranous glottal gap area can be achieved through U-net-based architecture. Considering the segmentation quality and speed, Efficient U-Net is a reasonable choice for this task, while the other four models remain valuable competitors. The models' masked area enables possible calculation of the normalized membranous glottal gap area and analysis of the glottal area waveform, revealing promising clinical applications for this model. LEVEL OF EVIDENCE NA Laryngoscope, 134:2835-2843, 2024.
Collapse
Affiliation(s)
- Acquah Hackman
- Artificial Intelligence Development Center, Changhua Christian Hospital, Changhua, Taiwan
| | - Chih-Hua Chen
- Department of Otorhinolaryngology, Head and Neck Surgery, Changhua Christian Hospital, Changhua, Taiwan
| | - Andy Wei-Ge Chen
- Department of Otorhinolaryngology, Head and Neck Surgery, Changhua Christian Hospital, Changhua, Taiwan
- Doctoral Program in Translational Medicine, National Chung Hsing University, Taichung, Taiwan
- Rong Hsing Translational Medicine Research Center, National Chung Hsing University, Taichung, Taiwan
| | - Mu-Kuan Chen
- Department of Otorhinolaryngology, Head and Neck Surgery, Changhua Christian Hospital, Changhua, Taiwan
- Department of Post-Baccalaureate Medicine, College of Medicine, National Chung Hsing University, Taichung, Taiwan
| |
Collapse
|
3
|
Pennington-FitzGerald W, Joshi A, Honzel E, Hernandez-Morato I, Pitman MJ, Moayedi Y. Development and Application of Automated Vocal Fold Tracking Software in a Rat Surgical Model. Laryngoscope 2024; 134:340-346. [PMID: 37543969 DOI: 10.1002/lary.30930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 06/21/2023] [Accepted: 07/15/2023] [Indexed: 08/08/2023]
Abstract
OBJECTIVE The rat is a widely used model for studying vocal fold (VF) function after recurrent laryngeal nerve injury, but common techniques for evaluating rat VF motion remain subjective and imprecise. To address this, we developed a software package, called RatVocalTracker1.0 (RVT1.0), to quantify VF motion and tested it on rats with iatrogenic unilateral vocal fold paralysis (VFP). METHODS A deep neural network was trained to identify the positions of the VFs and arytenoid cartilages (ACs) in transoral laryngoscope videos of the rat glottis. Software was developed to estimate glottic midline, VF displacement, VF velocity, and AC angle. The software was applied to laryngoscope videos of adult rats before and after right recurrent and superior laryngeal nerve transection (N = 15; 6M, 9F). All software calculated metrics were compared before and after injury and validated against manually calculated metrics. RESULTS RVT1.0 accurately tracked and quantified VF displacement, VF velocity, and AC angle. Significant differences were found before and after surgery for all RVT1.0 calculated metrics. There was strong agreement between programmatically and manually calculated measures. Automated analysis was also more efficient than nearly all manual methods. CONCLUSION This approach provides fast, accurate assessment of VF motion in rats with minimal labor and allows for quantitative comparison of lateral differences in movement. Through this novel analysis method, we can differentiate healthy movement from unilateral VFP. RVT1.0 is open-source and will be a valuable tool for researchers using the rat model for laryngology research. LEVEL OF EVIDENCE NA Laryngoscope, 134:340-346, 2024.
Collapse
Affiliation(s)
| | - Abhinav Joshi
- The Center for Voice and Swallowing, Department of Otolaryngology-Head & Neck Surgery, Columbia University Irving Medical Center, New York, New York, U.S.A
| | - Emily Honzel
- College of Physicians and Surgeons, Columbia University, New York, New York, U.S.A
| | - Ignacio Hernandez-Morato
- The Center for Voice and Swallowing, Department of Otolaryngology-Head & Neck Surgery, Columbia University Irving Medical Center, New York, New York, U.S.A
| | - Michael J Pitman
- The Center for Voice and Swallowing, Department of Otolaryngology-Head & Neck Surgery, Columbia University Irving Medical Center, New York, New York, U.S.A
| | - Yalda Moayedi
- The Center for Voice and Swallowing, Department of Otolaryngology-Head & Neck Surgery, Columbia University Irving Medical Center, New York, New York, U.S.A
- Department of Neurology, Columbia University, New York, New York, U.S.A
| |
Collapse
|
4
|
Semmler M, Lasar S, Kremer F, Reinwald L, Wittig F, Peters G, Schraut T, Wendler O, Seyferth S, Schützenberger A, Dürr S. Extent and Effect of Covering Laryngeal Structures with Synthetic Laryngeal Mucus via Two Different Administration Techniques. J Voice 2023:S0892-1997(23)00228-X. [PMID: 37648625 DOI: 10.1016/j.jvoice.2023.07.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 07/20/2023] [Accepted: 07/21/2023] [Indexed: 09/01/2023]
Abstract
OBJECTIVE The first goal of this study was to investigate the coverage of laryngeal structures using two potential administration techniques for synthetic mucus: inhalation and lozenge ingestion. As a second research question, the study investigated the potential effects of these techniques on standardized voice assessment parameters. METHODS Fluorescein was added to throat lozenges and to an inhalation solution to visualize the coverage of laryngeal structures through blue light imaging. The study included 70 vocally healthy subjects. Fifty subjects underwent administration via lozenge ingestion and 20 subjects performed the inhalation process. For the first research question, the recordings from the blue light imaging system were categorized to compare the extent of coverage on individual laryngeal structures objectively. Secondly, a standardized voice evaluation protocol was performed before and after each administration to determine any measurable effects of typical voice parameters. RESULTS The administration via inhalation demonstrated complete coverage of all laryngeal structures, including the vocal folds, ventricular folds, and arytenoid cartilages, as visualized by the fluorescent dye. In contrast, the application of the lozenge predominantly covered the pharynx and laryngeal surface toward the aryepiglottic fold, but not the inferior structures. All in all, the comparison before and after administration showed no clear effect, although a minor deterioration of the acoustic signal was noted in the shimmer and cepstral peak prominence after the inhalation. CONCLUSIONS Our findings indicate that the inhalation process is a more effective technique for covering deeper laryngeal structures such as the vocal folds and ventricular folds with synthetic mucus. This knowledge enables further in vivo studies on the role of laryngeal mucus in phonation in general, and how it can be substituted or supplemented for patients with reduced glandular activity as well as for heavy voice users.
Collapse
Affiliation(s)
- Marion Semmler
- University Hospital Erlangen, Medical School, Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany.
| | - Sarina Lasar
- University Hospital Erlangen, Medical School, Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany.
| | - Franziska Kremer
- University Hospital Erlangen, Medical School, Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany.
| | - Laura Reinwald
- University Hospital Erlangen, Medical School, Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany.
| | - Fiori Wittig
- University Hospital Erlangen, Medical School, Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany.
| | - Gregor Peters
- University Hospital Erlangen, Medical School, Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany.
| | - Tobias Schraut
- University Hospital Erlangen, Medical School, Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany.
| | - Olaf Wendler
- University Hospital Erlangen, Medical School, Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany.
| | - Stefan Seyferth
- Department of Chemistry and Pharmacy, Chair of Pharmaceutics, Friedrich-Alexander-University Erlangen-Nürnberg, Cauerstr. 4, 91058 Erlangen, Germany.
| | - Anne Schützenberger
- University Hospital Erlangen, Medical School, Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany.
| | - Stephan Dürr
- University Hospital Regensburg, Department of Otorhinolaryngology, Division of Phoniatrics and Pediatric Audiology, Franz-Josef-Strauß-Allee 11, 93053 Regensburg, Germany.
| |
Collapse
|