1
|
Malinowski J, Pietruszewska W, Kowalczyk M, Niebudek-Bogusz E. Value of high-speed videoendoscopy as an auxiliary tool in differentiation of benign and malignant unilateral vocal lesions. J Cancer Res Clin Oncol 2024; 150:10. [PMID: 38216796 PMCID: PMC10786956 DOI: 10.1007/s00432-023-05543-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 12/13/2023] [Indexed: 01/14/2024]
Abstract
PURPOSE The study aimed to assess the relevance of objective vibratory parameters derived from high-speed videolaryngoscopy (HSV) as a supporting tool, to assist clinicians in establishing the initial diagnosis of benign and malignant glottal organic lesions. METHODS The HSV examinations were conducted in 175 subjects: 50 normophonic, 85 subjects with benign vocal fold lesions, and 40 with early glottic cancer; organic lesions were confirmed by histopathologic examination. The parameters, derived from HSV kymography: amplitude, symmetry, and glottal dynamic characteristics, were compared statistically between the groups with the following ROC analysis. RESULTS Among 14 calculated parameters, 10 differed significantly between the groups. Four of them, the average resultant amplitude of the involved vocal fold (AmpInvolvedAvg), average amplitude asymmetry for the whole glottis and its middle third part (AmplAsymAvg; AmplAsymAvg_2/3), and absolute average phase difference (AbsPhaseDiffAvg), showed significant differences between benign and malignant lesions. Amplitude values were decreasing, while asymmetry and phase difference values were increasing with the risk of malignancy. In ROC analysis, the highest AUC was observed for AmpAsymAvg (0.719; p < 0.0001), and next in order was AmpInvolvedAvg (0.70; p = 0.0002). CONCLUSION The golden standard in the diagnosis of organic lesions of glottis remains clinical examination with videolaryngoscopy, confirmed by histopathological examination. Our results showed that measurements of amplitude, asymmetry, and phase of vibrations in malignant vocal fold masses deteriorate significantly in comparison to benign vocal lesions. High-speed videolaryngoscopy could aid their preliminary differentiation noninvasively before histopathological examination; however, further research on larger groups is needed.
Collapse
Affiliation(s)
- Jakub Malinowski
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, Lodz, Poland.
| | - Wioletta Pietruszewska
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, Lodz, Poland
| | - Magdalena Kowalczyk
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, Lodz, Poland
| | - Ewa Niebudek-Bogusz
- Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, Lodz, Poland
| |
Collapse
|
2
|
Zhang J, Wu J, Qiu Y, Song A, Li W, Li X, Liu Y. Intelligent speech technologies for transcription, disease diagnosis, and medical equipment interactive control in smart hospitals: A review. Comput Biol Med 2023; 153:106517. [PMID: 36623438 PMCID: PMC9814440 DOI: 10.1016/j.compbiomed.2022.106517] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Revised: 12/23/2022] [Accepted: 12/31/2022] [Indexed: 01/07/2023]
Abstract
The growing and aging of the world population have driven the shortage of medical resources in recent years, especially during the COVID-19 pandemic. Fortunately, the rapid development of robotics and artificial intelligence technologies help to adapt to the challenges in the healthcare field. Among them, intelligent speech technology (IST) has served doctors and patients to improve the efficiency of medical behavior and alleviate the medical burden. However, problems like noise interference in complex medical scenarios and pronunciation differences between patients and healthy people hamper the broad application of IST in hospitals. In recent years, technologies such as machine learning have developed rapidly in intelligent speech recognition, which is expected to solve these problems. This paper first introduces IST's procedure and system architecture and analyzes its application in medical scenarios. Secondly, we review existing IST applications in smart hospitals in detail, including electronic medical documentation, disease diagnosis and evaluation, and human-medical equipment interaction. In addition, we elaborate on an application case of IST in the early recognition, diagnosis, rehabilitation training, evaluation, and daily care of stroke patients. Finally, we discuss IST's limitations, challenges, and future directions in the medical field. Furthermore, we propose a novel medical voice analysis system architecture that employs active hardware, active software, and human-computer interaction to realize intelligent and evolvable speech recognition. This comprehensive review and the proposed architecture offer directions for future studies on IST and its applications in smart hospitals.
Collapse
Affiliation(s)
- Jun Zhang
- The State Key Laboratory of Bioelectronics, School of Instrument Science and Engineering, Southeast University, Nanjing, 210096, China,Corresponding author
| | - Jingyue Wu
- The State Key Laboratory of Bioelectronics, School of Instrument Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Yiyi Qiu
- The State Key Laboratory of Bioelectronics, School of Instrument Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Aiguo Song
- The State Key Laboratory of Bioelectronics, School of Instrument Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Weifeng Li
- Department of Emergency Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, 510080, China
| | - Xin Li
- Department of Emergency Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, 510080, China
| | - Yecheng Liu
- Emergency Department, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, 100730, China
| |
Collapse
|
3
|
Pedersen M, Larsen CF, Madsen B, Eeg M. Localization and quantification of glottal gaps on deep learning segmentation of vocal folds. Sci Rep 2023; 13:878. [PMID: 36650265 PMCID: PMC9845318 DOI: 10.1038/s41598-023-27980-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 01/11/2023] [Indexed: 01/19/2023] Open
Abstract
The entire glottis has mostly been the focus in the tracking of the vocal folds, both manually and automatically. From a treatment point of view, the various regions of the glottis are of specific interest. The aim of the study was to test if it was possible to supplement an existing convolutional neural network (CNN) with post-network calculations for the localization and quantification of posterior glottal gaps during phonation, usable for vocal fold function analysis of e.g. laryngopharyngeal reflux findings. 30 subjects/videos with insufficient closure in the rear glottal area and 20 normal subjects/videos were selected from our database, recorded with a commercial high-speed video setup (HSV with 4000 frames per second), and segmented with an open-source CNN for validating voice function. We made post-network calculations to localize and quantify the 10% and 50% distance lines from the rear part of the glottis. The results showed a significant difference using the algorithm at the 10% line distance between the two groups of p < 0.0001 and no difference at 50%. These novel results show that it is possible to use post-network calculations on CNNs for the localization and quantification of posterior glottal gaps.
Collapse
|
4
|
Ding H, Cen Q, Si X, Pan Z, Chen X. Automatic glottis segmentation for laryngeal endoscopic images based on U-Net. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103116] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
5
|
Ishac D, Matta S, Bin S, Aziz H, Karam E, Abche A, Nassar G. Objective Assessment of Covid-19 Severity Affecting the Vocal and Respiratory System Using a Wearable, Autonomous Sound Collar. Cell Mol Bioeng 2021; 15:67-86. [PMID: 34777597 PMCID: PMC8570400 DOI: 10.1007/s12195-021-00712-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 10/11/2021] [Indexed: 11/13/2022] Open
Abstract
Introduction Since the outbreak began in January 2020, Covid-19 has affected more than 161 million people worldwide and resulted in about 3.3 million deaths. Despite efforts to detect human infection with the virus as early as possible, the confirmatory test still requires the analysis of sputum or blood with estimated results available within approximately 30 minutes; this may potentially be followed by clinical referral if the patient shows signs of aggravated pneumonia. This work aims to implement a soft collar as a sound device dedicated to the objective evaluation of the pathophysiological state resulting from dysphonia of laryngeal origin or respiratory failure of inflammatory origin, in particular caused by Covid-19. Methods In this study, we exploit the vibrations of waves generated by the vocal and respiratory system of 30 people. A biocompatible acoustic sensor embedded in a soft collar around the neck collects these waves. The collar is also equipped with thermal sensors and a cross-data analysis module in both the temporal and frequency domains (STFT). The optimal coupling conditions and the electrical and dimensional characteristics of the sensors were defined based on a mathematical approach using a matrix formalism. Results The characteristics of the signals in the time domain combined with the quantities obtained from the STFT offer multidimensional information and a decision support tool for determining a pathophysiological state representative of the symptoms explored. The device, tested on 30 people, was able to differentiate patients with mild symptoms from those who had developed acute signs of respiratory failure on a severity scale of 1 to 10. Conclusion With the health constraints imposed by the effects of Covid-19, the heavy organization to be implemented resulting from the flow of diagnostics, tests and clinical management, it was urgent to develop innovative and safe biomedical technologies. This passive listening technique will contribute to the non-invasive assessment and dynamic observation of lesions. Moreover, it merits further examination to provide support for medical operators to improve clinical management. Supplementary Information The online version contains supplementary material available at 10.1007/s12195-021-00712-w.
Collapse
Affiliation(s)
- D Ishac
- Electrical Engineering Department, University of Balamand (UOB), Balamand, Lebanon
| | - S Matta
- Electrical Engineering Department, University of Balamand (UOB), Balamand, Lebanon
| | - S Bin
- College of Physics, University of Qingdao, Qingdao, China
| | - H Aziz
- Department of Pulmonary Pathology, Sahlgrenska University Hospital, Göteborg, Sweden
| | - E Karam
- Electrical Engineering Department, University of Balamand (UOB), Balamand, Lebanon
| | - A Abche
- Electrical Engineering Department, University of Balamand (UOB), Balamand, Lebanon
| | - G Nassar
- IEMN - CNRS UMR 8520-INSA (HdF)-Lille academic, Lille, France
| |
Collapse
|
6
|
Kist AM, Gómez P, Dubrovskiy D, Schlegel P, Kunduk M, Echternach M, Patel R, Semmler M, Bohr C, Dürr S, Schützenberger A, Döllinger M. A Deep Learning Enhanced Novel Software Tool for Laryngeal Dynamics Analysis. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1889-1903. [PMID: 34000199 DOI: 10.1044/2021_jslhr-20-00498] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose High-speed videoendoscopy (HSV) is an emerging, but barely used, endoscopy technique in the clinic to assess and diagnose voice disorders because of the lack of dedicated software to analyze the data. HSV allows to quantify the vocal fold oscillations by segmenting the glottal area. This challenging task has been tackled by various studies; however, the proposed approaches are mostly limited and not suitable for daily clinical routine. Method We developed a user-friendly software in C# that allows the editing, motion correction, segmentation, and quantitative analysis of HSV data. We further provide pretrained deep neural networks for fully automatic glottis segmentation. Results We freely provide our software Glottis Analysis Tools (GAT). Using GAT, we provide a general threshold-based region growing platform that enables the user to analyze data from various sources, such as in vivo recordings, ex vivo recordings, and high-speed footage of artificial vocal folds. Additionally, especially for in vivo recordings, we provide three robust neural networks at various speed and quality settings to allow a fully automatic glottis segmentation needed for application by untrained personnel. GAT further evaluates video and audio data in parallel and is able to extract various features from the video data, among others the glottal area waveform, that is, the changing glottal area over time. In total, GAT provides 79 unique quantitative analysis parameters for video- and audio-based signals. Many of these parameters have already been shown to reflect voice disorders, highlighting the clinical importance and usefulness of the GAT software. Conclusion GAT is a unique tool to process HSV and audio data to determine quantitative, clinically relevant parameters for research, diagnosis, and treatment of laryngeal disorders. Supplemental Material https://doi.org/10.23641/asha.14575533.
Collapse
Affiliation(s)
- Andreas M Kist
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Pablo Gómez
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Denis Dubrovskiy
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Patrick Schlegel
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Melda Kunduk
- Department of Communication Sciences and Disorders, Louisiana State University, Baton Rouge
| | - Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Germany
| | - Rita Patel
- Department of Speech, Language and Hearing Sciences, College of Arts and Sciences, Indiana University, Bloomington
| | - Marion Semmler
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Christopher Bohr
- Klinik und Poliklinik für Hals-Nasen-Ohren-Heilkunde Universitätsklinikum Regensburg, Germany
| | - Stephan Dürr
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| |
Collapse
|
7
|
Tsarapkin GY, Sergeev SN, Kunelskaya NL, Cherepanov EO, Romanenko SG, Ogorodnikov DY, Kishinevsky AE, Gorovaya EV. [Prospects for a passive acoustic research method in otorhinolaryngology]. Vestn Otorinolaringol 2021; 86:66-72. [PMID: 33929155 DOI: 10.17116/otorino20218602166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The authors of the article reviewed acoustic research methods in otorhinolaryngology. All acoustic diagnostic methods are divided into active and passive. Active acoustic methods are based on the emission of acoustic vibrations, in some cases with the subsequent reception and processing of reflected vibrations. Passive acoustic research methods are based on the recording and analysis of sounds arising during the physiological functioning of the studied organs and systems. In otorhinolaryngology, active acoustic methods of studying the ENT organ are more widespread: audiometry, acoustic impedance measurement, ultrasound examination of hearing, auditory evoked potentials, sonotubometry, acoustic rhinometry, ultrasound examination of soft tissues of the neck and paranasal sinuses. Among passive acoustic research methods, the greatest development in clinical practice in otorhinolaryngology was obtained by computer acoustic analysis of the voice - an assessment of the phonatory function of the larynx. Using similar technologies, a technique for acoustic analysis of nasal breathing was developed - a functional assessment of the external nasal valve. Separate groups of authors have carried out an experimental study of the sounds that occur when the auditory tube is opened. Achievements in acoustics and the introduction of advanced technologies in medicine create prerequisites for improving existing and developing new methods of acoustic analysis of the work of ENT organs.
Collapse
Affiliation(s)
- G Yu Tsarapkin
- L.I. Sverzhevskiy Research and Clinical Institute of Otorhinolaryngology of the Moscow Healthcare Department, Moscow, Russia
| | - S N Sergeev
- FGUP «Research Institute of Applied Acoustics», Dubna, Russia
| | - N L Kunelskaya
- L.I. Sverzhevskiy Research and Clinical Institute of Otorhinolaryngology of the Moscow Healthcare Department, Moscow, Russia.,N.I. Pirogov Russian National Research Medical University of the Ministry of Health, Moscow, Russia
| | - E O Cherepanov
- FGUP «Research Institute of Applied Acoustics», Dubna, Russia
| | - S G Romanenko
- L.I. Sverzhevskiy Research and Clinical Institute of Otorhinolaryngology of the Moscow Healthcare Department, Moscow, Russia
| | - D Yu Ogorodnikov
- L.I. Sverzhevskiy Research and Clinical Institute of Otorhinolaryngology of the Moscow Healthcare Department, Moscow, Russia.,N.I. Pirogov Russian National Research Medical University of the Ministry of Health, Moscow, Russia
| | - A E Kishinevsky
- L.I. Sverzhevskiy Research and Clinical Institute of Otorhinolaryngology of the Moscow Healthcare Department, Moscow, Russia
| | - E V Gorovaya
- L.I. Sverzhevskiy Research and Clinical Institute of Otorhinolaryngology of the Moscow Healthcare Department, Moscow, Russia
| |
Collapse
|
8
|
Esmaeili N, Boese A, Davaris N, Arens C, Navab N, Friebe M, Illanes A. Cyclist Effort Features: A Novel Technique for Image Texture Characterization Applied to Larynx Cancer Classification in Contact Endoscopy-Narrow Band Imaging. Diagnostics (Basel) 2021; 11:432. [PMID: 33802625 PMCID: PMC8001098 DOI: 10.3390/diagnostics11030432] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 02/24/2021] [Accepted: 02/26/2021] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND Feature extraction is an essential part of a Computer-Aided Diagnosis (CAD) system. It is usually preceded by a pre-processing step and followed by image classification. Usually, a large number of features is needed to end up with the desired classification results. In this work, we propose a novel approach for texture feature extraction. This method was tested on larynx Contact Endoscopy (CE)-Narrow Band Imaging (NBI) image classification to provide more objective information for otolaryngologists regarding the stage of the laryngeal cancer. METHODS The main idea of the proposed methods is to represent an image as a hilly surface, where different paths can be identified between a starting and an ending point. Each of these paths can be thought of as a Tour de France stage profile where a cyclist needs to perform a specific effort to arrive at the finish line. Several paths can be generated in an image where different cyclists produce an average cyclist effort representing important textural characteristics of the image. Energy and power as two Cyclist Effort Features (CyEfF) were extracted using this concept. The performance of the proposed features was evaluated for the classification of 2701 CE-NBI images into benign and malignant lesions using four supervised classifiers and subsequently compared with the performance of 24 Geometrical Features (GF) and 13 Entropy Features (EF). RESULTS The CyEfF features showed maximum classification accuracy of 0.882 and improved the GF classification accuracy by 3 to 12 percent. Moreover, CyEfF features were ranked as the top 10 features along with some features from GF set in two feature ranking methods. CONCLUSION The results prove that CyEfF with only two features can describe the textural characterization of CE-NBI images and can be part of the CAD system in combination with GF for laryngeal cancer diagnosis.
Collapse
Affiliation(s)
- Nazila Esmaeili
- INKA—Innovation Laboratory for Image Guided Therapy, Otto-von-Guericke University Magdeburg, 39120 Magdeburg, Germany; (A.B.); (M.F.); (A.I.)
- Chair for Computer Aided Medical Procedures and Augmented Reality, Technical University Munich, 85748 Munich, Germany;
| | - Axel Boese
- INKA—Innovation Laboratory for Image Guided Therapy, Otto-von-Guericke University Magdeburg, 39120 Magdeburg, Germany; (A.B.); (M.F.); (A.I.)
| | - Nikolaos Davaris
- Department of Otorhinolaryngology, Head and Neck Surgery, Magdeburg University Hospital, 39120 Magdeburg, Germany; (N.D.); (C.A.)
| | - Christoph Arens
- Department of Otorhinolaryngology, Head and Neck Surgery, Magdeburg University Hospital, 39120 Magdeburg, Germany; (N.D.); (C.A.)
| | - Nassir Navab
- Chair for Computer Aided Medical Procedures and Augmented Reality, Technical University Munich, 85748 Munich, Germany;
| | - Michael Friebe
- INKA—Innovation Laboratory for Image Guided Therapy, Otto-von-Guericke University Magdeburg, 39120 Magdeburg, Germany; (A.B.); (M.F.); (A.I.)
- IDTM GmbH, 45657 Recklinghausen, Germany
| | - Alfredo Illanes
- INKA—Innovation Laboratory for Image Guided Therapy, Otto-von-Guericke University Magdeburg, 39120 Magdeburg, Germany; (A.B.); (M.F.); (A.I.)
| |
Collapse
|