1
|
Paderno A, Gennarini F, Sordi A, Montenegro C, Lancini D, Villani FP, Moccia S, Piazza C. Artificial intelligence in clinical endoscopy: Insights in the field of videomics. Front Surg 2022; 9:933297. [PMID: 36171813 PMCID: PMC9510389 DOI: 10.3389/fsurg.2022.933297] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 08/22/2022] [Indexed: 11/13/2022] Open
Abstract
Artificial intelligence is being increasingly seen as a useful tool in medicine. Specifically, these technologies have the objective to extract insights from complex datasets that cannot easily be analyzed by conventional statistical methods. While promising results have been obtained for various -omics datasets, radiological images, and histopathologic slides, analysis of videoendoscopic frames still represents a major challenge. In this context, videomics represents a burgeoning field wherein several methods of computer vision are systematically used to organize unstructured data from frames obtained during diagnostic videoendoscopy. Recent studies have focused on five broad tasks with increasing complexity: quality assessment of endoscopic images, classification of pathologic and nonpathologic frames, detection of lesions inside frames, segmentation of pathologic lesions, and in-depth characterization of neoplastic lesions. Herein, we present a broad overview of the field, with a focus on conceptual key points and future perspectives.
Collapse
Affiliation(s)
- Alberto Paderno
- Unit of Otorhinolaryngology—Head and Neck Surgery, ASST Spedali Civili of Brescia, Brescia, Italy
- Department of Medical and Surgical Specialties, Radiological Sciences, and Public Health, School of Medicine, University of Brescia, Brescia, Italy
- Correspondence: Alberto Paderno
| | - Francesca Gennarini
- Unit of Otorhinolaryngology—Head and Neck Surgery, ASST Spedali Civili of Brescia, Brescia, Italy
- Department of Medical and Surgical Specialties, Radiological Sciences, and Public Health, School of Medicine, University of Brescia, Brescia, Italy
| | - Alessandra Sordi
- Unit of Otorhinolaryngology—Head and Neck Surgery, ASST Spedali Civili of Brescia, Brescia, Italy
- Department of Medical and Surgical Specialties, Radiological Sciences, and Public Health, School of Medicine, University of Brescia, Brescia, Italy
| | - Claudia Montenegro
- Unit of Otorhinolaryngology—Head and Neck Surgery, ASST Spedali Civili of Brescia, Brescia, Italy
- Department of Medical and Surgical Specialties, Radiological Sciences, and Public Health, School of Medicine, University of Brescia, Brescia, Italy
| | - Davide Lancini
- Unit of Otorhinolaryngology—Head and Neck Surgery, ASST Spedali Civili of Brescia, Brescia, Italy
| | - Francesca Pia Villani
- The BioRobotics Institute, Scuola Superiore Sant’Anna, Pisa, Italy
- Department of Excellence in Robotics and AI, Scuola Superiore Sant’Anna, Pisa, Italy
| | - Sara Moccia
- The BioRobotics Institute, Scuola Superiore Sant’Anna, Pisa, Italy
- Department of Excellence in Robotics and AI, Scuola Superiore Sant’Anna, Pisa, Italy
| | - Cesare Piazza
- Unit of Otorhinolaryngology—Head and Neck Surgery, ASST Spedali Civili of Brescia, Brescia, Italy
- Department of Medical and Surgical Specialties, Radiological Sciences, and Public Health, School of Medicine, University of Brescia, Brescia, Italy
| |
Collapse
|
2
|
宋 琦, 李 晓. [Application and development of voice analysis and endoscopic technology combined with artificial intelligence in the diagnosis and treatment of throat disease]. LIN CHUANG ER BI YAN HOU TOU JING WAI KE ZA ZHI = JOURNAL OF CLINICAL OTORHINOLARYNGOLOGY, HEAD, AND NECK SURGERY 2022; 36:647-650. [PMID: 35959588 PMCID: PMC10128196 DOI: 10.13201/j.issn.2096-7993.2022.08.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Indexed: 06/15/2023]
Abstract
In the diagnosis and treatment of throat disease, the application and development of combining voice analysis or endoscopic technology with artificial intelligence has developed rapidly. This paper reviews the history and principles of the combination of voice analysis or endoscopic technology with artificial intelligence, summarizes its status of application and development, and sums up its advantages that lie in the strong learning and interpretation ability, amazing speed and tolerance, and stable replication and expansion. The key to restrict its development is the uncertainty in the process of machine learning, the error caused by small samples, and the ethical philosophical thinking. Future development direction should be that the surgeons in otolaryngology head and neck department on the basis of excellent professional knowledge, learn related knowledge of epidemiology, classic statistics, strengthen the exchanges and cooperation with machine learning developers. Eventually, advanced science and technology can be truly used in clinical practice to maximize the benefit of the majority of patients.
Collapse
Affiliation(s)
- 琦 宋
- 中国人民解放军联勤保障部队第九八〇医院耳鼻咽喉头颈外科(石家庄,050082)Department of Otolaryngology Head and Neck Surgery, the 980th Hospital of the Joint Logistics Support Unit of the Chinese PLA, Shijiazhuang, 050082, China
| | - 晓明 李
- 中国人民解放军联勤保障部队第九八〇医院耳鼻咽喉头颈外科(石家庄,050082)Department of Otolaryngology Head and Neck Surgery, the 980th Hospital of the Joint Logistics Support Unit of the Chinese PLA, Shijiazhuang, 050082, China
| |
Collapse
|
3
|
Kist AM, Gómez P, Dubrovskiy D, Schlegel P, Kunduk M, Echternach M, Patel R, Semmler M, Bohr C, Dürr S, Schützenberger A, Döllinger M. A Deep Learning Enhanced Novel Software Tool for Laryngeal Dynamics Analysis. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1889-1903. [PMID: 34000199 DOI: 10.1044/2021_jslhr-20-00498] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose High-speed videoendoscopy (HSV) is an emerging, but barely used, endoscopy technique in the clinic to assess and diagnose voice disorders because of the lack of dedicated software to analyze the data. HSV allows to quantify the vocal fold oscillations by segmenting the glottal area. This challenging task has been tackled by various studies; however, the proposed approaches are mostly limited and not suitable for daily clinical routine. Method We developed a user-friendly software in C# that allows the editing, motion correction, segmentation, and quantitative analysis of HSV data. We further provide pretrained deep neural networks for fully automatic glottis segmentation. Results We freely provide our software Glottis Analysis Tools (GAT). Using GAT, we provide a general threshold-based region growing platform that enables the user to analyze data from various sources, such as in vivo recordings, ex vivo recordings, and high-speed footage of artificial vocal folds. Additionally, especially for in vivo recordings, we provide three robust neural networks at various speed and quality settings to allow a fully automatic glottis segmentation needed for application by untrained personnel. GAT further evaluates video and audio data in parallel and is able to extract various features from the video data, among others the glottal area waveform, that is, the changing glottal area over time. In total, GAT provides 79 unique quantitative analysis parameters for video- and audio-based signals. Many of these parameters have already been shown to reflect voice disorders, highlighting the clinical importance and usefulness of the GAT software. Conclusion GAT is a unique tool to process HSV and audio data to determine quantitative, clinically relevant parameters for research, diagnosis, and treatment of laryngeal disorders. Supplemental Material https://doi.org/10.23641/asha.14575533.
Collapse
Affiliation(s)
- Andreas M Kist
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Pablo Gómez
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Denis Dubrovskiy
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Patrick Schlegel
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Melda Kunduk
- Department of Communication Sciences and Disorders, Louisiana State University, Baton Rouge
| | - Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Germany
| | - Rita Patel
- Department of Speech, Language and Hearing Sciences, College of Arts and Sciences, Indiana University, Bloomington
| | - Marion Semmler
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Christopher Bohr
- Klinik und Poliklinik für Hals-Nasen-Ohren-Heilkunde Universitätsklinikum Regensburg, Germany
| | - Stephan Dürr
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology-Head & Neck Surgery, University Hospital Erlangen, Germany
| |
Collapse
|
4
|
Abstract
PURPOSE OF REVIEW Machine learning (ML) algorithms have augmented human judgment in various fields of clinical medicine. However, little progress has been made in applying these tools to video-endoscopy. We reviewed the field of video-analysis (herein termed 'Videomics' for the first time) as applied to diagnostic endoscopy, assessing its preliminary findings, potential, as well as limitations, and consider future developments. RECENT FINDINGS ML has been applied to diagnostic endoscopy with different aims: blind-spot detection, automatic quality control, lesion detection, classification, and characterization. The early experience in gastrointestinal endoscopy has recently been expanded to the upper aerodigestive tract, demonstrating promising results in both clinical fields. From top to bottom, multispectral imaging (such as Narrow Band Imaging) appeared to provide significant information drawn from endoscopic images. SUMMARY Videomics is an emerging discipline that has the potential to significantly improve human detection and characterization of clinically significant lesions during endoscopy across medical and surgical disciplines. Research teams should focus on the standardization of data collection, identification of common targets, and optimal reporting. With such a collaborative stepwise approach, Videomics is likely to soon augment clinical endoscopy, significantly impacting cancer patient outcomes.
Collapse
|
5
|
Turkmen HI, Karsligil ME, Kocak I. Visible Vessels of Vocal Folds: Can they have a Diagnostic Role? Curr Med Imaging 2020; 15:785-795. [PMID: 32008546 DOI: 10.2174/1573405614666180604083854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Revised: 02/16/2018] [Accepted: 02/21/2018] [Indexed: 11/22/2022]
Abstract
BACKGROUND Challenges in visual identification of laryngeal disorders lead researchers to investigate new opportunities to help clinical examination. This paper presents an efficient and simple method which extracts and assesses blood vessels on vocal fold tissue in order to serve medical diagnosis. METHODS The proposed vessel segmentation approach has been designed in order to overcome difficulties raised by design specifications of videolaryngostroboscopy and anatomic structure of vocal fold vasculature. The limited number of medical studies on vocal fold vasculature point out that the direction of blood vessels and amount of vasculature are discriminative features for vocal fold disorders. Therefore, we extracted the features of vessels on the basis of these studies. We represent vessels as vascular vectors and suggest a vector field based measurement that quantifies the orientation pattern of blood vessels towards vocal fold pathologies. RESULTS In order to demonstrate the relationship between vessel structure and vocal fold disorders, we performed classification of vocal fold disorders by using only vessel features. A binary tree of Support Vector Machine (SVM) has been exploited for classification. Average recall of proposed vessel extraction method was calculated as 0.82 while healthy, sulcus vocalis, laryngitis classification accuracy of 0.75 was achieved. CONCLUSION Obtained success rates showed the efficiency of vocal fold vessels in serving as an indicator of laryngeal diseases.
Collapse
Affiliation(s)
- Hafiza Irem Turkmen
- Computer Engineering Department, Faculty of Electrical & Electronics Engineering, Yildiz Technical University, Istanbul, Turkey
| | - Mine Elif Karsligil
- Computer Engineering Department, Faculty of Electrical & Electronics Engineering, Yildiz Technical University, Istanbul, Turkey
| | - Ismail Kocak
- Otorhinolaryngology Department, Faculty of Medicine, Okan University, Istanbul, Turkey
| |
Collapse
|
6
|
Jeffrey Kuo CF, Li YC, Weng WH, Pinos Leon KB, Chu YH. Applied image processing techniques in video laryngoscope for occult tumor detection. Biomed Signal Process Control 2020. [DOI: 10.1016/j.bspc.2019.101633] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
7
|
Turkmen HI, Karsligil ME. Advanced computing solutions for analysis of laryngeal disorders. Med Biol Eng Comput 2019; 57:2535-2552. [DOI: 10.1007/s11517-019-02031-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Accepted: 08/13/2019] [Indexed: 11/29/2022]
|
8
|
Kuo CFJ, Kuo J, Hsiao SW, Lee CL, Lee JC, Ke BH. Automatic and quantitative measurement of laryngeal video stroboscopic images. Proc Inst Mech Eng H 2017; 231:48-57. [PMID: 28097934 DOI: 10.1177/0954411916679200] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The laryngeal video stroboscope is an important instrument for physicians to analyze abnormalities and diseases in the glottal area. Stroboscope has been widely used around the world. However, without quantized indices, physicians can only make subjective judgment on glottal images. We designed a new laser projection marking module and applied it onto the laryngeal video stroboscope to provide scale conversion reference parameters for glottal imaging and to convert the physiological parameters of glottis. Image processing technology was used to segment the important image regions of interest. Information of the glottis was quantified, and the vocal fold image segmentation system was completed to assist clinical diagnosis and increase accuracy. Regarding image processing, histogram equalization was used to enhance glottis image contrast. The center weighted median filters image noise while retaining the texture of the glottal image. Statistical threshold determination was used for automatic segmentation of a glottal image. As the glottis image contains saliva and light spots, which are classified as the noise of the image, noise was eliminated by erosion, expansion, disconnection, and closure techniques to highlight the vocal area. We also used image processing to automatically identify an image of vocal fold region in order to quantify information from the glottal image, such as glottal area, vocal fold perimeter, vocal fold length, glottal width, and vocal fold angle. The quantized glottis image database was created to assist physicians in diagnosing glottis diseases more objectively.
Collapse
Affiliation(s)
- Chung-Feng Jeffrey Kuo
- 1 Department of Materials Science and Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Joseph Kuo
- 2 Wisconsin State Laboratory of Hygiene, Department of Pathobiological Sciences, University of Wisconsin-Madison, Madison, WI, USA
| | - Shang-Wun Hsiao
- 1 Department of Materials Science and Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Chi-Lung Lee
- 1 Department of Materials Science and Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Jih-Chin Lee
- 3 Department of Otolaryngology-Head and Neck Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Bo-Han Ke
- 4 Graduate Institute of Automation and Control, National Taiwan University of Science and Technology, Taipei, Taiwan
| |
Collapse
|
9
|
Niebudek-Bogusz E, Kopczynski B, Strumillo P, Morawska J, Wiktorowicz J, Sliwinska-Kowalska M. Quantitative assessment of videolaryngostroboscopic images in patients with glottic pathologies. LOGOP PHONIATR VOCO 2016; 42:73-83. [DOI: 10.3109/14015439.2016.1174293] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|