1
|
A Novel Framework of Manifold Learning Cascade-Clustering for the Informative Frame Selection. Diagnostics (Basel) 2023; 13:diagnostics13061151. [PMID: 36980459 PMCID: PMC10047422 DOI: 10.3390/diagnostics13061151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 03/05/2023] [Accepted: 03/10/2023] [Indexed: 03/19/2023] Open
Abstract
Narrow band imaging is an established non-invasive tool used for the early detection of laryngeal cancer in surveillance examinations. Most images produced from the examination are useless, such as blurred, specular reflection, and underexposed. Removing the uninformative frames is vital to improve detection accuracy and speed up computer-aided diagnosis. It often takes a lot of time for the physician to manually inspect the informative frames. This issue is commonly addressed by a classifier with task-specific categories of the uninformative frames. However, the definition of the uninformative categories is ambiguous, and tedious labeling still cannot be avoided. Here, we show that a novel unsupervised scheme is comparable to the current benchmarks on the dataset of NBI-InfFrames. We extract feature embedding using a vanilla neural network (VGG16) and introduce a new dimensionality reduction method called UMAP that distinguishes the feature embedding in the lower-dimensional space. Along with the proposed automatic cluster labeling algorithm and cost function in Bayesian optimization, the proposed method coupled with UMAP achieves state-of-the-art performance. It outperforms the baseline by 12% absolute. The overall median recall of the proposed method is currently the highest, 96%. Our results demonstrate the effectiveness of the proposed scheme and the robustness of detecting the informative frames. It also suggests the patterns embedded in the data help develop flexible algorithms that do not require manual labeling.
Collapse
|
2
|
Wang YY, Hamad AS, Lever TE, Bunyak F. Orthogonal Region Selection Network for Laryngeal Closure Detection in Laryngoscopy Videos. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2020:2167-2172. [PMID: 33018436 DOI: 10.1109/embc44109.2020.9176149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Vocal folds (VFs) play a critical role in breathing, swallowing, and speech production. VF dysfunctions caused by various medical conditions can significantly reduce patients' quality of life and lead to life-threatening conditions such as aspiration pneumonia, caused by food and/or liquid "invasion" into the windpipe. Laryngeal endoscopy is routinely used in clinical practice to inspect the larynx and to assess the VF function. Unfortunately, the resulting videos are only visually inspected, leading to loss of valuable information that can be used for early diagnosis and disease or treatment monitoring. In this paper, we propose a deep learning-based image analysis solution for automated detection of laryngeal adductor reflex (LAR) events in laryngeal endoscopy videos. Laryngeal endoscopy image analysis is a challenging task because of anatomical variations and various imaging problems. Analysis of LAR events is further challenging because of data imbalance since these are rare events. In order to tackle this problem, we propose a deep learning system that consists of a two-stream network with a novel orthogonal region selection subnetwork. To our best knowledge, this is the first deep learning network that learns to directly map its input to a VF open/close state without first segmenting or tracking the VF region, which drastically reduces labor-intensive manual annotation needed for mask or track generation. The proposed two-stream network and the orthogonal region selection subnetwork allow integration of local and global information for improved performance. The experimental results show promising performance for the automated, objective, and quantitative analysis of LAR events from laryngeal endoscopy videos.Clinical relevance- This paper presents an objective, quantitative, and automatic deep learning based system for detection of laryngeal adductor reflex (LAR) events in laryngoscopy videos.
Collapse
|
3
|
Jeffrey Kuo CF, Li YC, Weng WH, Pinos Leon KB, Chu YH. Applied image processing techniques in video laryngoscope for occult tumor detection. Biomed Signal Process Control 2020. [DOI: 10.1016/j.bspc.2019.101633] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
4
|
Moccia S, Vanone GO, Momi ED, Laborai A, Guastini L, Peretti G, Mattos LS. Learning-based classification of informative laryngoscopic frames. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2018; 158:21-30. [PMID: 29544787 DOI: 10.1016/j.cmpb.2018.01.030] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Revised: 12/18/2017] [Accepted: 01/29/2018] [Indexed: 06/08/2023]
Abstract
BACKGROUND AND OBJECTIVE Early-stage diagnosis of laryngeal cancer is of primary importance to reduce patient morbidity. Narrow-band imaging (NBI) endoscopy is commonly used for screening purposes, reducing the risks linked to a biopsy but at the cost of some drawbacks, such as large amount of data to review to make the diagnosis. The purpose of this paper is to present a strategy to perform automatic selection of informative endoscopic video frames, which can reduce the amount of data to process and potentially increase diagnosis performance. METHODS A new method to classify NBI endoscopic frames based on intensity, keypoint and image spatial content features is proposed. Support vector machines with the radial basis function and the one-versus-one scheme are used to classify frames as informative, blurred, with saliva or specular reflections, or underexposed. RESULTS When tested on a balanced set of 720 images from 18 different laryngoscopic videos, a classification recall of 91% was achieved for informative frames, significantly overcoming three state of the art methods (Wilcoxon rank-signed test, significance level = 0.05). CONCLUSIONS Due to the high performance in identifying informative frames, the approach is a valuable tool to perform informative frame selection, which can be potentially applied in different fields, such us computer-assisted diagnosis and endoscopic view expansion.
Collapse
Affiliation(s)
- Sara Moccia
- Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Milan, Italy; Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genoa, Italy.
| | - Gabriele O Vanone
- Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Elena De Momi
- Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Andrea Laborai
- Department of Otorhinolaryngology, Head and Neck Surgery, University of Genoa, Genoa, Italy
| | - Luca Guastini
- Department of Otorhinolaryngology, Head and Neck Surgery, University of Genoa, Genoa, Italy
| | - Giorgio Peretti
- Department of Otorhinolaryngology, Head and Neck Surgery, University of Genoa, Genoa, Italy
| | - Leonardo S Mattos
- Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genoa, Italy
| |
Collapse
|
5
|
Chen F, Li S, Zhang Y, Wang J. Detection of the Vibration Signal from Human Vocal Folds Using a 94-GHz Millimeter-Wave Radar. SENSORS 2017; 17:s17030543. [PMID: 28282892 PMCID: PMC5375829 DOI: 10.3390/s17030543] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2016] [Revised: 03/03/2017] [Accepted: 03/04/2017] [Indexed: 11/16/2022]
Abstract
The detection of the vibration signal from human vocal folds provides essential information for studying human phonation and diagnosing voice disorders. Doppler radar technology has enabled the noncontact measurement of the human-vocal-fold vibration. However, existing systems must be placed in close proximity to the human throat and detailed information may be lost because of the low operating frequency. In this paper, a long-distance detection method, involving the use of a 94-GHz millimeter-wave radar sensor, is proposed for detecting the vibration signals from human vocal folds. An algorithm that combines empirical mode decomposition (EMD) and the auto-correlation function (ACF) method is proposed for detecting the signal. First, the EMD method is employed to suppress the noise of the radar-detected signal. Further, the ratio of the energy and entropy is used to detect voice activity in the radar-detected signal, following which, a short-time ACF is employed to extract the vibration signal of the human vocal folds from the processed signal. For validating the method and assessing the performance of the radar system, a vibration measurement sensor and microphone system are additionally employed for comparison. The experimental results obtained from the spectrograms, the vibration frequency of the vocal folds, and coherence analysis demonstrate that the proposed method can effectively detect the vibration of human vocal folds from a long detection distance.
Collapse
Affiliation(s)
- Fuming Chen
- Department of Biomedical Engineering, Fourth Military Medical University, Xi'an 710032, China.
| | - Sheng Li
- College of Control Engineering, Xijing University, Xi'an 710123, China.
| | - Yang Zhang
- Center for Disease Control and Prevention of Guangzhou Military Region, Guangzhou 510507, China.
| | - Jianqi Wang
- Department of Biomedical Engineering, Fourth Military Medical University, Xi'an 710032, China.
| |
Collapse
|
6
|
Kuo CFJ, Kuo J, Hsiao SW, Lee CL, Lee JC, Ke BH. Automatic and quantitative measurement of laryngeal video stroboscopic images. Proc Inst Mech Eng H 2017; 231:48-57. [PMID: 28097934 DOI: 10.1177/0954411916679200] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The laryngeal video stroboscope is an important instrument for physicians to analyze abnormalities and diseases in the glottal area. Stroboscope has been widely used around the world. However, without quantized indices, physicians can only make subjective judgment on glottal images. We designed a new laser projection marking module and applied it onto the laryngeal video stroboscope to provide scale conversion reference parameters for glottal imaging and to convert the physiological parameters of glottis. Image processing technology was used to segment the important image regions of interest. Information of the glottis was quantified, and the vocal fold image segmentation system was completed to assist clinical diagnosis and increase accuracy. Regarding image processing, histogram equalization was used to enhance glottis image contrast. The center weighted median filters image noise while retaining the texture of the glottal image. Statistical threshold determination was used for automatic segmentation of a glottal image. As the glottis image contains saliva and light spots, which are classified as the noise of the image, noise was eliminated by erosion, expansion, disconnection, and closure techniques to highlight the vocal area. We also used image processing to automatically identify an image of vocal fold region in order to quantify information from the glottal image, such as glottal area, vocal fold perimeter, vocal fold length, glottal width, and vocal fold angle. The quantized glottis image database was created to assist physicians in diagnosing glottis diseases more objectively.
Collapse
Affiliation(s)
- Chung-Feng Jeffrey Kuo
- 1 Department of Materials Science and Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Joseph Kuo
- 2 Wisconsin State Laboratory of Hygiene, Department of Pathobiological Sciences, University of Wisconsin-Madison, Madison, WI, USA
| | - Shang-Wun Hsiao
- 1 Department of Materials Science and Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Chi-Lung Lee
- 1 Department of Materials Science and Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Jih-Chin Lee
- 3 Department of Otolaryngology-Head and Neck Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Bo-Han Ke
- 4 Graduate Institute of Automation and Control, National Taiwan University of Science and Technology, Taipei, Taiwan
| |
Collapse
|
7
|
Perperidis A, Akram A, Altmann Y, McCool P, Westerfeld J, Wilson D, Dhaliwal K, McLaughlin S. Automated Detection of Uninformative Frames in Pulmonary Optical Endomicroscopy. IEEE Trans Biomed Eng 2017; 64:87-98. [PMID: 26978410 DOI: 10.1109/tbme.2016.2538084] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
SIGNIFICANCE Optical endomicroscopy (OEM) is a novel real-time imaging technology that provides endoscopic images at a microscopic level. The nature of OEM data, as acquired in clinical use, gives rise to the presence of uninformative frames (i.e., pure-noise and motion-artefacts). Uninformative frames can comprise a considerable proportion (up to > 25%) of a dataset, increasing the resources required for analyzing the data (both manually and automatically), as well as diluting the results of any automated quantification analysis. OBJECTIVE There is, therefore, a need to automatically detect and remove as many of these uninformative frames as possible while keeping frames with structural information intact. METHODS This paper employs Gray Level Cooccurrence Matrix texture measures and detection theory to identify and remove such frames. The detection of pure-noise and motion-artefacts frames is treated as two independent problems. RESULTS Pulmonary OEM frame sequences of the distal lung are employed for the development and assessment of the approach. The proposed approach identifies and removes uninformative frames with a sensitivity of 93% and a specificity of 92.6%. CONCLUSION The detection algorithm is accurate and robust in pulmonary OEM frame sequences. Conditional to appropriate model refinement, the algorithms can become applicable in other organs.
Collapse
|
8
|
Ishijima A, Schwarz RA, Shin D, Mondrik S, Vigneswaran N, Gillenwater AM, Anandasabapathy S, Richards-Kortum R. Automated frame selection process for high-resolution microendoscopy. JOURNAL OF BIOMEDICAL OPTICS 2015; 20:46014. [PMID: 25919426 PMCID: PMC4412137 DOI: 10.1117/1.jbo.20.4.046014] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2014] [Accepted: 04/09/2015] [Indexed: 05/21/2023]
Abstract
We developed an automated frame selection algorithm for high-resolution microendoscopy video sequences. The algorithm rapidly selects a representative frame with minimal motion artifact from a short video sequence, enabling fully automated image analysis at the point-of-care. The algorithm was evaluated by quantitative comparison of diagnostically relevant image features and diagnostic classification results obtained using automated frame selection versus manual frame selection. A data set consisting of video sequences collected in vivo from 100 oral sites and 167 esophageal sites was used in the analysis. The area under the receiver operating characteristic curve was 0.78 (automated selection) versus 0.82 (manual selection) for oral sites, and 0.93 (automated selection) versus 0.92 (manual selection) for esophageal sites. The implementation of fully automated high-resolution microendoscopy at the point-of-care has the potential to reduce the number of biopsies needed for accurate diagnosis of precancer and cancer in low-resource settings where there may be limited infrastructure and personnel for standard histologic analysis.
Collapse
Affiliation(s)
- Ayumu Ishijima
- Rice University, Department of Bioengineering MS 142, 6100 Main Street, Houston, Texas 77005, United States
- University of Tokyo, Department of Precision Engineering, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Richard A. Schwarz
- Rice University, Department of Bioengineering MS 142, 6100 Main Street, Houston, Texas 77005, United States
| | - Dongsuk Shin
- Rice University, Department of Bioengineering MS 142, 6100 Main Street, Houston, Texas 77005, United States
| | - Sharon Mondrik
- Rice University, Department of Bioengineering MS 142, 6100 Main Street, Houston, Texas 77005, United States
| | - Nadarajah Vigneswaran
- University of Texas School of Dentistry, 7500 Cambridge Street, Houston, Texas 77054, United States
| | - Ann M. Gillenwater
- University of Texas M. D. Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, Texas 77030, United States
| | - Sharmila Anandasabapathy
- Mount Sinai Medical Center, Division of Gastroenterology, One Gustave L. Levy Place, New York, New York 10029, United States
| | - Rebecca Richards-Kortum
- Rice University, Department of Bioengineering MS 142, 6100 Main Street, Houston, Texas 77005, United States
- Address all correspondence to: Rebecca Richards-Kortum, E-mail:
| |
Collapse
|
9
|
Kumar A, Wang YY, Wu CJ, Liu KC, Wu HS. Stereoscopic visualization of laparoscope image using depth information from 3D model. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2014; 113:862-868. [PMID: 24444752 DOI: 10.1016/j.cmpb.2013.12.013] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Revised: 11/20/2013] [Accepted: 12/18/2013] [Indexed: 06/03/2023]
Abstract
Laparoscopic surgery is indispensable from the current surgical procedures. It uses an endoscope system of camera and light source, and surgical instruments which pass through the small incisions on the abdomen of the patients undergoing laparoscopic surgery. Conventional laparoscope (endoscope) systems produce 2D colored video images which do not provide surgeons an actual depth perception of the scene. In this work, the problem was formulated as synthesizing a stereo image of the monocular (conventional) laparoscope image by incorporating into them the depth information from a 3D CT model. Various algorithms of the computer vision including the algorithms for the feature detection, matching and tracking in the video frames, and for the reconstruction of 3D shape from shading in the 2D laparoscope image were combined for making the system. The current method was applied to the laparoscope video at the rate of up to 5 frames per second to visualize its stereo video. A correlation was investigated between the depth maps calculated with our method with those from the shape from shading algorithm. The correlation coefficients between the depth maps were within the range of 0.70-0.95 (P<0.05). A t-test was used for the statistical analysis.
Collapse
Affiliation(s)
- Atul Kumar
- Medical Imaging Research Laboratory, IRCAD, Taiwan; Department of General Surgery, Chang Bing Show Chwan Memorial Hospital, Taiwan.
| | - Yen-Yu Wang
- Medical Imaging Research Laboratory, IRCAD, Taiwan; Department of General Surgery, Chang Bing Show Chwan Memorial Hospital, Taiwan
| | - Ching-Jen Wu
- Medical Imaging Research Laboratory, IRCAD, Taiwan; Department of General Surgery, Chang Bing Show Chwan Memorial Hospital, Taiwan
| | - Kai-Che Liu
- Medical Imaging Research Laboratory, IRCAD, Taiwan; Department of General Surgery, Chang Bing Show Chwan Memorial Hospital, Taiwan
| | - Hurng-Sheng Wu
- Medical Imaging Research Laboratory, IRCAD, Taiwan; Department of General Surgery, Chang Bing Show Chwan Memorial Hospital, Taiwan
| |
Collapse
|