1
|
Kim ES, Lee KS. Artificial intelligence in colonoscopy: from detection to diagnosis. Korean J Intern Med 2024; 39:555-562. [PMID: 38695105 PMCID: PMC11236815 DOI: 10.3904/kjim.2023.332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 11/13/2023] [Indexed: 07/12/2024] Open
Abstract
This study reviews the recent progress of artificial intelligence for colonoscopy from detection to diagnosis. The source of data was 27 original studies in PubMed. The search terms were "colonoscopy" (title) and "deep learning" (abstract). The eligibility criteria were: (1) the dependent variable of gastrointestinal disease; (2) the interventions of deep learning for classification, detection and/or segmentation for colonoscopy; (3) the outcomes of accuracy, sensitivity, specificity, area under the curve (AUC), precision, F1, intersection of union (IOU), Dice and/or inference frames per second (FPS); (3) the publication year of 2021 or later; (4) the publication language of English. Based on the results of this study, different deep learning methods would be appropriate for different tasks for colonoscopy, e.g., Efficientnet with neural architecture search (AUC 99.8%) in the case of classification, You Only Look Once with the instance tracking head (F1 96.3%) in the case of detection, and Unet with dense-dilation-residual blocks (Dice 97.3%) in the case of segmentation. Their performance measures reported varied within 74.0-95.0% for accuracy, 60.0-93.0% for sensitivity, 60.0-100.0% for specificity, 71.0-99.8% for the AUC, 70.1-93.3% for precision, 81.0-96.3% for F1, 57.2-89.5% for the IOU, 75.1-97.3% for Dice and 66-182 for FPS. In conclusion, artificial intelligence provides an effective, non-invasive decision support system for colonoscopy from detection to diagnosis.
Collapse
Affiliation(s)
- Eun Sun Kim
- Department of Gastroenterology, Korea University Anam Hospital, Seoul, Korea
| | - Kwang-Sig Lee
- AI Center, Korea University Anam Hospital, Seoul, Korea
| |
Collapse
|
2
|
Subhashini R, Velswamy R, Sree Rathna Lakshmi NVS, Sivanandam C. An innovative breast cancer detection framework using multiscale dilated densenet with attention mechanism. NETWORK (BRISTOL, ENGLAND) 2024:1-37. [PMID: 38648017 DOI: 10.1080/0954898x.2024.2343348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 04/05/2024] [Indexed: 04/25/2024]
Abstract
Cancer-related deadly diseases affect both developed and underdeveloped nations worldwide. Effective network learning is crucial to more reliably identify and categorize breast carcinoma in vast and unbalanced image datasets. The absence of early cancer symptoms makes the early identification process challenging. Therefore, from the perspectives of diagnosis, prevention, and therapy, cancer continues to be among the healthcare concerns that numerous researchers work to advance. It is highly essential to design an innovative breast cancer detection model by considering the complications presented in the classical techniques. Initially, breast cancer images are gathered from online sources and it is further subjected to the segmentation region. Here, it is segmented using Adaptive Trans-Dense-Unet (A-TDUNet), and their parameters are tuned using the developed Modified Sheep Flock Optimization Algorithm (MSFOA). The segmented images are further subjected to the breast cancer detection stage and effective breast cancer detection is performed by Multiscale Dilated Densenet with Attention Mechanism (MDD-AM). Throughout the result validation, the Net Present Value (NPV) and accuracy rate of the designed approach are 96.719% and 93.494%. Hence, the implemented breast cancer detection model secured a better efficacy rate than the baseline detection methods in diverse experimental conditions.
Collapse
Affiliation(s)
- R Subhashini
- Department of Information Technology, Sona College of Technology, Salem, Tamil Nadu, India
| | - Rajasekar Velswamy
- Department of Computer Science and Engineering, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| | - N V S Sree Rathna Lakshmi
- Department of Electronics and Communication Engineering, Agni College of Technology, Thazhambur, Tamil Nadu, India
| | - Chakaravarthi Sivanandam
- Department of Computer Science and Engineering, Panimalar Engineering College, Poonamallee, Chennai, Tamil Nadu, India
| |
Collapse
|
3
|
Sikkandar MY, Sundaram SG, Alassaf A, AlMohimeed I, Alhussaini K, Aleid A, Alolayan SA, Ramkumar P, Almutairi MK, Begum SS. Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer. Sci Rep 2024; 14:7318. [PMID: 38538774 PMCID: PMC11377543 DOI: 10.1038/s41598-024-57993-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 03/24/2024] [Indexed: 09/07/2024] Open
Abstract
Polyp detection is a challenging task in the diagnosis of Colorectal Cancer (CRC), and it demands clinical expertise due to the diverse nature of polyps. The recent years have witnessed the development of automated polyp detection systems to assist the experts in early diagnosis, considerably reducing the time consumption and diagnostic errors. In automated CRC diagnosis, polyp segmentation is an important step which is carried out with deep learning segmentation models. Recently, Vision Transformers (ViT) are slowly replacing these models due to their ability to capture long range dependencies among image patches. However, the existing ViTs for polyp do not harness the inherent self-attention abilities and incorporate complex attention mechanisms. This paper presents Polyp-Vision Transformer (Polyp-ViT), a novel Transformer model based on the conventional Transformer architecture, which is enhanced with adaptive mechanisms for feature extraction and positional embedding. Polyp-ViT is tested on the Kvasir-seg and CVC-Clinic DB Datasets achieving segmentation accuracies of 0.9891 ± 0.01 and 0.9875 ± 0.71 respectively, outperforming state-of-the-art models. Polyp-ViT is a prospective tool for polyp segmentation which can be adapted to other medical image segmentation tasks as well due to its ability to generalize well.
Collapse
Affiliation(s)
- Mohamed Yacin Sikkandar
- Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah University, Al Majmaah, 11952, Saudi Arabia.
| | - Sankar Ganesh Sundaram
- Department of Artificial Intelligence and Data Science, KPR Institute of Engineering and Technology, Coimbatore, 641407, India
| | - Ahmad Alassaf
- Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah University, Al Majmaah, 11952, Saudi Arabia
| | - Ibrahim AlMohimeed
- Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah University, Al Majmaah, 11952, Saudi Arabia
| | - Khalid Alhussaini
- Department of Biomedical Technology, College of Applied Medical Sciences, King Saud University, Riyadh, 12372, Saudi Arabia
| | - Adham Aleid
- Department of Biomedical Technology, College of Applied Medical Sciences, King Saud University, Riyadh, 12372, Saudi Arabia
| | - Salem Ali Alolayan
- Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah University, Al Majmaah, 11952, Saudi Arabia
| | - P Ramkumar
- Department of Computer Science and Engineering, Sri Sairam College of Engineering, Anekal, Bengaluru, 562106, Karnataka, India
| | - Meshal Khalaf Almutairi
- Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah University, Al Majmaah, 11952, Saudi Arabia
| | - S Sabarunisha Begum
- Department of Biotechnology, P.S.R. Engineering College, Sivakasi, 626140, India
| |
Collapse
|
4
|
Sharma P, Nayak DR, Balabantaray BK, Tanveer M, Nayak R. A survey on cancer detection via convolutional neural networks: Current challenges and future directions. Neural Netw 2024; 169:637-659. [PMID: 37972509 DOI: 10.1016/j.neunet.2023.11.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 10/21/2023] [Accepted: 11/04/2023] [Indexed: 11/19/2023]
Abstract
Cancer is a condition in which abnormal cells uncontrollably split and damage the body tissues. Hence, detecting cancer at an early stage is highly essential. Currently, medical images play an indispensable role in detecting various cancers; however, manual interpretation of these images by radiologists is observer-dependent, time-consuming, and tedious. An automatic decision-making process is thus an essential need for cancer detection and diagnosis. This paper presents a comprehensive survey on automated cancer detection in various human body organs, namely, the breast, lung, liver, prostate, brain, skin, and colon, using convolutional neural networks (CNN) and medical imaging techniques. It also includes a brief discussion about deep learning based on state-of-the-art cancer detection methods, their outcomes, and the possible medical imaging data used. Eventually, the description of the dataset used for cancer detection, the limitations of the existing solutions, future trends, and challenges in this domain are discussed. The utmost goal of this paper is to provide a piece of comprehensive and insightful information to researchers who have a keen interest in developing CNN-based models for cancer detection.
Collapse
Affiliation(s)
- Pallabi Sharma
- School of Computer Science, UPES, Dehradun, 248007, Uttarakhand, India.
| | - Deepak Ranjan Nayak
- Department of Computer Science and Engineering, Malaviya National Institute of Technology, Jaipur, 302017, Rajasthan, India.
| | - Bunil Kumar Balabantaray
- Computer Science and Engineering, National Institute of Technology Meghalaya, Shillong, 793003, Meghalaya, India.
| | - M Tanveer
- Department of Mathematics, Indian Institute of Technology Indore, Simrol, 453552, Indore, India.
| | - Rajashree Nayak
- School of Applied Sciences, Birla Global University, Bhubaneswar, 751029, Odisha, India.
| |
Collapse
|
5
|
Nguyen TC, Nguyen TP, Cao T, Dao TTP, Ho TN, Nguyen TV, Tran MT. MANet: Multi-branch attention auxiliary learning for lung nodule detection and segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 241:107748. [PMID: 37598474 DOI: 10.1016/j.cmpb.2023.107748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Revised: 07/12/2023] [Accepted: 08/03/2023] [Indexed: 08/22/2023]
Abstract
BACKGROUND AND OBJECTIVE Pulmonary nodule detection and segmentation are currently two primary tasks in analyzing chest computed tomography (Chest CT) in order to detect signs of lung cancer, thereby providing early treatment measures to reduce mortality. Even though there are many proposed methods to reduce false positives for obtaining effective detection results, distinguishing between the pulmonary nodule and background region remains challenging because their biological characteristics are similar and varied in size. The purpose of our work is to propose a method for automatic nodule detection and segmentation in Chest CT by enhancing the feature information of pulmonary nodules. METHODS We propose a new UNet-based backbone with multi-branch attention auxiliary learning mechanism, which contains three novel modules, namely, Projection module, Fast Cascading Context module, and Boundary Enhancement module, to further enhance the nodule feature representation. Based on that, we build MANet, a lung nodule localization network that simultaneously detects and segments precise nodule positions. Furthermore, our MANet contains a Proposal Refinement step which refines initially generated proposals to effectively reduce false positives and thereby produce the segmentation quality. RESULTS Comprehensive experiments on the combination of two benchmarks LUNA16 and LIDC-IDRI show that our proposed model outperforms state-of-the-art methods in the tasks of nodule detection and segmentation tasks in terms of FROC, IoU, and DSC metrics. Our method reports an average FROC score of 88.11% in lung nodule detection. For the lung nodule segmentation, the results reach an average IoU score of 71.29% and a DSC score of 82.74%. The ablation study also shows the effectiveness of the new modules which can be integrated into other UNet-based models. CONCLUSIONS The experiments demonstrated our method with multi-branch attention auxiliary learning ability are a promising approach for detecting and segmenting the pulmonary nodule instances compared to the original UNet design.
Collapse
Affiliation(s)
- Tan-Cong Nguyen
- University of Science - VNUHCM, Ho Chi Minh City, Viet Nam; University of Social Sciences and Humanities - VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam
| | - Tien-Phat Nguyen
- University of Science - VNUHCM, Ho Chi Minh City, Viet Nam; John von Neumann Institute - VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam
| | - Tri Cao
- University of Science - VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam
| | - Thao Thi Phuong Dao
- University of Science - VNUHCM, Ho Chi Minh City, Viet Nam; John von Neumann Institute - VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam; Thong Nhat Hospital, Ho Chi Minh City, Viet Nam
| | - Thi-Ngoc Ho
- University of Social Sciences and Humanities - VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam
| | - Tam V Nguyen
- University of Dayton, Dayton, OH, United States.
| | - Minh-Triet Tran
- University of Science - VNUHCM, Ho Chi Minh City, Viet Nam; John von Neumann Institute - VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam
| |
Collapse
|
6
|
AL Qurri A, Almekkawy M. Improved UNet with Attention for Medical Image Segmentation. SENSORS (BASEL, SWITZERLAND) 2023; 23:8589. [PMID: 37896682 PMCID: PMC10611347 DOI: 10.3390/s23208589] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 10/01/2023] [Accepted: 10/13/2023] [Indexed: 10/29/2023]
Abstract
Medical image segmentation is crucial for medical image processing and the development of computer-aided diagnostics. In recent years, deep Convolutional Neural Networks (CNNs) have been widely adopted for medical image segmentation and have achieved significant success. UNet, which is based on CNNs, is the mainstream method used for medical image segmentation. However, its performance suffers owing to its inability to capture long-range dependencies. Transformers were initially designed for Natural Language Processing (NLP), and sequence-to-sequence applications have demonstrated the ability to capture long-range dependencies. However, their abilities to acquire local information are limited. Hybrid architectures of CNNs and Transformer, such as TransUNet, have been proposed to benefit from Transformer's long-range dependencies and CNNs' low-level details. Nevertheless, automatic medical image segmentation remains a challenging task due to factors such as blurred boundaries, the low-contrast tissue environment, and in the context of ultrasound, issues like speckle noise and attenuation. In this paper, we propose a new model that combines the strengths of both CNNs and Transformer, with network architectural improvements designed to enrich the feature representation captured by the skip connections and the decoder. To this end, we devised a new attention module called Three-Level Attention (TLA). This module is composed of an Attention Gate (AG), channel attention, and spatial normalization mechanism. The AG preserves structural information, whereas channel attention helps to model the interdependencies between channels. Spatial normalization employs the spatial coefficient of the Transformer to improve spatial attention akin to TransNorm. To further improve the skip connection and reduce the semantic gap, skip connections between the encoder and decoder were redesigned in a manner similar to that of the UNet++ dense connection. Moreover, deep supervision using a side-output channel was introduced, analogous to BASNet, which was originally used for saliency predictions. Two datasets from different modalities, a CT scan dataset and an ultrasound dataset, were used to evaluate the proposed UNet architecture. The experimental results showed that our model consistently improved the prediction performance of the UNet across different datasets.
Collapse
|
7
|
Houwen BBSL, Nass KJ, Vleugels JLA, Fockens P, Hazewinkel Y, Dekker E. Comprehensive review of publicly available colonoscopic imaging databases for artificial intelligence research: availability, accessibility, and usability. Gastrointest Endosc 2023; 97:184-199.e16. [PMID: 36084720 DOI: 10.1016/j.gie.2022.08.043] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 08/24/2022] [Accepted: 08/30/2022] [Indexed: 01/28/2023]
Abstract
BACKGROUND AND AIMS Publicly available databases containing colonoscopic imaging data are valuable resources for artificial intelligence (AI) research. Currently, little is known regarding the available number and content of these databases. This review aimed to describe the availability, accessibility, and usability of publicly available colonoscopic imaging databases, focusing on polyp detection, polyp characterization, and quality of colonoscopy. METHODS A systematic literature search was performed in MEDLINE and Embase to identify AI studies describing publicly available colonoscopic imaging databases published after 2010. Second, a targeted search using Google's Dataset Search, Google Search, GitHub, and Figshare was done to identify databases directly. Databases were included if they contained data about polyp detection, polyp characterization, or quality of colonoscopy. To assess accessibility of databases, the following categories were defined: open access, open access with barriers, and regulated access. To assess the potential usability of the included databases, essential details of each database were extracted using a checklist derived from the Checklist for Artificial Intelligence in Medical Imaging. RESULTS We identified 22 databases with open access, 3 databases with open access with barriers, and 15 databases with regulated access. The 22 open access databases contained 19,463 images and 952 videos. Nineteen of these databases focused on polyp detection, localization, and/or segmentation; 6 on polyp characterization, and 3 on quality of colonoscopy. Only half of these databases have been used by other researcher to develop, train, or benchmark their AI system. Although technical details were in general well reported, important details such as polyp and patient demographics and the annotation process were under-reported in almost all databases. CONCLUSIONS This review provides greater insight on public availability of colonoscopic imaging databases for AI research. Incomplete reporting of important details limits the ability of researchers to assess the usability of current databases.
Collapse
Affiliation(s)
- Britt B S L Houwen
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam University Medical Centres, location Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| | - Karlijn J Nass
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam University Medical Centres, location Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| | - Jasper L A Vleugels
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam University Medical Centres, location Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| | - Paul Fockens
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam University Medical Centres, location Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| | - Yark Hazewinkel
- Department of Gastroenterology and Hepatology, Radboud University Nijmegen Medical Center, Radboud University of Nijmegen, Nijmegen, the Netherlands
| | - Evelien Dekker
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam University Medical Centres, location Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| |
Collapse
|
8
|
ELKarazle K, Raman V, Then P, Chua C. Detection of Colorectal Polyps from Colonoscopy Using Machine Learning: A Survey on Modern Techniques. SENSORS (BASEL, SWITZERLAND) 2023; 23:1225. [PMID: 36772263 PMCID: PMC9953705 DOI: 10.3390/s23031225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 01/08/2023] [Accepted: 01/17/2023] [Indexed: 06/18/2023]
Abstract
Given the increased interest in utilizing artificial intelligence as an assistive tool in the medical sector, colorectal polyp detection and classification using deep learning techniques has been an active area of research in recent years. The motivation for researching this topic is that physicians miss polyps from time to time due to fatigue and lack of experience carrying out the procedure. Unidentified polyps can cause further complications and ultimately lead to colorectal cancer (CRC), one of the leading causes of cancer mortality. Although various techniques have been presented recently, several key issues, such as the lack of enough training data, white light reflection, and blur affect the performance of such methods. This paper presents a survey on recently proposed methods for detecting polyps from colonoscopy. The survey covers benchmark dataset analysis, evaluation metrics, common challenges, standard methods of building polyp detectors and a review of the latest work in the literature. We conclude this paper by providing a precise analysis of the gaps and trends discovered in the reviewed literature for future work.
Collapse
Affiliation(s)
- Khaled ELKarazle
- School of Information and Communication Technologies, Swinburne University of Technology, Sarawak Campus, Kuching 93350, Malaysia
| | - Valliappan Raman
- Department of Artificial Intelligence and Data Science, Coimbatore Institute of Technology, Coimbatore 641014, India
| | - Patrick Then
- School of Information and Communication Technologies, Swinburne University of Technology, Sarawak Campus, Kuching 93350, Malaysia
| | - Caslon Chua
- Department of Computer Science and Software Engineering, Swinburne University of Technology, Melbourne 3122, Australia
| |
Collapse
|
9
|
Lan K, Cheng J, Jiang J, Jiang X, Zhang Q. Modified UNet++ with atrous spatial pyramid pooling for blood cell image segmentation. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:1420-1433. [PMID: 36650817 DOI: 10.3934/mbe.2023064] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Blood cell image segmentation is an important part of the field of computer-aided diagnosis. However, due to the low contrast, large differences in cell morphology and the scarcity of labeled images, the segmentation performance of cells cannot meet the requirements of an actual diagnosis. To address the above limitations, we present a deep learning-based approach to study cell segmentation on pathological images. Specifically, the algorithm selects UNet++ as the backbone network to extract multi-scale features. Then, the skip connection is redesigned to improve the degradation problem and reduce the computational complexity. In addition, the atrous spatial pyramid pooling (ASSP) is introduced to obtain cell image information features from each layer through different receptive domains. Finally, the multi-sided output fusion (MSOF) strategy is utilized to fuse the features of different semantic levels, so as to improve the accuracy of target segmentation. Experimental results on blood cell images for segmentation and classification (BCISC) dataset show that the proposed method has significant improvement in Matthew's correlation coefficient (Mcc), Dice and Jaccard values, which are better than the classical semantic segmentation network.
Collapse
Affiliation(s)
- Kun Lan
- College of Mechanical Engineering, Quzhou University, Quzhou 324000, China
| | - Jianzhen Cheng
- Department of Rehabilitation, Quzhou Third Hospital, Quzhou 324000, China
| | - Jinyun Jiang
- College of Mechanical Engineering, Quzhou University, Quzhou 324000, China
| | - Xiaoliang Jiang
- College of Mechanical Engineering, Quzhou University, Quzhou 324000, China
| | - Qile Zhang
- Department of Rehabilitation, The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou 324000, China
| |
Collapse
|
10
|
Ali S. Where do we stand in AI for endoscopic image analysis? Deciphering gaps and future directions. NPJ Digit Med 2022; 5:184. [PMID: 36539473 PMCID: PMC9767933 DOI: 10.1038/s41746-022-00733-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 11/29/2022] [Indexed: 12/24/2022] Open
Abstract
Recent developments in deep learning have enabled data-driven algorithms that can reach human-level performance and beyond. The development and deployment of medical image analysis methods have several challenges, including data heterogeneity due to population diversity and different device manufacturers. In addition, more input from experts is required for a reliable method development process. While the exponential growth in clinical imaging data has enabled deep learning to flourish, data heterogeneity, multi-modality, and rare or inconspicuous disease cases still need to be explored. Endoscopy being highly operator-dependent with grim clinical outcomes in some disease cases, reliable and accurate automated system guidance can improve patient care. Most designed methods must be more generalisable to the unseen target data, patient population variability, and variable disease appearances. The paper reviews recent works on endoscopic image analysis with artificial intelligence (AI) and emphasises the current unmatched needs in this field. Finally, it outlines the future directions for clinically relevant complex AI solutions to improve patient outcomes.
Collapse
Affiliation(s)
- Sharib Ali
- School of Computing, University of Leeds, LS2 9JT, Leeds, UK.
| |
Collapse
|
11
|
Wu C, Long C, Li S, Yang J, Jiang F, Zhou R. MSRAformer: Multiscale spatial reverse attention network for polyp segmentation. Comput Biol Med 2022; 151:106274. [PMID: 36375412 DOI: 10.1016/j.compbiomed.2022.106274] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 10/10/2022] [Accepted: 10/30/2022] [Indexed: 11/11/2022]
Abstract
Colon polyp is an important reference basis in the diagnosis of colorectal cancer(CRC). In routine diagnosis, the polyp area is segmented from the colorectal enteroscopy image, and the obtained pathological information is used to assist in the diagnosis of the disease and surgery. It is always a challenging task for accurate segmentation of polyps in colonoscopy images. There are great differences in shape, size, color and texture of the same type of polyps, and it is difficult to distinguish the polyp region from the mucosal boundary. In recent years, convolutional neural network(CNN) has achieved some results in the task of medical image segmentation. However, CNNs focus on the extraction of local features and be short of the extracting ability of global feature information. This paper presents a Multiscale Spatial Reverse Attention Network called MSRAformer with high performance in medical segmentation, which adopts the Swin Transformer encoder with pyramid structure to extract the features of four different stages, and extracts the multi-scale feature information through the multi-scale channel attention module, which enhances the global feature extraction ability and generalization of the network, and preliminarily aggregates a pre-segmentation result. This paper proposes a spatial reverse attention mechanism module to gradually supplement the edge structure and detail information of the polyp region. Extensive experiments on MSRAformer proved that the segmentation effect on the colonoscopy polyp dataset is better than most state-of-the-art(SOTA) medical image segmentation methods, with better generalization performance. Reference implementation of MSRAformer is available at https://github.com/ChengLong1222/MSRAformer-main.
Collapse
Affiliation(s)
- Cong Wu
- School of computer science, Hubei University of Technology, Wuhan, China.
| | - Cheng Long
- School of computer science, Hubei University of Technology, Wuhan, China.
| | - Shijun Li
- School of computer science, Hubei University of Technology, Wuhan, China
| | - Junjie Yang
- Union Hospital Tongji Medical College Huazhong University of Science and Technology, Wuhan, China
| | - Fagang Jiang
- Union Hospital Tongji Medical College Huazhong University of Science and Technology, Wuhan, China
| | - Ran Zhou
- School of computer science, Hubei University of Technology, Wuhan, China
| |
Collapse
|
12
|
Nodirov J, Abdusalomov AB, Whangbo TK. Attention 3D U-Net with Multiple Skip Connections for Segmentation of Brain Tumor Images. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22176501. [PMID: 36080958 PMCID: PMC9460422 DOI: 10.3390/s22176501] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 08/25/2022] [Accepted: 08/26/2022] [Indexed: 06/12/2023]
Abstract
Among researchers using traditional and new machine learning and deep learning techniques, 2D medical image segmentation models are popular. Additionally, 3D volumetric data recently became more accessible, as a result of the high number of studies conducted in recent years regarding the creation of 3D volumes. Using these 3D data, researchers have begun conducting research on creating 3D segmentation models, such as brain tumor segmentation and classification. Since a higher number of crucial features can be extracted using 3D data than 2D data, 3D brain tumor detection models have increased in popularity among researchers. Until now, various significant research works have focused on the 3D version of the U-Net and other popular models, such as 3D U-Net and V-Net, while doing superior research works. In this study, we used 3D brain image data and created a new architecture based on a 3D U-Net model that uses multiple skip connections with cost-efficient pretrained 3D MobileNetV2 blocks and attention modules. These pretrained MobileNetV2 blocks assist our architecture by providing smaller parameters to maintain operable model size in terms of our computational capability and help the model to converge faster. We added additional skip connections between the encoder and decoder blocks to ease the exchange of extracted features between the two blocks, which resulted in the maximum use of the features. We also used attention modules to filter out irrelevant features coming through the skip connections and, thus, preserved more computational power while achieving improved accuracy.
Collapse
Affiliation(s)
- Jakhongir Nodirov
- Department of IT Convergence Engineering, Gachon University, Sujeong-Gu, Seongnam-Si 461-701, Gyeonggi-Do, Korea
| | | | - Taeg Keun Whangbo
- Department of Computer Engineering, Gachon University, Sujeong-Gu, Seongnam-Si 461-701, Gyeonggi-Do, Korea
| |
Collapse
|
13
|
Park HC, Poudel S, Ghimire R, Lee SW. Polyp segmentation with consistency training and continuous update of pseudo-label. Sci Rep 2022; 12:14626. [PMID: 36028547 PMCID: PMC9418164 DOI: 10.1038/s41598-022-17843-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 08/02/2022] [Indexed: 11/18/2022] Open
Abstract
Polyp segmentation has accomplished massive triumph over the years in the field of supervised learning. However, obtaining a vast number of labeled datasets is commonly challenging in the medical domain. To solve this problem, we employ semi-supervised methods and suitably take advantage of unlabeled data to improve the performance of polyp image segmentation. First, we propose an encoder-decoder-based method well suited for the polyp with varying shape, size, and scales. Second, we utilize the teacher-student concept of training the model, where the teacher model is the student model's exponential average. Third, to leverage the unlabeled dataset, we enforce a consistency technique and force the teacher model to generate a similar output on the different perturbed versions of the given input. Finally, we propose a method that upgrades the traditional pseudo-label method by learning the model with continuous update of pseudo-label. We show the efficacy of our proposed method on different polyp datasets, and hence attaining better results in semi-supervised settings. Extensive experiments demonstrate that our proposed method can propagate the unlabeled dataset's essential information to improve performance.
Collapse
Affiliation(s)
- Hyun-Cheol Park
- Department of IT Convergence Engineering, Gachon University, Seongnam, 13120, South Korea
| | - Sahadev Poudel
- Department of IT Convergence Engineering, Gachon University, Seongnam, 13120, South Korea
| | - Raman Ghimire
- Department of IT Convergence Engineering, Gachon University, Seongnam, 13120, South Korea
| | - Sang-Woong Lee
- School of Computing, Gachon University, Seongnam, 13120, South Korea.
| |
Collapse
|
14
|
An Improved U-Net Image Segmentation Method and Its Application for Metallic Grain Size Statistics. MATERIALS 2022; 15:ma15134417. [PMID: 35806543 PMCID: PMC9267311 DOI: 10.3390/ma15134417] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 06/17/2022] [Accepted: 06/19/2022] [Indexed: 02/01/2023]
Abstract
Grain size is one of the most important parameters for metallographic microstructure analysis, which can partly determine the material performance. The measurement of grain size is based on accurate image segmentation methods, which include traditional image processing methods and emerging machine-learning-based methods. Unfortunately, traditional image processing methods can hardly segment grains correctly from metallographic images with low contrast and blurry boundaries. Moreover, the proposed machine-learning-based methods need a large dataset to train the model and can hardly deal with the segmentation challenge of complex images with fuzzy boundaries and complex structure. In this paper, an improved U-Net model is proposed to automatically accomplish image segmentation of complex metallographic images with only a small training set. The experiments on metallographic images show the significant advantage of the method, especially for the metallographic images with low contrast, a fuzzy boundary and complex structure. Compared with other deep learning methods, the improved U-Net scored higher in ACC, MIoU, Precision, and F1 indexes, among which ACC was 0.97, MIoU was 0.752, Precision was 0.98, and F1 was 0.96. The grain size was calculated based on the segmentation according to the American Society for Testing Material (ASTM) standards, producing a satisfactory result.
Collapse
|
15
|
Caires Silveira E, Santos Corrêa CF, Madureira Silva L, Almeida Santos B, Mattos Pretti S, Freire de Melo F. Recognition of esophagitis in endoscopic images using transfer learning. World J Gastrointest Endosc 2022; 14:311-319. [PMID: 35719896 PMCID: PMC9157692 DOI: 10.4253/wjge.v14.i5.311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/15/2021] [Accepted: 04/26/2022] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Esophagitis is an inflammatory and damaging process of the esophageal mucosa, which is confirmed by endoscopic visualization and may, in extreme cases, result in stenosis, fistulization and esophageal perforation. The use of deep learning (a field of artificial intelligence) techniques can be considered to determine the presence of esophageal lesions compatible with esophagitis.
AIM To develop, using transfer learning, a deep neural network model to recognize the presence of esophagitis in endoscopic images.
METHODS Endoscopic images of 1932 patients with a diagnosis of esophagitis and 1663 patients without any pathological diagnosis provenient from the KSAVIR and HyperKSAVIR datasets were splitted in training (80%) and test (20%) and used to develop and evaluate a binary deep learning classifier built using the DenseNet-201 architecture, a densely connected convolutional network, with weights pretrained on the ImageNet image set and fine-tuned during training. The classifier model performance was evaluated in the test set according to accuracy, sensitivity, specificity and area under the receiver operating characteristic curve (AUC).
RESULTS The model was trained using Adam optimizer with a learning rate of 0.0001 and applying binary cross entropy loss function. In the test set (n = 719), the classifier achieved 93.32% accuracy, 93.18% sensitivity, 93.46% specificity and a 0.96 AUC. Heatmaps for spatial predictive relevance in esophagitis endoscopic images from the test set were also plotted. In face of the obtained results, the use of dense convolutional neural networks with pretrained and fine-tuned weights proves to be a good strategy for predictive modeling for esophagitis recognition in endoscopic images. In addition, adopting the classification approach combined with the subsequent plotting of heat maps associated with the classificatory decision gives greater explainability to the model.
CONCLUSION It is opportune to raise new studies involving transfer learning for the analysis of endoscopic images, aiming to improve, validate and disseminate its use for clinical practice.
Collapse
Affiliation(s)
- Elena Caires Silveira
- Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45029-094, Bahia, Brazil
| | - Caio Fellipe Santos Corrêa
- Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45029-094, Bahia, Brazil
| | - Leonardo Madureira Silva
- Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45029-094, Bahia, Brazil
| | - Bruna Almeida Santos
- Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45029-094, Bahia, Brazil
| | - Soraya Mattos Pretti
- Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45029-094, Bahia, Brazil
| | - Fabrício Freire de Melo
- Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45029-094, Bahia, Brazil
| |
Collapse
|
16
|
Deep Ensembles Based on Stochastic Activations for Semantic Segmentation. SIGNALS 2021. [DOI: 10.3390/signals2040047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Semantic segmentation is a very popular topic in modern computer vision, and it has applications in many fields. Researchers have proposed a variety of architectures for semantic image segmentation. The most common ones exploit an encoder–decoder structure that aims to capture the semantics of the image and its low-level features. The encoder uses convolutional layers, in general with a stride larger than one, to extract the features, while the decoder recreates the image by upsampling and using skip connections with the first layers. The objective of this study is to propose a method for creating an ensemble of CNNs by enhancing diversity among networks with different activation functions. In this work, we use DeepLabV3+ as an architecture to test the effectiveness of creating an ensemble of networks by randomly changing the activation functions inside the network multiple times. We also use different backbone networks in our DeepLabV3+ to validate our findings. A comprehensive evaluation of the proposed approach is conducted across two different image segmentation problems: the first is from the medical field, i.e., polyp segmentation for early detection of colorectal cancer, and the second is skin detection for several different applications, including face detection, hand gesture recognition, and many others. As to the first problem, we manage to reach a Dice coefficient of 0.888, and a mean intersection over union (mIoU) of 0.825, in the competitive Kvasir-SEG dataset. The high performance of the proposed ensemble is confirmed in skin detection, where the proposed approach is ranked first concerning other state-of-the-art approaches (including HarDNet) in a large set of testing datasets.
Collapse
|
17
|
Hwang J, Hwang S. Exploiting Global Structure Information to Improve Medical Image Segmentation. SENSORS 2021; 21:s21093249. [PMID: 34067205 PMCID: PMC8125827 DOI: 10.3390/s21093249] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 04/28/2021] [Accepted: 05/03/2021] [Indexed: 11/24/2022]
Abstract
In this paper, we propose a method to enhance the performance of segmentation models for medical images. The method is based on convolutional neural networks that learn the global structure information, which corresponds to anatomical structures in medical images. Specifically, the proposed method is designed to learn the global boundary structures via an autoencoder and constrain a segmentation network through a loss function. In this manner, the segmentation model performs the prediction in the learned anatomical feature space. Unlike previous studies that considered anatomical priors by using a pre-trained autoencoder to train segmentation networks, we propose a single-stage approach in which the segmentation network and autoencoder are jointly learned. To verify the effectiveness of the proposed method, the segmentation performance is evaluated in terms of both the overlap and distance metrics on the lung area and spinal cord segmentation tasks. The experimental results demonstrate that the proposed method can enhance not only the segmentation performance but also the robustness against domain shifts.
Collapse
Affiliation(s)
- Jaemoon Hwang
- Department of Data Science, Seoul National University of Science and Technology, Seoul 01811, Korea;
| | - Sangheum Hwang
- Department of Data Science, Seoul National University of Science and Technology, Seoul 01811, Korea;
- Department of Industrial & Information Systems Engineering, Seoul National University of Science and Technology, Seoul 01811, Korea
- Research Center for Electrical and Information Technology, Seoul National University of Science and Technology, Seoul 01811, Korea
- Correspondence:
| |
Collapse
|