Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

23
(from Reference Citation Analysis)

Article PDFs (4)

Cited by > 0 (8)

Searched Name

Attention mechanisms

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Dong C, Du G. An enhanced real-time human pose estimation method based on modified YOLOv8 framework. Sci Rep 2024;14:8012. [PMID: 38580704 PMCID: PMC10997650 DOI: 10.1038/s41598-024-58146-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 03/26/2024] [Indexed: 04/07/2024] Open

Hussein R, Shin D, Zhao MY, Guo J, Davidzon G, Steinberg G, Moseley M, Zaharchuk G. Turning brain MRI into diagnostic PET: ¹⁵O-water PET CBF synthesis from multi-contrast MRI via attention-based encoder-decoder networks. Med Image Anal 2024;93:103072. [PMID: 38176356 PMCID: PMC10922206 DOI: 10.1016/j.media.2023.103072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 12/20/2023] [Accepted: 12/20/2023] [Indexed: 01/06/2024]

Wu Y, Li J, Wang X, Zhang Z, Zhao S. DECIDE: A decoupled semantic and boundary learning network for precise osteosarcoma segmentation by integrating multi-modality MRI. Comput Biol Med 2024;174:108308. [PMID: 38581998 DOI: 10.1016/j.compbiomed.2024.108308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 01/17/2024] [Accepted: 03/12/2024] [Indexed: 04/08/2024]

Abstract

Automated Osteosarcoma Segmentation in Multi-modality MRI (AOSMM) holds clinical significance for effective tumor evaluation and treatment planning. However, the precision of AOSMM is challenged by the diverse characteristics of multi-modality MRI and the inherent heterogeneity and boundary ambiguity of osteosarcoma. While numerous methods have made significant strides in automated osteosarcoma segmentation, they primarily focused on the use of a single MRI modality and overlooked the potential benefits of integrating complementary information from other MRI modalities. Furthermore, they did not adequately model the long-range dependencies of complex tumor features, which may lead to insufficiently discriminative feature representations. To this end, we propose a decoupled semantic and boundary learning network (DECIDE) to achieve precise AOSMM with three functional modules. The Multi-modality Feature Fusion and Recalibration (MFR) module adaptively fuses and recalibrates multi-modality features by exploiting their channel-wise dependencies to compute low-rank attention weights for effectively aggregating useful information from different MRI modalities, which promotes complementary learning between multi-modality MRI and enables a more comprehensive tumor characterization. The Lesion Attention Enhancement (LAE) module employs spatial and channel attention mechanisms to capture global contextual dependencies over local features, significantly enhancing the discriminability and representational capacity of intricate tumor features. The Boundary Context Aggregation (BCA) module further enhances semantic representations by utilizing boundary information for effective context aggregation while also ensuring intra-class consistency in cases of boundary ambiguity. Substantial experiments demonstrate that DECIDE achieves exceptional performance in osteosarcoma segmentation, surpassing state-of-the-art methods in terms of accuracy and stability.

Collapse

V JP, S AAV, P GK, N K K. A novel attention-based cross-modal transfer learning framework for predicting cardiovascular disease. Comput Biol Med 2024;170:107977. [PMID: 38217974 DOI: 10.1016/j.compbiomed.2024.107977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/19/2023] [Accepted: 01/08/2024] [Indexed: 01/15/2024]

Lu P, Zhang W, Wu J. AMPCDA: Prediction of circRNA-disease associations by utilizing attention mechanisms on metapaths. Comput Biol Chem 2024;108:107989. [PMID: 38016366 DOI: 10.1016/j.compbiolchem.2023.107989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 10/24/2023] [Accepted: 11/15/2023] [Indexed: 11/30/2023]

Abstract

Researchers have been creating an expanding corpus of experimental evidences in biomedical field which has revealed prevalent associations between circRNAs and human diseases. Such linkages unveiled afforded a new perspective for elucidating etiology and devise innovative therapeutic strategies. In recent years, many computational methods were introduced to remedy the limitations of inefficiency and exorbitant budgets brought by conventional lab-experimental approaches to enumerate possible circRNA-disease associations, but the majority of existing methods still face challenges in effectively integrating node embeddings with higher-order neighborhood representations, which might hinder the final predictive accuracy from attaining optimal measures. To overcome such constraints, we proposed AMPCDA, a computational technique harnessing predefined metapaths to predict circRNA-disease associations. Specifically, an association graph is initially built upon three source databases and two similarity derivation procedures, and DeepWalk is subsequently imposed on the graph to procure initial feature representations. Vectorial embeddings of metapath instances, concatenated by initial node features, are then fed through a customed encoder. By employing self-attention section, metapath-specific contributions to each node are accumulated before combining with node's intrinsic features and channeling into a graph attention module, which furnished the input representations for the multilayer perceptron to predict the ultimate association probability scores. By integrating graph topology features and node embedding themselves, AMPCDA managed to effectively leverage information carried by multiple nodes along paths and exhibited an exceptional predictive performance, achieving AUC values of 0.9623, 0.9675, and 0.9711 under 5-fold cross validation, 10-fold cross validation, and leave-one-out cross validation, respectively. These results signify substantial accuracy improvements compared to other prediction models. Case study assessments confirm the high predictive accuracy of our proposed technique in identifying circRNA-disease connections, highlighting its value in guiding future biological research to reveal new disease mechanisms.

Collapse

Wang Z, Yang P, Hu L, Zhang B, Lin C, Lv W, Wang Q. SLAPP: Subgraph-level attention-based performance prediction for deep learning models. Neural Netw 2024;170:285-297. [PMID: 38000312 DOI: 10.1016/j.neunet.2023.11.043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 10/30/2023] [Accepted: 11/20/2023] [Indexed: 11/26/2023]

Luo H, Wei J, Wang Y, Chen J, Li W. An improved lightweight object detection algorithm for YOLOv5. PeerJ Comput Sci 2024;10:e1830. [PMID: 38435620 PMCID: PMC10909222 DOI: 10.7717/peerj-cs.1830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 12/29/2023] [Indexed: 03/05/2024]

Abstract

Object detection based on deep learning has made great progress in the past decade and has been widely used in various fields of daily life. Model lightweighting is the core of deploying target detection models on mobile or edge devices. Lightweight models have fewer parameters and lower computational costs, but are often accompanied by lower detection accuracy. Based on YOLOv5s, this article proposes an improved lightweight target detection model, which can achieve higher detection accuracy with smaller parameters. Firstly, utilizing the lightweight feature of the Ghost module, we integrated it into the C3 structure and replaced some of the C3 modules after the upsample layer on the neck network, thereby reducing the number of model parameters and expediting the model's inference process. Secondly, the coordinate attention (CA) mechanism was added to the neck to enhance the model's ability to pay attention to relevant information and improved detection accuracy. Finally, a more efficient Simplified Spatial Pyramid Pooling-Fast (SimSPPF) module was designed to enhance the stability of the model and shorten the training time of the model. In order to verify the effectiveness of the improved model, experiments were conducted using three datasets with different features. Experimental results show that the number of parameters of our model is significantly reduced by 28% compared with the original model, and mean average precision (mAP) is increased by 3.1%, 1.1% and 1.8% respectively. The model also performs better in terms of accuracy compared to existing lightweight state-of-the-art models. On three datasets with different features, mAP of the proposed model achieved 87.2%, 77.8% and 92.3%, which is better than YOLOv7tiny (81.4%, 77.7%, 90.3%), YOLOv8n (84.7%, 77.7%, 90.6%) and other advanced models. When achieving the decreased number of parameters, the improved model can successfully increase mAP, providing great reference for deploying the model on mobile or edge devices.

Collapse

Zhao Z, Zhu J, Jiao P, Wang J, Zhang X, Lu X, Zhang Y. Hybrid-FHR: a multi-modal AI approach for automated fetal acidosis diagnosis. BMC Med Inform Decis Mak 2024;24:19. [PMID: 38247009 PMCID: PMC10801938 DOI: 10.1186/s12911-024-02423-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 01/10/2024] [Indexed: 01/23/2024] Open

Abstract

BACKGROUND

In clinical medicine, fetal heart rate (FHR) monitoring using cardiotocography (CTG) is one of the most commonly used methods for assessing fetal acidosis. However, as the visual interpretation of CTG depends on the subjective judgment of the clinician, this has led to high inter-observer and intra-observer variability, making it necessary to introduce automated diagnostic techniques.

METHODS

In this study, we propose a computer-aided diagnostic algorithm (Hybrid-FHR) for fetal acidosis to assist physicians in making objective decisions and taking timely interventions. Hybrid-FHR uses multi-modal features, including one-dimensional FHR signals and three types of expert features designed based on prior knowledge (morphological time domain, frequency domain, and nonlinear). To extract the spatiotemporal feature representation of one-dimensional FHR signals, we designed a multi-scale squeeze and excitation temporal convolutional network (SE-TCN) backbone model based on dilated causal convolution, which can effectively capture the long-term dependence of FHR signals by expanding the receptive field of each layer's convolution kernel while maintaining a relatively small parameter size. In addition, we proposed a cross-modal feature fusion (CMFF) method that uses multi-head attention mechanisms to explore the relationships between different modalities, obtaining more informative feature representations and improving diagnostic accuracy.

RESULTS

Our ablation experiments show that the Hybrid-FHR outperforms traditional previous methods, with average accuracy, specificity, sensitivity, precision, and F1 score of 96.8, 97.5, 96, 97.5, and 96.7%, respectively.

CONCLUSIONS

Our algorithm enables automated CTG analysis, assisting healthcare professionals in the early identification of fetal acidosis and the prompt implementation of interventions.

Collapse

Jia J, Lv P, Wei X, Qiu W. SNO-DCA: A model for predicting S-nitrosylation sites based on densely connected convolutional networks and attention mechanism. Heliyon 2024;10:e23187. [PMID: 38148797 PMCID: PMC10750070 DOI: 10.1016/j.heliyon.2023.e23187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 11/22/2023] [Accepted: 11/29/2023] [Indexed: 12/28/2023] Open

Huang H, Chen P, Wen J, Lu X, Zhang N. Multiband seizure type classification based on 3D convolution with attention mechanisms. Comput Biol Med 2023;166:107517. [PMID: 37778214 DOI: 10.1016/j.compbiomed.2023.107517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 08/28/2023] [Accepted: 09/19/2023] [Indexed: 10/03/2023]

Qiu C, Huang Z, Lin C, Zhang G, Ying S. A despeckling method for ultrasound images utilizing content-aware prior and attention-driven techniques. Comput Biol Med 2023;166:107515. [PMID: 37839221 DOI: 10.1016/j.compbiomed.2023.107515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Revised: 08/25/2023] [Accepted: 09/19/2023] [Indexed: 10/17/2023]

Lu S, Liu M, Yin L, Yin Z, Liu X, Zheng W. The multi-modal fusion in visual question answering: a review of attention mechanisms. PeerJ Comput Sci 2023;9:e1400. [PMID: 37346665 PMCID: PMC10280591 DOI: 10.7717/peerj-cs.1400] [Citation(s) in RCA: 38] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 04/25/2023] [Indexed: 06/23/2023]

Fan T, Qiu S, Wang Z, Zhao H, Jiang J, Wang Y, Xu J, Sun T, Jiang N. A new deep convolutional neural network incorporating attentional mechanisms for ECG emotion recognition. Comput Biol Med 2023;159:106938. [PMID: 37119553 DOI: 10.1016/j.compbiomed.2023.106938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 03/28/2023] [Accepted: 04/14/2023] [Indexed: 05/01/2023]

Zhan B, Song E, Liu H. FSA-Net: Rethinking the attention mechanisms in medical image segmentation from releasing global suppressed information. Comput Biol Med 2023;161:106932. [PMID: 37230013 DOI: 10.1016/j.compbiomed.2023.106932] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 03/28/2023] [Accepted: 04/13/2023] [Indexed: 05/27/2023]

Zhang J, Chen Y, Zeng P, Liu Y, Diao Y, Liu P. Ultra-Attention: Automatic Recognition of Liver Ultrasound Standard Sections Based on Visual Attention Perception Structures. Ultrasound Med Biol 2023;49:1007-1017. [PMID: 36681610 DOI: 10.1016/j.ultrasmedbio.2022.12.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 11/12/2022] [Accepted: 12/22/2022] [Indexed: 06/17/2023]

Gómez S, Mantilla D, Rangel E, Ortiz A, D Vera D, Martínez Carrillo F. A deep supervised cross-attention strategy for ischemic stroke segmentation in MRI studies. Biomed Phys Eng Express 2023;9. [PMID: 36988115 DOI: 10.1088/2057-1976/acc853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 03/28/2023] [Indexed: 03/30/2023]

Wei J, Liu G, Liu S, Xiao Z. A novel algorithm for small object detection based on YOLOv4. PeerJ Comput Sci 2023;9:e1314. [PMID: 37346537 PMCID: PMC10280595 DOI: 10.7717/peerj-cs.1314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 03/06/2023] [Indexed: 06/23/2023]

Abstract

Small object detection is one of the difficulties in the development of computer vision, especially in the case of complex image backgrounds, and the accuracy of small object detection still needs to be improved. In this article, we present a small object detection network based on YOLOv4, which solves some obstacles that hinder the performance of traditional methods in small object detection tasks in complex road environments, such as few effective features, the influence of image noise, and occlusion by large objects, and improves the detection of small objects in complex background situations such as drone aerial survey images. The improved network architecture reduces the computation and GPU memory consumption of the network by including the cross-stage partial network (CSPNet) structure into the spatial pyramid pool (SPP) structure in the YOLOv4 network and convolutional layers after concatenation operation. Secondly, the accuracy of the model on the small object detection task is improved by adding a more suitable small object detection head and removing one used for large object detection. Then, a new branch is added to extract feature information at a shallow location in the backbone part, and the feature information extracted from this branch is fused in the neck part to enrich the small object location information extracted by the model; when fusing feature information from different levels in the backbone, the fusion weight of useful information is increased by adding a weighting mechanism to improve detection performance at each scale. Finally, a coordinated attention (CA) module is embedded at a suitable location in the neck part, which enables the model to focus on spatial location relationships and inter-channel relationships and enhances feature representation capability. The proposed model has been tested to detect 10 different target objects in aerial images from drones and five different road traffic signal signs in images taken from vehicles in a complex road environment. The detection speed of the model meets the criteria of real-time detection, the model has better performance in terms of accuracy compared to the existing state-of-the-art detection models, and the model has only 44M parameters. On the drone aerial photography dataset, the average accuracy of YOLOv4 and YOLOv5L is 42.79% and 42.10%, respectively, while our model achieves an average accuracy (mAP) of 52.76%; on the urban road traffic light dataset, the proposed model achieves an average accuracy of 96.98%, which is also better than YOLOv4 (95.32%), YOLOv5L (94.79%) and other advanced models. The current work provides an efficient method for small object detection in complex road environments, which can be extended to scenarios involving small object detection, such as drone cruising and autonomous driving.

Collapse

Yu Z, Liu S, Liu P, Liu Y. Automatic detection and diagnosis of thyroid ultrasound images based on attention mechanism. Comput Biol Med 2023;155:106468. [PMID: 36841057 DOI: 10.1016/j.compbiomed.2022.106468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 11/21/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]

Zhang Y, Su L, Liu Z, Tan W, Jiang Y, Cheng C. A semi-supervised learning approach for COVID-19 detection from chest CT scans. Neurocomputing 2022;503:314-24. [PMID: 35765410 DOI: 10.1016/j.neucom.2022.06.076] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 05/11/2022] [Accepted: 06/18/2022] [Indexed: 01/17/2023]

Guan A, Liu L, Fu X, Liu L. Precision medical image hash retrieval by interpretability and feature fusion. Comput Methods Programs Biomed 2022;222:106945. [PMID: 35749884 DOI: 10.1016/j.cmpb.2022.106945] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 04/14/2022] [Accepted: 06/07/2022] [Indexed: 06/15/2023]

Abstract

BACKGROUND AND OBJECTIVE

To address the problem of low accuracy of medical image retrieval due to high inter-class similarity and easy omission of lesions, a precision medical image hash retrieval method combining interpretability and feature fusion is proposed, taking chest X-ray images as an example.

METHODS

Firstly, the DenseNet-121 network is pre-trained on a large dataset of medical images without manual annotation using the comparison to learn (C2L) method to obtain a backbone network model containing more medical representations with training weights. Then, a global network is constructed by using global image learning to acquire an interpretable saliency map as attention mechanisms, which can generate a mask crop to get a local discriminant region. Thirdly, the local discriminant regions are used as local network inputs to obtain local features, and the global features are used with the local features by dimension in the pooling layer. Finally, a hash layer is added between the fully connected layer and the classification layer of the backbone network, defining classification loss, quantization loss and bit-balanced loss functions to generate high-quality hash codes. The final retrieval result is output by calculating the similarity metric of the hash codes.

RESULTS

Experiments on the Chest X-ray8 dataset demonstrate that our proposed interpretable saliency map can effectively locate focal regions, the fusion of features can avoid information omission, and the combination of three loss functions can generate more accurate hash codes. Compared with the current advanced medical image retrieval methods, this method can effectively improve the accuracy of medical image retrieval.

CONCLUSIONS

The proposed hash retrieval approach combining interpretability and feature fusion can effectively improve the accuracy of medical image retrieval which can be potentially applied in computer-aided-diagnosis systems.

Collapse

Ni J, Wu J, Elazab A, Tong J, Chen Z. DNL-Net: deformed non-local neural network for blood vessel segmentation. BMC Med Imaging 2022;22:109. [PMID: 35668351 PMCID: PMC9169317 DOI: 10.1186/s12880-022-00836-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 05/31/2022] [Indexed: 11/10/2022] Open

Yeung M, Sala E, Schönlieb CB, Rundo L. Focus U-Net: A novel dual attention-gated CNN for polyp segmentation during colonoscopy. Comput Biol Med 2021;137:104815. [PMID: 34507156 PMCID: PMC8505797 DOI: 10.1016/j.compbiomed.2021.104815] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 08/26/2021] [Accepted: 08/26/2021] [Indexed: 02/07/2023]

Abstract

BACKGROUND

Colonoscopy remains the gold-standard screening for colorectal cancer. However, significant miss rates for polyps have been reported, particularly when there are multiple small adenomas. This presents an opportunity to leverage computer-aided systems to support clinicians and reduce the number of polyps missed.

METHOD

In this work we introduce the Focus U-Net, a novel dual attention-gated deep neural network, which combines efficient spatial and channel-based attention into a single Focus Gate module to encourage selective learning of polyp features. The Focus U-Net incorporates several further architectural modifications, including the addition of short-range skip connections and deep supervision. Furthermore, we introduce the Hybrid Focal loss, a new compound loss function based on the Focal loss and Focal Tversky loss, designed to handle class-imbalanced image segmentation. For our experiments, we selected five public datasets containing images of polyps obtained during optical colonoscopy: CVC-ClinicDB, Kvasir-SEG, CVC-ColonDB, ETIS-Larib PolypDB and EndoScene test set. We first perform a series of ablation studies and then evaluate the Focus U-Net on the CVC-ClinicDB and Kvasir-SEG datasets separately, and on a combined dataset of all five public datasets. To evaluate model performance, we use the Dice similarity coefficient (DSC) and Intersection over Union (IoU) metrics.

RESULTS

Our model achieves state-of-the-art results for both CVC-ClinicDB and Kvasir-SEG, with a mean DSC of 0.941 and 0.910, respectively. When evaluated on a combination of five public polyp datasets, our model similarly achieves state-of-the-art results with a mean DSC of 0.878 and mean IoU of 0.809, a 14% and 15% improvement over the previous state-of-the-art results of 0.768 and 0.702, respectively.

CONCLUSIONS

This study shows the potential for deep learning to provide fast and accurate polyp segmentation results for use during colonoscopy. The Focus U-Net may be adapted for future use in newer non-invasive colorectal cancer screening and more broadly to other biomedical image segmentation tasks similarly involving class imbalance and requiring efficiency.

Collapse

Sridharan D, Knudsen EI. Selective disinhibition: A unified neural mechanism for predictive and post hoc attentional selection. Vision Res 2015;116:194-209. [PMID: 25542276 DOI: 10.1016/j.visres.2014.12.010] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2014] [Revised: 12/04/2014] [Accepted: 12/11/2014] [Indexed: 11/23/2022]