1
|
Pavithra S, Jaladi D, Tamilarasi K. Optical imaging for diabetic retinopathy diagnosis and detection using ensemble models. Photodiagnosis Photodyn Ther 2024; 48:104259. [PMID: 38944405 DOI: 10.1016/j.pdpdt.2024.104259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 06/16/2024] [Accepted: 06/20/2024] [Indexed: 07/01/2024]
Abstract
Diabetes, characterized by heightened blood sugar levels, can lead to a condition called Diabetic Retinopathy (DR), which adversely impacts the eyes due to elevated blood sugar affecting the retinal blood vessels. The most common cause of blindness in diabetics is thought to be Diabetic Retinopathy (DR), particularly in working-age individuals living in poor nations. People with type 1 or type 2 diabetes may develop this illness, and the risk rises with the length of diabetes and inadequate blood sugar management. There are limits to traditional approaches for the early identification of diabetic retinopathy (DR). In order to diagnose diabetic retinopathy, a model based on Convolutional neural network (CNN) is used in a unique way in this research. The suggested model uses a number of deep learning (DL) models, such as VGG19, Resnet50, and InceptionV3, to extract features. After concatenation, these characteristics are sent through the CNN algorithm for classification. By combining the advantages of several models, ensemble approaches can be effective tools for detecting diabetic retinopathy and increase overall performance and resilience. Classification and image recognition are just a few of the tasks that may be accomplished with ensemble approaches like combination of VGG19,Inception V3 and Resnet 50 to achieve high accuracy. The proposed model is evaluated using a publicly accessible collection of fundus images.VGG19, ResNet50, and InceptionV3 differ in their neural network architectures, feature extraction capabilities, object detection methods, and approaches to retinal delineation. VGG19 may excel in capturing fine details, ResNet50 in recognizing complex patterns, and InceptionV3 in efficiently capturing multi-scale features. Their combined use in an ensemble approach can provide a comprehensive analysis of retinal images, aiding in the delineation of retinal regions and identification of abnormalities associated with diabetic retinopathy. For instance, micro aneurysms, the earliest signs of DR, often require precise detection of subtle vascular abnormalities. VGG19's proficiency in capturing fine details allows for the identification of these minute changes in retinal morphology. On the other hand, ResNet50's strength lies in recognizing intricate patterns, making it effective in detecting neoneovascularization and complex haemorrhagic lesions. Meanwhile, InceptionV3's multi-scale feature extraction enables comprehensive analysis, crucial for assessing macular oedema and ischaemic changes across different retinal layers.
Collapse
Affiliation(s)
- S Pavithra
- School of Computer Science and Engineering, VIT University, Chennai, Tamil Nadu, India.
| | - Deepika Jaladi
- School of Computer Science and Engineering, VIT University, Chennai, Tamil Nadu, India.
| | - K Tamilarasi
- School of Computer Science and Engineering, VIT University, Chennai, Tamil Nadu, India.
| |
Collapse
|
2
|
Zhang Y, Ma X, Huang K, Li M, Heng PA. Semantic-Oriented Visual Prompt Learning for Diabetic Retinopathy Grading on Fundus Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2960-2969. [PMID: 38564346 DOI: 10.1109/tmi.2024.3383827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Diabetic retinopathy (DR) is a serious ocular condition that requires effective monitoring and treatment by ophthalmologists. However, constructing a reliable DR grading model remains a challenging and costly task, heavily reliant on high-quality training sets and adequate hardware resources. In this paper, we investigate the knowledge transferability of large-scale pre-trained models (LPMs) to fundus images based on prompt learning to construct a DR grading model efficiently. Unlike full-tuning which fine-tunes all parameters of LPMs, prompt learning only involves a minimal number of additional learnable parameters while achieving a competitive effect as full-tuning. Inspired by visual prompt tuning, we propose Semantic-oriented Visual Prompt Learning (SVPL) to enhance the semantic perception ability for better extracting task-specific knowledge from LPMs, without any additional annotations. Specifically, SVPL assigns a group of learnable prompts for each DR level to fit the complex pathological manifestations and then aligns each prompt group to task-specific semantic space via a contrastive group alignment (CGA) module. We also propose a plug-and-play adapter module, Hierarchical Semantic Delivery (HSD), which allows the semantic transition of prompt groups from shallow to deep layers to facilitate efficient knowledge mining and model convergence. Our extensive experiments on three public DR grading datasets demonstrate that SVPL achieves superior results compared to other transfer tuning and DR grading methods. Further analysis suggests that the generalized knowledge from LPMs is advantageous for constructing the DR grading model on fundus images.
Collapse
|
3
|
Huang Y, Lyu J, Cheng P, Tam R, Tang X. SSiT: Saliency-Guided Self-Supervised Image Transformer for Diabetic Retinopathy Grading. IEEE J Biomed Health Inform 2024; 28:2806-2817. [PMID: 38319784 DOI: 10.1109/jbhi.2024.3362878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2024]
Abstract
Self-supervised Learning (SSL) has been widely applied to learn image representations through exploiting unlabeled images. However, it has not been fully explored in the medical image analysis field. In this work, Saliency-guided Self-Supervised image Transformer (SSiT) is proposed for Diabetic Retinopathy (DR) grading from fundus images. We novelly introduce saliency maps into SSL, with a goal of guiding self-supervised pre-training with domain-specific prior knowledge. Specifically, two saliency-guided learning tasks are employed in SSiT: 1) Saliency-guided contrastive learning is conducted based on the momentum contrast, wherein fundus images' saliency maps are utilized to remove trivial patches from the input sequences of the momentum-updated key encoder. Thus, the key encoder is constrained to provide target representations focusing on salient regions, guiding the query encoder to capture salient features. 2) The query encoder is trained to predict the saliency segmentation, encouraging the preservation of fine-grained information in the learned representations. To assess our proposed method, four publicly-accessible fundus image datasets are adopted. One dataset is employed for pre-training, while the three others are used to evaluate the pre-trained models' performance on downstream DR grading. The proposed SSiT significantly outperforms other representative state-of-the-art SSL methods on all downstream datasets and under various evaluation settings. For example, SSiT achieves a Kappa score of 81.88% on the DDR dataset under fine-tuning evaluation, outperforming all other ViT-based SSL methods by at least 9.48%.
Collapse
|
4
|
Hai Z, Zou B, Xiao X, Peng Q, Yan J, Zhang W, Yue K. A novel approach for intelligent diagnosis and grading of diabetic retinopathy. Comput Biol Med 2024; 172:108246. [PMID: 38471350 DOI: 10.1016/j.compbiomed.2024.108246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 03/05/2024] [Accepted: 03/05/2024] [Indexed: 03/14/2024]
Abstract
Diabetic retinopathy (DR) is a severe ocular complication of diabetes that can lead to vision damage and even blindness. Currently, traditional deep convolutional neural networks (CNNs) used for DR grading tasks face two primary challenges: (1) insensitivity to minority classes due to imbalanced data distribution, and (2) neglecting the relationship between the left and right eyes by utilizing the fundus image of only one eye for training without differentiating between them. To tackle these challenges, we proposed the DRGCNN (DR Grading CNN) model. To solve the problem caused by imbalanced data distribution, our model adopts a more balanced strategy by allocating an equal number of channels to feature maps representing various DR categories. Furthermore, we introduce a CAM-EfficientNetV2-M encoder dedicated to encoding input retinal fundus images for feature vector generation. The number of parameters of our encoder is 52.88 M, which is less than RegNet_y_16gf (80.57 M) and EfficientNetB7 (63.79 M), but the corresponding kappa value is higher. Additionally, in order to take advantage of the binocular relationship, we input fundus retinal images from both eyes of the patient into the network for features fusion during the training phase. We achieved a kappa value of 86.62% on the EyePACS dataset and 86.16% on the Messidor-2 dataset. Experimental results on these representative datasets for diabetic retinopathy (DR) demonstrate the exceptional performance of our DRGCNN model, establishing it as a highly competitive intelligent classification model in the field of DR. The code is available for use at https://github.com/Fat-Hai/DRGCNN.
Collapse
Affiliation(s)
- Zeru Hai
- School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan Province, 410208, China
| | - Beiji Zou
- School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan Province, 410208, China; School of Computer Science and Engineering, Central South University, Changsha, Hunan Province, 410083, China
| | - Xiaoxia Xiao
- School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan Province, 410208, China.
| | - Qinghua Peng
- School of Traditional Chinese Medicine, Hunan University of Chinese Medicine, Changsha, Hunan Province, 410208, China
| | - Junfeng Yan
- School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan Province, 410208, China
| | - Wensheng Zhang
- School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan Province, 410208, China; University of Chinese Academy of Sciences (UCAS), Beijing, 100049, China; Research Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Kejuan Yue
- School of Computer Science, Hunan First Normal University, Changsha, Hunan Province, 410205, China
| |
Collapse
|
5
|
Xia H, Long J, Song S, Tan Y. Multi-scale multi-attention network for diabetic retinopathy grading. Phys Med Biol 2023; 69:015007. [PMID: 38035368 DOI: 10.1088/1361-6560/ad111d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 11/30/2023] [Indexed: 12/02/2023]
Abstract
Objective.Diabetic retinopathy (DR) grading plays an important role in clinical diagnosis. However, automatic grading of DR is challenging due to the presence of intra-class variation and small lesions. On the one hand, deep features learned by convolutional neural networks often lose valid information about these small lesions. On the other hand, the great variability of lesion features, including differences in type and quantity, can exhibit considerable divergence even among fundus images of the same grade. To address these issues, we propose a novel multi-scale multi-attention network (MMNet).Approach.Firstly, to focus on different lesion features of fundus images, we propose a lesion attention module, which aims to encode multiple different lesion attention feature maps by combining channel attention and spatial attention, thus extracting global feature information and preserving diverse lesion features. Secondly, we propose a multi-scale feature fusion module to learn more feature information for small lesion regions, which combines complementary relationships between different convolutional layers to capture more detailed feature information. Furthermore, we introduce a Cross-layer Consistency Constraint Loss to overcome semantic differences between multi-scale features.Main results.The proposed MMNet obtains a high accuracy of 86.4% and a high kappa score of 88.4% for multi-class DR grading tasks on the EyePACS dataset, while 98.6% AUC, 95.3% accuracy, 92.7% recall, 95.0% precision, and 93.3% F1-score for referral and non-referral classification on the Messidor-1 dataset. Extensive experiments on two challenging benchmarks demonstrate that our MMNet achieves significant improvements and outperforms other state-of-the-art DR grading methods.Significance.MMNet has improved the diagnostic efficiency and accuracy of diabetes retinopathy and promoted the application of computer-aided medical diagnosis in DR screening.
Collapse
Affiliation(s)
- Haiying Xia
- School of Electronic and Information Engineering, Guangxi Normal University, Guilin 541004, People's Republic of China
| | - Jie Long
- School of Electronic and Information Engineering, Guangxi Normal University, Guilin 541004, People's Republic of China
| | - Shuxiang Song
- School of Electronic and Information Engineering, Guangxi Normal University, Guilin 541004, People's Republic of China
| | - Yumei Tan
- School of Computer Science and Engineering, Guangxi Normal University, Guilin 541004, People's Republic of China
| |
Collapse
|
6
|
Liu F, Huang W. ESDiff: a joint model for low-quality retinal image enhancement and vessel segmentation using a diffusion model. BIOMEDICAL OPTICS EXPRESS 2023; 14:6563-6578. [PMID: 38420298 PMCID: PMC10898574 DOI: 10.1364/boe.506205] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 11/01/2023] [Accepted: 11/13/2023] [Indexed: 03/02/2024]
Abstract
In clinical screening, accurate diagnosis of various diseases relies on the extraction of blood vessels from fundus images. However, clinical fundus images often suffer from uneven illumination, blur, and artifacts caused by equipment or environmental factors. In this paper, we propose a unified framework called ESDiff to address these challenges by integrating retinal image enhancement and vessel segmentation. Specifically, we introduce a novel diffusion model-based framework for image enhancement, incorporating mask refinement as an auxiliary task via a vessel mask-aware diffusion model. Furthermore, we utilize low-quality retinal fundus images and their corresponding illumination maps as inputs to the modified UNet to obtain degradation factors that effectively preserve pathological features and pertinent information. This approach enhances the intermediate results within the iterative process of the diffusion model. Extensive experiments on publicly available fundus retinal datasets (i.e. DRIVE, STARE, CHASE_DB1 and EyeQ) demonstrate the effectiveness of ESDiff compared to state-of-the-art methods.
Collapse
Affiliation(s)
- Fengting Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250300, China
| | - Wenhui Huang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250300, China
| |
Collapse
|
7
|
Uppamma P, Bhattacharya S. A multidomain bio-inspired feature extraction and selection model for diabetic retinopathy severity classification: an ensemble learning approach. Sci Rep 2023; 13:18572. [PMID: 37903967 PMCID: PMC10616283 DOI: 10.1038/s41598-023-45886-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 10/25/2023] [Indexed: 11/01/2023] Open
Abstract
Diabetes retinopathy (DR) is one of the leading causes of blindness globally. Early detection of this condition is essential for preventing patients' loss of eyesight caused by diabetes mellitus being untreated for an extended period. This paper proposes the design of an augmented bioinspired multidomain feature extraction and selection model for diabetic retinopathy severity estimation using an ensemble learning process. The proposed approach initiates by identifying DR severity levels from retinal images that segment the optical disc, macula, blood vessels, exudates, and hemorrhages using an adaptive thresholding process. Once the images are segmented, multidomain features are extracted from the retinal images, including frequency, entropy, cosine, gabor, and wavelet components. These data were fed into a novel Modified Moth Flame Optimization-based feature selection method that assisted in optimal feature selection. Finally, an ensemble model using various ML (machine learning) algorithms, which included Naive Bayes, K-Nearest Neighbours, Support Vector Machine, Multilayer Perceptron, Random Forests, and Logistic Regression were used to identify the various severity complications of DR. The experiments on different openly accessible data sources have shown that the proposed method outperformed conventional methods and achieved an Accuracy of 96.5% in identifying DR severity levels.
Collapse
Affiliation(s)
- Posham Uppamma
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamilnadu, 632014, India
| | - Sweta Bhattacharya
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamilnadu, 632014, India.
| |
Collapse
|
8
|
Gao Y, Ma C, Guo L, Zhang X, Ji X. CLRD: Collaborative Learning for Retinopathy Detection Using Fundus Images. Bioengineering (Basel) 2023; 10:978. [PMID: 37627863 PMCID: PMC10451343 DOI: 10.3390/bioengineering10080978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 08/03/2023] [Accepted: 08/15/2023] [Indexed: 08/27/2023] Open
Abstract
Retinopathy, a prevalent disease causing visual impairment and sometimes blindness, affects many individuals in the population. Early detection and treatment of the disease can be facilitated by monitoring the retina using fundus imaging. Nonetheless, the limited availability of fundus images and the imbalanced datasets warrant the development of more precise and efficient algorithms to enhance diagnostic performance. This study presents a novel online knowledge distillation framework, called CLRD, which employs a collaborative learning approach for detecting retinopathy. By combining student models with varying scales and architectures, the CLRD framework extracts crucial pathological information from fundus images. The transfer of knowledge is accomplished by developing distortion information particular to fundus images, thereby enhancing model invariance. Our selection of student models includes the Transformer-based BEiT and the CNN-based ConvNeXt, which achieve accuracies of 98.77% and 96.88%, respectively. Furthermore, the proposed method has 5.69-23.13%, 5.37-23.73%, 5.74-23.17%, 11.24-45.21%, and 5.87-24.96% higher accuracy, precision, recall, specificity, and F1 score, respectively, compared to the advanced visual model. The results of our study indicate that the CLRD framework can effectively minimize generalization errors without compromising independent predictions made by student models, offering novel directions for further investigations into detecting retinopathy.
Collapse
Affiliation(s)
- Yuan Gao
- Department of Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing 100191, China
| | - Chenbin Ma
- Department of Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing 100191, China
- Shen Yuan Honors College, Beihang University, Beijing 100191, China
| | - Lishuang Guo
- Department of Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing 100191, China
| | - Xuxiang Zhang
- Department of Ophthalmology, Beijing Tiantan Hospital, Capital Medical University, Beijing 100050, China
| | - Xunming Ji
- Department of Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing 100191, China
| |
Collapse
|
9
|
Tian M, Wang H, Sun Y, Wu S, Tang Q, Zhang M. Fine-grained attention & knowledge-based collaborative network for diabetic retinopathy grading. Heliyon 2023; 9:e17217. [PMID: 37449186 PMCID: PMC10336422 DOI: 10.1016/j.heliyon.2023.e17217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 06/09/2023] [Accepted: 06/10/2023] [Indexed: 07/18/2023] Open
Abstract
Accurate diabetic retinopathy (DR) grading is crucial for making the proper treatment plan to reduce the damage caused by vision loss. This task is challenging due to the fact that the DR related lesions are often small and subtle in visual differences and intra-class variations. Moreover, relationships between the lesions and the DR levels are complicated. Although many deep learning (DL) DR grading systems have been developed with some success, there are still rooms for grading accuracy improvement. A common issue is that not much medical knowledge was used in these DL DR grading systems. As a result, the grading results are not properly interpreted by ophthalmologists, thus hinder the potential for practical applications. This paper proposes a novel fine-grained attention & knowledge-based collaborative network (FA+KC-Net) to address this concern. The fine-grained attention network dynamically divides the extracted feature maps into smaller patches and effectively captures small image features that are meaningful in the sense of its training from large amount of retinopathy fundus images. The knowledge-based collaborative network extracts a-priori medical knowledge features, i.e., lesions such as the microaneurysms (MAs), soft exudates (SEs), hard exudates (EXs), and hemorrhages (HEs). Finally, decision rules are developed to fuse the DR grading results from the fine-grained network and the knowledge-based collaborative network to make the final grading. Extensive experiments are carried out on four widely-used datasets, the DDR, Messidor, APTOS, and EyePACS to evaluate the efficacy of our method and compare with other state-of-the-art (SOTA) DL models. Simulation results show that proposed FA+KC-Net is accurate and stable, achieves the best performances on the DDR, Messidor, and APTOS datasets.
Collapse
Affiliation(s)
- Miao Tian
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Hongqiu Wang
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Yingxue Sun
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Shaozhi Wu
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Qingqing Tang
- Department of Ophthalmology, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Meixia Zhang
- Department of Ophthalmology, West China Hospital, Sichuan University, Chengdu, 610041, China
| |
Collapse
|
10
|
Han Z, Yang B, Deng S, Li Z, Tong Z. Category weighted network and relation weighted label for diabetic retinopathy screening. Comput Biol Med 2023; 152:106408. [PMID: 36516580 DOI: 10.1016/j.compbiomed.2022.106408] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 11/10/2022] [Accepted: 12/03/2022] [Indexed: 12/08/2022]
Abstract
Diabetic retinopathy (DR) is the primary cause of blindness in adults. Incorporating machine learning into DR grading can improve the accuracy of medical diagnosis. However, problems, such as severe data imbalance, persists. Existing studies on DR grading ignore the correlation between its labels. In this study, a category weighted network (CWN) was proposed to achieve data balance at the model level. In the CWN, a reference for weight settings is provided by calculating the category gradient norm and reducing the experimental overhead. We proposed to use relation weighted labels instead of the one-hot label to investigate the distance relationship between labels. Experiments revealed that the proposed CWN achieved excellent performance on various DR datasets. Furthermore, relation weighted labels exhibit broad applicability and can improve other methods using one-hot labels. The proposed method achieved kappa scores of 0.9431 and 0.9226 and accuracy of 90.94% and 86.12% on DDR and APTOS datasets, respectively.
Collapse
Affiliation(s)
- Zhike Han
- Zhejiang University, Hangzhou, 310027, Zhejiang, China; Zhejiang University City College, Hangzhou, 310015, Zhejiang, China
| | - Bin Yang
- Zhejiang University, Hangzhou, 310027, Zhejiang, China
| | | | - Zhuorong Li
- Zhejiang University City College, Hangzhou, 310015, Zhejiang, China.
| | - Zhou Tong
- The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310058, Zhejiang, China
| |
Collapse
|
11
|
Li F, Tang S, Chen Y, Zou H. Deep attentive convolutional neural network for automatic grading of imbalanced diabetic retinopathy in retinal fundus images. BIOMEDICAL OPTICS EXPRESS 2022; 13:5813-5835. [PMID: 36733744 PMCID: PMC9872872 DOI: 10.1364/boe.472176] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 09/25/2022] [Accepted: 10/06/2022] [Indexed: 06/18/2023]
Abstract
Automated fine-grained diabetic retinopathy (DR) grading was of great significance for assisting ophthalmologists in monitoring DR and designing tailored treatments for patients. Nevertheless, it is a challenging task as a result of high intra-class variations, high inter-class similarities, small lesions, and imbalanced data distributions. The pivotal factor for the success in fine-grained DR grading is to discern more subtle associated lesion features, such as microaneurysms (MA), Hemorrhages (HM), soft exudates (SE), and hard exudates (HE). In this paper, we constructed a simple yet effective deep attentive convolutional neural network (DACNN) for DR grading and lesion discovery with only image-wise supervision. Designed as a top-down architecture, our model incorporated stochastic atrous spatial pyramid pooling (sASPP), global attention mechanism (GAM), category attention mechanism (CAM), and learnable connected module (LCM) to better extract lesion-related features and maximize the DR grading performance. To be concrete, we devised sASPP combining randomness with atrous spatial pyramid pooling (ASPP) to accommodate the various scales of the lesions and struggle against the co-adaptation of multiple atrous convolutions. Then, GAM was introduced to extract class-agnostic global attention feature details, whilst CAM was explored for seeking class-specific distinctive region-level lesion feature information and regarding each DR severity grade in an equal way, which tackled the problem of imbalance DR data distributions. Further, the LCM was designed to automatically and adaptively search the optimal connections among layers for better extracting detailed small lesion feature representations. The proposed approach obtained high accuracy of 88.0% and kappa score of 88.6% for multi-class DR grading task on the EyePACS dataset, respectively, while 98.5% AUC, 93.8% accuracy, 87.9% kappa, 90.7% recall, 94.6% precision, and 92.6% F1-score for referral and non-referral classification on the Messidor dataset. Extensive experimental results on three challenging benchmarks demonstrated that the proposed approach achieved competitive performance in DR grading and lesion discovery using retinal fundus images compared with existing cutting-edge methods, and had good generalization capacity for unseen DR datasets. These promising results highlighted its potential as an efficient and reliable tool to assist ophthalmologists in large-scale DR screening.
Collapse
Affiliation(s)
- Feng Li
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Shiqing Tang
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Yuyang Chen
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Haidong Zou
- Shanghai Eye Disease Prevention & Treatment Center, Shanghai 200040, China
- Ophthalmology Center, Shanghai General Hospital, Shanghai 200080, China
| |
Collapse
|