1
|
Sun Y, Li P, Xu H, Wang R. Structural prior-driven feature extraction with gradient-momentum combined optimization for convolutional neural network image classification. Neural Netw 2024; 179:106511. [PMID: 39146718 DOI: 10.1016/j.neunet.2024.106511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 06/12/2024] [Accepted: 07/03/2024] [Indexed: 08/17/2024]
Abstract
Recent image classification efforts have achieved certain success by incorporating prior information such as labels and logical rules to learn discriminative features. However, these methods overlook the variability of features, resulting in feature inconsistency and fluctuations in model parameter updates, which further contribute to decreased image classification accuracy and model instability. To address this issue, this paper proposes a novel method combining structural prior-driven feature extraction with gradient-momentum (SPGM), from the perspectives of consistent feature learning and precise parameter updates, to enhance the accuracy and stability of image classification. Specifically, SPGM leverages a structural prior-driven feature extraction (SPFE) approach to calculate gradients of multi-level features and original images to construct structural information, which is then transformed into prior knowledge to drive the network to learn features consistent with the original images. Additionally, an optimization strategy integrating gradients and momentum (GMO) is introduced, dynamically adjusting the direction and step size of parameter updates based on the angle and norm of the sum of gradients and momentum, enabling precise model parameter updates. Extensive experiments on CIFAR10 and CIFAR100 datasets demonstrate that the SPGM method significantly reduces the top-1 error rate in image classification, enhances the classification performance, and outperforms state-of-the-art methods.
Collapse
Affiliation(s)
- Yunyun Sun
- School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing, 210023, Jiangsu, China.
| | - Peng Li
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, 210023, Jiangsu, China; Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks, Nanjing, 210023, Jiangsu, China.
| | - He Xu
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, 210023, Jiangsu, China; Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks, Nanjing, 210023, Jiangsu, China.
| | - Ruchuan Wang
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, 210023, Jiangsu, China; Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks, Nanjing, 210023, Jiangsu, China.
| |
Collapse
|
2
|
Grzybowski A, Jin K, Zhou J, Pan X, Wang M, Ye J, Wong TY. Retina Fundus Photograph-Based Artificial Intelligence Algorithms in Medicine: A Systematic Review. Ophthalmol Ther 2024; 13:2125-2149. [PMID: 38913289 PMCID: PMC11246322 DOI: 10.1007/s40123-024-00981-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 04/15/2024] [Indexed: 06/25/2024] Open
Abstract
We conducted a systematic review of research in artificial intelligence (AI) for retinal fundus photographic images. We highlighted the use of various AI algorithms, including deep learning (DL) models, for application in ophthalmic and non-ophthalmic (i.e., systemic) disorders. We found that the use of AI algorithms for the interpretation of retinal images, compared to clinical data and physician experts, represents an innovative solution with demonstrated superior accuracy in identifying many ophthalmic (e.g., diabetic retinopathy (DR), age-related macular degeneration (AMD), optic nerve disorders), and non-ophthalmic disorders (e.g., dementia, cardiovascular disease). There has been a significant amount of clinical and imaging data for this research, leading to the potential incorporation of AI and DL for automated analysis. AI has the potential to transform healthcare by improving accuracy, speed, and workflow, lowering cost, increasing access, reducing mistakes, and transforming healthcare worker education and training.
Collapse
Affiliation(s)
- Andrzej Grzybowski
- Institute for Research in Ophthalmology, Foundation for Ophthalmology Development, Poznań , Poland.
| | - Kai Jin
- Eye Center, School of Medicine, The Second Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China
| | - Jingxin Zhou
- Eye Center, School of Medicine, The Second Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China
| | - Xiangji Pan
- Eye Center, School of Medicine, The Second Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China
| | - Meizhu Wang
- Eye Center, School of Medicine, The Second Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China
| | - Juan Ye
- Eye Center, School of Medicine, The Second Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China.
| | - Tien Y Wong
- School of Clinical Medicine, Tsinghua Medicine, Tsinghua University, Beijing, China
- Singapore Eye Research Institute, Singapore National Eye Center, Singapore, Singapore
| |
Collapse
|
3
|
Zhao J, Wan C, Li J, Zhang Z, Yang W, Li K. NCME-Net: Nuclear cataract mask encoder network for intelligent grading using self-supervised learning from anterior segment photographs. Heliyon 2024; 10:e34726. [PMID: 39149020 PMCID: PMC11324988 DOI: 10.1016/j.heliyon.2024.e34726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 07/05/2024] [Accepted: 07/15/2024] [Indexed: 08/17/2024] Open
Abstract
Cataracts are a leading cause of blindness worldwide, making accurate diagnosis and effective surgical planning critical. However, grading the severity of the lens nucleus is challenging because deep learning (DL) models pretrained using ImageNet perform poorly when applied directly to medical data due to the limited availability of labeled medical images and high interclass similarity. Self-supervised pretraining offers a solution by circumventing the need for cost-intensive data annotations and bridging domain disparities. In this study, to address the challenges of intelligent grading, we proposed a hybrid model called nuclear cataract mask encoder network (NCME-Net), which utilizes self-supervised pretraining for the four-class analysis of nuclear cataract severity. A total of 792 images of nuclear cataracts were categorized into the training set (533 images), the validation set (139 images), and the test set (100 images). NCME-Net achieved a diagnostic accuracy of 91.0 % on the test set, a 5.0 % improvement over the best-performing DL model (ResNet50). Experimental results demonstrate NCME-Net's ability to distinguish between cataract severities, particularly in scenarios with limited samples, making it a valuable tool for intelligently diagnosing cataracts. In addition, the effect of different self-supervised tasks on the model's ability to capture the intrinsic structure of the data was studied. Findings indicate that image restoration tasks significantly enhance semantic information extraction.
Collapse
Affiliation(s)
- Jiani Zhao
- College of Electronic and Information Engineering /College of Integrated Circuits, Nanjing University of Aeronautics and Astronautics, Nanjing, Jiangsu, 211106, China
| | - Cheng Wan
- College of Electronic and Information Engineering /College of Integrated Circuits, Nanjing University of Aeronautics and Astronautics, Nanjing, Jiangsu, 211106, China
| | - Jiajun Li
- Eye Hospital, Nanjing Medical University, Nanjing, Jiangsu, 210029, China
| | - Zhe Zhang
- Shenzhen Eye Institute, Shenzhen Eye Hospital, Jinan University, Shenzhen, Guangdong, 518040, China
| | - Weihua Yang
- Shenzhen Eye Institute, Shenzhen Eye Hospital, Jinan University, Shenzhen, Guangdong, 518040, China
| | - Keran Li
- Eye Hospital, Nanjing Medical University, Nanjing, Jiangsu, 210029, China
| |
Collapse
|
4
|
Yang B, Cao L, Zhao H, Li H, Liu H, Wang N. Adaptive enhancement of cataractous retinal images for contrast standardization. Med Biol Eng Comput 2024; 62:357-369. [PMID: 37848753 DOI: 10.1007/s11517-023-02937-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 09/09/2023] [Indexed: 10/19/2023]
Abstract
Cataract affects the quality of fundus images, especially the contrast, due to lens opacity. In this paper, we propose a scheme to enhance different cataractous retinal images to the same contrast as normal images, which can automatically choose the suitable enhancement model based on cataract grading. A multi-level cataract dataset is constructed via the degradation model with quantified contrast. Then, an adaptive enhancement strategy is introduced to choose among three enhancement networks based on a blurriness classifier. The blurriness grading loss is proposed in the enhancement models to further constrain the contrast of the enhanced images. During test, the well-trained blurriness classifier can assist in the selection of enhancement networks with specific enhancement ability. Our method performs the best on the synthetic paired data on PSNR, SSIM, and FSIM and has the best PIQE and FID on 406 clinical fundus images. There is a 7.78% improvement for our method compared with the second on the introduced [Formula: see text] score without over-enhancement according to [Formula: see text], which demonstrates that the proper enhancement by our method is close to the high-quality images. The visual evaluation on multiple clinical datasets also shows the applicability of our method for different blurriness. The proposed method can benefit clinical diagnosis and improve the performance of computer-aided algorithms such as vessel tracking and vessel segmentation.
Collapse
Affiliation(s)
- Bingyu Yang
- Beijing Institute of Technology, Beijing, 100081, China
| | - Lvchen Cao
- School of Artificial Intelligence, Henan University, Zhengzhou, 450046, China
| | - He Zhao
- Beijing Institute of Technology, Beijing, 100081, China
| | - Huiqi Li
- Beijing Institute of Technology, Beijing, 100081, China.
| | - Hanruo Liu
- Beijing Institute of Ophthalmology, Beijing Tongren Hospital, Capital Medical University, Beijing, 100730, China
| | - Ningli Wang
- Beijing Institute of Ophthalmology, Beijing Tongren Hospital, Capital Medical University, Beijing, 100730, China
| |
Collapse
|
5
|
Li Z, Xu M, Yang X, Han Y, Wang J. A Multi-Label Detection Deep Learning Model with Attention-Guided Image Enhancement for Retinal Images. MICROMACHINES 2023; 14:705. [PMID: 36985112 PMCID: PMC10054796 DOI: 10.3390/mi14030705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 03/05/2023] [Accepted: 03/20/2023] [Indexed: 06/18/2023]
Abstract
At present, multi-disease fundus image classification tasks still have the problems of small data volumes, uneven distributions, and low classification accuracy. In order to solve the problem of large data demand of deep learning models, a multi-disease fundus image classification ensemble model based on gradient-weighted class activation mapping (Grad-CAM) is proposed. The model uses VGG19 and ResNet50 as the classification networks. Grad-CAM is a data augmentation module used to obtain a network convolutional layer output activation map. Both the augmented and the original data are used as the input of the model to achieve the classification goal. The data augmentation module can guide the model to learn the feature differences of lesions in the fundus and enhance the robustness of the classification model. Model fine tuning and transfer learning are used to improve the accuracy of multiple classifiers. The proposed method is based on the RFMiD (Retinal Fundus Multi-Disease Image Dataset) dataset, and an ablation experiment was performed. Compared with other methods, the accuracy, precision, and recall of this model are 97%, 92%, and 81%, respectively. The resulting activation graph shows the areas of interest for model classification, making it easier to understand the classification network.
Collapse
|
6
|
Zéboulon P, Panthier C, Rouger H, Bijon J, Ghazal W, Gatinel D. Development and validation of a pixel wise deep learning model to detect cataract on swept-source optical coherence tomography images. JOURNAL OF OPTOMETRY 2022; 15 Suppl 1:S43-S49. [PMID: 36229338 PMCID: PMC9732477 DOI: 10.1016/j.optom.2022.08.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 08/17/2022] [Accepted: 08/21/2022] [Indexed: 06/16/2023]
Abstract
PURPOSE The diagnosis of cataract is mostly clinical and there is a lack of objective and specific tool to detect and grade it automatically. The goal of this study was to develop and validate a deep learning model to detect and localize cataract on Swept Source Optical Coherance Tomography (SS-OCT) images. METHODS We trained a convolutional network to detect cataract at the pixel level from 504 SS-OCT images of clear lens and cataract patients. The model was then validated on 1326 different images of 114 patients. The output of the model is a map repreenting the probability of cataract for each pixel of the image. We calculated the Cataract Fraction (CF), defined as the number of pixel classified as "cataract" divided by the number of pixel representing the lens for each image. Receiver Operating Characteristic Curves were plotted. Area Under the Curve (ROC AUC) sensitivity and specitivity to detect cataract were calculated. RESULTS In the validsation set, mean CF was 0.024 ± 0.077 and 0.479 ± 0.230 (p < 0.001). ROC AUC was 0.98 with an optimal CF threshold of 0.14. Using that threshold, sensitivity and specificity to detect cataract were 94.4% and 94.7%, respectively. CONCLUSION We developed an automatic detection tool for cataract on SS-OCT images. Probability maps of cataract on the images provide an additional tool to help the physician in its diagnosis and surgical planning.
Collapse
Affiliation(s)
- Pierre Zéboulon
- Department of Ophthalmology, Rothschild Foundation, 25 Rue Manin, Paris 75019, France.
| | - Christophe Panthier
- Department of Ophthalmology, Rothschild Foundation, 25 Rue Manin, Paris 75019, France
| | - Hélène Rouger
- Department of Ophthalmology, Rothschild Foundation, 25 Rue Manin, Paris 75019, France
| | - Jacques Bijon
- Department of Ophthalmology, Rothschild Foundation, 25 Rue Manin, Paris 75019, France
| | - Wassim Ghazal
- Department of Ophthalmology, Rothschild Foundation, 25 Rue Manin, Paris 75019, France
| | - Damien Gatinel
- Department of Ophthalmology, Rothschild Foundation, 25 Rue Manin, Paris 75019, France
| |
Collapse
|