1
|
Zhao J, Zhu J, He J, Cao G, Dai C. Multi-label classification of retinal diseases based on fundus images using Resnet and Transformer. Med Biol Eng Comput 2024:10.1007/s11517-024-03144-6. [PMID: 38871856 DOI: 10.1007/s11517-024-03144-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Accepted: 05/27/2024] [Indexed: 06/15/2024]
Abstract
Retinal disorders are a major cause of irreversible vision loss, which can be mitigated through accurate and early diagnosis. Conventionally, fundus images are used as the gold diagnosis standard in detecting retinal diseases. In recent years, more and more researchers have employed deep learning methods for diagnosing ophthalmic diseases using fundus photography datasets. Among the studies, most of them focus on diagnosing a single disease in fundus images, making it still challenging for the diagnosis of multiple diseases. In this paper, we propose a framework that combines ResNet and Transformer for multi-label classification of retinal disease. This model employs ResNet to extract image features, utilizes Transformer to capture global information, and enhances the relationships between categories through learnable label embedding. On the publicly available Ocular Disease Intelligent Recognition (ODIR-5 k) dataset, the proposed method achieves a mean average precision of 92.86%, an area under the curve (AUC) of 97.27%, and a recall of 90.62%, which outperforms other state-of-the-art approaches for the multi-label classification. The proposed method represents a significant advancement in the field of retinal disease diagnosis, offering a more accurate, efficient, and comprehensive model for the detection of multiple retinal conditions.
Collapse
Affiliation(s)
- Jiaqing Zhao
- Shanghai Institute of Technology, Shanghai, China
| | - Jianfeng Zhu
- Shanghai Eye Disease Prevention and Control Center, Shanghai, China
| | - Jiangnan He
- Shanghai Eye Disease Prevention and Control Center, Shanghai, China
| | - Guogang Cao
- Shanghai Institute of Technology, Shanghai, China.
| | - Cuixia Dai
- Shanghai Institute of Technology, Shanghai, China.
| |
Collapse
|
2
|
Vafaeezadeh M, Behnam H, Gifani P. Ultrasound Image Analysis with Vision Transformers-Review. Diagnostics (Basel) 2024; 14:542. [PMID: 38473014 DOI: 10.3390/diagnostics14050542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/22/2024] [Accepted: 02/29/2024] [Indexed: 03/14/2024] Open
Abstract
Ultrasound (US) has become a widely used imaging modality in clinical practice, characterized by its rapidly evolving technology, advantages, and unique challenges, such as a low imaging quality and high variability. There is a need to develop advanced automatic US image analysis methods to enhance its diagnostic accuracy and objectivity. Vision transformers, a recent innovation in machine learning, have demonstrated significant potential in various research fields, including general image analysis and computer vision, due to their capacity to process large datasets and learn complex patterns. Their suitability for automatic US image analysis tasks, such as classification, detection, and segmentation, has been recognized. This review provides an introduction to vision transformers and discusses their applications in specific US image analysis tasks, while also addressing the open challenges and potential future trends in their application in medical US image analysis. Vision transformers have shown promise in enhancing the accuracy and efficiency of ultrasound image analysis and are expected to play an increasingly important role in the diagnosis and treatment of medical conditions using ultrasound imaging as technology progresses.
Collapse
Affiliation(s)
- Majid Vafaeezadeh
- Biomedical Engineering Department, School of Electrical Engineering, Iran University of Science and Technology, Tehran 1311416846, Iran
| | - Hamid Behnam
- Biomedical Engineering Department, School of Electrical Engineering, Iran University of Science and Technology, Tehran 1311416846, Iran
| | - Parisa Gifani
- Medical Sciences and Technologies Department, Science and Research Branch, Islamic Azad University, Tehran 1477893855, Iran
| |
Collapse
|
3
|
Wan C, Hua R, Li K, Hong X, Fang D, Yang W. Automatic Diagnosis of Different Types of Retinal Vein Occlusion Based on Fundus Images. INT J INTELL SYST 2023; 2023:1-13. [DOI: 10.1155/2023/1587410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2024]
Abstract
Retinal vein occlusion (RVO) is the second common cause of blindness following diabetic retinopathy. The manual screening of fundus images to detect RVO is time consuming. Deep-learning techniques have been used for screening RVO due to their outstanding performance in many applications. However, unlike other images, medical images have smaller lesions, which require a more elaborate approach. To provide patients with an accurate diagnosis, followed by timely and effective treatment, we developed an intelligent method for automatic RVO screening on fundus images. Swin Transformer learns the hierarchy of low-to high-level features like the convolutional neural network. However, Swin Transformer extracts features from fundus images through attention modules, which pay more attention to the interrelationship between the features and each other. The model is more universal, does not rely entirely on the data itself, and focuses not only on local information but has a diffusion mechanism from local to global. To suppress overfitting, we adopt a regularization strategy, label smoothing, which uses one-hot to add noise to reduce the weight of the categories of true sample labels when calculating the loss function. The choice of different models using a 5-fold cross-validation on our own datasets indicates that Swin Transformer performs better. The accuracy of classifying all datasets is 98.75 ± 0.000, and the accuracy of identifying MRVO, CRVO, BRVO, and normal, using the method proposed in the paper, is 94.49 ± 0.094, 99.98 ± 0.015, 98.88 ± 0.08, and 99.42 ± 0.012, respectively. The method will be useful to diagnose RVO and help decide grade through fundus images, which has the potency to provide patients with further diagnosis and treatment.
Collapse
Affiliation(s)
- Cheng Wan
- College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211100, China
| | - Rongrong Hua
- College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211100, China
| | - Kunke Li
- Shenzhen Eye Hospital, Jinan University, Shenzhen 518040, China
| | - Xiangqian Hong
- Shenzhen Eye Hospital, Jinan University, Shenzhen 518040, China
| | - Dong Fang
- Shenzhen Eye Hospital, Jinan University, Shenzhen 518040, China
| | - Weihua Yang
- Shenzhen Eye Hospital, Jinan University, Shenzhen 518040, China
| |
Collapse
|
4
|
Sun J, Wu B, Zhao T, Gao L, Xie K, Lin T, Sui J, Li X, Wu X, Ni X. Classification for thyroid nodule using ViT with contrastive learning in ultrasound images. Comput Biol Med 2023; 152:106444. [PMID: 36565481 DOI: 10.1016/j.compbiomed.2022.106444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 12/01/2022] [Accepted: 12/15/2022] [Indexed: 12/23/2022]
Abstract
The lack of representative features between benign nodules, especially level 3 of Thyroid Imaging Reporting and Data System (TI-RADS), and malignant nodules limits diagnostic accuracy, leading to inconsistent interpretation, overdiagnosis, and unnecessary biopsies. We propose a Vision-Transformer-based (ViT) thyroid nodule classification model using contrast learning, called TC-ViT, to improve accuracy of diagnosis and specificity of biopsy recommendations. ViT can explore the global features of thyroid nodules well. Nodule images are used as ROI to enhance the local features of the ViT. Contrast learning can minimize the representation distance between nodules of the same category, enhance the representation consistency of global and local features, and achieve accurate diagnosis of TI-RADS 3 or malignant nodules. The test results achieve an accuracy of 86.9%. The evaluation metrics show that the network outperforms other classical deep learning-based networks in terms of classification performance. TC-ViT can achieve automatic classification of TI-RADS 3 and malignant nodules on ultrasound images. It can also be used as a key step in computer-aided diagnosis for comprehensive analysis and accurate diagnosis. The code will be available at https://github.com/Jiawei217/TC-ViT.
Collapse
Affiliation(s)
- Jiawei Sun
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China; Jiangsu Province Engineering Research Center of Medical Physics, Changzhou 213003, China; Center of Medical Physics, Nanjing Medical University, Changzhou 213003, China
| | - Bobo Wu
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China
| | - Tong Zhao
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China
| | - Liugang Gao
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China; Jiangsu Province Engineering Research Center of Medical Physics, Changzhou 213003, China; Center of Medical Physics, Nanjing Medical University, Changzhou 213003, China
| | - Kai Xie
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China; Jiangsu Province Engineering Research Center of Medical Physics, Changzhou 213003, China; Center of Medical Physics, Nanjing Medical University, Changzhou 213003, China
| | - Tao Lin
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China; Jiangsu Province Engineering Research Center of Medical Physics, Changzhou 213003, China; Center of Medical Physics, Nanjing Medical University, Changzhou 213003, China
| | - Jianfeng Sui
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China; Jiangsu Province Engineering Research Center of Medical Physics, Changzhou 213003, China; Center of Medical Physics, Nanjing Medical University, Changzhou 213003, China
| | - Xiaoqin Li
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China
| | - Xiaojin Wu
- Oncology Department, Xuzhou NO.1 People's Hospital, Xuzhou 221000, China.
| | - Xinye Ni
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China; Jiangsu Province Engineering Research Center of Medical Physics, Changzhou 213003, China; Center of Medical Physics, Nanjing Medical University, Changzhou 213003, China.
| |
Collapse
|