1
|
Huang L, Zhang N, Yi Y, Zhou W, Zhou B, Dai J, Wang J. SAMCF: Adaptive global style alignment and multi-color spaces fusion for joint optic cup and disc segmentation. Comput Biol Med 2024; 178:108639. [PMID: 38878394 DOI: 10.1016/j.compbiomed.2024.108639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 04/21/2024] [Accepted: 05/18/2024] [Indexed: 07/24/2024]
Abstract
The optic cup (OC) and optic disc (OD) are two critical structures in retinal fundus images, and their relative positions and sizes are essential for effectively diagnosing eye diseases. With the success of deep learning in computer vision, deep learning-based segmentation models have been widely used for joint optic cup and disc segmentation. However, there are three prominent issues that impact the segmentation performance. First, significant differences among datasets collecting from various institutions, protocols, and devices lead to performance degradation of models. Second, we find that images with only RGB information struggle to counteract the interference caused by brightness variations, affecting color representation capability. Finally, existing methods typically ignored the edge perception, facing the challenges in obtaining clear and smooth edge segmentation results. To address these drawbacks, we propose a novel framework based on Style Alignment and Multi-Color Fusion (SAMCF) for joint OC and OD segmentation. Initially, we introduce a domain generalization method to generate uniformly styled images without damaged image content for mitigating domain shift issues. Next, based on multiple color spaces, we propose a feature extraction and fusion network aiming to handle brightness variation interference and improve color representation capability. Lastly, an edge aware loss is designed to generate fine edge segmentation results. Our experiments conducted on three public datasets, DGS, RIM, and REFUGE, demonstrate that our proposed SAMCF achieves superior performance to existing state-of-the-art methods. Moreover, SAMCF exhibits remarkable generalization ability across multiple retinal fundus image datasets, showcasing its outstanding generality.
Collapse
Affiliation(s)
- Longjun Huang
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China
| | - Ningyi Zhang
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China
| | - Yugen Yi
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China.
| | - Wei Zhou
- College of Computer Science, Shenyang Aerospace University, Shenyang, 110136, China
| | - Bin Zhou
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China
| | - Jiangyan Dai
- School of Computer Engineering, Weifang University, 261061, China.
| | - Jianzhong Wang
- College of Information Science and Technology, Northeast Normal University, Changchun, 130117, China
| |
Collapse
|
2
|
Chen Y, Bai Y, Zhang Y. Optic disc and cup segmentation for glaucoma detection using Attention U-Net incorporating residual mechanism. PeerJ Comput Sci 2024; 10:e1941. [PMID: 38660163 PMCID: PMC11042003 DOI: 10.7717/peerj-cs.1941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Accepted: 02/26/2024] [Indexed: 04/26/2024]
Abstract
Glaucoma is a common eye disease that can cause blindness. Accurate detection of the optic disc and cup disc is crucial for glaucoma diagnosis. Algorithm models based on artificial intelligence can assist doctors in improving detection performance. In this article, U-Net is used as the backbone network, and the attention and residual modules are integrated to construct an end-to-end convolutional neural network model for optic disc and cup disc segmentation. The U-Net backbone is used to infer the basic position information of optic disc and cup disc, the attention module enhances the model's ability to represent and extract features of optic disc and cup disc, and the residual module alleviates gradient disappearance or explosion that may occur during feature representation of the neural network. The proposed model is trained and tested on the DRISHTI-GS1 dataset. Results show that compared with the original U-Net method, our model can more effectively separate optic disc and cup disc in terms of overlap error, sensitivity, and specificity.
Collapse
Affiliation(s)
- Yuanyuan Chen
- School of Information Technology, Luoyang Normal University, Luoyang, China
| | - Yongpeng Bai
- School of Information Technology, Luoyang Normal University, Luoyang, China
| | - Yifan Zhang
- School of Information Technology, Luoyang Normal University, Luoyang, China
| |
Collapse
|
3
|
Zhang K, Lin PC, Pan J, Shao R, Xu PX, Cao R, Wu CG, Crookes D, Hua L, Wang L. DeepmdQCT: A multitask network with domain invariant features and comprehensive attention mechanism for quantitative computer tomography diagnosis of osteoporosis. Comput Biol Med 2024; 170:107916. [PMID: 38237237 DOI: 10.1016/j.compbiomed.2023.107916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 12/18/2023] [Accepted: 12/29/2023] [Indexed: 02/28/2024]
Abstract
In the medical field, the application of machine learning technology in the automatic diagnosis and monitoring of osteoporosis often faces challenges related to domain adaptation in drug therapy research. The existing neural networks used for the diagnosis of osteoporosis may experience a decrease in model performance when applied to new data domains due to changes in radiation dose and equipment. To address this issue, in this study, we propose a new method for multi domain diagnostic and quantitative computed tomography (QCT) images, called DeepmdQCT. This method adopts a domain invariant feature strategy and integrates a comprehensive attention mechanism to guide the fusion of global and local features, effectively improving the diagnostic performance of multi domain CT images. We conducted experimental evaluations on a self-created OQCT dataset, and the results showed that for dose domain images, the average accuracy reached 91%, while for device domain images, the accuracy reached 90.5%. our method successfully estimated bone density values, with a fit of 0.95 to the gold standard. Our method not only achieved high accuracy in CT images in the dose and equipment fields, but also successfully estimated key bone density values, which is crucial for evaluating the effectiveness of osteoporosis drug treatment. In addition, we validated the effectiveness of our architecture in feature extraction using three publicly available datasets. We also encourage the application of the DeepmdQCT method to a wider range of medical image analysis fields to improve the performance of multi-domain images.
Collapse
Affiliation(s)
- Kun Zhang
- School of Electrical Engineering, Nantong University, Nantong, Jiangsu, 226001, China; Nantong Key Laboratory of Intelligent Control and Intelligent Computing, Nantong, Jiangsu, 226001, China; Nantong Key Laboratory of Intelligent Medicine Innovation and Transformation, Nantong, Jiangsu, 226001, China
| | - Peng-Cheng Lin
- School of Electrical Engineering, Nantong University, Nantong, Jiangsu, 226001, China
| | - Jing Pan
- Department of Radiology, Affiliated Hospital 2 of Nantong University, Nantong, Jiangsu, 226001, China
| | - Rui Shao
- School of Electrical Engineering, Nantong University, Nantong, Jiangsu, 226001, China
| | - Pei-Xia Xu
- School of Electrical Engineering, Nantong University, Nantong, Jiangsu, 226001, China
| | - Rui Cao
- Department of Radiology, Affiliated Hospital 2 of Nantong University, Nantong, Jiangsu, 226001, China
| | - Cheng-Gang Wu
- School of Electrical Engineering, Nantong University, Nantong, Jiangsu, 226001, China
| | - Danny Crookes
- School of Electronics, Electrical Engineering and Computer Science, Queen's University Belfast, Belfast, BT7 1NN, UK
| | - Liang Hua
- School of Electrical Engineering, Nantong University, Nantong, Jiangsu, 226001, China.
| | - Lin Wang
- Department of Radiology, Affiliated Hospital 2 of Nantong University, Nantong, Jiangsu, 226001, China.
| |
Collapse
|
4
|
Wang C, Wang L, Wang N, Wei X, Feng T, Wu M, Yao Q, Zhang R. CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation. Comput Biol Med 2024; 168:107803. [PMID: 38064854 DOI: 10.1016/j.compbiomed.2023.107803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 11/23/2023] [Accepted: 11/29/2023] [Indexed: 01/10/2024]
Abstract
Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network structure, called CFATransUnet, with Transformer and CNN blocks as the backbone network, equipped with Channel-wise Cross Fusion Attention and Transformer (CCFAT) module, containing Channel-wise Cross Fusion Transformer (CCFT) and Channel-wise Cross Fusion Attention (CCFA). Specifically, we use a Transformer and CNN blocks to construct the encoder and decoder for adequate extraction and fusion of long-range and local semantic features. The CCFT module utilizes the self-attention mechanism to reintegrate semantic information from different stages into cross-level global features to reduce the semantic asymmetry between features at different levels. The CCFA module adaptively acquires the importance of each feature channel based on a global perspective in a network learning manner, enhancing effective information grasping and suppressing non-important features to mitigate semantic gaps. The combination of CCFT and CCFA can guide the effective fusion of different levels of features more powerfully with a global perspective. The consistent architecture of the encoder and decoder also alleviates the semantic gap. Experimental results suggest that the proposed CFATransUnet achieves state-of-the-art performance on four datasets. The code is available at https://github.com/CPU0808066/CFATransUnet.
Collapse
Affiliation(s)
- Cheng Wang
- Department of Optical Science and Engineering, Fudan University, Shanghai 200433, China
| | - Le Wang
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
| | - Nuoqi Wang
- Department of Optical Science and Engineering, Fudan University, Shanghai 200433, China
| | - Xiaoling Wei
- Department of Endodontics, Shanghai Stomatological Hospital, Fudan University, Shanghai 200001, China
| | - Ting Feng
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
| | - Minfeng Wu
- Department of Dermatology, Huadong Hospital Affiliated to Fudan University, Shanghai, 200040, China.
| | - Qi Yao
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China.
| | - Rongjun Zhang
- Department of Optical Science and Engineering, Fudan University, Shanghai 200433, China; Academy for Engineering and Technology, Fudan University, Shanghai 200433, China; Zhuhai Fudan Innovation Institute, Zhuhai 519031, China.
| |
Collapse
|
5
|
Hwang EE, Chen D, Han Y, Jia L, Shan J. Multi-Dataset Comparison of Vision Transformers and Convolutional Neural Networks for Detecting Glaucomatous Optic Neuropathy from Fundus Photographs. Bioengineering (Basel) 2023; 10:1266. [PMID: 38002390 PMCID: PMC10669064 DOI: 10.3390/bioengineering10111266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 10/26/2023] [Accepted: 10/27/2023] [Indexed: 11/26/2023] Open
Abstract
Glaucomatous optic neuropathy (GON) can be diagnosed and monitored using fundus photography, a widely available and low-cost approach already adopted for automated screening of ophthalmic diseases such as diabetic retinopathy. Despite this, the lack of validated early screening approaches remains a major obstacle in the prevention of glaucoma-related blindness. Deep learning models have gained significant interest as potential solutions, as these models offer objective and high-throughput methods for processing image-based medical data. While convolutional neural networks (CNN) have been widely utilized for these purposes, more recent advances in the application of Transformer architectures have led to new models, including Vision Transformer (ViT,) that have shown promise in many domains of image analysis. However, previous comparisons of these two architectures have not sufficiently compared models side-by-side with more than a single dataset, making it unclear which model is more generalizable or performs better in different clinical contexts. Our purpose is to investigate comparable ViT and CNN models tasked with GON detection from fundus photos and highlight their respective strengths and weaknesses. We train CNN and ViT models on six unrelated, publicly available databases and compare their performance using well-established statistics including AUC, sensitivity, and specificity. Our results indicate that ViT models often show superior performance when compared with a similarly trained CNN model, particularly when non-glaucomatous images are over-represented in a given dataset. We discuss the clinical implications of these findings and suggest that ViT can further the development of accurate and scalable GON detection for this leading cause of irreversible blindness worldwide.
Collapse
Affiliation(s)
- Elizabeth E. Hwang
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA 94143, USA
- Medical Scientist Training Program, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Dake Chen
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Ying Han
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Lin Jia
- Digillect LLC, San Francisco, CA 94158, USA
| | - Jing Shan
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA 94143, USA
| |
Collapse
|