1
|
Huang L, Zhang N, Yi Y, Zhou W, Zhou B, Dai J, Wang J. SAMCF: Adaptive global style alignment and multi-color spaces fusion for joint optic cup and disc segmentation. Comput Biol Med 2024; 178:108639. [PMID: 38878394 DOI: 10.1016/j.compbiomed.2024.108639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 04/21/2024] [Accepted: 05/18/2024] [Indexed: 07/24/2024]
Abstract
The optic cup (OC) and optic disc (OD) are two critical structures in retinal fundus images, and their relative positions and sizes are essential for effectively diagnosing eye diseases. With the success of deep learning in computer vision, deep learning-based segmentation models have been widely used for joint optic cup and disc segmentation. However, there are three prominent issues that impact the segmentation performance. First, significant differences among datasets collecting from various institutions, protocols, and devices lead to performance degradation of models. Second, we find that images with only RGB information struggle to counteract the interference caused by brightness variations, affecting color representation capability. Finally, existing methods typically ignored the edge perception, facing the challenges in obtaining clear and smooth edge segmentation results. To address these drawbacks, we propose a novel framework based on Style Alignment and Multi-Color Fusion (SAMCF) for joint OC and OD segmentation. Initially, we introduce a domain generalization method to generate uniformly styled images without damaged image content for mitigating domain shift issues. Next, based on multiple color spaces, we propose a feature extraction and fusion network aiming to handle brightness variation interference and improve color representation capability. Lastly, an edge aware loss is designed to generate fine edge segmentation results. Our experiments conducted on three public datasets, DGS, RIM, and REFUGE, demonstrate that our proposed SAMCF achieves superior performance to existing state-of-the-art methods. Moreover, SAMCF exhibits remarkable generalization ability across multiple retinal fundus image datasets, showcasing its outstanding generality.
Collapse
Affiliation(s)
- Longjun Huang
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China
| | - Ningyi Zhang
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China
| | - Yugen Yi
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China.
| | - Wei Zhou
- College of Computer Science, Shenyang Aerospace University, Shenyang, 110136, China
| | - Bin Zhou
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China
| | - Jiangyan Dai
- School of Computer Engineering, Weifang University, 261061, China.
| | - Jianzhong Wang
- College of Information Science and Technology, Northeast Normal University, Changchun, 130117, China
| |
Collapse
|
2
|
Gao Y, Fu J, Guo Y, Wang Y. G-T correcting: an improved training of image segmentation under noisy labels. Med Biol Eng Comput 2024:10.1007/s11517-024-03170-4. [PMID: 39031327 DOI: 10.1007/s11517-024-03170-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 07/06/2024] [Indexed: 07/22/2024]
Abstract
Data-driven medical image segmentation networks require expert annotations, which are hard to obtain. Non-expert annotations are often used instead, but these can be inaccurate (referred to as "noisy labels"), misleading the network's training and causing a decline in segmentation performance. In this study, we focus on improving the segmentation performance of neural networks when trained with noisy annotations. Specifically, we propose a two-stage framework named "G-T correcting," consisting of "G" stage for recognizing noisy labels and "T" stage for correcting noisy labels. In the "G" stage, a positive feedback method is proposed to automatically recognize noisy samples, using a Gaussian mixed model to classify clean and noisy labels through the per-sample loss histogram. In the "T" stage, a confident correcting strategy and early learning strategy are adopted to allow the segmentation network to receive productive guidance from noisy labels. Experiments on simulated and real-world noisy labels show that this method can achieve over 90% accuracy in recognizing noisy labels, and improve the network's DICE coefficient to 91%. The results demonstrate that the proposed method can enhance the segmentation performance of the network when trained with noisy labels, indicating good clinical application prospects.
Collapse
Affiliation(s)
- Yun Gao
- School of Information Science and Technology of Fudan University, 220 Handan Rd, Shanghai, 200433, China
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China
| | - Junhu Fu
- School of Information Science and Technology of Fudan University, 220 Handan Rd, Shanghai, 200433, China
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China
| | - Yi Guo
- School of Information Science and Technology of Fudan University, 220 Handan Rd, Shanghai, 200433, China.
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China.
| | - Yuanyuan Wang
- School of Information Science and Technology of Fudan University, 220 Handan Rd, Shanghai, 200433, China.
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China.
| |
Collapse
|
3
|
Guo R, Xu Y, Tompkins A, Pagnucco M, Song Y. Multi-degradation-adaptation network for fundus image enhancement with degradation representation learning. Med Image Anal 2024; 97:103273. [PMID: 39029157 DOI: 10.1016/j.media.2024.103273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 05/16/2024] [Accepted: 07/09/2024] [Indexed: 07/21/2024]
Abstract
Fundus image quality serves a crucial asset for medical diagnosis and applications. However, such images often suffer degradation during image acquisition where multiple types of degradation can occur in each image. Although recent deep learning based methods have shown promising results in image enhancement, they tend to focus on restoring one aspect of degradation and lack generalisability to multiple modes of degradation. We propose an adaptive image enhancement network that can simultaneously handle a mixture of different degradations. The main contribution of this work is to introduce our Multi-Degradation-Adaptive module which dynamically generates filters for different types of degradation. Moreover, we explore degradation representation learning and propose the degradation representation network and Multi-Degradation-Adaptive discriminator for our accompanying image enhancement network. Experimental results demonstrate that our method outperforms several existing state-of-the-art methods in fundus image enhancement. Code will be available at https://github.com/RuoyuGuo/MDA-Net.
Collapse
Affiliation(s)
- Ruoyu Guo
- School of Computer Science and Engineering, University of New South Wales, Australia
| | - Yiwen Xu
- School of Computer Science and Engineering, University of New South Wales, Australia
| | - Anthony Tompkins
- School of Computer Science and Engineering, University of New South Wales, Australia
| | - Maurice Pagnucco
- School of Computer Science and Engineering, University of New South Wales, Australia
| | - Yang Song
- School of Computer Science and Engineering, University of New South Wales, Australia.
| |
Collapse
|
4
|
Meng Y, Zhang Y, Xie J, Duan J, Joddrell M, Madhusudhan S, Peto T, Zhao Y, Zheng Y. Multi-granularity learning of explicit geometric constraint and contrast for label-efficient medical image segmentation and differentiable clinical function assessment. Med Image Anal 2024; 95:103183. [PMID: 38692098 DOI: 10.1016/j.media.2024.103183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 01/26/2024] [Accepted: 04/18/2024] [Indexed: 05/03/2024]
Abstract
Automated segmentation is a challenging task in medical image analysis that usually requires a large amount of manually labeled data. However, most current supervised learning based algorithms suffer from insufficient manual annotations, posing a significant difficulty for accurate and robust segmentation. In addition, most current semi-supervised methods lack explicit representations of geometric structure and semantic information, restricting segmentation accuracy. In this work, we propose a hybrid framework to learn polygon vertices, region masks, and their boundaries in a weakly/semi-supervised manner that significantly advances geometric and semantic representations. Firstly, we propose multi-granularity learning of explicit geometric structure constraints via polygon vertices (PolyV) and pixel-wise region (PixelR) segmentation masks in a semi-supervised manner. Secondly, we propose eliminating boundary ambiguity by using an explicit contrastive objective to learn a discriminative feature space of boundary contours at the pixel level with limited annotations. Thirdly, we exploit the task-specific clinical domain knowledge to differentiate the clinical function assessment end-to-end. The ground truth of clinical function assessment, on the other hand, can serve as auxiliary weak supervision for PolyV and PixelR learning. We evaluate the proposed framework on two tasks, including optic disc (OD) and cup (OC) segmentation along with vertical cup-to-disc ratio (vCDR) estimation in fundus images; left ventricle (LV) segmentation at end-diastolic and end-systolic frames along with ejection fraction (LVEF) estimation in two-dimensional echocardiography images. Experiments on nine large-scale datasets of the two tasks under different label settings demonstrate our model's superior performance on segmentation and clinical function assessment.
Collapse
Affiliation(s)
- Yanda Meng
- Department of Eye and Vision Sciences, University of Liverpool, Liverpool, United Kingdom
| | - Yuchen Zhang
- Center for Bioinformatics, Peking University, Beijing, China
| | - Jianyang Xie
- Department of Eye and Vision Sciences, University of Liverpool, Liverpool, United Kingdom
| | - Jinming Duan
- School of Computer Science, University of Birmingham, Birmingham, United Kingdom
| | - Martha Joddrell
- Liverpool Centre for Cardiovascular Science, University of Liverpool and Liverpool Heart & Chest Hospital, Liverpool, United Kingdom; Department of Cardiovascular and Metabolic Medicine, University of Liverpool, Liverpool, United Kingdom
| | - Savita Madhusudhan
- St Paul's Eye Unit, Liverpool University Hospitals NHS Foundation Trust, Liverpool, United Kingdom
| | - Tunde Peto
- School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, Belfast, United Kingdom
| | - Yitian Zhao
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Science, Ningbo, China; Ningbo Eye Hospital, Ningbo, China.
| | - Yalin Zheng
- Department of Eye and Vision Sciences, University of Liverpool, Liverpool, United Kingdom; Liverpool Centre for Cardiovascular Science, University of Liverpool and Liverpool Heart & Chest Hospital, Liverpool, United Kingdom.
| |
Collapse
|
5
|
He H, Qiu J, Lin L, Cai Z, Cheng P, Tang X. JOINEDTrans: Prior guided multi-task transformer for joint optic disc/cup segmentation and fovea detection. Comput Biol Med 2024; 177:108613. [PMID: 38781644 DOI: 10.1016/j.compbiomed.2024.108613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 01/18/2024] [Accepted: 05/11/2024] [Indexed: 05/25/2024]
Abstract
Deep learning-based image segmentation and detection models have largely improved the efficiency of analyzing retinal landmarks such as optic disc (OD), optic cup (OC), and fovea. However, factors including ophthalmic disease-related lesions and low image quality issues may severely complicate automatic OD/OC segmentation and fovea detection. Most existing works treat the identification of each landmark as a single task, and take into account no prior information. To address these issues, we propose a prior guided multi-task transformer framework for joint OD/OC segmentation and fovea detection, named JOINEDTrans. JOINEDTrans effectively combines various spatial features of the fundus images, relieving the structural distortions induced by lesions and other imaging issues. It contains a segmentation branch and a detection branch. To be noted, we employ an encoder with prior-learning in a vessel segmentation task to effectively exploit the positional relationship among vessel, OD/OC, and fovea, successfully incorporating spatial prior into the proposed JOINEDTrans framework. There are a coarse stage and a fine stage in JOINEDTrans. In the coarse stage, OD/OC coarse segmentation and fovea heatmap localization are obtained through a joint segmentation and detection module. In the fine stage, we crop regions of interest for subsequent refinement and use predictions obtained in the coarse stage to provide additional information for better performance and faster convergence. Experimental results demonstrate that JOINEDTrans outperforms existing state-of-the-art methods on the publicly available GAMMA, REFUGE, and PALM fundus image datasets. We make our code available at https://github.com/HuaqingHe/JOINEDTrans.
Collapse
Affiliation(s)
- Huaqing He
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, Zhejiang, China.
| | - Jiaming Qiu
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China.
| | - Li Lin
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, Zhejiang, China; Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, China.
| | - Zhiyuan Cai
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China; Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
| | - Pujin Cheng
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China; Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, China.
| | - Xiaoying Tang
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, Zhejiang, China.
| |
Collapse
|
6
|
Messica S, Presil D, Hoch Y, Lev T, Hadad A, Katz O, Owens DR. Enhancing stroke risk and prognostic timeframe assessment with deep learning and a broad range of retinal biomarkers. Artif Intell Med 2024; 154:102927. [PMID: 38991398 DOI: 10.1016/j.artmed.2024.102927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 06/15/2024] [Accepted: 06/25/2024] [Indexed: 07/13/2024]
Abstract
Stroke stands as a major global health issue, causing high death and disability rates and significant social and economic burdens. The effectiveness of existing stroke risk assessment methods is questionable due to their use of inconsistent and varying biomarkers, which may lead to unpredictable risk evaluations. This study introduces an automatic deep learning-based system for predicting stroke risk (both ischemic and hemorrhagic) and estimating the time frame of its occurrence, utilizing a comprehensive set of known retinal biomarkers from fundus images. Our system, tested on the UK Biobank and DRSSW datasets, achieved AUROC scores of 0.83 (95% CI: 0.79-0.85) and 0.93 (95% CI: 0.9-0.95), respectively. These results not only highlight our system's advantage over established benchmarks but also underscore the predictive power of retinal biomarkers in assessing stroke risk and the unique effectiveness of each biomarker. Additionally, the correlation between retinal biomarkers and cardiovascular diseases broadens the potential application of our system, making it a versatile tool for predicting a wide range of cardiovascular conditions.
Collapse
Affiliation(s)
| | - Dan Presil
- NEC Israeli Research Center, Herzliya, Israel
| | - Yaacov Hoch
- NEC Israeli Research Center, Herzliya, Israel
| | - Tsvi Lev
- NEC Israeli Research Center, Herzliya, Israel
| | - Aviel Hadad
- Ophthalmology Department, Soroka University Medical Center, Be'er Sheva, South District, Israel
| | - Or Katz
- NEC Israeli Research Center, Herzliya, Israel
| | - David R Owens
- Swansea University Medical School, Swansea, Wales, UK
| |
Collapse
|
7
|
Chiang YY, Chen CL, Chen YH. Deep Learning Evaluation of Glaucoma Detection Using Fundus Photographs in Highly Myopic Populations. Biomedicines 2024; 12:1394. [PMID: 39061968 DOI: 10.3390/biomedicines12071394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Revised: 06/14/2024] [Accepted: 06/19/2024] [Indexed: 07/28/2024] Open
Abstract
OBJECTIVES This study aimed to use deep learning to identify glaucoma and normal eyes in groups with high myopia using fundus photographs. METHODS Patients who visited Tri-Services General Hospital from 1 November 2018 to 31 October 2022 were retrospectively reviewed. Patients with high myopia (spherical equivalent refraction of ≤-6.0 D) were included in the current analysis. Meanwhile, patients with pathological myopia were excluded. The participants were then divided into the high myopia group and high myopia glaucoma group. We used two classification models with the convolutional block attention module (CBAM), an attention mechanism module that enhances the performance of convolutional neural networks (CNNs), to investigate glaucoma cases. The learning data of this experiment were evaluated through fivefold cross-validation. The images were categorized into training, validation, and test sets in a ratio of 6:2:2. Grad-CAM visual visualization improved the interpretability of the CNN results. The performance indicators for evaluating the model include the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. RESULTS A total of 3088 fundus photographs were used for the deep-learning model, including 1540 and 1548 fundus photographs for the high myopia glaucoma and high myopia groups, respectively. The average refractive power of the high myopia glaucoma group and the high myopia group were -8.83 ± 2.9 D and -8.73 ± 2.6 D, respectively (p = 0.30). Based on a fivefold cross-validation assessment, the ConvNeXt_Base+CBAM architecture had the best performance, with an AUC of 0.894, accuracy of 82.16%, sensitivity of 81.04%, specificity of 83.27%, and F1 score of 81.92%. CONCLUSIONS Glaucoma in individuals with high myopia was identified from their fundus photographs.
Collapse
Affiliation(s)
- Yen-Ying Chiang
- Graduate Institute of Life Sciences, National Defense Medical Center, Taipei 114, Taiwan
| | - Ching-Long Chen
- Department of Ophthalmology, Tri-Service General Hospital, National Defense Medical Center, Taipei 114, Taiwan
| | - Yi-Hao Chen
- Department of Ophthalmology, Tri-Service General Hospital, National Defense Medical Center, Taipei 114, Taiwan
| |
Collapse
|
8
|
AlRyalat SA, Musleh AM, Kahook MY. Evaluating the strengths and limitations of multimodal ChatGPT-4 in detecting glaucoma using fundus images. FRONTIERS IN OPHTHALMOLOGY 2024; 4:1387190. [PMID: 38984105 PMCID: PMC11182172 DOI: 10.3389/fopht.2024.1387190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 05/17/2024] [Indexed: 07/11/2024]
Abstract
Overview This study evaluates the diagnostic accuracy of a multimodal large language model (LLM), ChatGPT-4, in recognizing glaucoma using color fundus photographs (CFPs) with a benchmark dataset and without prior training or fine tuning. Methods The publicly accessible Retinal Fundus Glaucoma Challenge "REFUGE" dataset was utilized for analyses. The input data consisted of the entire 400 image testing set. The task involved classifying fundus images into either 'Likely Glaucomatous' or 'Likely Non-Glaucomatous'. We constructed a confusion matrix to visualize the results of predictions from ChatGPT-4, focusing on accuracy of binary classifications (glaucoma vs non-glaucoma). Results ChatGPT-4 demonstrated an accuracy of 90% with a 95% confidence interval (CI) of 87.06%-92.94%. The sensitivity was found to be 50% (95% CI: 34.51%-65.49%), while the specificity was 94.44% (95% CI: 92.08%-96.81%). The precision was recorded at 50% (95% CI: 34.51%-65.49%), and the F1 Score was 0.50. Conclusion ChatGPT-4 achieved relatively high diagnostic accuracy without prior fine tuning on CFPs. Considering the scarcity of data in specialized medical fields, including ophthalmology, the use of advanced AI techniques, such as LLMs, might require less data for training compared to other forms of AI with potential savings in time and financial resources. It may also pave the way for the development of innovative tools to support specialized medical care, particularly those dependent on multimodal data for diagnosis and follow-up, irrespective of resource constraints.
Collapse
Affiliation(s)
- Saif Aldeen AlRyalat
- Department of Ophthalmology, The University of Jordan, Amman, Jordan
- Department of Ophthalmology, Houston Methodist Hospital, Houston, TX, United States
| | | | - Malik Y. Kahook
- Department of Ophthalmology, University of Colorado School of Medicine, Sue Anschutz-Rodgers Eye Center, Aurora, CO, United States
| |
Collapse
|
9
|
Habeb AAAA, Taresh MM, Li J, Gao Z, Zhu N. Enhancing Medical Image Classification with an Advanced Feature Selection Algorithm: A Novel Approach to Improving the Cuckoo Search Algorithm by Incorporating Caputo Fractional Order. Diagnostics (Basel) 2024; 14:1191. [PMID: 38893717 PMCID: PMC11172208 DOI: 10.3390/diagnostics14111191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 05/23/2024] [Accepted: 05/29/2024] [Indexed: 06/21/2024] Open
Abstract
Glaucoma is a chronic eye condition that seriously impairs vision and requires early diagnosis and treatment. Automated detection techniques are essential for obtaining a timely diagnosis. In this paper, we propose a novel method for feature selection that integrates the cuckoo search algorithm with Caputo fractional order (CFO-CS) to enhance the performance of glaucoma classification. However, when using the infinite series, the Caputo definition has memory length truncation issues. Therefore, we suggest a fixed memory step and an adjustable term count for optimization. We conducted experiments integrating various feature extraction techniques, including histograms of oriented gradients (HOGs), local binary patterns (LBPs), and deep features from MobileNet and VGG19, to create a unified vector. We evaluate the informative features selected from the proposed method using the k-nearest neighbor. Furthermore, we use data augmentation to enhance the diversity and quantity of the training set. The proposed method enhances convergence speed and the attainment of optimal solutions during training. The results demonstrate superior performance on the test set, achieving 92.62% accuracy, 94.70% precision, 93.52% F1-Score, 92.98% specificity, 92.36% sensitivity, and 85.00% Matthew's correlation coefficient. The results confirm the efficiency of the proposed method, rendering it a generalizable and applicable technique in ophthalmology.
Collapse
Affiliation(s)
| | | | - Jintang Li
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410012, China; (A.A.A.A.H.); (J.L.); (N.Z.)
| | - Zhan Gao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410012, China; (A.A.A.A.H.); (J.L.); (N.Z.)
| | - Ningbo Zhu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410012, China; (A.A.A.A.H.); (J.L.); (N.Z.)
- Research Institute of Hunan University in Chongqing, Chongqing 400000, China
| |
Collapse
|
10
|
Boulogne LH, Lorenz J, Kienzle D, Schön R, Ludwig K, Lienhart R, Jégou S, Li G, Chen C, Wang Q, Shi D, Maniparambil M, Müller D, Mertes S, Schröter N, Hellmann F, Elia M, Dirks I, Bossa MN, Berenguer AD, Mukherjee T, Vandemeulebroucke J, Sahli H, Deligiannis N, Gonidakis P, Huynh ND, Razzak I, Bouadjenek R, Verdicchio M, Borrelli P, Aiello M, Meakin JA, Lemm A, Russ C, Ionasec R, Paragios N, van Ginneken B, Revel-Dubois MP. The STOIC2021 COVID-19 AI challenge: Applying reusable training methodologies to private data. Med Image Anal 2024; 97:103230. [PMID: 38875741 DOI: 10.1016/j.media.2024.103230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Revised: 01/11/2024] [Accepted: 06/03/2024] [Indexed: 06/16/2024]
Abstract
Challenges drive the state-of-the-art of automated medical image analysis. The quantity of public training data that they provide can limit the performance of their solutions. Public access to the training methodology for these solutions remains absent. This study implements the Type Three (T3) challenge format, which allows for training solutions on private data and guarantees reusable training methodologies. With T3, challenge organizers train a codebase provided by the participants on sequestered training data. T3 was implemented in the STOIC2021 challenge, with the goal of predicting from a computed tomography (CT) scan whether subjects had a severe COVID-19 infection, defined as intubation or death within one month. STOIC2021 consisted of a Qualification phase, where participants developed challenge solutions using 2000 publicly available CT scans, and a Final phase, where participants submitted their training methodologies with which solutions were trained on CT scans of 9724 subjects. The organizers successfully trained six of the eight Final phase submissions. The submitted codebases for training and running inference were released publicly. The winning solution obtained an area under the receiver operating characteristic curve for discerning between severe and non-severe COVID-19 of 0.815. The Final phase solutions of all finalists improved upon their Qualification phase solutions.
Collapse
Affiliation(s)
- Luuk H Boulogne
- Radboud university medical center, P.O. Box 9101, 6500HB Nijmegen, The Netherlands.
| | - Julian Lorenz
- University of Augsburg, Universitätsstraße 2, 86159 Augsburg, Germany.
| | - Daniel Kienzle
- University of Augsburg, Universitätsstraße 2, 86159 Augsburg, Germany
| | - Robin Schön
- University of Augsburg, Universitätsstraße 2, 86159 Augsburg, Germany
| | - Katja Ludwig
- University of Augsburg, Universitätsstraße 2, 86159 Augsburg, Germany
| | - Rainer Lienhart
- University of Augsburg, Universitätsstraße 2, 86159 Augsburg, Germany
| | | | - Guang Li
- Keya medical technology co. ltd, Floor 20, Building A, 1 Ronghua South Road, Yizhuang Economic Development Zone, Daxing District, Beijing, PR China.
| | - Cong Chen
- Keya medical technology co. ltd, Floor 20, Building A, 1 Ronghua South Road, Yizhuang Economic Development Zone, Daxing District, Beijing, PR China
| | - Qi Wang
- Keya medical technology co. ltd, Floor 20, Building A, 1 Ronghua South Road, Yizhuang Economic Development Zone, Daxing District, Beijing, PR China
| | - Derik Shi
- Keya medical technology co. ltd, Floor 20, Building A, 1 Ronghua South Road, Yizhuang Economic Development Zone, Daxing District, Beijing, PR China
| | - Mayug Maniparambil
- ML-Labs, Dublin City University, N210, Marconi building, Dublin City University, Glasnevin, Dublin 9, Ireland.
| | - Dominik Müller
- University of Augsburg, Universitätsstraße 2, 86159 Augsburg, Germany; Faculty of Applied Computer Science, University of Augsburg, Germany
| | - Silvan Mertes
- Faculty of Applied Computer Science, University of Augsburg, Germany
| | - Niklas Schröter
- Faculty of Applied Computer Science, University of Augsburg, Germany
| | - Fabio Hellmann
- Faculty of Applied Computer Science, University of Augsburg, Germany
| | - Miriam Elia
- Faculty of Applied Computer Science, University of Augsburg, Germany.
| | - Ine Dirks
- Vrije Universiteit Brussel, Department of Electronics and Informatics, Pleinlaan 2, 1050 Brussels, Belgium; imec, Kapeldreef 75, 3001 Leuven, Belgium.
| | - Matías Nicolás Bossa
- Vrije Universiteit Brussel, Department of Electronics and Informatics, Pleinlaan 2, 1050 Brussels, Belgium; imec, Kapeldreef 75, 3001 Leuven, Belgium
| | - Abel Díaz Berenguer
- Vrije Universiteit Brussel, Department of Electronics and Informatics, Pleinlaan 2, 1050 Brussels, Belgium; imec, Kapeldreef 75, 3001 Leuven, Belgium
| | - Tanmoy Mukherjee
- Vrije Universiteit Brussel, Department of Electronics and Informatics, Pleinlaan 2, 1050 Brussels, Belgium; imec, Kapeldreef 75, 3001 Leuven, Belgium
| | - Jef Vandemeulebroucke
- Vrije Universiteit Brussel, Department of Electronics and Informatics, Pleinlaan 2, 1050 Brussels, Belgium; imec, Kapeldreef 75, 3001 Leuven, Belgium
| | - Hichem Sahli
- Vrije Universiteit Brussel, Department of Electronics and Informatics, Pleinlaan 2, 1050 Brussels, Belgium; imec, Kapeldreef 75, 3001 Leuven, Belgium
| | - Nikos Deligiannis
- Vrije Universiteit Brussel, Department of Electronics and Informatics, Pleinlaan 2, 1050 Brussels, Belgium; imec, Kapeldreef 75, 3001 Leuven, Belgium
| | - Panagiotis Gonidakis
- Vrije Universiteit Brussel, Department of Electronics and Informatics, Pleinlaan 2, 1050 Brussels, Belgium; imec, Kapeldreef 75, 3001 Leuven, Belgium
| | | | - Imran Razzak
- University of New South Wales, Sydney, Australia.
| | | | | | | | | | - James A Meakin
- Radboud university medical center, P.O. Box 9101, 6500HB Nijmegen, The Netherlands
| | - Alexander Lemm
- Amazon Web Services, Marcel-Breuer-Str. 12, 80807 München, Germany
| | - Christoph Russ
- Amazon Web Services, Marcel-Breuer-Str. 12, 80807 München, Germany
| | - Razvan Ionasec
- Amazon Web Services, Marcel-Breuer-Str. 12, 80807 München, Germany
| | - Nikos Paragios
- Keya medical technology co. ltd, Floor 20, Building A, 1 Ronghua South Road, Yizhuang Economic Development Zone, Daxing District, Beijing, PR China; TheraPanacea, 75004, Paris, France
| | - Bram van Ginneken
- Radboud university medical center, P.O. Box 9101, 6500HB Nijmegen, The Netherlands
| | - Marie-Pierre Revel-Dubois
- Department of Radiology, Université de Paris, APHP, Hôpital Cochin, 27 rue du Fg Saint Jacques, 75014 Paris, France
| |
Collapse
|
11
|
He Y, Kong J, Li J, Zheng C. Entropy and distance-guided super self-ensembling for optic disc and cup segmentation. BIOMEDICAL OPTICS EXPRESS 2024; 15:3975-3992. [PMID: 38867792 PMCID: PMC11166439 DOI: 10.1364/boe.521778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 04/14/2024] [Accepted: 05/06/2024] [Indexed: 06/14/2024]
Abstract
Segmenting the optic disc (OD) and optic cup (OC) is crucial to accurately detect changes in glaucoma progression in the elderly. Recently, various convolutional neural networks have emerged to deal with OD and OC segmentation. Due to the domain shift problem, achieving high-accuracy segmentation of OD and OC from different domain datasets remains highly challenging. Unsupervised domain adaptation has taken extensive focus as a way to address this problem. In this work, we propose a novel unsupervised domain adaptation method, called entropy and distance-guided super self-ensembling (EDSS), to enhance the segmentation performance of OD and OC. EDSS is comprised of two self-ensembling models, and the Gaussian noise is added to the weights of the whole network. Firstly, we design a super self-ensembling (SSE) framework, which can combine two self-ensembling to learn more discriminative information about images. Secondly, we propose a novel exponential moving average with Gaussian noise (G-EMA) to enhance the robustness of the self-ensembling framework. Thirdly, we propose an effective multi-information fusion strategy (MFS) to guide and improve the domain adaptation process. We evaluate the proposed EDSS on two public fundus image datasets RIGA+ and REFUGE. Large amounts of experimental results demonstrate that the proposed EDSS outperforms state-of-the-art segmentation methods with unsupervised domain adaptation, e.g., the Dicemean score on three test sub-datasets of RIGA+ are 0.8442, 0.8772 and 0.9006, respectively, and the Dicemean score on the REFUGE dataset is 0.9154.
Collapse
Affiliation(s)
- Yanlin He
- College of Information Sciences and Technology, Northeast Normal University, Changchun 130117, China
| | - Jun Kong
- College of Information Sciences and Technology, Northeast Normal University, Changchun 130117, China
| | - Juan Li
- Jilin Engineering Normal University, Changchun 130052, China
- Business School, Northeast Normal University, Changchun 130117, China
| | - Caixia Zheng
- College of Information Sciences and Technology, Northeast Normal University, Changchun 130117, China
- Key Laboratory of Applied Statistics of MOE, Northeast Normal University, Changchun 130024, China
| |
Collapse
|
12
|
Wang R, Zheng G. PFMNet: Prototype-based feature mapping network for few-shot domain adaptation in medical image segmentation. Comput Med Imaging Graph 2024; 116:102406. [PMID: 38824715 DOI: 10.1016/j.compmedimag.2024.102406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 05/23/2024] [Accepted: 05/24/2024] [Indexed: 06/04/2024]
Abstract
Lack of data is one of the biggest hurdles for rare disease research using deep learning. Due to the lack of rare-disease images and annotations, training a robust network for automatic rare-disease image segmentation is very challenging. To address this challenge, few-shot domain adaptation (FSDA) has emerged as a practical research direction, aiming to leverage a limited number of annotated images from a target domain to facilitate adaptation of models trained on other large datasets in a source domain. In this paper, we present a novel prototype-based feature mapping network (PFMNet) designed for FSDA in medical image segmentation. PFMNet adopts an encoder-decoder structure for segmentation, with the prototype-based feature mapping (PFM) module positioned at the bottom of the encoder-decoder structure. The PFM module transforms high-level features from the target domain into the source domain-like features that are more easily comprehensible by the decoder. By leveraging these source domain-like features, the decoder can effectively perform few-shot segmentation in the target domain and generate accurate segmentation masks. We evaluate the performance of PFMNet through experiments on three typical yet challenging few-shot medical image segmentation tasks: cross-center optic disc/cup segmentation, cross-center polyp segmentation, and cross-modality cardiac structure segmentation. We consider four different settings: 5-shot, 10-shot, 15-shot, and 20-shot. The experimental results substantiate the efficacy of our proposed approach for few-shot domain adaptation in medical image segmentation.
Collapse
Affiliation(s)
- Runze Wang
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China
| | - Guoyan Zheng
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China.
| |
Collapse
|
13
|
Cheng Z, Wang S, Gao Y, Zhu Z, Yan C. Invariant Content Representation for Generalizable Medical Image Segmentation. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01088-9. [PMID: 38758420 DOI: 10.1007/s10278-024-01088-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/20/2024] [Accepted: 02/09/2024] [Indexed: 05/18/2024]
Abstract
Domain generalization (DG) for medical image segmentation due to privacy preservation prefers learning from a single-source domain and expects good robustness on unseen target domains. To achieve this goal, previous methods mainly use data augmentation to expand the distribution of samples and learn invariant content from them. However, most of these methods commonly perform global augmentation, leading to limited augmented sample diversity. In addition, the style of the augmented image is more scattered than the source domain, which may cause the model to overfit the style of the source domain. To address the above issues, we propose an invariant content representation network (ICRN) to enhance the learning of invariant content and suppress the learning of variability styles. Specifically, we first design a gamma correction-based local style augmentation (LSA) to expand the distribution of samples by augmenting foreground and background styles, respectively. Then, based on the augmented samples, we introduce invariant content learning (ICL) to learn generalizable invariant content from both augmented and source-domain samples. Finally, we design domain-specific batch normalization (DSBN) based style adversarial learning (SAL) to suppress the learning of preferences for source-domain styles. Experimental results show that our proposed method improves by 8.74% and 11.33% in overall dice coefficient (Dice) and reduces 15.88 mm and 3.87 mm in overall average surface distance (ASD) on two publicly available cross-domain datasets, Fundus and Prostate, compared to the state-of-the-art DG methods. The code is available at https://github.com/ZMC-IIIM/ICRN-DG .
Collapse
Affiliation(s)
- Zhiming Cheng
- School of Automation, Hangzhou Dianzi University, Hangzhou, 310018, China
| | - Shuai Wang
- School of Cyberspace, Hangzhou Dianzi University, Hangzhou, 310018, China.
- Suzhou Research Institute of Shandong University, SuZhou, 215123, China.
| | - Yuhan Gao
- School of Automation, Hangzhou Dianzi University, Hangzhou, 310018, China
- Lishui Institute of Hangzhou Dianzi Universitu, Lishui, 323010, China
| | - Zunjie Zhu
- Lishui Institute of Hangzhou Dianzi Universitu, Lishui, 323010, China
- School of Communication, Engineering, Hangzhou Dianzi Universitu, Hangzhou, 310018, China
| | - Chenggang Yan
- School of Communication, Engineering, Hangzhou Dianzi Universitu, Hangzhou, 310018, China
| |
Collapse
|
14
|
Shi M, Tian Y, Luo Y, Elze T, Wang M. RNFLT2Vec: Artifact-corrected representation learning for retinal nerve fiber layer thickness maps. Med Image Anal 2024; 94:103110. [PMID: 38458093 DOI: 10.1016/j.media.2024.103110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 02/09/2024] [Accepted: 02/15/2024] [Indexed: 03/10/2024]
Abstract
Optical coherence tomography imaging provides a crucial clinical measurement for diagnosing and monitoring glaucoma through the two-dimensional retinal nerve fiber layer (RNFL) thickness (RNFLT) map. Researchers have been increasingly using neural models to extract meaningful features from the RNFLT map, aiming to identify biomarkers for glaucoma and its progression. However, accurately representing the RNFLT map features relevant to glaucoma is challenging due to significant variations in retinal anatomy among individuals, which confound the pathological thinning of the RNFL. Moreover, the presence of artifacts in the RNFLT map, caused by segmentation errors in the context of degraded image quality and defective imaging procedures, further complicates the task. In this paper, we propose a general framework called RNFLT2Vec for unsupervised learning of vectorized feature representations from RNFLT maps. Our method includes an artifact correction component that learns to rectify RNFLT values at artifact locations, producing a representation reflecting the RNFLT map without artifacts. Additionally, we incorporate two regularization techniques to encourage discriminative representation learning. Firstly, we introduce a contrastive learning-based regularization to capture the similarities and dissimilarities between RNFLT maps. Secondly, we employ a consistency learning-based regularization to align pairwise distances of RNFLT maps with their corresponding thickness distributions. Through extensive experiments on a large-scale real-world dataset, we demonstrate the superiority of RNFLT2Vec in three different clinical tasks: RNFLT pattern discovery, glaucoma detection, and visual field prediction. Our results validate the effectiveness of our framework and its potential to contribute to a better understanding and diagnosis of glaucoma.
Collapse
Affiliation(s)
- Min Shi
- Harvard Ophthalmology AI Lab, Schepens Eye Research Institute of Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA
| | - Yu Tian
- Harvard Ophthalmology AI Lab, Schepens Eye Research Institute of Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA
| | - Yan Luo
- Harvard Ophthalmology AI Lab, Schepens Eye Research Institute of Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA
| | - Tobias Elze
- Harvard Ophthalmology AI Lab, Schepens Eye Research Institute of Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA
| | - Mengyu Wang
- Harvard Ophthalmology AI Lab, Schepens Eye Research Institute of Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
15
|
Gibbon S, Muniz-Terrera G, Yii FSL, Hamid C, Cox S, Maccormick IJC, Tatham AJ, Ritchie C, Trucco E, Dhillon B, MacGillivray TJ. PallorMetrics: Software for Automatically Quantifying Optic Disc Pallor in Fundus Photographs, and Associations With Peripapillary RNFL Thickness. Transl Vis Sci Technol 2024; 13:20. [PMID: 38780955 PMCID: PMC11127490 DOI: 10.1167/tvst.13.5.20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 04/10/2024] [Indexed: 05/25/2024] Open
Abstract
Purpose We sough to develop an automatic method of quantifying optic disc pallor in fundus photographs and determine associations with peripapillary retinal nerve fiber layer (pRNFL) thickness. Methods We used deep learning to segment the optic disc, fovea, and vessels in fundus photographs, and measured pallor. We assessed the relationship between pallor and pRNFL thickness derived from optical coherence tomography scans in 118 participants. Separately, we used images diagnosed by clinical inspection as pale (n = 45) and assessed how measurements compared with healthy controls (n = 46). We also developed automatic rejection thresholds and tested the software for robustness to camera type, image format, and resolution. Results We developed software that automatically quantified disc pallor across several zones in fundus photographs. Pallor was associated with pRNFL thickness globally (β = -9.81; standard error [SE] = 3.16; P < 0.05), in the temporal inferior zone (β = -29.78; SE = 8.32; P < 0.01), with the nasal/temporal ratio (β = 0.88; SE = 0.34; P < 0.05), and in the whole disc (β = -8.22; SE = 2.92; P < 0.05). Furthermore, pallor was significantly higher in the patient group. Last, we demonstrate the analysis to be robust to camera type, image format, and resolution. Conclusions We developed software that automatically locates and quantifies disc pallor in fundus photographs and found associations between pallor measurements and pRNFL thickness. Translational Relevance We think our method will be useful for the identification, monitoring, and progression of diseases characterized by disc pallor and optic atrophy, including glaucoma, compression, and potentially in neurodegenerative disorders.
Collapse
Affiliation(s)
- Samuel Gibbon
- Centre for Clinical Brain Sciences, Edinburgh, UK
- Robert O Curle Ophthalmology Suite, Institute for Regeneration and Repair, University of Edinburgh, UK, Edinburgh, UK
| | | | - Fabian S. L. Yii
- Centre for Clinical Brain Sciences, Edinburgh, UK
- Robert O Curle Ophthalmology Suite, Institute for Regeneration and Repair, University of Edinburgh, UK, Edinburgh, UK
| | | | - Simon Cox
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, Edinburgh, UK
| | - Ian J. C. Maccormick
- Centre for Inflammation Research, University of Edinburgh, Edinburgh, UK
- Institute for Adaptive and Neural Computation, University of Edinburgh, Edinburgh, UK
| | - Andrew J. Tatham
- Centre for Clinical Brain Sciences, Edinburgh, UK
- Princess Alexandra Eye Pavilion, Chalmers Street, Edinburgh, UK
| | - Craig Ritchie
- Centre for Clinical Brain Sciences, Edinburgh, UK
- Centre for Dementia Prevention, University of Edinburgh, Edinburgh, UK
| | - Emanuele Trucco
- VAMPIRE Project, Computing (SSEN), University of Dundee, Dundee, UK
| | - Baljean Dhillon
- Centre for Clinical Brain Sciences, Edinburgh, UK
- Princess Alexandra Eye Pavilion, Chalmers Street, Edinburgh, UK
| | - Thomas J. MacGillivray
- Centre for Clinical Brain Sciences, Edinburgh, UK
- Robert O Curle Ophthalmology Suite, Institute for Regeneration and Repair, University of Edinburgh, UK, Edinburgh, UK
- VAMPIRE Project, Edinburgh Clinical Research facility, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
16
|
Zhou H, He Y, Cui X, Xie Z. AGSAM: Agent-Guided Segment Anything Model for Automatic Segmentation in Few-Shot Scenarios. Bioengineering (Basel) 2024; 11:447. [PMID: 38790313 PMCID: PMC11118214 DOI: 10.3390/bioengineering11050447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 03/25/2024] [Accepted: 04/16/2024] [Indexed: 05/26/2024] Open
Abstract
Precise medical image segmentation of regions of interest (ROIs) is crucial for accurate disease diagnosis and progression assessment. However, acquiring high-quality annotated data at the pixel level poses a significant challenge due to the resource-intensive nature of this process. This scarcity of high-quality annotated data results in few-shot scenarios, which are highly prevalent in clinical applications. To address this obstacle, this paper introduces Agent-Guided SAM (AGSAM), an innovative approach that transforms the Segment Anything Model (SAM) into a fully automated segmentation method by automating prompt generation. Capitalizing on the pre-trained feature extraction and decoding capabilities of SAM-Med2D, AGSAM circumvents the need for manual prompt engineering, ensuring adaptability across diverse segmentation methods. Furthermore, the proposed feature augmentation convolution module (FACM) enhances model accuracy by promoting stable feature representations. Experimental evaluations demonstrate AGSAM's consistent superiority over other methods across various metrics. These findings highlight AGSAM's efficacy in tackling the challenges associated with limited annotated data while achieving high-quality medical image segmentation.
Collapse
Affiliation(s)
- Hao Zhou
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou 510000, China; (H.Z.); (Y.H.)
| | - Yao He
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou 510000, China; (H.Z.); (Y.H.)
| | - Xiaoxiao Cui
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250000, China
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou 510000, China; (H.Z.); (Y.H.)
| |
Collapse
|
17
|
Yang X, Zheng Y, Mei C, Jiang G, Tian B, Wang L. UGLS: an uncertainty guided deep learning strategy for accurate image segmentation. Front Physiol 2024; 15:1362386. [PMID: 38651048 PMCID: PMC11033460 DOI: 10.3389/fphys.2024.1362386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 03/26/2024] [Indexed: 04/25/2024] Open
Abstract
Accurate image segmentation plays a crucial role in computer vision and medical image analysis. In this study, we developed a novel uncertainty guided deep learning strategy (UGLS) to enhance the performance of an existing neural network (i.e., U-Net) in segmenting multiple objects of interest from images with varying modalities. In the developed UGLS, a boundary uncertainty map was introduced for each object based on its coarse segmentation (obtained by the U-Net) and then combined with input images for the fine segmentation of the objects. We validated the developed method by segmenting optic cup (OC) regions from color fundus images and left and right lung regions from Xray images. Experiments on public fundus and Xray image datasets showed that the developed method achieved a average Dice Score (DS) of 0.8791 and a sensitivity (SEN) of 0.8858 for the OC segmentation, and 0.9605, 0.9607, 0.9621, and 0.9668 for the left and right lung segmentation, respectively. Our method significantly improved the segmentation performance of the U-Net, making it comparable or superior to five sophisticated networks (i.e., AU-Net, BiO-Net, AS-Net, Swin-Unet, and TransUNet).
Collapse
Affiliation(s)
- Xiaoguo Yang
- Wenzhou People’s Hospital, The Third Affiliated Hospital of Shanghai University, Wenzhou, China
| | - Yanyan Zheng
- Wenzhou People’s Hospital, The Third Affiliated Hospital of Shanghai University, Wenzhou, China
| | - Chenyang Mei
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Gaoqiang Jiang
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Bihan Tian
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Lei Wang
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
18
|
Yap BP, Kelvin LZ, Toh EQ, Low KY, Rani SK, Goh EJH, Hui VYC, Ng BK, Lim TH. Generalizability of Deep Neural Networks for Vertical Cup-to-Disc Ratio Estimation in Ultra-Widefield and Smartphone-Based Fundus Images. Transl Vis Sci Technol 2024; 13:6. [PMID: 38568608 PMCID: PMC10996969 DOI: 10.1167/tvst.13.4.6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 02/19/2024] [Indexed: 04/05/2024] Open
Abstract
Purpose To develop and validate a deep learning system (DLS) for estimation of vertical cup-to-disc ratio (vCDR) in ultra-widefield (UWF) and smartphone-based fundus images. Methods A DLS consisting of two sequential convolutional neural networks (CNNs) to delineate optic disc (OD) and optic cup (OC) boundaries was developed using 800 standard fundus images from the public REFUGE data set. The CNNs were tested on 400 test images from the REFUGE data set and 296 UWF and 300 smartphone-based images from a teleophthalmology clinic. vCDRs derived from the delineated OD/OC boundaries were compared with optometrists' annotations using mean absolute error (MAE). Subgroup analysis was conducted to study the impact of peripapillary atrophy (PPA), and correlation study was performed to investigate potential correlations between sectoral CDR (sCDR) and retinal nerve fiber layer (RNFL) thickness. Results The system achieved MAEs of 0.040 (95% CI, 0.037-0.043) in the REFUGE test images, 0.068 (95% CI, 0.061-0.075) in the UWF images, and 0.084 (95% CI, 0.075-0.092) in the smartphone-based images. There was no statistical significance in differences between PPA and non-PPA images. Weak correlation (r = -0.4046, P < 0.05) between sCDR and RNFL thickness was found only in the superior sector. Conclusions We developed a deep learning system that estimates vCDR from standard, UWF, and smartphone-based images. We also described anatomic peripapillary adversarial lesion and its potential impact on OD/OC delineation. Translational Relevance Artificial intelligence can estimate vCDR from different types of fundus images and may be used as a general and interpretable screening tool to improve community reach for diagnosis and management of glaucoma.
Collapse
Affiliation(s)
- Boon Peng Yap
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
| | - Li Zhenghao Kelvin
- Department of Ophthalmology, Tan Tock Seng Hospital, Singapore, Singapore
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
- National Healthcare Group Eye Institute, Singapore, Singapore
| | - En Qi Toh
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
| | - Kok Yao Low
- Department of Ophthalmology, Tan Tock Seng Hospital, Singapore, Singapore
- National Healthcare Group Eye Institute, Singapore, Singapore
| | - Sumaya Khan Rani
- Department of Ophthalmology, Tan Tock Seng Hospital, Singapore, Singapore
- National Healthcare Group Eye Institute, Singapore, Singapore
| | - Eunice Jin Hui Goh
- Department of Ophthalmology, Tan Tock Seng Hospital, Singapore, Singapore
- National Healthcare Group Eye Institute, Singapore, Singapore
| | - Vivien Yip Cherng Hui
- Department of Ophthalmology, Tan Tock Seng Hospital, Singapore, Singapore
- National Healthcare Group Eye Institute, Singapore, Singapore
| | - Beng Koon Ng
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
| | - Tock Han Lim
- Department of Ophthalmology, Tan Tock Seng Hospital, Singapore, Singapore
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
- National Healthcare Group Eye Institute, Singapore, Singapore
| |
Collapse
|
19
|
Morano J, Aresta G, Grechenig C, Schmidt-Erfurth U, Bogunovic H. Deep Multimodal Fusion of Data With Heterogeneous Dimensionality via Projective Networks. IEEE J Biomed Health Inform 2024; 28:2235-2246. [PMID: 38206782 DOI: 10.1109/jbhi.2024.3352970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2024]
Abstract
The use of multimodal imaging has led to significant improvements in the diagnosis and treatment of many diseases. Similar to clinical practice, some works have demonstrated the benefits of multimodal fusion for automatic segmentation and classification using deep learning-based methods. However, current segmentation methods are limited to fusion of modalities with the same dimensionality (e.g., 3D + 3D, 2D + 2D), which is not always possible, and the fusion strategies implemented by classification methods are incompatible with localization tasks. In this work, we propose a novel deep learning-based framework for the fusion of multimodal data with heterogeneous dimensionality (e.g., 3D + 2D) that is compatible with localization tasks. The proposed framework extracts the features of the different modalities and projects them into the common feature subspace. The projected features are then fused and further processed to obtain the final prediction. The framework was validated on the following tasks: segmentation of geographic atrophy (GA), a late-stage manifestation of age-related macular degeneration, and segmentation of retinal blood vessels (RBV) in multimodal retinal imaging. Our results show that the proposed method outperforms the state-of-the-art monomodal methods on GA and RBV segmentation by up to 3.10% and 4.64% Dice, respectively.
Collapse
|
20
|
Zhou Z, Zheng Y, Zhou X, Yu J, Rong S. Self-supervised pre-training for joint optic disc and cup segmentation via attention-aware network. BMC Ophthalmol 2024; 24:98. [PMID: 38438876 PMCID: PMC10910696 DOI: 10.1186/s12886-024-03376-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 02/28/2024] [Indexed: 03/06/2024] Open
Abstract
Image segmentation is a fundamental task in deep learning, which is able to analyse the essence of the images for further development. However, for the supervised learning segmentation method, collecting pixel-level labels is very time-consuming and labour-intensive. In the medical image processing area for optic disc and cup segmentation, we consider there are two challenging problems that remain unsolved. One is how to design an efficient network to capture the global field of the medical image and execute fast in real applications. The other is how to train the deep segmentation network using a few training data due to some medical privacy issues. In this paper, to conquer such issues, we first design a novel attention-aware segmentation model equipped with the multi-scale attention module in the pyramid structure-like encoder-decoder network, which can efficiently learn the global semantics and the long-range dependencies of the input images. Furthermore, we also inject the prior knowledge that the optic cup lies inside the optic disc by a novel loss function. Then, we propose a self-supervised contrastive learning method for optic disc and cup segmentation. The unsupervised feature representation is learned by matching an encoded query to a dictionary of encoded keys using a contrastive technique. Finetuning the pre-trained model using the proposed loss function can help achieve good performance for the task. To validate the effectiveness of the proposed method, extensive systemic evaluations on different public challenging optic disc and cup benchmarks, including DRISHTI-GS and REFUGE datasets demonstrate the superiority of the proposed method, which can achieve new state-of-the-art performance approaching 0.9801 and 0.9087 F1 score respectively while gaining 0.9657 D C disc and 0.8976 D C cup . The code will be made publicly available.
Collapse
Affiliation(s)
- Zhiwang Zhou
- Institute for Advanced Study, Nanchang University, Nanchang, 330031, China.
| | - Yuanchang Zheng
- Institute for Advanced Study, Nanchang University, Nanchang, 330031, China
- Institute of Science and Technology, Waseda University, Tokyo, 63-8001, Japan
| | - Xiaoyu Zhou
- School of Transportation Engineering, Tongji University, Shanghai, 200000, China
| | - Jie Yu
- School of Electrical Automation and Information Engineering, Tianjin University, Tianjin, 300000, China
| | - Shangjie Rong
- School of Mathematical Sciences, Xiamen University, Xiamen, 361000, China
| |
Collapse
|
21
|
Song Y, Zhang W, Zhang Y. A novel lightweight deep learning approach for simultaneous optic cup and optic disc segmentation in glaucoma detection. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:5092-5117. [PMID: 38872528 DOI: 10.3934/mbe.2024225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Glaucoma is a chronic neurodegenerative disease that can result in irreversible vision loss if not treated in its early stages. The cup-to-disc ratio is a key criterion for glaucoma screening and diagnosis, and it is determined by dividing the area of the optic cup (OC) by that of the optic disc (OD) in fundus images. Consequently, the automatic and accurate segmentation of the OC and OD is a pivotal step in glaucoma detection. In recent years, numerous methods have resulted in great success on this task. However, most existing methods either have unsatisfactory segmentation accuracy or high time costs. In this paper, we propose a lightweight deep-learning architecture for the simultaneous segmentation of the OC and OD, where we have adopted fuzzy learning and a multi-layer perceptron to simplify the learning complexity and improve segmentation accuracy. Experimental results demonstrate the superiority of our proposed method as compared to most state-of-the-art approaches in terms of both training time and segmentation accuracy.
Collapse
Affiliation(s)
- Yantao Song
- Institute of Big Data Science and Industry, Shanxi University, Taiyuan 030006, China
- School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
| | - Wenjie Zhang
- Institute of Big Data Science and Industry, Shanxi University, Taiyuan 030006, China
- School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
| | - Yue Zhang
- School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
| |
Collapse
|
22
|
Bragança CP, Torres JM, Macedo LO, Soares CPDA. Advancements in Glaucoma Diagnosis: The Role of AI in Medical Imaging. Diagnostics (Basel) 2024; 14:530. [PMID: 38473002 DOI: 10.3390/diagnostics14050530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 02/17/2024] [Accepted: 02/23/2024] [Indexed: 03/14/2024] Open
Abstract
The progress of artificial intelligence algorithms in digital image processing and automatic diagnosis studies of the eye disease glaucoma has been growing and presenting essential advances to guarantee better clinical care for the population. Given the context, this article describes the main types of glaucoma, traditional forms of diagnosis, and presents the global epidemiology of the disease. Furthermore, it explores how studies using artificial intelligence algorithms have been investigated as possible tools to aid in the early diagnosis of this pathology through population screening. Therefore, the related work section presents the main studies and methodologies used in the automatic classification of glaucoma from digital fundus images and artificial intelligence algorithms, as well as the main databases containing images labeled for glaucoma and publicly available for the training of machine learning algorithms.
Collapse
Affiliation(s)
- Clerimar Paulo Bragança
- ISUS Unit, Faculty of Science and Technology, University Fernando Pessoa, 4249-004 Porto, Portugal
- Department of Ophthalmology, Eye Hospital of Southern Minas Gerais State, Rua Joaquim Rosa 14, Itanhandu 37464-000, MG, Brazil
| | - José Manuel Torres
- ISUS Unit, Faculty of Science and Technology, University Fernando Pessoa, 4249-004 Porto, Portugal
- Artificial Intelligence and Computer Science Laboratory, LIACC, University of Porto, 4100-000 Porto, Portugal
| | - Luciano Oliveira Macedo
- Department of Ophthalmology, Eye Hospital of Southern Minas Gerais State, Rua Joaquim Rosa 14, Itanhandu 37464-000, MG, Brazil
| | - Christophe Pinto de Almeida Soares
- ISUS Unit, Faculty of Science and Technology, University Fernando Pessoa, 4249-004 Porto, Portugal
- Artificial Intelligence and Computer Science Laboratory, LIACC, University of Porto, 4100-000 Porto, Portugal
| |
Collapse
|
23
|
Yap BP, Ng BK. Coarse-to-fine visual representation learning for medical images via class activation maps. Comput Biol Med 2024; 171:108203. [PMID: 38430741 DOI: 10.1016/j.compbiomed.2024.108203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 01/29/2024] [Accepted: 02/19/2024] [Indexed: 03/05/2024]
Abstract
The value of coarsely labeled datasets in learning transferable representations for medical images is investigated in this work. Compared to fine labels which require meticulous effort to annotate, coarse labels can be acquired at a significantly lower cost and can provide useful training signals for data-hungry deep neural networks. We consider coarse labels in the form of binary labels differentiating a normal (healthy) image from an abnormal (diseased) image and propose CAMContrast, a two-stage representation learning framework for medical images. Using class activation maps, CAMContrast makes use of the binary labels to generate heatmaps as positive views for contrastive representation learning. Specifically, the learning objective is optimized to maximize the agreement within fixed crops of image-heatmap pair to learn fine-grained representations that are generalizable to different downstream tasks. We empirically validate the transfer learning performance of CAMContrast on several public datasets, covering classification and segmentation tasks on fundus photographs and chest X-ray images. The experimental results showed that our method outperforms other self-supervised and supervised pretrain methods in terms of data efficiency and downstream performance.
Collapse
Affiliation(s)
- Boon Peng Yap
- School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore; Centre for OptoElectronics and Biophotonics, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore.
| | - Beng Koon Ng
- School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore; Centre for OptoElectronics and Biophotonics, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore.
| |
Collapse
|
24
|
Hasan MM, Phu J, Sowmya A, Meijering E, Kalloniatis M. Artificial intelligence in the diagnosis of glaucoma and neurodegenerative diseases. Clin Exp Optom 2024; 107:130-146. [PMID: 37674264 DOI: 10.1080/08164622.2023.2235346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 07/07/2023] [Indexed: 09/08/2023] Open
Abstract
Artificial Intelligence is a rapidly expanding field within computer science that encompasses the emulation of human intelligence by machines. Machine learning and deep learning - two primary data-driven pattern analysis approaches under the umbrella of artificial intelligence - has created considerable interest in the last few decades. The evolution of technology has resulted in a substantial amount of artificial intelligence research on ophthalmic and neurodegenerative disease diagnosis using retinal images. Various artificial intelligence-based techniques have been used for diagnostic purposes, including traditional machine learning, deep learning, and their combinations. Presented here is a review of the literature covering the last 10 years on this topic, discussing the use of artificial intelligence in analysing data from different modalities and their combinations for the diagnosis of glaucoma and neurodegenerative diseases. The performance of published artificial intelligence methods varies due to several factors, yet the results suggest that such methods can potentially facilitate clinical diagnosis. Generally, the accuracy of artificial intelligence-assisted diagnosis ranges from 67-98%, and the area under the sensitivity-specificity curve (AUC) ranges from 0.71-0.98, which outperforms typical human performance of 71.5% accuracy and 0.86 area under the curve. This indicates that artificial intelligence-based tools can provide clinicians with useful information that would assist in providing improved diagnosis. The review suggests that there is room for improvement of existing artificial intelligence-based models using retinal imaging modalities before they are incorporated into clinical practice.
Collapse
Affiliation(s)
- Md Mahmudul Hasan
- School of Computer Science and Engineering, University of New South Wales, Kensington, New South Wales, Australia
| | - Jack Phu
- School of Optometry and Vision Science, University of New South Wales, Kensington, Australia
- Centre for Eye Health, University of New South Wales, Sydney, New South Wales, Australia
- School of Medicine (Optometry), Deakin University, Waurn Ponds, Victoria, Australia
| | - Arcot Sowmya
- School of Computer Science and Engineering, University of New South Wales, Kensington, New South Wales, Australia
| | - Erik Meijering
- School of Computer Science and Engineering, University of New South Wales, Kensington, New South Wales, Australia
| | - Michael Kalloniatis
- School of Optometry and Vision Science, University of New South Wales, Kensington, Australia
- School of Medicine (Optometry), Deakin University, Waurn Ponds, Victoria, Australia
| |
Collapse
|
25
|
Kumari S, Singh P. Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives. Comput Biol Med 2024; 170:107912. [PMID: 38219643 DOI: 10.1016/j.compbiomed.2023.107912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 11/02/2023] [Accepted: 12/24/2023] [Indexed: 01/16/2024]
Abstract
Deep learning has demonstrated remarkable performance across various tasks in medical imaging. However, these approaches primarily focus on supervised learning, assuming that the training and testing data are drawn from the same distribution. Unfortunately, this assumption may not always hold true in practice. To address these issues, unsupervised domain adaptation (UDA) techniques have been developed to transfer knowledge from a labeled domain to a related but unlabeled domain. In recent years, significant advancements have been made in UDA, resulting in a wide range of methodologies, including feature alignment, image translation, self-supervision, and disentangled representation methods, among others. In this paper, we provide a comprehensive literature review of recent deep UDA approaches in medical imaging from a technical perspective. Specifically, we categorize current UDA research in medical imaging into six groups and further divide them into finer subcategories based on the different tasks they perform. We also discuss the respective datasets used in the studies to assess the divergence between the different domains. Finally, we discuss emerging areas and provide insights and discussions on future research directions to conclude this survey.
Collapse
Affiliation(s)
- Suruchi Kumari
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| | - Pravendra Singh
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| |
Collapse
|
26
|
Gao XR, Wu F, Yuhas PT, Rasel RK, Chiariglione M. Automated vertical cup-to-disc ratio determination from fundus images for glaucoma detection. Sci Rep 2024; 14:4494. [PMID: 38396048 PMCID: PMC10891153 DOI: 10.1038/s41598-024-55056-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 02/20/2024] [Indexed: 02/25/2024] Open
Abstract
Glaucoma is the leading cause of irreversible blindness worldwide. Often asymptomatic for years, this disease can progress significantly before patients become aware of the loss of visual function. Critical examination of the optic nerve through ophthalmoscopy or using fundus images is a crucial component of glaucoma detection before the onset of vision loss. The vertical cup-to-disc ratio (VCDR) is a key structural indicator for glaucoma, as thinning of the superior and inferior neuroretinal rim is a hallmark of the disease. However, manual assessment of fundus images is both time-consuming and subject to variability based on clinician expertise and interpretation. In this study, we develop a robust and accurate automated system employing deep learning (DL) techniques, specifically the YOLOv7 architecture, for the detection of optic disc and optic cup in fundus images and the subsequent calculation of VCDR. We also address the often-overlooked issue of adapting a DL model, initially trained on a specific population (e.g., European), for VCDR estimation in a different population. Our model was initially trained on ten publicly available datasets and subsequently fine-tuned on the REFUGE dataset, which comprises images collected from Chinese patients. The DL-derived VCDR displayed exceptional accuracy, achieving a Pearson correlation coefficient of 0.91 (P = 4.12 × 10-412) and a mean absolute error (MAE) of 0.0347 when compared to assessments by human experts. Our models also surpassed existing approaches on the REFUGE dataset, demonstrating higher Dice similarity coefficients and lower MAEs. Moreover, we developed an optimization approach capable of calibrating DL results for new populations. Our novel approaches for detecting optic discs and optic cups and calculating VCDR, offers clinicians a promising tool that significantly reduces manual workload in image assessment while improving both speed and accuracy. Most importantly, this automated method effectively differentiates between glaucoma and non-glaucoma cases, making it a valuable asset for glaucoma detection.
Collapse
Affiliation(s)
- Xiaoyi Raymond Gao
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, 43210, USA.
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA.
- Division of Human Genetics, The Ohio State University, Columbus, OH, 43210, USA.
- College of Optometry, The Ohio State University, Columbus, OH, USA.
| | - Fengze Wu
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, 43210, USA
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA
| | - Phillip T Yuhas
- College of Optometry, The Ohio State University, Columbus, OH, USA
| | - Rafiul Karim Rasel
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, 43210, USA
| | - Marion Chiariglione
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, 43210, USA
| |
Collapse
|
27
|
Pandey PU, Ballios BG, Christakis PG, Kaplan AJ, Mathew DJ, Ong Tone S, Wan MJ, Micieli JA, Wong JCY. Ensemble of deep convolutional neural networks is more accurate and reliable than board-certified ophthalmologists at detecting multiple diseases in retinal fundus photographs. Br J Ophthalmol 2024; 108:417-423. [PMID: 36720585 PMCID: PMC10894841 DOI: 10.1136/bjo-2022-322183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 01/11/2023] [Indexed: 02/02/2023]
Abstract
AIMS To develop an algorithm to classify multiple retinal pathologies accurately and reliably from fundus photographs and to validate its performance against human experts. METHODS We trained a deep convolutional ensemble (DCE), an ensemble of five convolutional neural networks (CNNs), to classify retinal fundus photographs into diabetic retinopathy (DR), glaucoma, age-related macular degeneration (AMD) and normal eyes. The CNN architecture was based on the InceptionV3 model, and initial weights were pretrained on the ImageNet dataset. We used 43 055 fundus images from 12 public datasets. Five trained ensembles were then tested on an 'unseen' set of 100 images. Seven board-certified ophthalmologists were asked to classify these test images. RESULTS Board-certified ophthalmologists achieved a mean accuracy of 72.7% over all classes, while the DCE achieved a mean accuracy of 79.2% (p=0.03). The DCE had a statistically significant higher mean F1-score for DR classification compared with the ophthalmologists (76.8% vs 57.5%; p=0.01) and greater but statistically non-significant mean F1-scores for glaucoma (83.9% vs 75.7%; p=0.10), AMD (85.9% vs 85.2%; p=0.69) and normal eyes (73.0% vs 70.5%; p=0.39). The DCE had a greater mean agreement between accuracy and confident of 81.6% vs 70.3% (p<0.001). DISCUSSION We developed a deep learning model and found that it could more accurately and reliably classify four categories of fundus images compared with board-certified ophthalmologists. This work provides proof-of-principle that an algorithm is capable of accurate and reliable recognition of multiple retinal diseases using only fundus photographs.
Collapse
Affiliation(s)
- Prashant U Pandey
- School of Biomedical Engineering, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Brian G Ballios
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- Krembil Research Institute, University Health Network, Toronto, Ontario, Canada
- Kensington Vision and Research Centre and Kensington Research Institute, Toronto, Ontario, Canada
| | - Panos G Christakis
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- Kensington Vision and Research Centre and Kensington Research Institute, Toronto, Ontario, Canada
| | - Alexander J Kaplan
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| | - David J Mathew
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- Krembil Research Institute, University Health Network, Toronto, Ontario, Canada
- Kensington Vision and Research Centre and Kensington Research Institute, Toronto, Ontario, Canada
| | - Stephan Ong Tone
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- Sunnybrook Research Institute, Toronto, Ontario, Canada
| | - Michael J Wan
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Jonathan A Micieli
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- Kensington Vision and Research Centre and Kensington Research Institute, Toronto, Ontario, Canada
- Department of Ophthalmology, St. Michael's Hospital, Unity Health, Toronto, Ontario, Canada
| | - Jovi C Y Wong
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
28
|
Mei C, Yang X, Zhou M, Zhang S, Chen H, Yang X, Wang L. Semi-supervised image segmentation using a residual-driven mean teacher and an exponential Dice loss. Artif Intell Med 2024; 148:102757. [PMID: 38325920 DOI: 10.1016/j.artmed.2023.102757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 11/13/2023] [Accepted: 12/29/2023] [Indexed: 02/09/2024]
Abstract
Semi-supervised segmentation plays an important role in computer vision and medical image analysis and can alleviate the burden of acquiring abundant expert-annotated images. In this paper, we developed a residual-driven semi-supervised segmentation method (termed RDMT) based on the classical mean teacher (MT) framework by introducing a novel model-level residual perturbation and an exponential Dice (eDice) loss. The introduced perturbation was integrated into the exponential moving average (EMA) scheme to enhance the performance of the MT, while the eDice loss was used to improve the detection sensitivity of a given network to object boundaries. We validated the developed method by applying it to segment 3D Left Atrium (LA) and 2D optic cup (OC) from the public LASC and REFUGE datasets based on the V-Net and U-Net, respectively. Extensive experiments demonstrated that the developed method achieved the average Dice score of 0.8776 and 0.7751, when trained on 10% and 20% labeled images, respectively for the LA and OC regions depicted on the LASC and REFUGE datasets. It significantly outperformed the MT and can compete with several existing semi-supervised segmentation methods (i.e., HCMT, UAMT, DTC and SASS).
Collapse
Affiliation(s)
- Chenyang Mei
- School of Ophthalmology & Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Xiaoguo Yang
- Department of Neurology, Wenzhou People's Hospital, The Third Affiliated Hospital of Shanghai University, Wenzhou 325041, China
| | - Mi Zhou
- Department of Neurology, Wenzhou People's Hospital, The Third Affiliated Hospital of Shanghai University, Wenzhou 325041, China; School of Medicine, Shanghai University, Shanghai 200444, China
| | - Shaodan Zhang
- School of Ophthalmology & Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Hao Chen
- School of Ophthalmology & Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Xiaokai Yang
- Department of Neurology, Wenzhou People's Hospital, The Third Affiliated Hospital of Shanghai University, Wenzhou 325041, China; School of Medicine, Shanghai University, Shanghai 200444, China.
| | - Lei Wang
- School of Ophthalmology & Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China.
| |
Collapse
|
29
|
Zhu Y, Salowe R, Chow C, Li S, Bastani O, O'Brien JM. Advancing Glaucoma Care: Integrating Artificial Intelligence in Diagnosis, Management, and Progression Detection. Bioengineering (Basel) 2024; 11:122. [PMID: 38391608 PMCID: PMC10886285 DOI: 10.3390/bioengineering11020122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 01/23/2024] [Accepted: 01/24/2024] [Indexed: 02/24/2024] Open
Abstract
Glaucoma, the leading cause of irreversible blindness worldwide, comprises a group of progressive optic neuropathies requiring early detection and lifelong treatment to preserve vision. Artificial intelligence (AI) technologies are now demonstrating transformative potential across the spectrum of clinical glaucoma care. This review summarizes current capabilities, future outlooks, and practical translation considerations. For enhanced screening, algorithms analyzing retinal photographs and machine learning models synthesizing risk factors can identify high-risk patients needing diagnostic workup and close follow-up. To augment definitive diagnosis, deep learning techniques detect characteristic glaucomatous patterns by interpreting results from optical coherence tomography, visual field testing, fundus photography, and other ocular imaging. AI-powered platforms also enable continuous monitoring, with algorithms that analyze longitudinal data alerting physicians about rapid disease progression. By integrating predictive analytics with patient-specific parameters, AI can also guide precision medicine for individualized glaucoma treatment selections. Advances in robotic surgery and computer-based guidance demonstrate AI's potential to improve surgical outcomes and surgical training. Beyond the clinic, AI chatbots and reminder systems could provide patient education and counseling to promote medication adherence. However, thoughtful approaches to clinical integration, usability, diversity, and ethical implications remain critical to successfully implementing these emerging technologies. This review highlights AI's vast capabilities to transform glaucoma care while summarizing key achievements, future prospects, and practical considerations to progress from bench to bedside.
Collapse
Affiliation(s)
- Yan Zhu
- Department of Ophthalmology, Scheie Eye Institute, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Rebecca Salowe
- Department of Ophthalmology, Scheie Eye Institute, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Caven Chow
- Department of Ophthalmology, Scheie Eye Institute, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Shuo Li
- Department of Computer & Information Science, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Osbert Bastani
- Department of Computer & Information Science, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Joan M O'Brien
- Department of Ophthalmology, Scheie Eye Institute, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
30
|
Fang H, Li F, Wu J, Fu H, Sun X, Orlando JI, Bogunović H, Zhang X, Xu Y. Open Fundus Photograph Dataset with Pathologic Myopia Recognition and Anatomical Structure Annotation. Sci Data 2024; 11:99. [PMID: 38245589 PMCID: PMC10799845 DOI: 10.1038/s41597-024-02911-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 01/02/2024] [Indexed: 01/22/2024] Open
Abstract
Pathologic myopia (PM) is a common blinding retinal degeneration suffered by highly myopic population. Early screening of this condition can reduce the damage caused by the associated fundus lesions and therefore prevent vision loss. Automated diagnostic tools based on artificial intelligence methods can benefit this process by aiding clinicians to identify disease signs or to screen mass populations using color fundus photographs as inputs. This paper provides insights about PALM, our open fundus imaging dataset for pathological myopia recognition and anatomical structure annotation. Our databases comprises 1200 images with associated labels for the pathologic myopia category and manual annotations of the optic disc, the position of the fovea and delineations of lesions such as patchy retinal atrophy (including peripapillary atrophy) and retinal detachment. In addition, this paper elaborates on other details such as the labeling process used to construct the database, the quality and characteristics of the samples and provides other relevant usage notes.
Collapse
Affiliation(s)
- Huihui Fang
- South China University of Technology, Guangzhou, China
- Pazhou Lab., Guangzhou, China
| | - Fei Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, China
| | - Junde Wu
- National University of Singapore, Singapore, Singapore
| | - Huazhu Fu
- Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore, Singapore
| | - Xu Sun
- Pazhou Lab., Guangzhou, China
| | | | - Hrvoje Bogunović
- Christian Doppler Lab for Artificial Intelligence in Retina, Department of Ophthalmology and Optometry, Medical University of Vienna, Vienna, Austria
| | - Xiulan Zhang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, China.
| | - Yanwu Xu
- South China University of Technology, Guangzhou, China.
- Pazhou Lab., Guangzhou, China.
| |
Collapse
|
31
|
Tang S, Song C, Wang D, Gao Y, Liu Y, Lv W. W-Net: A boundary-aware cascade network for robust and accurate optic disc segmentation. iScience 2024; 27:108247. [PMID: 38230262 PMCID: PMC10790032 DOI: 10.1016/j.isci.2023.108247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 03/14/2023] [Accepted: 10/16/2023] [Indexed: 01/18/2024] Open
Abstract
Accurate optic disc (OD) segmentation has a great significance for computer-aided diagnosis of different types of eye diseases. Due to differences in image acquisition equipment and acquisition methods, the resolution, size, contrast, and clarity of images from different datasets show significant differences, resulting in poor generalization performance of deep learning networks. To solve this problem, this study proposes a multi-level segmentation network. The network includes data quality enhancement module (DQEM), coarse segmentation module (CSM), localization module (OLM), and fine segmentation stage module (FSM). In FSM, W-Net is proposed for the first time, and boundary loss is introduced in the loss function, which effectively improves the performance of OD segmentation. We generalized the model in the REFUGE test dataset, GAMMA dataset, Drishti-GS1 dataset, and IDRiD dataset, respectively. The results show that our method has the best OD segmentation performance in different datasets compared with state-of-the-art networks.
Collapse
Affiliation(s)
- Shuo Tang
- School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing 100191, China
| | - Chongchong Song
- School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing 100191, China
| | - Defeng Wang
- School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing 100191, China
| | - Yang Gao
- School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing 100191, China
| | - Yuchen Liu
- School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing 100191, China
| | - Wang Lv
- School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing 100191, China
| |
Collapse
|
32
|
Azad R, Kazerouni A, Heidari M, Aghdam EK, Molaei A, Jia Y, Jose A, Roy R, Merhof D. Advances in medical image analysis with vision Transformers: A comprehensive review. Med Image Anal 2024; 91:103000. [PMID: 37883822 DOI: 10.1016/j.media.2023.103000] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 09/30/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023]
Abstract
The remarkable performance of the Transformer architecture in natural language processing has recently also triggered broad interest in Computer Vision. Among other merits, Transformers are witnessed as capable of learning long-range dependencies and spatial correlations, which is a clear advantage over convolutional neural networks (CNNs), which have been the de facto standard in Computer Vision problems so far. Thus, Transformers have become an integral part of modern medical image analysis. In this review, we provide an encyclopedic review of the applications of Transformers in medical imaging. Specifically, we present a systematic and thorough review of relevant recent Transformer literature for different medical image analysis tasks, including classification, segmentation, detection, registration, synthesis, and clinical report generation. For each of these applications, we investigate the novelty, strengths and weaknesses of the different proposed strategies and develop taxonomies highlighting key properties and contributions. Further, if applicable, we outline current benchmarks on different datasets. Finally, we summarize key challenges and discuss different future research directions. In addition, we have provided cited papers with their corresponding implementations in https://github.com/mindflow-institue/Awesome-Transformer.
Collapse
Affiliation(s)
- Reza Azad
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Amirhossein Kazerouni
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | - Moein Heidari
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | | | - Amirali Molaei
- School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
| | - Yiwei Jia
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Abin Jose
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Rijo Roy
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Dorit Merhof
- Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany; Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany.
| |
Collapse
|
33
|
Wang J, Jin Y, Stoyanov D, Wang L. FedDP: Dual Personalization in Federated Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:297-308. [PMID: 37494156 DOI: 10.1109/tmi.2023.3299206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
Personalized federated learning (PFL) addresses the data heterogeneity challenge faced by general federated learning (GFL). Rather than learning a single global model, with PFL a collection of models are adapted to the unique feature distribution of each site. However, current PFL methods rarely consider self-attention networks which can handle data heterogeneity by long-range dependency modeling and they do not utilize prediction inconsistencies in local models as an indicator of site uniqueness. In this paper, we propose FedDP, a novel fed erated learning scheme with d ual p ersonalization, which improves model personalization from both feature and prediction aspects to boost image segmentation results. We leverage long-range dependencies by designing a local query (LQ) that decouples the query embedding layer out of each local model, whose parameters are trained privately to better adapt to the respective feature distribution of the site. We then propose inconsistency-guided calibration (IGC), which exploits the inter-site prediction inconsistencies to accommodate the model learning concentration. By encouraging a model to penalize pixels with larger inconsistencies, we better tailor prediction-level patterns to each local site. Experimentally, we compare FedDP with the state-of-the-art PFL methods on two popular medical image segmentation tasks with different modalities, where our results consistently outperform others on both tasks. Our code and models are available at https://github.com/jcwang123/PFL-Seg-Trans.
Collapse
|
34
|
Sigut J, Fumero F, Estévez J, Alayón S, Díaz-Alemán T. In-Depth Evaluation of Saliency Maps for Interpreting Convolutional Neural Network Decisions in the Diagnosis of Glaucoma Based on Fundus Imaging. SENSORS (BASEL, SWITZERLAND) 2023; 24:239. [PMID: 38203101 PMCID: PMC10781365 DOI: 10.3390/s24010239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 12/14/2023] [Accepted: 12/29/2023] [Indexed: 01/12/2024]
Abstract
Glaucoma, a leading cause of blindness, damages the optic nerve, making early diagnosis challenging due to no initial symptoms. Fundus eye images taken with a non-mydriatic retinograph help diagnose glaucoma by revealing structural changes, including the optic disc and cup. This research aims to thoroughly analyze saliency maps in interpreting convolutional neural network decisions for diagnosing glaucoma from fundus images. These maps highlight the most influential image regions guiding the network's decisions. Various network architectures were trained and tested on 739 optic nerve head images, with nine saliency methods used. Some other popular datasets were also used for further validation. The results reveal disparities among saliency maps, with some consensus between the folds corresponding to the same architecture. Concerning the significance of optic disc sectors, there is generally a lack of agreement with standard medical criteria. The background, nasal, and temporal sectors emerge as particularly influential for neural network decisions, showing a likelihood of being the most relevant ranging from 14.55% to 28.16% on average across all evaluated datasets. We can conclude that saliency maps are usually difficult to interpret and even the areas indicated as the most relevant can be very unintuitive. Therefore, its usefulness as an explanatory tool may be compromised, at least in problems such as the one addressed in this study, where the features defining the model prediction are generally not consistently reflected in relevant regions of the saliency maps, and they even cannot always be related to those used as medical standards.
Collapse
Affiliation(s)
- Jose Sigut
- Department of Computer Science and Systems Engineering, Universidad de La Laguna, Camino San Francisco de Paula, 19, La Laguna, 38203 Santa Cruz de Tenerife, Spain; (F.F.); (J.E.); (S.A.)
| | - Francisco Fumero
- Department of Computer Science and Systems Engineering, Universidad de La Laguna, Camino San Francisco de Paula, 19, La Laguna, 38203 Santa Cruz de Tenerife, Spain; (F.F.); (J.E.); (S.A.)
| | - José Estévez
- Department of Computer Science and Systems Engineering, Universidad de La Laguna, Camino San Francisco de Paula, 19, La Laguna, 38203 Santa Cruz de Tenerife, Spain; (F.F.); (J.E.); (S.A.)
| | - Silvia Alayón
- Department of Computer Science and Systems Engineering, Universidad de La Laguna, Camino San Francisco de Paula, 19, La Laguna, 38203 Santa Cruz de Tenerife, Spain; (F.F.); (J.E.); (S.A.)
| | - Tinguaro Díaz-Alemán
- Department of Ophthalmology, Hospital Universitario de Canarias, Carretera Ofra S/N, La Laguna, 38320 Santa Cruz de Tenerife, Spain;
| |
Collapse
|
35
|
Xu Y, Yang W. Editorial: Artificial intelligence applications in chronic ocular diseases. Front Cell Dev Biol 2023; 11:1295850. [PMID: 38143924 PMCID: PMC10740206 DOI: 10.3389/fcell.2023.1295850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Accepted: 11/28/2023] [Indexed: 12/26/2023] Open
Affiliation(s)
- Yanwu Xu
- School of Future Technology, South China University of Technology, Guangzhou, Guangdong Province, China
- Pazhou Lab, Guangzhou, Guangdong Province, China
| | - Weihua Yang
- Shenzhen Eye Institute, Shenzhen Eye Hospital, Jinan University, Shenzhen, Guangdong Province, China
| |
Collapse
|
36
|
Wang Z, Wang J, Zhang H, Yan C, Wang X, Wen X. Mstnet: method for glaucoma grading based on multimodal feature fusion of spatial relations. Phys Med Biol 2023; 68:245002. [PMID: 37857309 DOI: 10.1088/1361-6560/ad0520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 10/19/2023] [Indexed: 10/21/2023]
Abstract
Objective.The objective of this study is to develop an efficient multimodal learning framework for the classification of glaucoma. Glaucoma is a group of eye diseases that can result in vision loss and blindness, often due to delayed detection and treatment. Fundus images and optical coherence tomography (OCT) images have proven valuable for the diagnosis and management of glaucoma. However, current models that combine features from both modalities often lack efficient spatial relationship modeling.Approach.In this study, we propose an innovative approach to address the classification of glaucoma. We focus on leveraging the features of OCT volumes and harness the capabilities of transformer models to capture long-range spatial relationships. To achieve this, we introduce a 3D transformer model to extract features from OCT volumes, enhancing the model's effectiveness. Additionally, we employ downsampling techniques to enhance model efficiency. We then utilize the spatial feature relationships between OCT volumes and fundus images to fuse the features extracted from both sources.Main results.Our proposed framework has yielded remarkable results, particularly in terms of glaucoma grading performance. We conducted our experiments using the GAMMA dataset, and our approach outperformed traditional feature fusion methods. By effectively modeling spatial relationships and combining OCT volume and fundus map features, our framework achieved outstanding classification results.Significance.This research is of significant importance in the field of glaucoma diagnosis and management. Efficient and accurate glaucoma classification is essential for timely intervention and prevention of vision loss. Our proposed approach, which integrates 3D transformer models, offers a novel way to extract and fuse features from OCT volumes and fundus images, ultimately enhancing the effectiveness of glaucoma classification. This work has the potential to contribute to improved patient care, particularly in the early detection and treatment of glaucoma, thereby reducing the risk of vision impairment and blindness.
Collapse
Affiliation(s)
- Zhizhou Wang
- No. 209, University Street, Yuci District, Jinzhong City, Shanxi Province, People's Republic of China
| | - Jun Wang
- No. 209, University Street, Yuci District, Jinzhong City, Shanxi Province, People's Republic of China
| | - Hongru Zhang
- No. 209, University Street, Yuci District, Jinzhong City, Shanxi Province, People's Republic of China
| | - Chen Yan
- No. 209, University Street, Yuci District, Jinzhong City, Shanxi Province, People's Republic of China
| | - Xingkui Wang
- No. 209, University Street, Yuci District, Jinzhong City, Shanxi Province, People's Republic of China
| | - Xin Wen
- No. 209, University Street, Yuci District, Jinzhong City, Shanxi Province, People's Republic of China
| |
Collapse
|
37
|
Wu JCH, Yu HW, Tsai TH, Lu HHS. Dynamically Synthetic Images for Federated Learning of medical images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 242:107845. [PMID: 37852147 DOI: 10.1016/j.cmpb.2023.107845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 09/28/2023] [Accepted: 10/03/2023] [Indexed: 10/20/2023]
Abstract
BACKGROUND To develop deep learning models for medical diagnosis, it is important to collect more medical data from several medical institutions. Due to the regulations for privacy concerns, it is infeasible to collect data from various medical institutions to one institution for centralized learning. Federated Learning (FL) provides a feasible approach to jointly train the deep learning model with data stored in various medical institutions instead of collected together. However, the resulting FL models could be biased towards institutions with larger training datasets. METHODOLOGY In this study, we propose the applicable method of Dynamically Synthetic Images for Federated Learning (DSIFL) that aims to integrate the information of local institutions with heterogeneous types of data. The main technique of DSIFL is to develop a synthetic method that can dynamically adjust the number of synthetic images similar to local data that are misclassified by the current model. The resulting global model can handle the diversity in heterogeneous types of data collected in local medical institutions by including the training of synthetic images similar to misclassified cases in local collections. RESULTS In model performance evaluation metrics, we focus on the accuracy of each client's dataset. Finally, the accuracy of the model of DSIFL in the experiments can achieve the higher accuracy of the FL approach. CONCLUSION In this study, we propose the framework of DSIFL that achieves improvements over the conventional FL approach. We conduct empirical studies with two kinds of medical images. We compare the performance by variants of FL vs. DSIFL approaches. The performance by individual training is used as the baseline, whereas the performance by centralized learning is used as the target for the comparison studies. The empirical findings suggest that the DSIFL has improved performance over the FL via the technique of dynamically synthetic images in training.
Collapse
Affiliation(s)
- Jacky Chung-Hao Wu
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, ROC
| | - Hsuan-Wen Yu
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, ROC
| | - Tsung-Hung Tsai
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, ROC
| | - Henry Horng-Shing Lu
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, ROC; Department of Statistics and Data Science, Cornell University, New York, USA.
| |
Collapse
|
38
|
Wu J, Fang H, Li F, Fu H, Lin F, Li J, Huang Y, Yu Q, Song S, Xu X, Xu Y, Wang W, Wang L, Lu S, Li H, Huang S, Lu Z, Ou C, Wei X, Liu B, Kobbi R, Tang X, Lin L, Zhou Q, Hu Q, Bogunović H, Orlando JI, Zhang X, Xu Y. GAMMA challenge: Glaucoma grAding from Multi-Modality imAges. Med Image Anal 2023; 90:102938. [PMID: 37806020 DOI: 10.1016/j.media.2023.102938] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 07/04/2023] [Accepted: 08/16/2023] [Indexed: 10/10/2023]
Abstract
Glaucoma is a chronic neuro-degenerative condition that is one of the world's leading causes of irreversible but preventable blindness. The blindness is generally caused by the lack of timely detection and treatment. Early screening is thus essential for early treatment to preserve vision and maintain life quality. Colour fundus photography and Optical Coherence Tomography (OCT) are the two most cost-effective tools for glaucoma screening. Both imaging modalities have prominent biomarkers to indicate glaucoma suspects, such as the vertical cup-to-disc ratio (vCDR) on fundus images and retinal nerve fiber layer (RNFL) thickness on OCT volume. In clinical practice, it is often recommended to take both of the screenings for a more accurate and reliable diagnosis. However, although numerous algorithms are proposed based on fundus images or OCT volumes for the automated glaucoma detection, there are few methods that leverage both of the modalities to achieve the target. To fulfil the research gap, we set up the Glaucoma grAding from Multi-Modality imAges (GAMMA) Challenge to encourage the development of fundus & OCT-based glaucoma grading. The primary task of the challenge is to grade glaucoma from both the 2D fundus images and 3D OCT scanning volumes. As part of GAMMA, we have publicly released a glaucoma annotated dataset with both 2D fundus colour photography and 3D OCT volumes, which is the first multi-modality dataset for machine learning based glaucoma grading. In addition, an evaluation framework is also established to evaluate the performance of the submitted methods. During the challenge, 1272 results were submitted, and finally, ten best performing teams were selected for the final stage. We analyse their results and summarize their methods in the paper. Since all the teams submitted their source code in the challenge, we conducted a detailed ablation study to verify the effectiveness of the particular modules proposed. Finally, we identify the proposed techniques and strategies that could be of practical value for the clinical diagnosis of glaucoma. As the first in-depth study of fundus & OCT multi-modality glaucoma grading, we believe the GAMMA Challenge will serve as an essential guideline and benchmark for future research.
Collapse
Affiliation(s)
- Junde Wu
- South China University of Technology, Guangzhou, China; Pazhou Lab, Guangzhou, China
| | - Huihui Fang
- South China University of Technology, Guangzhou, China; Pazhou Lab, Guangzhou, China
| | - Fei Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, China
| | - Huazhu Fu
- Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR), Singapore
| | - Fengbin Lin
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, China
| | - Jiongcheng Li
- School of Informatics, Xiamen University, Xiamen, China
| | - Yue Huang
- School of Informatics, Xiamen University, Xiamen, China
| | - Qinji Yu
- Shanghai Jiao Tong University, Shanghai, China
| | - Sifan Song
- Xi'an Jiaotong-Liverpool University, Suzhou, China
| | - Xinxing Xu
- Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR), Singapore
| | - Yanyu Xu
- Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR), Singapore
| | - Wensai Wang
- Institute of Biomedical Engineering, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin, China
| | - Lingxiao Wang
- Institute of Biomedical Engineering, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin, China
| | - Shuai Lu
- School of Medical Technology, Beijing Institute of Technology, Beijing, China
| | - Huiqi Li
- School of Medical Technology, Beijing Institute of Technology, Beijing, China; School of Information and Electronics, Beijing Institute of Technology, Beijing, China
| | - Shihua Huang
- Department of Computing, Hong Kong Polytechnic University, Hong Kong, China
| | - Zhichao Lu
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Chubin Ou
- Weizhi Medical Technology Company, Suzhou, China
| | - Xifei Wei
- Weizhi Medical Technology Company, Suzhou, China
| | - Bingyuan Liu
- École de technologie supérieure, Montreal, Montreal, Canada
| | | | - Xiaoying Tang
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Li Lin
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China; Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, China
| | - Qiang Zhou
- Suixin (Shanghai) Technology Co., Ltd., Shanghai, China
| | - Qiang Hu
- Suixin (Shanghai) Technology Co., Ltd., Shanghai, China
| | - Hrvoje Bogunović
- Christian Doppler Lab for Artificial Intelligence in Retina, Department of Ophthalmology, Medical University of Vienna, Austria
| | | | - Xiulan Zhang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, China.
| | - Yanwu Xu
- South China University of Technology, Guangzhou, China; Pazhou Lab, Guangzhou, China.
| |
Collapse
|
39
|
Bo ZH, Guo Y, Lyu J, Liang H, He J, Deng S, Xu F, Lou X, Dai Q. Relay learning: a physically secure framework for clinical multi-site deep learning. NPJ Digit Med 2023; 6:204. [PMID: 37925578 PMCID: PMC10625523 DOI: 10.1038/s41746-023-00934-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 09/25/2023] [Indexed: 11/06/2023] Open
Abstract
Big data serves as the cornerstone for constructing real-world deep learning systems across various domains. In medicine and healthcare, a single clinical site lacks sufficient data, thus necessitating the involvement of multiple sites. Unfortunately, concerns regarding data security and privacy hinder the sharing and reuse of data across sites. Existing approaches to multi-site clinical learning heavily depend on the security of the network firewall and system implementation. To address this issue, we propose Relay Learning, a secure deep-learning framework that physically isolates clinical data from external intruders while still leveraging the benefits of multi-site big data. We demonstrate the efficacy of Relay Learning in three medical tasks of different diseases and anatomical structures, including structure segmentation of retina fundus, mediastinum tumors diagnosis, and brain midline localization. We evaluate Relay Learning by comparing its performance to alternative solutions through multi-site validation and external validation. Incorporating a total of 41,038 medical images from 21 medical hosts, including 7 external hosts, with non-uniform distributions, we observe significant performance improvements with Relay Learning across all three tasks. Specifically, it achieves an average performance increase of 44.4%, 24.2%, and 36.7% for retinal fundus segmentation, mediastinum tumor diagnosis, and brain midline localization, respectively. Remarkably, Relay Learning even outperforms central learning on external test sets. In the meanwhile, Relay Learning keeps data sovereignty locally without cross-site network connections. We anticipate that Relay Learning will revolutionize clinical multi-site collaboration and reshape the landscape of healthcare in the future.
Collapse
Affiliation(s)
- Zi-Hao Bo
- School of Software, Tsinghua University, Beijing, China
- BNRist, Tsinghua University, Beijing, China
| | - Yuchen Guo
- BNRist, Tsinghua University, Beijing, China.
| | - Jinhao Lyu
- Department of Radiology, Chinese PLA General Hospital / Chinese PLA Medical School, Beijing, China
| | - Hengrui Liang
- Department of Thoracic Oncology and Surgery, China State Key Laboratory of Respiratory Disease & National Clinical Research Center for Respiratory Disease, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Jianxing He
- Department of Thoracic Oncology and Surgery, China State Key Laboratory of Respiratory Disease & National Clinical Research Center for Respiratory Disease, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Shijie Deng
- Department of Radiology, The 921st Hospital of Chinese PLA, Changsha, China
| | - Feng Xu
- School of Software, Tsinghua University, Beijing, China.
- BNRist, Tsinghua University, Beijing, China.
| | - Xin Lou
- Department of Radiology, Chinese PLA General Hospital / Chinese PLA Medical School, Beijing, China.
| | - Qionghai Dai
- BNRist, Tsinghua University, Beijing, China.
- Department of Automation, Tsinghua University, Beijing, China.
| |
Collapse
|
40
|
Rashidisabet H, Sethi A, Jindarak P, Edmonds J, Chan RVP, Leiderman YI, Vajaranant TS, Yi D. Validating the Generalizability of Ophthalmic Artificial Intelligence Models on Real-World Clinical Data. Transl Vis Sci Technol 2023; 12:8. [PMID: 37922149 PMCID: PMC10629532 DOI: 10.1167/tvst.12.11.8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 08/21/2023] [Indexed: 11/05/2023] Open
Abstract
Purpose This study aims to investigate generalizability of deep learning (DL) models trained on commonly used public fundus images to an instance of real-world data (RWD) for glaucoma diagnosis. Methods We used Illinois Eye and Ear Infirmary fundus data set as an instance of RWD in addition to six publicly available fundus data sets. We compared the performance of DL-trained models on public data and RWD for glaucoma classification and optic disc (OD) segmentation tasks. For each task, we created models trained on each data set, respectively, and each model was tested on both data sets. We further examined each model's decision-making process and learned embeddings for the glaucoma classification task. Results Using public data for the test set, public-trained models outperformed RWD-trained models in OD segmentation and glaucoma classification with a mean intersection over union of 96.3% and mean area under the receiver operating characteristic curve of 95.0%, respectively. Using the RWD test set, the performance of public models decreased by 8.0% and 18.4% to 85.6% and 76.6% for OD segmentation and glaucoma classification tasks, respectively. RWD models outperformed public models on RWD test sets by 2.0% and 9.5%, respectively, in OD segmentation and glaucoma classification tasks. Conclusions DL models trained on commonly used public data have limited ability to generalize to RWD for classifying glaucoma. They perform similarly to RWD models for OD segmentation. Translational Relevance RWD is a potential solution for improving generalizability of DL models and enabling clinical translations in the care of prevalent blinding ophthalmic conditions, such as glaucoma.
Collapse
Affiliation(s)
- Homa Rashidisabet
- Department of Biomedical Engineering, University of Illinois Chicago, Chicago, IL, USA
- Artificial Intelligence in Ophthalmology (Ai-O) Center, University of Illinois Chicago, Chicago, IL, USA
| | - Abhishek Sethi
- Artificial Intelligence in Ophthalmology (Ai-O) Center, University of Illinois Chicago, Chicago, IL, USA
- Illinois Eye and Ear Infirmary, Department of Ophthalmology and Visual Sciences, University of Illinois Chicago, Chicago, IL, USA
| | - Ponpawee Jindarak
- Illinois Eye and Ear Infirmary, Department of Ophthalmology and Visual Sciences, University of Illinois Chicago, Chicago, IL, USA
| | - James Edmonds
- Artificial Intelligence in Ophthalmology (Ai-O) Center, University of Illinois Chicago, Chicago, IL, USA
- Illinois Eye and Ear Infirmary, Department of Ophthalmology and Visual Sciences, University of Illinois Chicago, Chicago, IL, USA
| | - R V Paul Chan
- Artificial Intelligence in Ophthalmology (Ai-O) Center, University of Illinois Chicago, Chicago, IL, USA
- Illinois Eye and Ear Infirmary, Department of Ophthalmology and Visual Sciences, University of Illinois Chicago, Chicago, IL, USA
| | - Yannek I Leiderman
- Department of Biomedical Engineering, University of Illinois Chicago, Chicago, IL, USA
- Artificial Intelligence in Ophthalmology (Ai-O) Center, University of Illinois Chicago, Chicago, IL, USA
- Illinois Eye and Ear Infirmary, Department of Ophthalmology and Visual Sciences, University of Illinois Chicago, Chicago, IL, USA
| | - Thasarat Sutabutr Vajaranant
- Artificial Intelligence in Ophthalmology (Ai-O) Center, University of Illinois Chicago, Chicago, IL, USA
- Illinois Eye and Ear Infirmary, Department of Ophthalmology and Visual Sciences, University of Illinois Chicago, Chicago, IL, USA
| | - Darvin Yi
- Department of Biomedical Engineering, University of Illinois Chicago, Chicago, IL, USA
- Artificial Intelligence in Ophthalmology (Ai-O) Center, University of Illinois Chicago, Chicago, IL, USA
- Illinois Eye and Ear Infirmary, Department of Ophthalmology and Visual Sciences, University of Illinois Chicago, Chicago, IL, USA
| |
Collapse
|
41
|
Qiu L, Cheng J, Gao H, Xiong W, Ren H. Federated Semi-Supervised Learning for Medical Image Segmentation via Pseudo-Label Denoising. IEEE J Biomed Health Inform 2023; 27:4672-4683. [PMID: 37155394 DOI: 10.1109/jbhi.2023.3274498] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Distributed big data and digital healthcare technologies have great potential to promote medical services, but challenges arise when it comes to learning predictive model from diverse and complex e-health datasets. Federated Learning (FL), as a collaborative machine learning technique, aims to address the challenges by learning a joint predictive model across multi-site clients, especially for distributed medical institutions or hospitals. However, most existing FL methods assume that clients possess fully labeled data for training, which is often not the case in e-health datasets due to high labeling costs or expertise requirement. Therefore, this work proposes a novel and feasible approach to learn a Federated Semi-Supervised Learning (FSSL) model from distributed medical image domains, where a federated pseudo-labeling strategy for unlabeled clients is developed based on the embedded knowledge learned from labeled clients. This greatly mitigates the annotation deficiency at unlabeled clients and leads to a cost-effective and efficient medical image analysis tool. We demonstrated the effectiveness of our method by achieving significant improvements compared to the state-of-the-art in both fundus image and prostate MRI segmentation tasks, resulting in the highest Dice scores of 89.23% and 91.95% respectively even with only a few labeled clients participating in model training. This reveals the superiority of our method for practical deployment, ultimately facilitating the wider use of FL in healthcare and leading to better patient outcomes.
Collapse
|
42
|
Gu R, Wang G, Lu J, Zhang J, Lei W, Chen Y, Liao W, Zhang S, Li K, Metaxas DN, Zhang S. CDDSA: Contrastive domain disentanglement and style augmentation for generalizable medical image segmentation. Med Image Anal 2023; 89:102904. [PMID: 37506556 DOI: 10.1016/j.media.2023.102904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 06/06/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023]
Abstract
Generalization to previously unseen images with potential domain shifts is essential for clinically applicable medical image segmentation. Disentangling domain-specific and domain-invariant features is key for Domain Generalization (DG). However, existing DG methods struggle to achieve effective disentanglement. To address this problem, we propose an efficient framework called Contrastive Domain Disentanglement and Style Augmentation (CDDSA) for generalizable medical image segmentation. First, a disentangle network decomposes the image into domain-invariant anatomical representation and domain-specific style code, where the former is sent for further segmentation that is not affected by domain shift, and the disentanglement is regularized by a decoder that combines the anatomical representation and style code to reconstruct the original image. Second, to achieve better disentanglement, a contrastive loss is proposed to encourage the style codes from the same domain and different domains to be compact and divergent, respectively. Finally, to further improve generalizability, we propose a style augmentation strategy to synthesize images with various unseen styles in real time while maintaining anatomical information. Comprehensive experiments on a public multi-site fundus image dataset and an in-house multi-site Nasopharyngeal Carcinoma Magnetic Resonance Image (NPC-MRI) dataset show that the proposed CDDSA achieved remarkable generalizability across different domains, and it outperformed several state-of-the-art methods in generalizable segmentation. Code is available at https://github.com/HiLab-git/DAG4MIA.
Collapse
Affiliation(s)
- Ran Gu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Guotai Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai AI Lab, Shanghai, China.
| | - Jiangshan Lu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Jingyang Zhang
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China; School of Biomedical Engineering, ShanghaiTech University, Shanghai, China
| | - Wenhui Lei
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Shanghai AI Lab, Shanghai, China
| | - Yinan Chen
- SenseTime Research, Shanghai, China; West China Hospital-SenseTime Joint Lab, West China Biomedical Big Data Center, Sichuan University, Chengdu, China
| | - Wenjun Liao
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, University of Electronic Science and Technology of China, Chengdu, China
| | - Shichuan Zhang
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, University of Electronic Science and Technology of China, Chengdu, China
| | - Kang Li
- West China Hospital-SenseTime Joint Lab, West China Biomedical Big Data Center, Sichuan University, Chengdu, China
| | - Dimitris N Metaxas
- Department of Computer Science, Rutgers University, Piscataway NJ 08854, USA
| | - Shaoting Zhang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; SenseTime Research, Shanghai, China; Shanghai AI Lab, Shanghai, China.
| |
Collapse
|
43
|
Cellini F, Caamaño D, Carrasco B, Juberías JR, Ossa C, Bringas R, de la Fuente F, Franco P, Coronado D, Pastor JC. Deep Learning Application to Detect Glaucoma with a Mixed Training Approach: Public Database and Expert-Labeled Glaucoma Population. Ophthalmic Res 2023; 66:1278-1285. [PMID: 37778337 DOI: 10.1159/000534251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 09/18/2023] [Indexed: 10/03/2023]
Abstract
INTRODUCTION Artificial intelligence has real potential for early identification of ocular diseases such as glaucoma. An important challenge is the requirement for large databases properly selected, which are not easily obtained. We used a relatively original strategy: a glaucoma recognition algorithm trained with fundus images from public databases and then tested and retrained with a carefully selected patient database. METHODS The study's supervised deep learning method was an adapted version of the ResNet-50 architecture previously trained from 10,658 optic head images (glaucomatous or non-glaucomatous) from seven public databases. A total of 1,158 new images labeled by experts from 616 patients were added. The images were categorized after clinical examination including visual fields in 304 (26%) control images or those with ocular hypertension and 347 (30%) images with early, 290 (25%) with moderate, and 217 (19%) with advanced glaucoma. The initial algorithm was tested using 30% of the selected glaucoma database and then re-trained with 70% of this database and tested again. RESULTS The results in the initial sample showed an area under the curve (AUC) of 76% for all images, and 66% for early, 82% for moderate, and 84% for advanced glaucoma. After retraining the algorithm, the respective AUC results were 82%, 72%, 89%, and 91%. CONCLUSION Using combined data from public databases and data selected and labeled by experts facilitated improvement of the system's precision and identified interesting possibilities for obtaining tools for automatic screening of glaucomatous eyes more affordably.
Collapse
Affiliation(s)
- Florencia Cellini
- Instituto de Oftalmobiología Aplicada (IOBA), University of Valladolid, Valladolid, Spain
| | - Deborah Caamaño
- Instituto de Oftalmobiología Aplicada (IOBA), University of Valladolid, Valladolid, Spain
| | - Belen Carrasco
- Ophthalmology Department, Hospital Clinico Universitario (HCUV), Valladolid, Spain
| | - José R Juberías
- Instituto de Oftalmobiología Aplicada (IOBA), University of Valladolid, Valladolid, Spain
- Ophthalmology Department, Hospital Clinico Universitario (HCUV), Valladolid, Spain
| | - Carolina Ossa
- Instituto de Oftalmobiología Aplicada (IOBA), University of Valladolid, Valladolid, Spain
| | - Ramón Bringas
- Ophthalmology Department, Hospital Universitario Río Hortega (HURH), Valladolid, Spain
| | | | | | | | - Jose Carlos Pastor
- Instituto de Oftalmobiología Aplicada (IOBA), University of Valladolid, Valladolid, Spain
| |
Collapse
|
44
|
Yi Y, Jiang Y, Zhou B, Zhang N, Dai J, Huang X, Zeng Q, Zhou W. C2FTFNet: Coarse-to-fine transformer network for joint optic disc and cup segmentation. Comput Biol Med 2023; 164:107215. [PMID: 37481947 DOI: 10.1016/j.compbiomed.2023.107215] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/07/2023] [Accepted: 06/25/2023] [Indexed: 07/25/2023]
Abstract
Glaucoma is a leading cause of worldwide blindness and visual impairment, making early screening and diagnosis is crucial to prevent vision loss. Cup-to-Disk Ratio (CDR) evaluation serves as a widely applied approach for effective glaucoma screening. At present, deep learning methods have exhibited outstanding performance in optic disk (OD) and optic cup (OC) segmentation and maturely deployed in CAD system. However, owning to the complexity of clinical data, these techniques could be constrained. Therefore, an original Coarse-to-Fine Transformer Network (C2FTFNet) is designed to segment OD and OC jointly , which is composed of two stages. In the coarse stage, to eliminate the effects of irrelevant organization on the segmented OC and OD regions, we employ U-Net and Circular Hough Transform (CHT) to segment the Region of Interest (ROI) of OD. Meanwhile, a TransUnet3+ model is designed in the fine segmentation stage to extract the OC and OD regions more accurately from ROI. In this model, to alleviate the limitation of the receptive field caused by traditional convolutional methods, a Transformer module is introduced into the backbone to capture long-distance dependent features for retaining more global information. Then, a Multi-Scale Dense Skip Connection (MSDC) module is proposed to fuse the low-level and high-level features from different layers for reducing the semantic gap among different level features. Comprehensive experiments conducted on DRIONS-DB, Drishti-GS, and REFUGE datasets validate the superior effectiveness of the proposed C2FTFNet compared to existing state-of-the-art approaches.
Collapse
Affiliation(s)
- Yugen Yi
- School of Software, Jiangxi Normal University, Nanchang, 330022, China; Jiangxi Provincial Engineering Research Center of Blockchain Data Security and Governance, Nanchang, 330022, China
| | - Yan Jiang
- School of Software, Jiangxi Normal University, Nanchang, 330022, China
| | - Bin Zhou
- School of Software, Jiangxi Normal University, Nanchang, 330022, China
| | - Ningyi Zhang
- School of Software, Jiangxi Normal University, Nanchang, 330022, China
| | - Jiangyan Dai
- School of Computer Engineering, Weifang University, 261061, China.
| | - Xin Huang
- School of Software, Jiangxi Normal University, Nanchang, 330022, China
| | - Qinqin Zeng
- Department of Ophthalmology, The Second Affiliated Hospital of Nanchang University, Nanchang, 330006, China
| | - Wei Zhou
- College of Computer Science, Shenyang Aerospace University, Shenyang, 110136, China.
| |
Collapse
|
45
|
Zhang J, Gu R, Xue P, Liu M, Zheng H, Zheng Y, Ma L, Wang G, Gu L. S 3R: Shape and Semantics-Based Selective Regularization for Explainable Continual Segmentation Across Multiple Sites. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2539-2551. [PMID: 37030841 DOI: 10.1109/tmi.2023.3260974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
In clinical practice, it is desirable for medical image segmentation models to be able to continually learn on a sequential data stream from multiple sites, rather than a consolidated dataset, due to storage cost and privacy restrictions. However, when learning on a new site, existing methods struggle with a weak memorizability for previous sites with complex shape and semantic information, and a poor explainability for the memory consolidation process. In this work, we propose a novel Shape and Semantics-based Selective Regularization ( [Formula: see text]) method for explainable cross-site continual segmentation to maintain both shape and semantic knowledge of previously learned sites. Specifically, [Formula: see text] method adopts a selective regularization scheme to penalize changes of parameters with high Joint Shape and Semantics-based Importance (JSSI) weights, which are estimated based on the parameter sensitivity to shape properties and reliable semantics of the segmentation object. This helps to prevent the related shape and semantic knowledge from being forgotten. Moreover, we propose an Importance Activation Mapping (IAM) method for memory interpretation, which indicates the spatial support for important parameters to visualize the memorized content. We have extensively evaluated our method on prostate segmentation and optic cup and disc segmentation tasks. Our method outperforms other comparison methods in reducing model forgetting and increasing explainability. Our code is available at https://github.com/jingyzhang/S3R.
Collapse
|
46
|
Li Z, Zhao C, Han Z, Hong C. TUNet and domain adaptation based learning for joint optic disc and cup segmentation. Comput Biol Med 2023; 163:107209. [PMID: 37442009 DOI: 10.1016/j.compbiomed.2023.107209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Revised: 06/02/2023] [Accepted: 06/25/2023] [Indexed: 07/15/2023]
Abstract
Glaucoma is a chronic disorder that harms the optic nerves and causes irreversible blindness. The calculation of optic cup (OC) to optic disc (OD) ratio plays an important role in the primary screening and diagnosis of glaucoma. Thus, automatic and precise segmentations of OD and OC is highly preferable. Recently, deep neural networks demonstrate remarkable progress in the OD and OC segmentation, however, they are severely hindered in generalizing across different scanners and image resolution. In this work, we propose a novel domain adaptation-based framework to mitigate the performance degradation in OD and OC segmentation. We first devise an effective transformer-based segmentation network as a backbone to accurately segment the OD and OC regions. Then, to address the issue of domain shift, we introduce domain adaptation into the learning paradigm to encourage domain-invariant features. Since the segmentation-based domain adaptation loss is insufficient for capturing segmentation details, we further propose an auxiliary classifier to enable the discrimination on segmentation details. Exhaustive experiments on three public retinal fundus image datasets, i.e., REFUGE, Drishti-GS and RIM-ONE-r3, demonstrate our superior performance on the segmentation of OD and OC. These results suggest that our proposal has great potential to be an important component for an automated glaucoma screening system.
Collapse
Affiliation(s)
- Zhuorong Li
- Hangzhou City University, Hangzhou, 310015, Zhejiang, China.
| | - Chen Zhao
- Zhejiang University, Hangzhou, 310027, Zhejiang, China
| | - Zhike Han
- Hangzhou City University, Hangzhou, 310015, Zhejiang, China.
| | - Chaoyang Hong
- Zhejiang Provincial People's Hospital, Hangzhou, 310014, Zhejiang, China
| |
Collapse
|
47
|
Rezaei M, Näppi JJ, Bischl B, Yoshida H. Bayesian uncertainty estimation for detection of long-tailed and unseen conditions in medical images. J Med Imaging (Bellingham) 2023; 10:054501. [PMID: 37818179 PMCID: PMC10560997 DOI: 10.1117/1.jmi.10.5.054501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 09/19/2023] [Accepted: 09/20/2023] [Indexed: 10/12/2023] Open
Abstract
Purpose Deep supervised learning provides an effective approach for developing robust models for various computer-aided diagnosis tasks. However, there is often an underlying assumption that the frequencies of the samples between the different classes of the training dataset are either similar or balanced. In real-world medical data, the samples of positive classes often occur too infrequently to satisfy this assumption. Thus, there is an unmet need for deep-learning systems that can automatically identify and adapt to the real-world conditions of imbalanced data. Approach We propose a deep Bayesian ensemble learning framework to address the representation learning problem of long-tailed and out-of-distribution (OOD) samples when training from medical images. By estimating the relative uncertainties of the input data, our framework can adapt to imbalanced data for learning generalizable classifiers. We trained and tested our framework on four public medical imaging datasets with various imbalance ratios and imaging modalities across three different learning tasks: semantic medical image segmentation, OOD detection, and in-domain generalization. We compared the performance of our framework with those of state-of-the-art comparator methods. Results Our proposed framework outperformed the comparator models significantly across all performance metrics (pairwise t -test: p < 0.01 ) in the semantic segmentation of high-resolution CT and MR images as well as in the detection of OOD samples (p < 0.01 ), thereby showing significant improvement in handling the associated long-tailed data distribution. The results of the in-domain generalization also indicated that our framework can enhance the prediction of retinal glaucoma, contributing to clinical decision-making processes. Conclusions Training of the proposed deep Bayesian ensemble learning framework with dynamic Monte-Carlo dropout and a combination of losses yielded the best generalization to unseen samples from imbalanced medical imaging datasets across different learning tasks.
Collapse
Affiliation(s)
- Mina Rezaei
- LMU Munich, Department of Statistics, Munich, Germany
- Munich Center for Machine Learning, Munich, Germany
| | - Janne J. Näppi
- Massachusetts General Hospital, Harvard Medical School, 3D Imaging Research, Department of Radiology, Boston, Massachusetts, United States
| | - Bernd Bischl
- LMU Munich, Department of Statistics, Munich, Germany
- Munich Center for Machine Learning, Munich, Germany
| | - Hiroyuki Yoshida
- Massachusetts General Hospital, Harvard Medical School, 3D Imaging Research, Department of Radiology, Boston, Massachusetts, United States
| |
Collapse
|
48
|
Liu Z, Lv Q, Yang Z, Li Y, Lee CH, Shen L. Recent progress in transformer-based medical image analysis. Comput Biol Med 2023; 164:107268. [PMID: 37494821 DOI: 10.1016/j.compbiomed.2023.107268] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 05/30/2023] [Accepted: 07/16/2023] [Indexed: 07/28/2023]
Abstract
The transformer is primarily used in the field of natural language processing. Recently, it has been adopted and shows promise in the computer vision (CV) field. Medical image analysis (MIA), as a critical branch of CV, also greatly benefits from this state-of-the-art technique. In this review, we first recap the core component of the transformer, the attention mechanism, and the detailed structures of the transformer. After that, we depict the recent progress of the transformer in the field of MIA. We organize the applications in a sequence of different tasks, including classification, segmentation, captioning, registration, detection, enhancement, localization, and synthesis. The mainstream classification and segmentation tasks are further divided into eleven medical image modalities. A large number of experiments studied in this review illustrate that the transformer-based method outperforms existing methods through comparisons with multiple evaluation metrics. Finally, we discuss the open challenges and future opportunities in this field. This task-modality review with the latest contents, detailed information, and comprehensive comparison may greatly benefit the broad MIA community.
Collapse
Affiliation(s)
- Zhaoshan Liu
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| | - Qiujie Lv
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore; School of Intelligent Systems Engineering, Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, 518107, China.
| | - Ziduo Yang
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore; School of Intelligent Systems Engineering, Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, 518107, China.
| | - Yifan Li
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| | - Chau Hung Lee
- Department of Radiology, Tan Tock Seng Hospital, 11 Jalan Tan Tock Seng, Singapore, 308433, Singapore.
| | - Lei Shen
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| |
Collapse
|
49
|
Xie Y, Wan Q, Xie H, Xu Y, Wang T, Wang S, Lei B. Fundus Image-Label Pairs Synthesis and Retinopathy Screening via GANs With Class-Imbalanced Semi-Supervised Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2714-2725. [PMID: 37030825 DOI: 10.1109/tmi.2023.3263216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Retinopathy is the primary cause of irreversible yet preventable blindness. Numerous deep-learning algorithms have been developed for automatic retinal fundus image analysis. However, existing methods are usually data-driven, which rarely consider the costs associated with fundus image collection and annotation, along with the class-imbalanced distribution that arises from the relative scarcity of disease-positive individuals in the population. Semi-supervised learning on class-imbalanced data, despite a realistic problem, has been relatively little studied. To fill the existing research gap, we explore generative adversarial networks (GANs) as a potential answer to that problem. Specifically, we present a novel framework, named CISSL-GANs, for class-imbalanced semi-supervised learning (CISSL) by leveraging a dynamic class-rebalancing (DCR) sampler, which exploits the property that the classifier trained on class-imbalanced data produces high-precision pseudo-labels on minority classes to leverage the bias inherent in pseudo-labels. Also, given the well-known difficulty of training GANs on complex data, we investigate three practical techniques to improve the training dynamics without altering the global equilibrium. Experimental results demonstrate that our CISSL-GANs are capable of simultaneously improving fundus image class-conditional generation and classification performance under a typical label insufficient and imbalanced scenario. Our code is available at: https://github.com/Xyporz/CISSL-GANs.
Collapse
|
50
|
Hua K, Fang X, Tang Z, Cheng Y, Yu Z. DCAM-NET:A novel domain generalization optic cup and optic disc segmentation pipeline with multi-region and multi-scale convolution attention mechanism. Comput Biol Med 2023; 163:107076. [PMID: 37379616 DOI: 10.1016/j.compbiomed.2023.107076] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Revised: 04/27/2023] [Accepted: 05/27/2023] [Indexed: 06/30/2023]
Abstract
Fundus images are an essential basis for diagnosing ocular diseases, and using convolutional neural networks has shown promising results in achieving accurate fundus image segmentation. However, the difference between the training data (source domain) and the testing data (target domain) will significantly affect the final segmentation performance. This paper proposes a novel framework named DCAM-NET for fundus domain generalization segmentation, which substantially improves the generalization ability of the segmentation model to the target domain data and enhances the extraction of detailed information on the source domain data. This model can effectively overcome the problem of poor model performance due to cross-domain segmentation. To enhance the adaptability of the segmentation model to target domain data, this paper proposes a multi-scale attention mechanism module (MSA) that functions at the feature extraction level. Extracting different attribute features to enter the corresponding scale attention module further captures the critical features in channel, position, and spatial regions. The MSA attention mechanism module also integrates the characteristics of the self-attention mechanism, it can capture dense context information, and the aggregation of multi-feature information effectively enhances the generalization of the model when dealing with unknown domain data. In addition, this paper proposes the multi-region weight fusion convolution module (MWFC), which is essential for the segmentation model to extract feature information from the source domain data accurately. Fusing multiple region weights and convolutional kernel weights on the image to enhance the model adaptability to information at different locations on the image, the fusion of weights deepens the capacity and depth of the model. It enhances the learning ability of the model for multiple regions on the source domain. Our experiments on fundus data for cup/disc segmentation show that the introduction of MSA and MWFC modules in this paper effectively improves the segmentation ability of the segmentation model on the unknown domain. And the performance of the proposed method is significantly better than other methods in the current domain generalization segmentation of the optic cup/disc.
Collapse
Affiliation(s)
- Kaiwen Hua
- School of Computer Science and Engineering, Anhui University of Science and Technology, 232001, Huainan, Anhui, China
| | - Xianjin Fang
- School of Computer Science and Engineering, Anhui University of Science and Technology, 232001, Huainan, Anhui, China.
| | - Zhiri Tang
- Academy for Engineering and Technology, Fudan University, 200433, Shanghai, China
| | - Ying Cheng
- School of Artificial Intelligence Academy, Anhui University of Science and Technology, 232001, Huainan, Anhui, China
| | - Zekuan Yu
- Academy for Engineering and Technology, Fudan University, 200433, Shanghai, China.
| |
Collapse
|